Stephen Simpson 316610e932 updates
2025-12-10 11:16:55 -06:00
2025-12-04 17:02:29 -06:00
2025-12-10 11:16:55 -06:00
2025-12-10 11:16:55 -06:00
2025-11-20 11:16:33 -06:00
2025-11-20 11:16:33 -06:00
2025-12-10 11:16:55 -06:00
2025-12-10 11:16:55 -06:00
2025-12-10 11:16:55 -06:00
2025-12-10 11:16:55 -06:00
2025-11-20 11:16:33 -06:00

🚀 Rocky Man 🚀

Rocky Man is a tool for generating searchable HTML documentation from Rocky Linux man pages across BaseOS and AppStream repositories for Rocky Linux 8, 9, and 10.

Features

  • Uses filelists.xml to pre-filter packages with man pages
  • Processes packages from BaseOS and AppStream repositories
  • Runs in containers on x86_64, aarch64, and arm64 architectures
  • Configurable cleanup of temporary files
  • Concurrent downloads and conversions
  • Supports Rocky Linux 8, 9, and 10

Quick Start

Podman

# Build the image
docker build -t rocky-man .

# Generate for specific versions
podman run --rm -v $(pwd)/html:/data/html:Z rocky-man \
  --versions 8.10 9.6 10.0

# Keep downloaded RPMs for multiple builds
podman run --rm -it \
  -v $(pwd)/html:/data/html:Z \
  -v $(pwd)/downloads:/data/tmp/downloads:Z \
  rocky-man --versions 9.6 --keep-rpms --verbose

View the HTML Locally

Start a local web server to browse the generated documentation:

python3 -m http.server -d ./html

Then open http://127.0.0.1:8000 in your browser.

To use a different port:

python3 -m http.server 8080 -d ./html

Directory Structure in Container

The container uses the following paths:

  • /data/html - Generated HTML output
  • /data/tmp/downloads - Downloaded RPM files
  • /data/tmp/extracts - Extracted man page files

These paths are used by default and can be overridden with command-line arguments if needed.

Local Development

Important: Rocky Man requires Rocky Linux because it uses the system's native python3-dnf module to interact with DNF repositories. This module cannot be installed via pip and must come from the Rocky Linux system packages.

# Start a Rocky Linux container with your project mounted
podman run --rm -it -v $(pwd):/workspace:Z rockylinux/rockylinux:9 /bin/bash

# Inside the container, navigate to the project
cd /workspace

# Install epel-release for mandoc
dnf install -y epel-release

# Install system dependencies
dnf install -y python3 python3-pip python3-dnf mandoc rpm-build dnf-plugins-core

# Install Python dependencies
pip3 install -e .

# Run the tool
python3 -m rocky_man.main --versions 9.6 --output-dir ./html/

Option 2: On a Native Rocky Linux System

# Install epel-release for mandoc
dnf install -y epel-release

# Install system dependencies
dnf install -y python3 python3-pip python3-dnf mandoc rpm-build dnf-plugins-core

# Install Python dependencies
pip3 install -e .

# Run the tool
python3 -m rocky_man.main --versions 9.6 --output-dir ./html/

Architecture

Rocky Man is organized into components:

rocky-man/
├── src/rocky_man/
│   ├── models/              # Data models (Package, ManFile)
│   ├── repo/                # Repository management
│   ├── processor/           # Man page processing
│   ├── web/                 # Web page generation
│   ├── utils/               # Utilities
│   └── main.py              # Main entry point and orchestration
├── templates/               # Jinja2 templates
├── Dockerfile               # Multi-stage, arch-independent
└── pyproject.toml           # Python project configuration

How It Works

  1. Package Discovery - Parses repository metadata (repodata/repomd.xml and filelists.xml.gz) to identify packages containing files in /usr/share/man/ directories
  2. Package Download - Downloads identified RPM packages using DNF, with configurable parallel downloads (default: 5)
  3. Man Page Extraction - Extracts man page files from RPMs using rpm2cpio, filtering by section and language based on configuration
  4. HTML Conversion - Converts troff-formatted man pages to HTML using mandoc, with parallel processing (default: 10 workers)
  5. Cross-Reference Linking - Parses converted HTML to add hyperlinks between man page references (e.g., bash(1) becomes clickable)
  6. Index Generation - Creates search indexes (JSON/gzipped) and navigation pages using Jinja2 templates
  7. Cleanup - Removes temporary files (RPMs and extracted content) unless --keep-rpms or --keep-extracts is specified

Command Line Options

usage: main.py [-h] [--versions VERSIONS [VERSIONS ...]]
               [--repo-types REPO_TYPES [REPO_TYPES ...]]
               [--output-dir OUTPUT_DIR] [--download-dir DOWNLOAD_DIR]
               [--extract-dir EXTRACT_DIR] [--keep-rpms] [--keep-extracts]
               [--parallel-downloads PARALLEL_DOWNLOADS]
               [--parallel-conversions PARALLEL_CONVERSIONS] [--mirror MIRROR]
               [--vault] [--existing-versions [VERSION ...]]
               [--template-dir TEMPLATE_DIR] [-v]
               [--skip-sections [SKIP_SECTIONS ...]]
               [--skip-packages [SKIP_PACKAGES ...]] [--skip-languages]
               [--keep-languages] [--allow-all-sections]

Generate HTML documentation for Rocky Linux man pages

optional arguments:
  -h, --help            show this help message and exit
  --versions VERSIONS [VERSIONS ...]
                        Rocky Linux versions to process (default: 8.10 9.6 10.0)
  --repo-types REPO_TYPES [REPO_TYPES ...]
                        Repository types to process (default: BaseOS AppStream)
  --output-dir OUTPUT_DIR
                        Output directory for HTML files (default: /data/html)
  --download-dir DOWNLOAD_DIR
                        Directory for downloading packages (default: /data/tmp/downloads)
  --extract-dir EXTRACT_DIR
                        Directory for extracting man pages (default: /data/tmp/extracts)
  --keep-rpms           Keep downloaded RPM files after processing
  --keep-extracts       Keep extracted man files after processing
  --parallel-downloads PARALLEL_DOWNLOADS
                        Number of parallel downloads (default: 5)
  --parallel-conversions PARALLEL_CONVERSIONS
                        Number of parallel HTML conversions (default: 10)
  --mirror MIRROR       Rocky Linux mirror URL (default: http://dl.rockylinux.org/)
  --vault               Use vault directory instead of pub (vault/rocky instead of pub/rocky)
  --existing-versions [VERSION ...]
                        List of existing versions to include in root index (e.g., 8.10 9.7)
  --template-dir TEMPLATE_DIR
                        Template directory (default: ./templates)
  -v, --verbose         Enable verbose logging
  --skip-sections [SKIP_SECTIONS ...]
                        Man sections to skip (default: 3 3p 3pm). Use empty list to skip none.
  --skip-packages [SKIP_PACKAGES ...]
                        Package names to skip (default: lapack dpdk-devel gl-manpages). Use empty list to skip none.
  --skip-languages      Skip non-English man pages (default: enabled)
  --keep-languages      Keep all languages (disables --skip-languages)
  --allow-all-sections  Include all man sections (overrides --skip-sections)

Attribution

The man pages displayed in this documentation are sourced from Rocky Linux distribution packages. All man page content is copyrighted by their respective authors and distributed under the licenses specified within each man page.

This tool generates HTML documentation from man pages contained in Rocky Linux packages but does not modify the content of the man pages themselves.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Third-Party Software

This project uses several open source components.

Key dependencies include:

  • mandoc - Man page converter (ISC License)
  • python3-dnf - DNF package manager Python bindings (GPL-2.0-or-later)
  • Fuse.js - Client-side search (Apache 2.0)
  • Python packages: requests, rpmfile, Jinja2, lxml, zstandard
  • Fonts: Red Hat Display, Red Hat Text, JetBrains Mono (SIL OFL)

Trademark Notice

Rocky Linux is a trademark of the Rocky Enterprise Software Foundation (RESF). This project is not officially affiliated with or endorsed by RESF. All trademarks are the property of their respective owners. This project complies with RESF's trademark usage guidelines.

Description
No description provided
Readme MIT 301 KiB
Languages
Python 60.4%
HTML 38.1%
Dockerfile 1.5%