Home

A Python library and CLI toolkit that brings PDF files alive with the power of LLMs.
Highlights
| Feature | Details |
|---|---|
| Automatic TOC generation | Generate clickable Table of Contents (PDF bookmarks) using LLM inference with intelligent batching for arbitrarily large documents |
| Smart OCR detection | Automatically detects scanned PDFs and performs OCR via Tesseract when needed |
| Intelligent file renaming | Batch rename files using natural language instructions with LLM-powered inference and confidence scoring |
| Multi-provider LLM support | Use any LLM provider via LangChain: OpenAI, Anthropic, local models via Ollama, and more |
| TOC postprocessing | Optional second LLM pass cross-references against printed TOC pages to fix typos, remove duplicates, and correct hierarchy |
| TOML configuration | Set persistent defaults for any CLI option via pdfalive.toml config files with per-command sections |
| Built-in resilience | Automatic retry logic with exponential backoff for handling API rate limits |
Installation
Tesseract is required for OCR functionality. On macOS:
Install pdfalive via pip:
Or run directly without installation using uvx:
Usage
Use --help on any command for detailed options:
generate-toc
Generate a clickable Table of Contents using PDF bookmarks. The tool extracts font and text features from the PDF and uses an LLM to intelligently identify chapter and section headings.
pdfalive generate-toc input.pdf output.pdf
# Or modify the file in place
pdfalive generate-toc --inplace input.pdf
Choosing an LLM:
By default, pdfalive uses the latest OpenAI model. Use any LangChain-supported model:
# Use Claude
pdfalive generate-toc --model-identifier 'claude-sonnet-4-5' input.pdf output.pdf
# Use a local model via Ollama
pdfalive generate-toc --model-identifier 'ollama/llama3' input.pdf output.pdf
Set the appropriate API key for your provider (OPENAI_API_KEY, ANTHROPIC_API_KEY, etc.).
Scanned PDFs:
OCR is enabled by default. Scanned documents without extractable text are automatically detected and processed:
# Default: OCR text layer discarded after TOC generation (preserves file size)
pdfalive generate-toc scanned.pdf output.pdf
# Include OCR text layer in output (makes PDF searchable)
pdfalive generate-toc --ocr-output scanned.pdf output.pdf
# Disable automatic OCR entirely
pdfalive generate-toc --no-ocr input.pdf output.pdf
Postprocessing:
For documents with a printed table of contents page, enable LLM postprocessing to refine results:
Postprocessing uses an additional LLM call to: - Remove duplicate entries and fix typos - Cross-reference against any printed TOC found in the document - Add missing entries and correct page numbers
Other options:
| Option | Description |
|---|---|
--inplace |
Modify the input file in place instead of creating a new output file |
--force |
Overwrite existing TOC if the PDF already has bookmarks |
--ocr-language |
Set OCR language (default: eng). Use Tesseract language codes |
--request-delay |
Delay between LLM calls for rate limiting (default: 10s) |
extract-text
Extract text from scanned PDFs using OCR and save to a new PDF with an embedded text layer:
pdfalive extract-text input.pdf output.pdf
# Or modify the file in place
pdfalive extract-text --inplace input.pdf
This creates a searchable/selectable PDF without generating a TOC.
Options:
| Option | Description |
|---|---|
--inplace |
Modify the input file in place instead of creating a new output file |
--force |
Force OCR even if document already has text |
--ocr-language |
Set OCR language (default: eng) |
--ocr-dpi |
DPI resolution for OCR processing (default: 300) |
rename
Intelligently rename files using LLM inference. Analyzes filenames and applies renaming rules based on natural language instructions.
Custom naming formats:
Specify exact formatting including special characters — the LLM respects brackets, parentheses, dashes, and other formatting:
Reading paths from a file:
When dealing with many files or long filenames that exceed command-line limits, use the -f/--input-file option to read paths from a text file (one per line):
# Generate a list of files to rename
find /path/to/docs -name "*.pdf" > files.txt
# Rename using the file list
pdfalive rename -q "Standardize filenames" -f files.txt
The input file supports comments (lines starting with #) and blank lines are ignored.
Workflow:
- The tool analyzes each filename and generates rename suggestions
- A preview table shows original names, proposed names, confidence scores, and reasoning
- Confirm or cancel the operation (unless
-yis used) - Files are renamed in place
Automatic confirmation:
Options:
| Option | Description |
|---|---|
-f, --input-file |
Read input file paths from a text file (one per line) |
--model-identifier |
Choose which LLM to use (default: gpt-5.2) |
-y, --yes |
Automatically apply renames without confirmation |
--show-token-usage |
Display token usage statistics (default: enabled) |
Configuration
pdfalive supports TOML configuration files for setting default options. This is useful for frequently-used settings like the --query argument for rename.
Config file locations (searched in order):
1. pdfalive.toml or .pdfalive.toml in the current directory
2. pdfalive.toml or .pdfalive.toml in your home directory
3. ~/.config/pdfalive/pdfalive.toml
Example pdfalive.toml:
# Global settings (shared across commands)
[global]
model-identifier = "gpt-5.2"
show-token-usage = true
# Settings for generate-toc command
[generate-toc]
force = false
request-delay = 10.0
ocr = true
ocr-language = "eng"
ocr-dpi = 300
postprocess = false
# Settings for extract-text command
[extract-text]
ocr-language = "eng"
ocr-dpi = 300
force = false
# Settings for rename command
[rename]
query = "Rename to \"[Author Last Name] Book Title, Edition (Year).pdf\""
yes = false
Using a specific config file:
Override hierarchy: 1. Code defaults (lowest priority) 2. Config file values 3. CLI arguments (highest priority)
CLI arguments always override config file settings.
Development
We use uv to manage the project:
Code quality tools:
| Tool | Purpose |
|---|---|
| ruff | Formatting and linting |
| mypy | Static type checking |
| pytest | Unit testing |
| pre-commit | Git hooks for quality checks |
# Run linting
uv run ruff check .
uv run ruff format .
# Run type checking
uv run mypy pdfalive
# Run tests
uv run pytest
License
pdfalive is distributed under the terms of the MIT License.