Free PDF Summarizer

Summarize any PDF with Claude AI. Get bullet-point summaries, key takeaways, and Q&A on document content. Handles large files automatically with smart chunking. Batch process entire folders.

Download from GitHub

Free and open source. Bring your own Claude API key - costs fractions of a cent per PDF.

View on GitHub →

What it does

This script extracts text from any PDF using pdfplumber and sends it to Claude AI for summarization. For large files, it chunks the content intelligently so nothing gets cut off. You can also run it in Q&A mode - ask any question about the document and get an answer.

Useful for summarizing research papers, legal documents, reports, e-books, or any PDF you need to quickly understand without reading the whole thing.

Features

Summarize any PDF - research papers, reports, books, contracts
Smart chunking for large files (no context limit issues)
Q&A mode: ask questions about the document
Batch process entire folders of PDFs
Export summaries to Markdown or JSON
Extracts key takeaways as bullet points

Quick start

git clone https://github.com/Get-Ai-Tools/pdf-summarizer
cd pdf-summarizer
pip install -r requirements.txt

# Summarize a PDF
python summarizer.py --api-key sk-ant-... --pdf report.pdf

# Q&A mode
python summarizer.py --api-key sk-ant-... --pdf contract.pdf --mode qa

# Batch process a folder
python summarizer.py --api-key sk-ant-... --folder ./pdfs --out ./summaries

Requirements

Python 3.8+
pdfplumber, anthropic (auto-installed)
Claude API key - get one free at console.anthropic.com

Frequently asked questions

Does it work on scanned PDFs?

It works best on PDFs with selectable text. Scanned image-only PDFs require OCR preprocessing (e.g., with Tesseract) before the text can be extracted.

How much does it cost to run?

Using claude-haiku-4-5 (the default), a typical 10-page PDF costs around $0.001-$0.005 to summarize. You can process hundreds of documents for less than a dollar.

Can it handle very large PDFs?

Yes. The tool chunks large PDFs and summarizes them section by section, then combines the results. There's no hard page limit.