Advertisement
📡
Social Media

Free Reddit Scraper

Scrape posts, comments, and subreddit data with Python. No Reddit API key required. Filters by date, score, and keyword. Exports to CSV or JSON.

Download from GitHub

Free and open source. Clone the repo, add your Claude API key (optional, for AI summaries), and run.

View on GitHub →

What it does

This script uses Reddit's public JSON API (no authentication required) to scrape posts and comments from any subreddit. It filters results by score, date range, keyword, or flair, then exports clean data to CSV or JSON.

The optional Claude AI integration can summarize post content, extract key themes from a subreddit, or classify posts by sentiment - add your API key to unlock it.

Features

Quick start

git clone https://github.com/Get-Ai-Tools/reddit-scraper
cd reddit-scraper
pip install -r requirements.txt

# Scrape top 100 posts from r/python
python scraper.py --subreddit python --sort top --limit 100

# With keyword filter and CSV export
python scraper.py --subreddit entrepreneur --keyword "side project" --output csv

# With AI summarization (requires Claude API key)
python scraper.py --subreddit machinelearning --sort hot --summarize --api-key sk-ant-...

Requirements

Example output

title,score,comments,url,created,flair
"Show HN: I built a free Reddit scraper",847,203,https://...,2026-04-01,Show HN
"Best Python libraries for data analysis",1204,89,https://...,2026-04-02,Discussion
...

Frequently asked questions

Does this Reddit scraper require an API key?

No. It uses Reddit's public JSON API (append .json to any Reddit URL) which requires no authentication. A Claude API key is only needed if you want AI-powered summaries.

Is scraping Reddit legal?

Scraping publicly available data is generally legal in most jurisdictions, but you should review Reddit's Terms of Service for your specific use case. This tool respects rate limits and only accesses public data.

How many posts can I scrape at once?

Reddit's public API returns up to 100 posts per request. The script handles pagination automatically - set --limit to any number and it will make multiple requests as needed, with delays to respect rate limits.