Reddit Subreddit Scraper

Name: Reddit Subreddit Scraper
Author: backhoe

by backhoe

Reddit Subreddit Scraper is your plug-and-play radar for Reddit communities: it harvests fresh stats from 100+ subreddits via Apify Residential proxie...

215 runs

8 users

Try This Actor

Opens on Apify.com

About Reddit Subreddit Scraper

Reddit Subreddit Scraper is your plug-and-play radar for Reddit communities: it harvests fresh stats from 100+ subreddits via Apify Residential proxies, returns clean JSON, and drops straight into AI pipelines or dashboards within minutes.

What does this actor do?

Reddit Subreddit Scraper is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.

Key Features

Cloud-based execution - no local setup required
Scalable infrastructure for large-scale operations
API access for integration with your applications
Built-in proxy rotation and anti-blocking measures
Scheduled runs and webhooks for automation

How to Use

Click "Try This Actor" to open it on Apify
Create a free Apify account if you don't have one
Configure the input parameters as needed
Run the actor and download your results

Documentation

Reddit Subreddit Scraper & Proxy Pipeline Fetch live subreddit metadata with built-in ban resistance, smart fallbacks, and automatic proxy rotation. This Actor is designed to be the most reliable way to check subreddit details (subscribers, title, description, active users, etc.) without getting blocked. It intelligently switches between Reddit's API and web endpoints to ensure data delivery. --- ## 🚀 Key Features - 🛡️ Smart Ban Resistance: Automatically rotates Apify Residential Proxies for every single request. - 🔄 Dual-Mode Fetching: Tries the official Reddit API first. If blocked (403/429), it seamlessly falls back to web scraping (`/r/name/about`) to get the data. - ⚡ High Performance: Built on `asyncio` and `httpx` (HTTP/2) for maximum concurrency and speed. - 🔗 Flexible Inputs: Accepts any format: - Subreddit names: `r/AskReddit`, `AskReddit` - Full URLs: `https://www.reddit.com/r/Python` - Reddit Fullnames: `t5_2qh1i` - 📊 Rich Diagnostics: Returns detailed stats for every item, including HTTP status codes, attempt counts, and whether the fallback was used. --- ## 📥 Input Configuration The Actor accepts a simple JSON input. You only need to provide the list of subreddits. ### Example Input `json { "subreddits": [ "r/AskReddit", "https://www.reddit.com/r/machinelearning", "t5_2qh1i" ] }` | Field | Type | Description | |-------|------|-------------| | `subreddits` | Array | Required. List of subreddits to fetch. Supports `r/name`, URLs, or `t5_` IDs. | | `proxyConfiguration` | Object | Optional. If omitted, the Actor enables Apify Residential proxies automatically (recommended for Reddit). | --- ## 📤 Output Data The results are stored in the default Apify Dataset. Each item represents one subreddit and contains the full response from Reddit. ### Success Example `json { "input_raw": "r/AskReddit", "input_type": "name", "identifier_value": "AskReddit", "status": "success", "http_status": 200, "response": { "kind": "t5", "data": { "display_name": "AskReddit", "title": "Ask Reddit...", "subscribers": 57140321, "active_user_count": 84210, "public_description": "r/AskReddit is the place to ask and answer thought-provoking questions.", "created_utc": 1201233135.0 } }, "used_web_fallback": false, "attempts_used": 1 }` ### Error Example If a subreddit cannot be reached after multiple retries, it is marked as an error: `json { "input_raw": "r/NonExistentSub123", "status": "error", "http_status": 404, "error": "404 Client Error: Not Found for url: https://www.reddit.com/r/NonExistentSub123/about.json", "attempts_used": 6 }` --- ## 💡 Tips & Tricks - Use Residential Proxies: Reddit is very strict with datacenter IPs. This Actor is pre-configured to use Residential proxies for the best success rate. - Batch Processing: You can pass thousands of subreddits in a single run. The Actor handles concurrency automatically. - Monitoring: The output includes `attempts_used` and `used_web_fallback`. If you see `used_web_fallback: true` often, it means the API is blocking requests, but the Actor is successfully bypassing it via the web interface. --- ### License Apache 2.0

Common Use Cases

Market Research

Gather competitive intelligence and market data

Lead Generation

Extract contact information for sales outreach

Price Monitoring

Track competitor pricing and product changes

Content Aggregation

Collect and organize content from multiple sources

Ready to Get Started?

Try Reddit Subreddit Scraper now on Apify. Free tier available with no credit card required.

Start Free Trial

Actor Information

Developer: backhoe
Pricing: Paid
Total Runs: 215
Active Users: 8

Related Actors

🏯 Tweet Scraper V2 - X / Twitter Scraper

by apidojo

Instagram Scraper

by apify

TikTok Scraper

by clockworks

Instagram Profile Scraper

by apify

Browse All Actors

Apify Platform

Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.

Learn more about Apify

Need Professional Help?

Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.

Find a Specialist

Trusted by millions | Money-back guarantee | 24/7 Support