Reddit Subreddit Scraper
by backhoe
Reddit Subreddit Scraper is your plug-and-play radar for Reddit communities: it harvests fresh stats from 100+ subreddits via Apify Residential proxie...
Opens on Apify.com
About Reddit Subreddit Scraper
Reddit Subreddit Scraper is your plug-and-play radar for Reddit communities: it harvests fresh stats from 100+ subreddits via Apify Residential proxies, returns clean JSON, and drops straight into AI pipelines or dashboards within minutes.
What does this actor do?
Reddit Subreddit Scraper is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.
Key Features
- Cloud-based execution - no local setup required
- Scalable infrastructure for large-scale operations
- API access for integration with your applications
- Built-in proxy rotation and anti-blocking measures
- Scheduled runs and webhooks for automation
How to Use
- Click "Try This Actor" to open it on Apify
- Create a free Apify account if you don't have one
- Configure the input parameters as needed
- Run the actor and download your results
Documentation
Reddit Subreddit Scraper & Proxy Pipeline
Fetch live subreddit metadata with built-in ban resistance, smart fallbacks, and automatic proxy rotation. This Actor is designed to be the most reliable way to check subreddit details (subscribers, title, description, active users, etc.) without getting blocked. It intelligently switches between Reddit's API and web endpoints to ensure data delivery. --- ## 🚀 Key Features - 🛡️ Smart Ban Resistance: Automatically rotates Apify Residential Proxies for every single request. - 🔄 Dual-Mode Fetching: Tries the official Reddit API first. If blocked (403/429), it seamlessly falls back to web scraping (/r/name/about) to get the data. - ⚡ High Performance: Built on asyncio and httpx (HTTP/2) for maximum concurrency and speed. - 🔗 Flexible Inputs: Accepts any format: - Subreddit names: r/AskReddit, AskReddit - Full URLs: https://www.reddit.com/r/Python - Reddit Fullnames: t5_2qh1i - 📊 Rich Diagnostics: Returns detailed stats for every item, including HTTP status codes, attempt counts, and whether the fallback was used. --- ## 📥 Input Configuration The Actor accepts a simple JSON input. You only need to provide the list of subreddits. ### Example Input json { "subreddits": [ "r/AskReddit", "https://www.reddit.com/r/machinelearning", "t5_2qh1i" ] } | Field | Type | Description | |-------|------|-------------| | subreddits | Array | Required. List of subreddits to fetch. Supports r/name, URLs, or t5_ IDs. | | proxyConfiguration | Object | Optional. If omitted, the Actor enables Apify Residential proxies automatically (recommended for Reddit). | --- ## 📤 Output Data The results are stored in the default Apify Dataset. Each item represents one subreddit and contains the full response from Reddit. ### Success Example json { "input_raw": "r/AskReddit", "input_type": "name", "identifier_value": "AskReddit", "status": "success", "http_status": 200, "response": { "kind": "t5", "data": { "display_name": "AskReddit", "title": "Ask Reddit...", "subscribers": 57140321, "active_user_count": 84210, "public_description": "r/AskReddit is the place to ask and answer thought-provoking questions.", "created_utc": 1201233135.0 } }, "used_web_fallback": false, "attempts_used": 1 } ### Error Example If a subreddit cannot be reached after multiple retries, it is marked as an error: json { "input_raw": "r/NonExistentSub123", "status": "error", "http_status": 404, "error": "404 Client Error: Not Found for url: https://www.reddit.com/r/NonExistentSub123/about.json", "attempts_used": 6 } --- ## 💡 Tips & Tricks - Use Residential Proxies: Reddit is very strict with datacenter IPs. This Actor is pre-configured to use Residential proxies for the best success rate. - Batch Processing: You can pass thousands of subreddits in a single run. The Actor handles concurrency automatically. - Monitoring: The output includes attempts_used and used_web_fallback. If you see used_web_fallback: true often, it means the API is blocking requests, but the Actor is successfully bypassing it via the web interface. --- ### License Apache 2.0
Categories
Common Use Cases
Market Research
Gather competitive intelligence and market data
Lead Generation
Extract contact information for sales outreach
Price Monitoring
Track competitor pricing and product changes
Content Aggregation
Collect and organize content from multiple sources
Ready to Get Started?
Try Reddit Subreddit Scraper now on Apify. Free tier available with no credit card required.
Start Free TrialActor Information
- Developer
- backhoe
- Pricing
- Paid
- Total Runs
- 215
- Active Users
- 8
Related Actors
🏯 Tweet Scraper V2 - X / Twitter Scraper
by apidojo
Instagram Scraper
by apify
TikTok Scraper
by clockworks
Instagram Profile Scraper
by apify
Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.
Learn more about ApifyNeed Professional Help?
Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.
Trusted by millions | Money-back guarantee | 24/7 Support