Reddit Scraper 🔥

Reddit Scraper 🔥

by nocodeventure

Skip the homework. Scraping Reddit used to mean wrestling with APIs, fighting rate limits, and begging for OAuth tokens. Not anymore.

25 runs
3 users
Try This Actor

Opens on Apify.com

About Reddit Scraper 🔥

Skip the homework. Scraping Reddit used to mean wrestling with APIs, fighting rate limits, and begging for OAuth tokens. Not anymore.

What does this actor do?

Reddit Scraper 🔥 is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.

Key Features

  • Cloud-based execution - no local setup required
  • Scalable infrastructure for large-scale operations
  • API access for integration with your applications
  • Built-in proxy rotation and anti-blocking measures
  • Scheduled runs and webhooks for automation

How to Use

  1. Click "Try This Actor" to open it on Apify
  2. Create a free Apify account if you don't have one
  3. Configure the input parameters as needed
  4. Run the actor and download your results

Documentation

Reddit Scraper 🔥 Skip the homework. Scraping Reddit used to mean wrestling with APIs, fighting rate limits, and begging for OAuth tokens. Not anymore. This Actor logs into Reddit like a real user, browses subreddits, searches for keywords, and extracts posts with comments — all without touching a single API endpoint. ⚠️ Important: Do not use regular datacenter proxies for Reddit scraping. Reddit aggressively blocks non-residential IPs; always use residential proxies if scraping at scale, otherwise you will be blocked quickly. ## ✨ Features - 🔍 Search Mode – Search all of Reddit or within a specific subreddit by keyword - 🔐 Login Support – Uses your Reddit credentials to access content as an authenticated user - 📜 Infinite Scroll – Automatically scrolls to load more posts (unlimited) - 💬 Comments Extraction – Grabs threaded comments with author, upvotes, and timestamps - 🖼️ Full Post Details – Extracts text, images, videos, galleries, and external links - 🎭 Stealth Mode – Uses Camoufox (stealthy Firefox fork) to avoid detection - 🔄 Flexible Filtering – Sort by Hot, New, Top, or Rising ## 🚀 Getting Started ### 📖 Two Modes: Browse vs Search This Actor supports two modes of operation: | Mode | When to Use | Required Fields | |------|-------------|-----------------| | Browse | Scrape posts from a specific subreddit feed | subreddit (required) | | Search | Find posts by keyword across Reddit | searchQuery (required) | --- ### 🔍 Search Mode Search for posts by keyword — either across all of Reddit or within a specific subreddit. #### Search All of Reddit json { "searchQuery": "artificial intelligence", "searchSort": "relevance", "searchTimeFilter": "week", "postCount": 20, "redditSessionCookie": "your-session-cookie-here" } #### Search Within a Subreddit json { "searchQuery": "renewable energy", "subreddit": "technology", "searchSort": "top", "searchTimeFilter": "month", "postCount": 15, "redditSessionCookie": "your-session-cookie-here" } Search Sort Options: - relevance – Best matches first (default) - hot – Currently trending results - top – Highest upvoted - new – Most recent - comments – Most discussed Time Filter Options: - all – All time (default) - hour – Past hour - day – Past 24 hours - week – Past week - month – Past month - year – Past year --- ### 📂 Browse Mode Scrape posts from a subreddit feed with your preferred sorting. json { "subreddit": "technology", "filter": "hot", "postCount": 25, "redditSessionCookie": "your-session-cookie-here" } --- ### 🔑 Authentication (Two Options) You can authenticate in two ways: #### Option 1: Session Cookie (Recommended ⭐) Use a session cookie for faster, more reliable scraping. The cookie is valid for ~6 months. > 💡 Tip: With a valid session cookie, you can try running without a proxy — it often works! This saves costs and speeds up your scrapes. json { "subreddit": "technology", "filter": "hot", "postCount": 25, "redditSessionCookie": "your-session-cookie-here" } #### Option 2: Email & Password Use credentials to login (requires residential proxy). The session cookie will be printed in the logs so you can copy it for future runs. json { "subreddit": "technology", "filter": "hot", "postCount": 25, "redditEmail": "your-email@example.com", "redditPassword": "your-password", "proxyConfiguration": { "useApifyProxy": true, "apifyProxyGroups": ["RESIDENTIAL"] } } ### 📋 How to Get Your Session Cookie There are two ways to get your Reddit session cookie: --- #### Method A: Run the Actor with Login (Automated) 1. Run the Actor with your Reddit email/password AND a residential proxy enabled 2. After successful login, look for this in the logs: text ============================================================ SESSION COOKIE FOR FUTURE USE (copy this value): eyJhbGciOiJSUzI1NiIsImtpZCI6IlNIQTI1N... ============================================================ 3. Copy the entire cookie value (it's a long string) 4. Save it somewhere safe — this cookie is valid for ~6 months! --- #### Method B: Extract from Your Browser (Manual) If you prefer not to share your password, you can grab the session cookie directly from your browser: Chrome / Edge: 1. Open Reddit and log in to your account 2. Press F12 (or right-click → "Inspect") to open Developer Tools 3. Go to the Application tab (or "Storage" tab in some browsers) 4. In the left sidebar, expand Cookies → click https://www.reddit.com 5. Find the cookie named reddit_session in the list 6. Double-click the Value column to select it, then copy the entire value Firefox: 1. Open Reddit and log in to your account 2. Press F12 to open Developer Tools 3. Go to the Storage tab 4. Expand Cookies → click https://www.reddit.com 5. Find reddit_session and copy its value > ⚠️ Important: The cookie value is a very long encoded string (starts with something like eyJ...). Make sure you copy the entire value — don't truncate it! --- For all future runs: - Just paste the cookie into redditSessionCookie - No proxy needed! No login needed! Much faster! > 💡 Why use the cookie? It skips the login process entirely, making runs faster and more reliable. Reddit's login page has CAPTCHAs and rate limits that can cause failures. ### Input Parameters #### Search Mode Parameters | Parameter | Type | Description | Default | |-----------|------|-------------|---------| | searchQuery | String | Search term to find posts. Enables search mode. | — | | searchSort | String | Sort results: relevance, hot, top, new, comments | relevance | | searchTimeFilter | String | Time range: all, hour, day, week, month, year | all | #### Browse Mode Parameters | Parameter | Type | Description | Default | |-----------|------|-------------|---------| | subreddit | String | Subreddit name without "r/" (required for browse, optional for search) | — | | filter | String | Post sorting: hot, new, top, rising | hot | #### Common Parameters | Parameter | Type | Description | Default | |-----------|------|-------------|---------| | postCount | Integer | Number of posts to scrape (0 = all) | 1 | | commentsPerPost | Integer | Comments per post (0 = all) | 5 | | extractDetails | Boolean | Extract full post content (text/images/videos) and comments | true | | redditSessionCookie | String | Session cookie (recommended, valid ~6 months) | — | | redditEmail | String | Reddit email/username (only needed to get cookie) | — | | redditPassword | String | Reddit password (only needed to get cookie) | — | | proxyConfiguration | Object | Proxy settings (required for login, optional with cookie) | Disabled | ## 📊 Output Example json { "id": "1h2abc3", "title": "Scientists discover new renewable energy source", "author": "u/sciencefan42", "subreddit": "r/technology", "upvotes": 15420, "commentCount": 892, "postedTime": "5 hours ago", "url": "https://www.reddit.com/r/technology/comments/1h2abc3/...", "details": { "fullTitle": "Scientists discover new renewable energy source that could power cities", "contentType": "link", "textContent": null, "imageUrl": null, "videoUrl": null, "externalUrl": "https://example.com/article", "galleryUrls": [] }, "comments": [ { "id": "kx7def8", "author": "techguru99", "content": "This could change everything!", "upvotes": "2.1k", "postedTime": "4 hours ago", "depth": 0, "permalink": "/r/technology/comments/..." } ] } ## 🎯 Use Cases - Brand Monitoring – Search for your company/product name across all of Reddit - Market Research – Track trends, sentiment, and discussions in your industry - Content Curation – Find viral posts and popular topics for content ideas - Competitor Analysis – Search for competitor mentions and discussions - Academic Research – Collect data for social media studies with keyword filtering - Lead Generation – Find engaged communities and discussions relevant to your product ## ⚙️ How It Works With Session Cookie (recommended): 1. Authenticate – Injects your session cookie before any navigation 2. Navigate – Goes directly to your target (subreddit or search results) 3. Scroll – Loads posts dynamically until reaching your target count 4. Extract – Visits each post to grab full details and comments 5. Output – Saves everything to a clean, structured dataset With Email/Password: 1. Login – Authenticates with your Reddit credentials using a real browser 2. Log Cookie – Prints the session cookie for you to copy for future runs 3. Navigate – Goes to your target subreddit or search results page 4. Scroll & Extract – Same as above Search vs Browse: - Search Mode (searchQuery provided) – Navigates to Reddit's search page with your query, optionally scoped to a subreddit - Browse Mode (no searchQuery) – Navigates directly to the subreddit feed with your chosen filter ## 🛡️ Proxy Support When do you need a proxy? - ✅ First run with email/password – Required! Use residential proxy to login and get your session cookie - ❌ Future runs with session cookie – Not required! Cookie-based auth works without proxy For the initial login, use residential proxies: json { "proxyConfiguration": { "useApifyProxy": true, "apifyProxyGroups": ["RESIDENTIAL"] } } > ⚠️ Important: Regular datacenter proxies will NOT work for Reddit. Always use residential proxies for the login step. ## 💡 Tips - Use the session cookie for faster, more reliable scraping (see instructions above) - Search tip: Use quotes in searchQuery for exact phrase matching (e.g., "data center") - Search tip: Combine searchQuery with subreddit to search within a specific community - Start with a small postCount (5-10) to test your setup - Use commentsPerPost: 0 to get ALL comments (slower but complete) - Enable extractDetails for text posts, images, and video URLs - First run: Use residential proxy with email/password to get your session cookie - Future runs: Use session cookie only — no proxy needed! ## 📜 Legal & Ethical Use This Actor is intended for legitimate data collection purposes. Please: - Respect Reddit's Terms of Service - Don't scrape personal or sensitive information - Use reasonable delays and limits - Comply with applicable data protection laws ## 🤝 Support Having issues? Found a bug? Want a feature? - Open an issue on the Actor's page - Contact the developer through Apify --- Built by nocodeventure • Made with ❤️ for the Apify community

Common Use Cases

Market Research

Gather competitive intelligence and market data

Lead Generation

Extract contact information for sales outreach

Price Monitoring

Track competitor pricing and product changes

Content Aggregation

Collect and organize content from multiple sources

Ready to Get Started?

Try Reddit Scraper 🔥 now on Apify. Free tier available with no credit card required.

Start Free Trial

Actor Information

Developer
nocodeventure
Pricing
Paid
Total Runs
25
Active Users
3
Apify Platform

Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.

Learn more about Apify

Need Professional Help?

Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.

Find a Specialist

Trusted by millions | Money-back guarantee | 24/7 Support