Producthunt Scraper
by runtime
A web scraper that extracts comprehensive product information from Product Hunt using Apify.
Opens on Apify.com
About Producthunt Scraper
A web scraper that extracts comprehensive product information from Product Hunt using Apify.
What does this actor do?
Producthunt Scraper is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.
Key Features
- Cloud-based execution - no local setup required
- Scalable infrastructure for large-scale operations
- API access for integration with your applications
- Built-in proxy rotation and anti-blocking measures
- Scheduled runs and webhooks for automation
How to Use
- Click "Try This Actor" to open it on Apify
- Create a free Apify account if you don't have one
- Configure the input parameters as needed
- Run the actor and download your results
Documentation
Product Hunt Scraper A web scraper that extracts comprehensive product information from Product Hunt using Apify and Crawlee. ## Quick Start 1. Input: Configure your scraping parameters in the input field 2. Run: Click "Start" to begin scraping 3. Output: Download results from the Dataset tab ## Input Configuration > Note: If you provide a Start Date (and/or End Date), the Start URLs field will be ignored. Only one method (date range OR Start URLs) will be used per run. ### Basic Configuration json { "startUrls": ["https://www.producthunt.com/"], "maxRequestRetries": 3, "maxConcurrency": 5, "maxRequestsPerCrawl": 100, "scrapeComingSoon": true } ### Daily Leaderboard Scraping json { "startUrls": ["https://www.producthunt.com/leaderboard/daily/2025/7/5/all"], "scrapeDailyLeaderboard": true, "maxRequestsPerCrawl": 50 } ### Input Parameters | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | startUrls | Array | ["https://www.producthunt.com/"] | Starting URLs for scraping | | maxResults | Number | 8 | Maximum number of product detail pages to process per run. Lower values reduce timeout risk. | | maxRequestRetries | Number | 3 | Maximum retry attempts for failed requests | | maxConcurrency | Number | 5 | Number of concurrent requests | | maxRequestsPerCrawl | Number | 100 | Maximum pages to crawl | | timeoutSecs | Number | 900 | Maximum runtime (seconds) before the Actor stops automatically | | useApifyProxy | Boolean | true | Simple checkbox to enable/disable Apify Proxy without touching advanced settings | | proxyConfiguration | Object | {} | Advanced proxy editor (choose Apify Proxy groups, custom proxy URLs, or no proxy). When left empty it inherits the useApifyProxy toggle. | | scrapeComingSoon | Boolean | true | Whether to scrape "Coming Soon" products | | scrapeDailyLeaderboard | Boolean | false | Whether to scrape daily leaderboard | | sortByDate | Boolean | false | Whether to sort output by date | | sortOrder | String | "desc" | Sort order: "asc" (oldest first) or "desc" (newest first) | | startDate | String | null | Start date for daily leaderboard range (format: YYYY-MM-DD or YYYY/MM/DD) | | endDate | String | null | End date for daily leaderboard range (defaults to yesterday if not specified) | | maxCommentPages | Number | 0 | Maximum number of comment pages to scrape per product (0-10). Comments are paginated on the main product page. The scraper will automatically stop if no more pages are available. | ## Input Method Exclusivity - If you provide a Start Date (and/or End Date), the Start URLs field will be ignored. - If you do not provide a Start Date, the scraper will use the Start URLs. - If neither is provided, the scraper will default to scraping the daily leaderboard for yesterday (today - 1 day). ### Examples Default (no input, scrapes yesterday's leaderboard): json {} // Will scrape the daily leaderboard for yesterday Date range only (Start URLs ignored): json { "startDate": "2025-07-01", "endDate": "2025-07-05" } Start URLs only (date fields empty): json { "startUrls": [ "https://www.producthunt.com/leaderboard/daily/2025/7/4/all" ] } Both provided (date range takes precedence): json { "startDate": "2025-07-01", "endDate": "2025-07-01", "startUrls": [ "https://www.producthunt.com/leaderboard/daily/2025/7/4/all" ] } // Only 2025-07-01 will be scraped ## Output Format ### Regular Products json { "name": "Product Name", "tagline": "Product tagline", "description": "Detailed description", "upvotes": 1234, "categories": ["SaaS", "Productivity"], "launchDate": "Launch date", "imageUrl": "https://example.com/image.jpg", "productUrl": "https://product-website.com", "socialLinks": [ { "platform": "twitter", "url": "https://twitter.com/product" } ], "scrapedAt": "2024-01-01T12:00:00.000Z", "companyWebsite": "https://company.com", "productHuntUrl": "https://www.producthunt.com/products/product-name", "makers": [ { "username": "maker1", "name": "Maker One", "roles": ["Maker"] }, { "username": "maker2", "name": "Maker Two", "roles": ["Maker", "Hunter"] } ], "hunter": { "username": "hunter1", "name": "Hunter Name" }, "builtWith": [ { "name": "shadcn/ui", "url": "https://www.producthunt.com/products/shadcn-ui", "description": "Beautifully designed components.", "imageUrl": "https://ph-files.imgix.net/..." } ], "launches": [ { "postId": "1038207", "title": "Product Name", "url": "https://www.producthunt.com/products/product-name/launches/product-name", "tagline": "Product tagline", "date": "November 15th, 2025", "rank": 1, "upvotes": 308, "comments": 43, "imageUrl": "https://example.com/image.jpg", "imageAlt": "Product Name" } ], "launchCount": 1, "comments": [ { "id": "5002031", "username": "adrm", "userName": "Adrián de la Rosa", "userUrl": "https://www.producthunt.com/@adrm", "userAvatar": "https://ph-avatars.imgix.net/...", "isMaker": true, "text": "Comment text content...", "html": "<div>Comment HTML content...</div>", "upvotes": 33, "timestamp": "2025-11-14T09:06:07-08:00", "timeAgo": "2d ago" } ], "commentCount": 43, "reviews": [], "reviewCount": 0 } ### Daily Leaderboard Products json { "name": "Product Name", "tagline": "Product description", "categories": ["SaaS", "Productivity"], "upvotes": "1234", "launchDate": "July 5, 2025", "imageUrl": "https://example.com/image.jpg", "productUrl": "https://www.producthunt.com/products/product-name", "scrapedFrom": "daily-leaderboard", "scrapedAt": "2024-01-01T12:00:00.000Z", "productHuntUrl": "https://www.producthunt.com/leaderboard/daily/2025/7/5/all", "makers": [ { "username": "maker1", "name": "Maker One", "roles": ["Maker"] } ], "hunter": { "username": "hunter1", "name": "Hunter Name" } } ### Team Extraction Output - makers: Array of all team members with the "Maker" role. Each object contains: - username: Product Hunt username (string) - name: Display name (string) - roles: Array of roles (e.g. ["Maker"] or ["Hunter", "Maker"]) - title: Job title or role description (string, optional) - hunter: Object with the first team member who has the "Hunter" role, with: - username: Product Hunt username (string) - name: Display name (string) - If no hunter is found, this field is null. - Note: A hunter can also be a maker (they will appear in both makers and hunter fields). ### Additional Data Fields - builtWith: Array of tools/products used to build the product. Each object contains: - name: Tool/product name (string) - url: Product Hunt URL (string) - description: Tool description (string) - imageUrl: Tool thumbnail image URL (string, optional) - launches: Array of all launches for the product. Each object contains: - postId: Launch post ID (string) - title: Launch title (string) - url: Launch URL (string) - tagline: Launch tagline (string) - date: Launch date (string) - rank: Daily rank (number) - upvotes: Number of upvotes (number) - comments: Number of comments (number) - imageUrl: Launch image URL (string, optional) - imageAlt: Image alt text (string, optional) - launchCount: Total number of launches (number) - comments: Array of comments from the main product page. Each object contains: - id: Comment ID (string) - username: Commenter's username (string) - userName: Commenter's display name (string) - userUrl: Commenter's profile URL (string) - userAvatar: Commenter's avatar URL (string, optional) - isMaker: Whether the commenter is a maker (boolean) - text: Comment text content (string) - html: Comment HTML content (string) - upvotes: Number of upvotes (number) - timestamp: ISO timestamp (string) - timeAgo: Human-readable time (string, e.g. "2d ago") - commentCount: Total number of comments extracted (number) - reviews: Array of reviews from the /reviews page (same structure as comments). Often empty as reviews are less common. - reviewCount: Total number of reviews extracted (number) ### Comment Pagination Comments are paginated on Product Hunt. The scraper supports pagination through the maxCommentPages parameter: - Default: 0 (skip comment extraction for fastest runs) - Range: 0-10 (set to 0 to disable comment extraction) - Behavior: - The scraper visits up to maxCommentPages pages of comments - It automatically stops if no more pages are available (even if maxCommentPages is higher) - Each page is visited sequentially: ?page=1#comments, ?page=2#comments, etc. - All comments from all pages are combined into a single comments array Example with pagination: json { "maxCommentPages": 5 } This will scrape up to 5 pages of comments per product, stopping early if fewer pages are available. ## Usage Examples ### Scrape Today's Products json { "startUrls": ["https://www.producthunt.com/"], "maxRequestsPerCrawl": 50 } ### Scrape Daily Leaderboard json { "startUrls": ["https://www.producthunt.com/leaderboard/daily/2025/7/5/all"], "scrapeDailyLeaderboard": true } ### Scrape Coming Soon Products json { "startUrls": ["https://www.producthunt.com/coming-soon"], "scrapeComingSoon": true } ### Scrape Specific Categories json { "startUrls": [ "https://www.producthunt.com/categories/developer-tools", "https://www.producthunt.com/categories/productivity" ] } ### Scrape with Date Sorting json { "startUrls": ["https://www.producthunt.com/"], "sortByDate": true, "sortOrder": "desc" } ### Scrape with Comment Pagination json { "startUrls": ["https://www.producthunt.com/products/product-name"], "maxCommentPages": 5 } This will scrape up to 5 pages of comments per product. ### Scrape Daily Leaderboard with Date Sorting (Oldest First) json { "startUrls": ["https://www.producthunt.com/leaderboard/daily/2025/7/5/all"], "scrapeDailyLeaderboard": true, "sortByDate": true, "sortOrder": "asc" } ### Scrape Daily Leaderboard Date Range json { "startDate": "2025-07-01", "endDate": "2025-07-05", "scrapeDailyLeaderboard": true, "maxRequestsPerCrawl": 200 } ### Scrape Daily Leaderboard from Date to Yesterday json { "startDate": "2025-07-01", "scrapeDailyLeaderboard": true, "sortByDate": true, "sortOrder": "desc" } ### Combine Custom URLs with Date Range json { "startUrls": [ "https://www.producthunt.com/leaderboard/daily/2025/7/5/all", "https://www.producthunt.com/leaderboard/daily/2025/7/4/all" ], "startDate": "2025-07-01", "endDate": "2025-07-03", "scrapeDailyLeaderboard": true } ## Data Fields | Field | Description | Available For | |-------|-------------|---------------| | name | Product name | All products | | tagline | Short product description | All products | | description | Detailed product description | Regular products | | upvotes | Number of upvotes | All products | | categories | Array of product categories | All products | | makers | Array of makers with roles | All products (when getDetails is true) | | hunter | Hunter information | All products (when getDetails is true) | | launchDate | Product launch date | All products | | imageUrl | Product image URL | All products | | productUrl | Direct product website link | All products | | pricing | Pricing information | Regular products | | metaKeywords | Meta keywords | Regular products | | socialLinks | Social media links | Regular products | | builtWith | Tools used to build the product | Regular products (when getDetails is true) | | launches | Array of all product launches | Regular products (when getDetails is true) | | launchCount | Total number of launches | Regular products (when getDetails is true) | | comments | Array of comments from main page | Regular products (when getDetails is true and maxCommentPages > 0) | | commentCount | Total number of comments | Regular products (when getDetails is true and maxCommentPages > 0) | | reviews | Array of reviews from /reviews page | Regular products (when getDetails is true) | | reviewCount | Total number of reviews | Regular products (when getDetails is true) | | scrapedFrom | Data source identifier | All products | | scrapedAt | Timestamp of scraping | All products | | sourceUrl | Original Product Hunt URL | All products | ## Performance Tips - Concurrency: Increase maxConcurrency for faster scraping (be mindful of rate limits) - Retries: Higher maxRequestRetries values improve reliability but slow down scraping - Request Limits: Adjust maxRequestsPerCrawl based on your needs - Proxy: Use the useApifyProxy checkbox for quick on/off control, or fill proxyConfiguration for custom pools/URLs. Camoufox-based Apify Proxy is enabled by default for anti-detection. - Sorting: Enable sortByDate to get chronologically ordered results (adds processing time for large datasets) - Date Ranges: Large date ranges will generate many URLs; increase maxRequestsPerCrawl accordingly - Comment Pagination: Each comment page adds ~2-3 seconds per product. The default maxCommentPages=0 skips comments entirely for maximum speed; raise it to collect discussion threads at the cost of additional time. - Product Details: Set getDetails to false to skip detailed product pages and scrape only leaderboard summaries (much faster) ## Troubleshooting ### Common Issues 1. No products found: Product Hunt may have changed their HTML structure 2. Rate limiting: Reduce maxConcurrency or add delays 3. Missing data: Some products may not have all fields available 4. Daily leaderboard issues: Check if the URL format is correct ### Debug Mode Enable debug logging by checking the actor logs in the Apify console. ## Legal Notice - Respect Product Hunt's robots.txt and terms of service - Use reasonable request rates - Use scraped data responsibly and in accordance with applicable laws - This scraper is for educational and research purposes ## Support For issues and questions: 1. Check the troubleshooting section 2. Review the actor logs 3. Contact Apify support --- Note: Always respect website terms of service and use data responsibly. ## Related Actors - CNN Top Headlines Scraper Actor: Scrape the latest top news headlines and full article details from CNN.
Categories
Common Use Cases
Market Research
Gather competitive intelligence and market data
Lead Generation
Extract contact information for sales outreach
Price Monitoring
Track competitor pricing and product changes
Content Aggregation
Collect and organize content from multiple sources
Ready to Get Started?
Try Producthunt Scraper now on Apify. Free tier available with no credit card required.
Start Free TrialActor Information
- Developer
- runtime
- Pricing
- Paid
- Total Runs
- 604
- Active Users
- 63
Related Actors
🏯 Tweet Scraper V2 - X / Twitter Scraper
by apidojo
Google Search Results Scraper
by apify
Instagram Profile Scraper
by apify
Tweet Scraper|$0.25/1K Tweets | Pay-Per Result | No Rate Limits
by kaitoeasyapi
Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.
Learn more about ApifyNeed Professional Help?
Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.
Trusted by millions | Money-back guarantee | 24/7 Support