HackerNoon Scraper
by dadhalfdev
Extract articles from HackerNoon across 22 different categories (AI, Web3, Business, Finance, and many more). Discover high-quality tech stories from ...
Opens on Apify.com
About HackerNoon Scraper
Extract articles from HackerNoon across 22 different categories (AI, Web3, Business, Finance, and many more). Discover high-quality tech stories from +45000 contributing writers.
What does this actor do?
HackerNoon Scraper is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.
Key Features
- Cloud-based execution - no local setup required
- Scalable infrastructure for large-scale operations
- API access for integration with your applications
- Built-in proxy rotation and anti-blocking measures
- Scheduled runs and webhooks for automation
How to Use
- Click "Try This Actor" to open it on Apify
- Create a free Apify account if you don't have one
- Configure the input parameters as needed
- Run the actor and download your results
Documentation
HackerNoon Scraper Extract articles from HackerNoon[https://hackernoon.com/] across 22 different categories (AI, Web3, Business, Finance and many more). Discover high quality tech stories from +45000 contributing writers. ## 🚀 Features - Multi-Category Support: Scrape from 22 different HackerNoon categories - Rich Data Extraction: Extracts comprehensive article metadata including content, author information, images, and engagement metrics - Configurable Limits: Set maximum number of articles to scrape ## 📊 Supported Categories The scraper supports all major HackerNoon categories: - Technology: AI, Programming, Tech Companies, Tech Stories, Cybersecurity, Cloud, Data Science - Business: Business, Finance, Startups, Management, Product Management - Lifestyle: Life Hacking, Remote Work, Gaming, Writing - Science & Innovation: Science, Futurism, Web3 - Community: HackerNoon, Society, Media - Special: Top Stories (trending articles) ## 📋 Data Fields Extracted Each scraped article includes the following information: ### Article Metadata - id - Unique article identifier - title - Article title - slug - URL slug - link - Full article URL - excerpt - Article excerpt/summary - tldr - Too Long; Didn't Read summary - articleBody - Full article content - createdAt - Publication date and time - parentCategory - Primary category - tags - Array of article tags - commentsCount - Number of comments - pageViews - Number of page views/reads - arweave - Arweave blockchain reference ### Media Information - mainImage - Main article image URL - mainImageHeight - Main image height (pixels) - mainImageWidth - Main image width (pixels) - socialPreviewImage - Social media preview image ### Author Information - author_name - Author display name - author_handle - Author username/handle - author_avatar - Author profile picture URL - author_bio - Author biography - author_isBrand - Whether the author is a brand account - author_isTrusted - Whether the author is verified/trusted ## ⚙️ Configuration ### Input Parameters The scraper accepts the following input parameters: json { "category": "AI", "max_posts": 100 } - category (required): Choose from available categories - Default: "Top Stories" - Options: See full list of supported categories above - max_posts (optional): Maximum number of articles to scrape - Default: No limit (up to 500 articles max) - Range: 50-500 - ⚠️ Note: For "Top Stories" category, maximum output is limited to 150 articles due to the time it takes to render compared to the other categories. ## 📊 Output Format The scraper outputs data in JSON format, with each article as a separate record: json { "id": "rxhvxiLNxsRwnMGivMFc", "title": "The Future of AI in Healthcare", "slug": "the-future-of-ai-in-healthcare", "link": "https://hackernoon.com/the-future-of-ai-in-healthcare", "excerpt": "Exploring how AI is revolutionizing medical diagnosis...", "tldr": "AI is transforming healthcare through improved diagnostics and personalized treatment.", "articleBody": "Full article content here...", "createdAt": "2023-09-30T21:36:56.367Z", "mainImage": "https://hackernoon.imgix.net/images/...", "mainImageHeight": 1024, "mainImageWidth": 1536, "socialPreviewImage": "https://hackernoon.imgix.net/images/...", "parentCategory": "ai", "tags": ["artificial-intelligence", "healthcare", "machine-learning"], "commentsCount": 15, "pageViews": 348080, "arweave": "QuKn6Hew8wrwpJ9Zt0OFoeVt5yQwBQyZf30TtejOOno", "author_name": "Dr. Sarah Johnson", "author_handle": "sarahj_ai", "author_avatar": "https://cdn.hackernoon.com/images/...", "author_bio": "AI researcher and healthcare innovation expert", "author_isBrand": false, "author_isTrusted": true }
Common Use Cases
Market Research
Gather competitive intelligence and market data
Lead Generation
Extract contact information for sales outreach
Price Monitoring
Track competitor pricing and product changes
Content Aggregation
Collect and organize content from multiple sources
Ready to Get Started?
Try HackerNoon Scraper now on Apify. Free tier available with no credit card required.
Start Free TrialActor Information
- Developer
- dadhalfdev
- Pricing
- Paid
- Total Runs
- 75
- Active Users
- 3
Related Actors
Smart Article Extractor
by lukaskrivka
Google Search
by devisty
Twitter Tweets Scraper
by gentle_cloud
Twitter Profile
by danek
Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.
Learn more about ApifyNeed Professional Help?
Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.
Trusted by millions | Money-back guarantee | 24/7 Support