In Depth News Scraper
by sync-network
Scrape complete news articles, not just headlines. This tool extracts full-length content from top sources for research, feeds, and data analysis.
Opens on Apify.com
About In Depth News Scraper
Tired of news scrapers that only grab headlines? I've been there. The In Depth News Scraper is the actor I built my projects around because it actually pulls the full article text from major news sites. It goes beyond the snippet you see in search results, fetching the complete story so you get context, analysis, and the full narrative. You can configure it to deliver exactly what you need—whether that's a clean summary for a dashboard or the entire formatted article text for your database. I use it primarily for two things: building curated news feeds on specific topics and feeding clean, structured data into analysis tools. Instead of manually visiting dozens of sites, this automates the collection of the latest updates from top sources. The key benefit is the depth; you're working with the real content, not just metadata. This makes it reliable for monitoring brand mentions, tracking industry trends, or compiling research datasets where headlines alone are useless. Setting it up is straightforward. You define your target sources and topics, and it handles the extraction, dealing with pagination and article layouts. The output is consistent JSON you can pipe directly into other apps or data warehouses. For anyone needing substantive news content at scale, this scraper eliminates the biggest pain point: getting past the lead paragraph.
What does this actor do?
In Depth News Scraper is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.
Key Features
- Cloud-based execution - no local setup required
- Scalable infrastructure for large-scale operations
- API access for integration with your applications
- Built-in proxy rotation and anti-blocking measures
- Scheduled runs and webhooks for automation
How to Use
- Click "Try This Actor" to open it on Apify
- Create a free Apify account if you don't have one
- Configure the input parameters as needed
- Run the actor and download your results
Documentation
In-Depth News Scraper
An Apify actor that extracts complete news articles, not just headlines, from major categories and outlets. It provides structured, analysis-ready data.
Overview
This scraper fetches full article content from various news sources. You can filter by category, refine with keywords, and control the output detail. It's built for automation and data integration workflows where you need more than basic headlines.
Key Features
- Full-Text Extraction: Gets the complete article body, not just summaries or metadata.
- Category & Keyword Filtering: Target specific news categories (World, Business, Technology, etc.) and add keywords to narrow results.
- Output Control: Choose between full articles or summaries via the
contentLengthparameter. - Structured Data: Returns consistent JSON with title, URL, date, source, content, and optional image URL.
- Exclusion Filters: Use
filterBadKeywordsto block articles containing terms like "sponsored". - Time-Range Selection: Scrape current or historical articles.
How to Use
Configure the actor with input parameters, then run it. The dataset will contain structured article objects.
- Set your target
newsCategory. - Optionally add
additionalKeywordsto refine the search within that category. - Configure other parameters like
numberOfItemsorcontentLength. - Execute the actor.
- Download or process the resulting dataset.
Input
Configure the actor using these parameters in a JSON object:
| Parameter | Type | Description |
|---|---|---|
newsCategory |
String | Required. News category (e.g., "Technology", "World"). |
additionalKeywords |
String | Optional. Keywords to refine search within the category. |
numberOfItems |
Number | Articles to retrieve (default: 10, max: 100). |
filterBadKeywords |
Array | Optional. Keywords to exclude from results (e.g., ["sponsored"]). |
contentLength |
String | "Full" for complete article or "Summary" (default: "Full"). |
timeRange |
String | Time period for articles (e.g., "Past week"). |
retrieveImage |
Boolean | Include imageUrl in output (default: false). |
Example Configuration:
{
"newsCategory": "Technology",
"additionalKeywords": "artificial intelligence",
"numberOfItems": 20,
"filterBadKeywords": ["sponsored", "advertisement"],
"contentLength": "Full",
"timeRange": "Past week",
"retrieveImage": false
}
Supported Categories: World, Business, Technology, Entertainment, Health, Science, Sports, Politics.
Output
The actor outputs a dataset where each item is a structured JSON object representing one article.
{
"title": "Article headline",
"link": "Article URL",
"pubDate": "2025-02-05T10:00:00.000Z",
"source": "Publishing outlet name",
"summary": "Brief article overview",
"content": "Full article text (length depends on 'contentLength' setting)",
"imageUrl": "Main image URL (if 'retrieveImage' is true)"
}
Performance & Notes
- Speed: Full article extraction takes approximately 5-10 seconds per item.
- Volume: Efficiently handles up to 100 articles per run. For faster results, limit
numberOfItemsto 50. - Reliability: Includes automatic retries for failed connections and dynamic delays to manage request rates.
- Recommendations: Use specific keywords for relevant results. Disable image retrieval (
"retrieveImage": false) if you don't need images to improve speed. Network conditions and source website performance can affect run time.
Categories
Common Use Cases
Market Research
Gather competitive intelligence and market data
Lead Generation
Extract contact information for sales outreach
Price Monitoring
Track competitor pricing and product changes
Content Aggregation
Collect and organize content from multiple sources
Ready to Get Started?
Try In Depth News Scraper now on Apify. Free tier available with no credit card required.
Start Free TrialActor Information
- Developer
- sync-network
- Pricing
- Paid
- Total Runs
- 287
- Active Users
- 8
Related Actors
Smart Article Extractor
by lukaskrivka
Google Search
by devisty
Twitter Tweets Scraper
by gentle_cloud
Twitter Profile
by danek
Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.
Learn more about ApifyNeed Professional Help?
Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.
Trusted by millions | Money-back guarantee | 24/7 Support