In Depth News Scraper

In Depth News Scraper

by sync-network

Scrape complete news articles, not just headlines. This tool extracts full-length content from top sources for research, feeds, and data analysis.

287 runs
8 users
Try This Actor

Opens on Apify.com

About In Depth News Scraper

Tired of news scrapers that only grab headlines? I've been there. The In Depth News Scraper is the actor I built my projects around because it actually pulls the full article text from major news sites. It goes beyond the snippet you see in search results, fetching the complete story so you get context, analysis, and the full narrative. You can configure it to deliver exactly what you need—whether that's a clean summary for a dashboard or the entire formatted article text for your database. I use it primarily for two things: building curated news feeds on specific topics and feeding clean, structured data into analysis tools. Instead of manually visiting dozens of sites, this automates the collection of the latest updates from top sources. The key benefit is the depth; you're working with the real content, not just metadata. This makes it reliable for monitoring brand mentions, tracking industry trends, or compiling research datasets where headlines alone are useless. Setting it up is straightforward. You define your target sources and topics, and it handles the extraction, dealing with pagination and article layouts. The output is consistent JSON you can pipe directly into other apps or data warehouses. For anyone needing substantive news content at scale, this scraper eliminates the biggest pain point: getting past the lead paragraph.

What does this actor do?

In Depth News Scraper is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.

Key Features

  • Cloud-based execution - no local setup required
  • Scalable infrastructure for large-scale operations
  • API access for integration with your applications
  • Built-in proxy rotation and anti-blocking measures
  • Scheduled runs and webhooks for automation

How to Use

  1. Click "Try This Actor" to open it on Apify
  2. Create a free Apify account if you don't have one
  3. Configure the input parameters as needed
  4. Run the actor and download your results

Documentation

In-Depth News Scraper

An Apify actor that extracts complete news articles, not just headlines, from major categories and outlets. It provides structured, analysis-ready data.

Overview

This scraper fetches full article content from various news sources. You can filter by category, refine with keywords, and control the output detail. It's built for automation and data integration workflows where you need more than basic headlines.

Key Features

  • Full-Text Extraction: Gets the complete article body, not just summaries or metadata.
  • Category & Keyword Filtering: Target specific news categories (World, Business, Technology, etc.) and add keywords to narrow results.
  • Output Control: Choose between full articles or summaries via the contentLength parameter.
  • Structured Data: Returns consistent JSON with title, URL, date, source, content, and optional image URL.
  • Exclusion Filters: Use filterBadKeywords to block articles containing terms like "sponsored".
  • Time-Range Selection: Scrape current or historical articles.

How to Use

Configure the actor with input parameters, then run it. The dataset will contain structured article objects.

  1. Set your target newsCategory.
  2. Optionally add additionalKeywords to refine the search within that category.
  3. Configure other parameters like numberOfItems or contentLength.
  4. Execute the actor.
  5. Download or process the resulting dataset.

Input

Configure the actor using these parameters in a JSON object:

Parameter Type Description
newsCategory String Required. News category (e.g., "Technology", "World").
additionalKeywords String Optional. Keywords to refine search within the category.
numberOfItems Number Articles to retrieve (default: 10, max: 100).
filterBadKeywords Array Optional. Keywords to exclude from results (e.g., ["sponsored"]).
contentLength String "Full" for complete article or "Summary" (default: "Full").
timeRange String Time period for articles (e.g., "Past week").
retrieveImage Boolean Include imageUrl in output (default: false).

Example Configuration:

{
  "newsCategory": "Technology",
  "additionalKeywords": "artificial intelligence",
  "numberOfItems": 20,
  "filterBadKeywords": ["sponsored", "advertisement"],
  "contentLength": "Full",
  "timeRange": "Past week",
  "retrieveImage": false
}

Supported Categories: World, Business, Technology, Entertainment, Health, Science, Sports, Politics.

Output

The actor outputs a dataset where each item is a structured JSON object representing one article.

{
  "title": "Article headline",
  "link": "Article URL",
  "pubDate": "2025-02-05T10:00:00.000Z",
  "source": "Publishing outlet name",
  "summary": "Brief article overview",
  "content": "Full article text (length depends on 'contentLength' setting)",
  "imageUrl": "Main image URL (if 'retrieveImage' is true)"
}

Performance & Notes

  • Speed: Full article extraction takes approximately 5-10 seconds per item.
  • Volume: Efficiently handles up to 100 articles per run. For faster results, limit numberOfItems to 50.
  • Reliability: Includes automatic retries for failed connections and dynamic delays to manage request rates.
  • Recommendations: Use specific keywords for relevant results. Disable image retrieval ("retrieveImage": false) if you don't need images to improve speed. Network conditions and source website performance can affect run time.

Common Use Cases

Market Research

Gather competitive intelligence and market data

Lead Generation

Extract contact information for sales outreach

Price Monitoring

Track competitor pricing and product changes

Content Aggregation

Collect and organize content from multiple sources

Ready to Get Started?

Try In Depth News Scraper now on Apify. Free tier available with no credit card required.

Start Free Trial

Actor Information

Developer
sync-network
Pricing
Paid
Total Runs
287
Active Users
8
Apify Platform

Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.

Learn more about Apify

Need Professional Help?

Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.

Find a Specialist

Trusted by millions | Money-back guarantee | 24/7 Support