🔥Czech News Scraper

🔥Czech News Scraper

by p-brother

Scrape Czech news from Novinky.cz, Seznam Zprávy & more into JSON. Get real-time or historical data, with access to over 1 million articles for analysis.

146 runs
7 users
Try This Actor

Opens on Apify.com

About 🔥Czech News Scraper

Need to monitor Czech media but tired of manual checking? I built this scraper to pull articles from major Czech news sites like Novinky.cz, Seznam Zprávy, Super.cz, and Proženy.cz directly into structured JSON. It works for both the latest headlines and historical archives, giving you a clean dataset without the hassle of dealing with each site's layout. I've used it for market research and trend analysis, and having over a million articles at your fingertips makes spotting patterns much easier. The main benefit is consistency. Instead of writing and maintaining separate scripts for each news portal, this actor handles it all in one go. You can schedule runs for real-time monitoring or dig into past coverage. The JSON output is straightforward, typically including the title, URL, publication date, and full text, ready for your database or analysis pipeline. It's a solid foundation for anyone building a media dashboard, tracking brand mentions, or conducting academic research on Czech current affairs.

What does this actor do?

🔥Czech News Scraper is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.

Key Features

  • Cloud-based execution - no local setup required
  • Scalable infrastructure for large-scale operations
  • API access for integration with your applications
  • Built-in proxy rotation and anti-blocking measures
  • Scheduled runs and webhooks for automation

How to Use

  1. Click "Try This Actor" to open it on Apify
  2. Create a free Apify account if you don't have one
  3. Configure the input parameters as needed
  4. Run the actor and download your results

Documentation

Czech News Scraper

Overview

This actor scrapes article text and metadata from major Czech news websites. It's built for speed and delivers structured JSON data, supporting both real-time and historical articles. The scraper currently covers over 1 million articles from Novinky.cz, Seznam Zprávy, Super.cz, and ProŽeny.cz, with more sites to be added.

Key Features

  • High-speed scraping: Processes thousands of articles in seconds
  • Structured JSON output: Ready for direct integration and processing
  • Comprehensive metadata: Extracts author, title, publication/update dates, tags, categories, and more
  • Clean content format: Article text provided in Markdown for analysis
  • Consistent schema: Uniform data structure across all supported websites
  • Flexible filtering: Filter by full-text query, created date range, or updated date range
  • Multiple sort options: Sort by created date, updated date, or relevance rank
  • Pagination: Retrieve up to 100 articles per page

How to Use

Run the actor via the Apify platform. Configure your scrape using the following input parameters:

Basic configuration:
- websites: Select which news sites to scrape (default: all supported)
- maxItems: Set maximum number of articles to retrieve
- query: Full-text search across article content

Filtering options:
- createdAfter / createdBefore: Date range for article publication
- updatedAfter / updatedBefore: Date range for article updates

Sorting and pagination:
- sortBy: Choose "created", "updated", or "rank"
- sortOrder: "asc" or "desc"
- offset: Pagination starting point
- limit: Articles per page (max 100)

Test configurations in the Start Console section — you only pay for results received, and free Apify credits are available to get started.

Input/Output

Input: Configure via the actor's input schema as described above.

Output: Results are available as a dataset in the Output tab, viewable as a table or raw JSON. Each article includes:

{
  "articleId": 40528950,
  "created": 1751640522,
  "updated": 1751652939,
  "url": "https://www.novinky.cz/clanek/...",
  "section": "ekonomika",
  "tags": ["Elektřina", "Blackout"],
  "authors": ["Martin Procházka"],
  "title": "Article title here",
  "perex": "Article excerpt...",
  "contentMarkdown": "Full article text in Markdown format..."
}

Additional fields include recommendedUntilDate, domicile, relatedArticles, and various content flags.

Dataset size: The actor currently provides access to over 1,076,000 articles across supported websites.

To request support for additional Czech news websites, create an issue in the actor's repository.

Common Use Cases

Market Research

Gather competitive intelligence and market data

Lead Generation

Extract contact information for sales outreach

Price Monitoring

Track competitor pricing and product changes

Content Aggregation

Collect and organize content from multiple sources

Ready to Get Started?

Try 🔥Czech News Scraper now on Apify. Free tier available with no credit card required.

Start Free Trial

Actor Information

Developer
p-brother
Pricing
Paid
Total Runs
146
Active Users
7
Apify Platform

Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.

Learn more about Apify

Need Professional Help?

Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.

Find a Specialist

Trusted by millions | Money-back guarantee | 24/7 Support