Google News Actor

Google News Actor

by early_kiosk

Google News Scraper collects localized Google News Search, Top Stories, and Topic feeds with infinitescroll, keyword operators, and hashed-topic cover...

16 runs
2 users
Try This Actor

Opens on Apify.com

About Google News Actor

Google News Scraper collects localized Google News Search, Top Stories, and Topic feeds with infinitescroll, keyword operators, and hashed-topic coverage while deduplicating results across migrations.

What does this actor do?

Google News Actor is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.

Key Features

  • Cloud-based execution - no local setup required
  • Scalable infrastructure for large-scale operations
  • API access for integration with your applications
  • Built-in proxy rotation and anti-blocking measures
  • Scheduled runs and webhooks for automation

How to Use

  1. Click "Try This Actor" to open it on Apify
  2. Create a free Apify account if you don't have one
  3. Configure the input parameters as needed
  4. Run the actor and download your results

Documentation

Google News Scraper Actor Collect real-time Google News Search, Top Stories, and Topic feeds with localized filtering, infinite scroll coverage, and optional redirect unwrapping for clean publisher URLs. ## Why teams pick this actor - Rank-ready coverage – Mirrors the long-form, marketing-first READMEs of top marketplace actors while focusing on actionable data (headlines, snippets, publishers, timestamps, thumbnails, canonical URLs). - 3 capture modes – SEARCH for keyword monitoring, TOP_STORIES for country/region dashboards, TOPIC (incl. hashed topic IDs) for the curated Google News sections journalists rely on. - Localization first – Every run requires an explicit language like en-US or de-DE, so you can mirror the behavior of your target newsroom or SEO locale. - Stateful infinite scroll – Playwright + Chrome scrolls until maxItems, deduplicates via persistent state, and survives platform migrations without repeating articles. - Redirect intelligence – Toggle resolveRedirects to unwrap news.google.com links via HTTP first, then fall back to a headless browser for stubborn publishers. - Operator friendly – Resource throttling (blocked assets, batching, optional proxy pools) keeps compute predictable, so you can undercut $20/month competitors while still monetizing premium options. ## Perfect for - Trend and sentiment tracking dashboards - Competitive/brand monitoring in multiple languages - Feeding LLM/RAG pipelines with fresh, normalized news snippets - SEO teams mapping content velocity or backlink opportunities - Research teams exporting CSV/JSON data into BI tools ## Data you get | Field | Description | | --- | --- | | title | Headline as displayed on Google News | | source | Publisher name extracted from the card | | publishedAt | ISO timestamp normalized from relative strings (e.g., “3 hours ago”) | | originalUrl | Google News redirect URL (always present) | | finalUrl | Set when resolveRedirects=true; points at the publisher site | | thumbnailUrl | Image from the article card, when available | | snippet | Reserved for future excerpt support | ## Modes & localization cheatsheet | Mode | When to use | URL that gets crawled | | --- | --- | --- | | SEARCH | Keyword monitoring, advanced query operators | https://news.google.com/search?q={query} | | TOP_STORIES | Country-level front page | https://news.google.com/topstories | | TOPIC | Curated sections & hashed topics | Either https://news.google.com/topics/{topicId} or a fallback search | Topic IDs & sections 1. Open the desired topic/section on news.google.com. 2. Copy the hashed ID from the URL (e.g., CAAqJggKIiBDQkFTRWdvSUwyMHZNRGRqTVhZU0FtVnVHZ0pWVXlnQVAB). 3. Pass it in query when mode="TOPIC" to lock the crawler to that curated feed. ## Input parameters | Field | Type | Required | Default | Notes | | --- | --- | --- | --- | --- | | mode | 'SEARCH' | 'TOP_STORIES' | 'TOPIC' | No | SEARCH | Controls which Google News surface we crawl | | query | string | SEARCH / custom TOPIC | – | Keywords or hashed topic IDs | | language | ll-CC | Yes | en-US | Drives hl, gl, and ceid params; always set it (best practice from Apify leaderboard actors) | | resolveRedirects | boolean | No | false | Adds an HTTP + optional browser hop per article to unwrap publisher URLs | | maxItems | number | No | 100 | Hard stop for infinite scroll + dataset pushes | | proxyConfiguration | object | No | Apify auto | Pass your proxy group or custom proxy URL | ## Advanced search operators - Exact match: "artificial intelligence" - Source filter: site:reuters.com "earnings" - Title only: intitle:"climate" - Exclusions: tesla -stock - Date bounds: after:2024-01-01 before:2024-06-30 Combine operators to reproduce the saved searches marketing teams monitor daily. ## Quick start 1. Add the actor from the Apify Store and click “Try for free”. 2. Choose the mode (SEARCH, TOP_STORIES, TOPIC) and fill in language (e.g., en-GB). 3. Paste your keywords or topic IDs; bump maxItems if you need deeper coverage. 4. Optional: enable resolveRedirects for canonical publisher URLs. 5. Run & download the dataset as JSON, CSV, Excel, or stream it via the Apify API. ## Output example json { "title": "OpenAI ships GPT-Next", "source": "TechCrunch", "publishedAt": "2025-11-25T14:30:00.000Z", "originalUrl": "https://news.google.com/rss/articles/CBMiYmh0dHBzOi8vbmV3cy5nb29nbGUuY29tLy4uLg", "finalUrl": "https://techcrunch.com/2025/11/25/openai-gpt-next", "thumbnailUrl": "https://lh3.googleusercontent.com/..." } ### Dataset schema | Field | Type | Example | Notes | | --- | --- | --- | --- | | title | string | "Tesla unveils new Model" | Headline pulled from the card | | source | string | "Reuters" | Publisher label | | publishedAt | string (ISO-8601) | "2025-11-25T14:30:00.000Z" | Normalized by parseGoogleDate | | originalUrl | string | "https://news.google.com/..." | Always present, Google redirect | | finalUrl | string | "https://www.reuters.com/..." | Only populated when resolveRedirects=true or when Google already links directly | | thumbnailUrl | string | "https://lh3.googleusercontent.com/..." | Optional image URL | | snippet | string | null | Reserved for future excerpt extraction | Every dataset item is stored in the default Apify Dataset, so you can download JSON/CSV/Excel or stream via the Dataset API. ## Cost & performance tips - Redirect resolution is the main price lever. Keep it off for cheaper monitoring runs and upsell it as a premium add-on when clients need canonical URLs. - Because the crawler blocks images/fonts and reuses sessions, SEARCH mode can pull ~100 results on the default memory tier. Increase the run memory only when maxItems is very high. - Persistent state (STATE key-value store) prevents duplicates if the run migrates, so re-runs won’t waste your monthly budget. ## Need help? Open an issue in the Apify actor console or ping us with your run ID. We actively benchmark against the highest-ranking Google News actors (e.g., api-empire/google-news-scraper at $19.99/month) to keep this README—and the crawler—competitive.

Common Use Cases

Market Research

Gather competitive intelligence and market data

Lead Generation

Extract contact information for sales outreach

Price Monitoring

Track competitor pricing and product changes

Content Aggregation

Collect and organize content from multiple sources

Ready to Get Started?

Try Google News Actor now on Apify. Free tier available with no credit card required.

Start Free Trial

Actor Information

Developer
early_kiosk
Pricing
Paid
Total Runs
16
Active Users
2
Apify Platform

Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.

Learn more about Apify

Need Professional Help?

Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.

Find a Specialist

Trusted by millions | Money-back guarantee | 24/7 Support