Cnn Top Headlines
by runtime
An Apify actor that scrapes top headlines from CNN's homepage and articles. Get structured news data for aggregation, research, or feeds without managing your own scraper.
Opens on Apify.com
About Cnn Top Headlines
Need a reliable way to pull the latest news from CNN without dealing with rate limits or parsing complex HTML? This Apify actor is your go-to. It's a straightforward scraper I've used to consistently extract top headlines directly from CNN's homepage and individual article pages. You get clean, structured data—like headline text, article URLs, and timestamps—delivered in a format (JSON, CSV) that's ready to drop into your database, spreadsheet, or application. It runs on Apify's platform, so you can schedule it to run daily or hourly, ensuring your dataset is always current. I typically use it for building news aggregators, tracking media trends, or feeding a live news ticker on a website. It saves you the headache of maintaining your own scraper every time CNN tweaks its site layout. If you're in media monitoring, research, or just need a steady stream of headline data, this actor handles the heavy lifting.
What does this actor do?
Cnn Top Headlines is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.
Key Features
- Cloud-based execution - no local setup required
- Scalable infrastructure for large-scale operations
- API access for integration with your applications
- Built-in proxy rotation and anti-blocking measures
- Scheduled runs and webhooks for automation
How to Use
- Click "Try This Actor" to open it on Apify
- Create a free Apify account if you don't have one
- Configure the input parameters as needed
- Run the actor and download your results
Documentation
CNN Top Headlines Scraper
This Apify actor scrapes the latest top news headlines from CNN (https://www.cnn.com/) or CNN International (https://edition.cnn.com/). It can optionally extract full article details.
Key Features
- Extracts real-time headlines from the CNN homepage or specific section pages.
- Optionally follows links to scrape full article content, author, and publish date.
- Outputs structured JSON data for easy processing or analysis.
How to Use
Configure your actor run using the input fields below. The main options are:
startUrls(array): The URLs to start scraping from (e.g., the homepage or a section like World or Business). Defaults to["https://www.cnn.com/"].maxHeadlines(integer): The maximum number of headlines to extract and visit. Default is20.includeArticleDetails(boolean): When set totrue, the actor will visit each headline link and scrape the full article text, author, and published date. Default isfalse.
Example Input
{
"startUrls": [
{ "url": "https://edition.cnn.com/" }
],
"maxHeadlines": 10,
"includeArticleDetails": true
}
Input & Output
Input Schema
Refer to .actor/input_schema.json for the complete specification.
Output Format
Results are saved to the default Apify dataset, available for download as JSON, CSV, or Excel.
Headline-only output:
{
"title": "Superman smashes box office expectations",
"url": "https://www.cnn.com/2025/07/13/entertainment/superman-box-office-intl",
"source": "CNN",
"scrapedAt": "2025-07-13T12:56:40.535Z"
}
Output with article details (when includeArticleDetails is true):
{
"title": "Superman smashes box office expectations",
"content": "Full article text ...",
"author": "CNN Staff",
"publishedDate": "2025-07-13T10:00:00Z",
"url": "https://www.cnn.com/2025/07/13/entertainment/superman-box-office-intl",
"source": "CNN",
"scrapedAt": "2025-07-13T12:56:40.535Z"
}
How It Works
- The actor loads the provided
startUrls. - It extracts up to
maxHeadlinesnews headlines from those pages. - If
includeArticleDetailsis enabled, it navigates to each headline's URL to scrape the article body, author, and publication date. - All results are stored in the dataset.
Important Notes
- Terms of Service: Use this actor responsibly and in compliance with CNN's terms and
robots.txt. - Rate Limiting: Avoid aggressive request rates to prevent overloading servers.
- Proxies: Consider using proxies for large-scale scraping to avoid IP blocking.
- Data Usage: Ensure you have the right to use scraped data for your intended purpose.
- Scope: The actor only scrapes publicly accessible CNN articles.
Legal Disclaimer
This tool is for educational and research purposes. You are responsible for using it in a legal and responsible manner that does not harm CNN's infrastructure.
Related Actors
- Booking.com Hotel Scraper: Scrape hotel data, prices, ratings, and more from Booking.com with advanced anti-detection and flexible extraction limits.
- Product Hunt Scraper: Extract product listings, launch details, votes, and comments from Product Hunt for market research and trend analysis.
Common Use Cases
Market Research
Gather competitive intelligence and market data
Lead Generation
Extract contact information for sales outreach
Price Monitoring
Track competitor pricing and product changes
Content Aggregation
Collect and organize content from multiple sources
Ready to Get Started?
Try Cnn Top Headlines now on Apify. Free tier available with no credit card required.
Start Free TrialActor Information
- Developer
- runtime
- Pricing
- Paid
- Total Runs
- 140
- Active Users
- 6
Related Actors
Smart Article Extractor
by lukaskrivka
Google Search
by devisty
Twitter Tweets Scraper
by gentle_cloud
Twitter Profile
by danek
Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.
Learn more about ApifyNeed Professional Help?
Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.
Trusted by millions | Money-back guarantee | 24/7 Support