RSS News Aggregator

Name: RSS News Aggregator
Author: louvre

by louvre

317 runs

4 users

Try This Actor

Opens on Apify.com

About RSS News Aggregator

What does this actor do?

RSS News Aggregator is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.

Key Features

Cloud-based execution - no local setup required
Scalable infrastructure for large-scale operations
API access for integration with your applications
Built-in proxy rotation and anti-blocking measures
Scheduled runs and webhooks for automation

How to Use

Click "Try This Actor" to open it on Apify
Create a free Apify account if you don't have one
Configure the input parameters as needed
Run the actor and download your results

Documentation

RSS News Aggregator ## 📌 Introduction ### 🎯 What is this actor? The RSS News Aggregator is a powerful Apify actor designed to fetch, combine, and process multiple RSS and Atom feeds. It takes a list of feed URLs, sorts all articles by publication date, and groups them by their source, making it easy to create customized news streams. ### 🚀 Key Features - Aggregate Multiple Feeds: Combine articles from various RSS/Atom sources. - Group by Source: Automatically groups articles by their root domain (e.g., `marketwatch.com`). - Global Sorting: Sorts all articles chronologically across all feeds. - Flexible Output: Choose between a clean, normalized format or the raw, unprocessed data from the feeds. - Deploy as API: Runs on the Apify platform and can be used as a serverless API. ### 🔍 Use Cases - News Dashboards: Build a personal or professional dashboard that pulls in news from all your favorite sources. - Content Curation: Curate content for newsletters, websites, or social media by monitoring specific topics across different blogs and news sites. - Market Research: Keep track of industry news, competitor announcements, and market trends in one place. - Data Analysis: Use the raw data output to perform text analysis, sentiment analysis, or other research on a large corpus of articles. --- ## 📥 Inputs The actor requires the following inputs. | Field | Type | Required | Description | | ------------ | --------- | -------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `rss_feeds` | `Array` | Yes | An array of objects, where each object contains a `url` key pointing to a valid RSS or Atom feed URL. | | `raw_data` | `Boolean` | No | If set to `true`, the actor will return the raw, unprocessed JSON for each feed, grouped by source domain, without applying normalization or pagination. Defaults to `false`. | ### ✅ Example Input `json { "rss_feeds": [ { "url": "https://www.federalreserve.gov/feeds/press_monetary.xml" }, { "url": "https://www.marketwatch.com/rss/topstories" }, { "url": "http://feeds.bbci.co.uk/news/rss.xml" } ], "raw_data": false }` --- ## 📤 Outputs The actor's output format depends on the `raw_data` input parameter. ### Standard Output (`raw_data`: false) By default, the actor performs a global sort on all articles from all feeds and then groups the results by their source. The output is an array of objects, where each object represents a source and contains its articles. #### ✅ Example Standard Output json [ { "source": "federalreserve.gov", "feeds": [ { "title": "Fed Announces Monetary Policy Update", "link": "https://www.federalreserve.gov/newsevents/pressreleases/monetary20231026a.htm", "pub_date": "2023-10-26T18:00:00.000Z", "source": "federalreserve.gov", "guid": "https://www.federalreserve.gov/newsevents/pressreleases/monetary20231026a.htm", "author": "", "category": "Monetary Policy", "enclosure": null } ] }, { "source": "marketwatch.com", "feeds": [ { "title": "Market Hits Record Highs After Tech Rally", "link": "https://www.marketwatch.com/story/market-hits-record-highs-11635280000", "pub_date": "2023-10-26T16:30:00.000Z", "source": "marketwatch.com", "guid": "https://www.marketwatch.com/story/market-hits-record-highs-11635280000", "author": "John Doe", "category": "Top Stories", "enclosure": { "url": "https://example.com/image.jpg", "type": "image/jpeg", "length": "12345" } } ] } ] ### Raw Data Output (`raw_data`: true) When `raw_data` is `true`, the actor fetches the raw XML from each feed, converts it to JSON, and groups the full JSON objects into an array of source-specific objects. This mode does not apply any sorting. It is designed to give you the complete, original data structure for each feed. #### ✅ Example Raw Data Output json [ { "source": "federalreserve.gov", "feeds": [ { "title": "Federal Reserve Board - Press Release", "link": { "$": { "href": "https://www.federalreserve.gov/feeds/press_monetary.xml", "rel": "self" } }, "description": "Recent press releases on monetary policy.", "lastBuildDate": "Thu, 26 Oct 2023 18:00:00 GMT", "item": [ { "title": "Fed Announces Monetary Policy Update", "link": "https://www.federalreserve.gov/newsevents/pressreleases/monetary20231026a.htm", "description": "The Federal Reserve issued a statement on its latest monetary policy decisions...", "pubDate": "Thu, 26 Oct 2023 18:00:00 GMT", "guid": { "_": "https://www.federalreserve.gov/newsevents/pressreleases/monetary20231026a.htm", "isPermaLink": "true" } } ] } ] } ] --- ## ⚙️ How to Use ### 🏁 Running on Apify Platform 1. Log in or Sign up on the Apify platform. 2. Find the "RSS News Aggregator" Actor in the Apify Store. 3. Click "Try actor" and then create a new task. 4. Enter your desired RSS feed URLs and any other optional parameters in the Input tab. 5. Start the task by clicking the "Start" button. The results will be available in the task's Dataset tab once the run is finished. --- ## 🛠️ Troubleshooting | Issue | Possible Cause | Solution | | ------------------- | ---------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------ | | `Invalid URL error` | One or more URLs in `rss_feeds` are malformed or not valid RSS/Atom feeds. | Check all provided URLs to ensure they are correct and point to a valid feed. | | `Parsing Error` | The feed's XML structure is broken or does not conform to RSS/Atom standards. | Try a different feed from that source or contact the source's administrator. | | `Empty Output` | The provided feeds are empty, or there was a network issue preventing them from being fetched. | Ensure the source feeds contain articles and that your network connection is stable. |

Common Use Cases

Market Research

Gather competitive intelligence and market data

Lead Generation

Extract contact information for sales outreach

Price Monitoring

Track competitor pricing and product changes

Content Aggregation

Collect and organize content from multiple sources

Ready to Get Started?

Try RSS News Aggregator now on Apify. Free tier available with no credit card required.

Start Free Trial

Actor Information

Developer: louvre
Pricing: Paid
Total Runs: 317
Active Users: 4

Related Actors

Smart Article Extractor

by lukaskrivka

Google Search

by devisty

Twitter Tweets Scraper

by gentle_cloud

Twitter Profile

by danek

Browse All Actors

Apify Platform

Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.

Learn more about Apify

Need Professional Help?

Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.

Find a Specialist

Trusted by millions | Money-back guarantee | 24/7 Support