Sitemap Change Orchestrator
by tri_angle
Monitor website sitemaps for new, updated, or removed URLs. Integration with the Website Content Crawler (WCC) allows feeding only relevant URLs. This...
Opens on Apify.com
About Sitemap Change Orchestrator
Monitor website sitemaps for new, updated, or removed URLs. Integration with the Website Content Crawler (WCC) allows feeding only relevant URLs. This ensures your web crawls are efficient, targeted, and resource-optimized, keeping your datasets fresh for any application.
What does this actor do?
Sitemap Change Orchestrator is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.
Key Features
- Cloud-based execution - no local setup required
- Scalable infrastructure for large-scale operations
- API access for integration with your applications
- Built-in proxy rotation and anti-blocking measures
- Scheduled runs and webhooks for automation
How to Use
- Click "Try This Actor" to open it on Apify
- Create a free Apify account if you don't have one
- Configure the input parameters as needed
- Run the actor and download your results
Documentation
Sitemap Change Orchestrator Monitor sitemaps, detect changes, orchestrate content crawls, and merge results into a single dataset. ## What is Sitemap Change Orchestrator? This actor orchestrates running the Sitemap Change Detector to identify changed URLs in sitemaps and then triggers parallel Website Content Crawler runs to fetch page content. Finally, it merges and deduplicates all crawler outputs by URL into one unified dataset. ## Key Features - Detect sitemap changes (NEW, UPDATED, REMOVED, SAME) - Orchestrate parallel crawl runs with configurable memory and timeout - Merge and dedupe Website Content Crawler results into a single output - Store and retrieve sitemap snapshots in a named key-value store ## How it Works 1. Run the Sitemap Change Detector with your settings 2. Collect changed URLs and batch them into Website Content Crawler runs 3. Trigger Website Content Crawler runs in parallel 4. Merge and dedupe all crawler run datasets by URL ## How to Use 1. Open the Sitemap Change Orchestrator actor on the Apify Store. 2. Configure memory, timeouts, and whether to skip crawling. 3. Paste your Website Content Crawler JSON input. 4. Set WCC batching options. 5. Save and click Run. 6. Review merged and deduplicated output in the default dataset. ### Example Input json { "addRemovedUrlsToKvs": false, "addWccUrlsToScd": true, "changeTypes": ["NEW", "UPDATED"], "discoverSitemaps": true, "skipWcc": false, "snapshotKeyPrefix": "APIFY", "wccInput": { "startUrls": [ { "url": "https://www.apify.com", "method": "GET" } ] // ... } } ## Output - Merged and deduplicated output from all Website Content Crawler runs in the default dataset - Additionally, sitemap snapshots and removed-URL lists are stored in a named key-value store under your prefix ## FAQ ### Can I export data using API? Yes, you can access this actor using your own applications through the Apify API. Click on the API tab for code examples or check out the Apify API reference docs at https://docs.apify.com/api/v2 for full details. ### Can I use Sitemap Change Orchestrator through an MCP Server? This actor, like all Apify actors, works on the Apify MCP server. For more information and instructions, read the Apify MCP server integration guide at https://docs.apify.com/platform/integrations/mcp. ### Can I integrate data from Sitemap Change Orchestrator with other apps? Yes. Sitemap Change Orchestrator can be connected with almost any cloud service or web app. Read more about the possibilities on our integrations page at https://apify.com/integrations. ### Is it legal to scrape data using Sitemap Change Orchestrator? This actor only extracts publicly available data. It does not collect private user data. However, you should ensure your reason for scraping is legitimate. Consult legal counsel if unsure. For more on scraping legality and ethics, see: - https://blog.apify.com/is-web-scraping-legal/ - https://blog.apify.com/what-is-ethical-web-scraping-and-how-do-you-do-it/ ### Your feedback We welcome feedback to improve this actor. If you encounter issues or have suggestions, please create an issue on the actor’s Issues tab.
Categories
Common Use Cases
Market Research
Gather competitive intelligence and market data
Lead Generation
Extract contact information for sales outreach
Price Monitoring
Track competitor pricing and product changes
Content Aggregation
Collect and organize content from multiple sources
Ready to Get Started?
Try Sitemap Change Orchestrator now on Apify. Free tier available with no credit card required.
Start Free TrialActor Information
- Developer
- tri_angle
- Pricing
- Paid
- Total Runs
- 90
- Active Users
- 33
Related Actors
Google Search Results Scraper
by apify
Website Content Crawler
by apify
🔥 Leads Generator - $3/1k 50k leads like Apollo
by microworlds
Video Transcript Scraper: Youtube, X, Facebook, Tiktok, etc.
by invideoiq
Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.
Learn more about ApifyNeed Professional Help?
Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.
Trusted by millions | Money-back guarantee | 24/7 Support