Website Broken Links & Redirects Checker
by smart-digital
Analyzes websites to detect broken links (4xx/5xx) and redirects (3xx). Checks internal/external links on single pages or crawls entire sites. Provide...
Opens on Apify.com
About Website Broken Links & Redirects Checker
Analyzes websites to detect broken links (4xx/5xx) and redirects (3xx). Checks internal/external links on single pages or crawls entire sites. Provides detailed reports per page and site summary.
What does this actor do?
Website Broken Links & Redirects Checker is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.
Key Features
- Cloud-based execution - no local setup required
- Scalable infrastructure for large-scale operations
- API access for integration with your applications
- Built-in proxy rotation and anti-blocking measures
- Scheduled runs and webhooks for automation
How to Use
- Click "Try This Actor" to open it on Apify
- Create a free Apify account if you don't have one
- Configure the input parameters as needed
- Run the actor and download your results
Documentation
Broken Links Checker Apify Actor to analyze broken links and redirects on a website. ## Description This actor analyzes a website to detect broken links (404, 500, etc.) and redirects (301, 302, etc.). It can analyze a single page or crawl multiple pages of a site to check all internal and external links. ## Features - Detection of broken links (HTTP 4xx and 5xx status codes) - Detection of redirects (HTTP 3xx status codes) with destination URL - Optional crawling of pages from the same domain - Verification of internal and external links (optional) - Detailed report per page with counters - Global site summary with complete statistics - Response time measurement for each link (in milliseconds) ## Input json { "startUrls": ["https://example.com"], "crawlPages": false, "maxPages": 50, "maxConcurrency": 5, "sameDomain": true, "checkExternal": false, "timeout": 10000 } ### Parameters - startUrls (required) : List of starting URLs to analyze - crawlPages (optional, default: false) : Enable page crawling. If disabled, only the starting URLs are analyzed - maxPages (optional, default: 50) : Maximum number of pages to crawl (only if crawlPages is enabled) - maxConcurrency (optional, default: 5) : Number of pages to crawl in parallel - sameDomain (optional, default: true) : Only crawl links from the same domain (only if crawlPages is enabled) - checkExternal (optional, default: false) : Also check external links (to other domains) - timeout (optional, default: 10000) : Timeout in milliseconds to check each link ## Output The actor generates two types of records: ### Page Record json { "type": "page", "pageUrl": "https://example.com/page", "title": "Page Title", "httpStatus": 200, "linksCount": 25, "brokenLinksCount": 2, "redirectLinksCount": 3, "links": [ { "url": "https://example.com/link", "text": "Link Text", "isInternal": true, "httpStatus": 404, "responseTime_ms": 150 }, { "url": "http://example.com/old-page", "text": "Old Page", "isInternal": true, "httpStatus": 301, "responseTime_ms": 120, "redirectUrl": "https://example.com/new-page" } ] } Page Record Fields: - linksCount : Total number of links checked on the page - brokenLinksCount : Number of broken links (HTTP 4xx and 5xx status codes) - redirectLinksCount : Number of redirects (HTTP 3xx status codes) ### Site Summary json { "type": "site-summary", "pagesCrawled": 10, "linksTotal": 250, "brokenLinksTotal": 15, "redirectLinksTotal": 8, "byStatus": { "200": 227, "301": 5, "302": 3, "404": 10, "500": 2 }, "byType": { "internal": 200, "external": 50 }, "topBrokenLinks": [ { "url": "https://example.com/broken", "count": 5, "pages": ["https://example.com/page1", "https://example.com/page2"], "httpStatus": 404 } ] } Site Summary Fields: - pagesCrawled : Total number of pages analyzed - linksTotal : Total number of links checked - brokenLinksTotal : Total number of broken links (4xx, 5xx) - redirectLinksTotal : Total number of redirects (3xx) - byStatus : Distribution of links by HTTP status code - byType : Distribution of links by type (internal/external) - topBrokenLinks : Top 20 most frequent broken links with the pages where they appear Link Fields: - url : Absolute URL of the link - text : Link text (content of the <a> tag) - isInternal : true if the link is on the same domain, false otherwise - httpStatus : HTTP response status code (200, 301, 302, 404, 500, etc.) - responseTime_ms : Response time in milliseconds - redirectUrl : Destination URL if the link is a redirect (3xx)
Categories
Common Use Cases
Market Research
Gather competitive intelligence and market data
Lead Generation
Extract contact information for sales outreach
Price Monitoring
Track competitor pricing and product changes
Content Aggregation
Collect and organize content from multiple sources
Ready to Get Started?
Try Website Broken Links & Redirects Checker now on Apify. Free tier available with no credit card required.
Start Free TrialActor Information
- Developer
- smart-digital
- Pricing
- Paid
- Total Runs
- 234
- Active Users
- 11
Related Actors
Google Search Results Scraper
by apify
Google Search Results (SERP) Scraper
by scraperlink
Google Search
by devisty
Bing Search Scraper
by tri_angle
Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.
Learn more about ApifyNeed Professional Help?
Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.
Trusted by millions | Money-back guarantee | 24/7 Support