URLs List - Extract ALL website urls

Name: URLs List - Extract ALL website urls
Author: lofomachines

by lofomachines

Automatically discovers and extracts ALL URLs from any website. Perfect for SEO analysis, content inventory, and bulk URL extraction from multiple web...

81 runs

37 users

Try This Actor

Opens on Apify.com

About URLs List - Extract ALL website urls

Automatically discovers and extracts ALL URLs from any website. Perfect for SEO analysis, content inventory, and bulk URL extraction from multiple websites. Get complete URL lists with metadata including last modified dates and priority levels.

What does this actor do?

URLs List - Extract ALL website urls is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.

Key Features

Cloud-based execution - no local setup required
Scalable infrastructure for large-scale operations
API access for integration with your applications
Built-in proxy rotation and anti-blocking measures
Scheduled runs and webhooks for automation

How to Use

Click "Try This Actor" to open it on Apify
Create a free Apify account if you don't have one
Configure the input parameters as needed
Run the actor and download your results

Documentation

URLs List - Extract ALL Website URLs

Comprehensive URL Extractor for SEO audits, content inventory, and bulk analysis.

Features • Cost of Usage • Input • Output • Troubleshooting

This actor automatically discovers and extracts ALL URLs from any target website. It is designed to be the entry point for SEO audits, site migrations, and content analysis pipelines. It crawls recursively to build a complete map of a domain.

✨ Key Features

🔍 Automatic Discovery: Intelligently finds all available URLs from any website structure. * 💨 Fast & Efficient: Optimized for speed to handle large sites (50k+ URLs). * 📦 Bulk Processing: Accepts multiple domain roots to process simultaneously. * 🏷️ Rich Metadata: Extracts last modified dates, priority levels, and update frequency (where available). * 🗜️ Smart Handling: Works with standard sitemaps, recursive crawling, and standard web formats. * 🛡️ Resilient: Automatic retries on temporary errors and infinite loop prevention. * 🎯 Result Limiting: Control the maximum number of URLs extracted with maxResults or enable returnAll for complete extraction. * 🔎 Keyword Filtering: Filter URLs by keywords - only URLs containing all specified keywords will be returned.

🎯 Use Cases

| Use Case | Description | | :--- | :--- | | SEO Audit | Extract all URLs to analyze site architecture and identify orphan pages. | | Content Inventory | Create a comprehensive list of all existing pages for migration planning. | | Monitoring | Track lastmod dates to identify which content has been updated recently. | | Data Pipelines | Feed the output URLs into other scrapers (e.g., Scrape HTML, Google Sheets export). | | Targeted Extraction | Use keyword filtering to extract only specific sections (e.g., all blog posts, product pages). | | Sampling | Use maxResults to extract a sample of URLs for quick analysis without processing entire sites. | ---

💰 Cost of Usage

This scraper is designed to be lightweight. It parses URL structures without rendering full page JavaScript (unless necessary), keeping costs low. * Small Sites (< 1,000 URLs): Cents per run. * Medium Sites (10,000 URLs): Typically < $1.00. * Large Sites: Efficiency scales well, but usage depends on the complexity of the target site's architecture. > Tip: Always use Apify Proxy (enabled by default) to ensure consistent access and avoid blocking. ---

📥 Input Configuration

The Actor expects a JSON input defining the websites to scan. ### Example Input json { "startUrls": [ { "url": "https://apify.com" }, { "url": "https://crawlee.dev" } ], "proxyConfiguration": { "useApifyProxy": true }, "returnAll": true, "maxResults": 1000, "keywords": ["blog", "article"] } ### Input Parameters | Parameter | Type | Required | Default | Description | | :--- | :--- | :--- | :--- | :--- | | startUrls | Array | ✅ Yes | [{ url: "https://apify.com" }] | List of website URLs to extract pages from. | | proxyConfiguration | Object | ❌ No | { useApifyProxy: false } | Proxy settings for reliable access. | | returnAll | Boolean | ❌ No | true | If true, extracts all available URLs regardless of maxResults. If false, applies the maxResults limit. | | maxResults | Integer | ❌ No | 1000 | Maximum number of URLs to extract. Ignored if returnAll is true or set to 0. | | keywords | Array | ❌ No | [] | Filter URLs to only include those containing ALL specified keywords. Case-insensitive matching. Example: ["blog"] returns only URLs containing "blog" (e.g., https://example.com/blog/article). |

Common Use Cases

Market Research

Gather competitive intelligence and market data

Lead Generation

Extract contact information for sales outreach

Price Monitoring

Track competitor pricing and product changes

Content Aggregation

Collect and organize content from multiple sources

Ready to Get Started?

Try URLs List - Extract ALL website urls now on Apify. Free tier available with no credit card required.

Start Free Trial

Actor Information

Developer: lofomachines
Pricing: Paid
Total Runs: 81
Active Users: 37

Related Actors

Web Scraper

by apify

Cheerio Scraper

by apify

Website Content Crawler

by apify

Legacy PhantomJS Crawler

by apify

Browse All Actors

Apify Platform

Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.

Learn more about Apify

Need Professional Help?

Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.

Find a Specialist

Trusted by millions | Money-back guarantee | 24/7 Support

URLs List - Extract ALL website urls

About URLs List - Extract ALL website urls

What does this actor do?

Key Features

How to Use

Documentation

URLs List - Extract ALL Website URLs

✨ Key Features

🎯 Use Cases

💰 Cost of Usage

📥 Input Configuration

Categories

Common Use Cases

Market Research

Lead Generation

Price Monitoring

Content Aggregation

Ready to Get Started?

Actor Information

Related Actors

Need Professional Help?