Get URLs from link

Name: Get URLs from link
Author: boring_code

by boring_code

Extracts URLs from a sitemap or webpage with intuitive path matching. Use comma-separated patterns to include or exclude URL paths with smart matching...

4,407 runs

205 users

Try This Actor

Opens on Apify.com

About Get URLs from link

Extracts URLs from a sitemap or webpage with intuitive path matching. Use comma-separated patterns to include or exclude URL paths with smart matching: '/tags/' for exact paths, '/product' for paths starting with, or simple text for substring matches.

What does this actor do?

Get URLs from link is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.

Key Features

Cloud-based execution - no local setup required
Scalable infrastructure for large-scale operations
API access for integration with your applications
Built-in proxy rotation and anti-blocking measures
Scheduled runs and webhooks for automation

How to Use

Click "Try This Actor" to open it on Apify
Create a free Apify account if you don't have one
Configure the input parameters as needed
Run the actor and download your results

Documentation

Get URLs from link This actor extracts URLs from a sitemap or any webpage containing links. It provides intuitive URL path matching and flexible filtering options to get exactly the URLs you need. ## Features - Extract URLs from XML sitemaps or webpages - Smart URL path matching: - Use '/tags/' to match exact path - Use '/product' to match paths starting with /product - Use 'product' to match URLs containing this text anywhere - Exclude specific file extensions (e.g., images) - Exclude URLs using the same smart path matching - Limit the number of processed URLs - Simple comma-separated syntax for filters ## Input Configuration | Field | Type | Description | |-------|------|-------------| | `link` | String | URL to process (required) | | `urlPattern` | String | List of URL parts to include (comma separated). Use '*' to include all URLs. When using slashes: '/tags/' matches exact path, '/tags' matches path starting with /tags, 'tags/' matches path ending with tags/. Without slashes (e.g., 'product') matches anywhere in URL | | `maxUrls` | Integer | Maximum number of URLs to process (0 for no limit). Good for testing purposes | | `excludeExtensions` | String | List of file extensions to exclude (comma separated). Example: jpg,jpeg,png,gif | | `customExcludePattern` | String | List of URL parts to exclude (comma separated). Uses same pattern matching as urlPattern. Examples: '/tags/,category' or '/blog/,author' | ## Output The actor outputs a dataset containing URLs that match your specified criteria. Each record has the following field: `json { "url": "https://example.com/page" }` ## Usage Examples ### Basic Usage Extract all URLs from a sitemap: `json { "link": "https://example.com/sitemap.xml" }` ### Smart Path Matching Get only product URLs with different matching options: `json { "link": "https://example.com/sitemap.xml", "urlPattern": "/products/,productId,deals/" }` This will match: - URLs containing exact '/products/' path - URLs containing 'productId' anywhere - URLs ending with 'deals/' ### Exclude File Types and Sections Get URLs excluding images and specific sections: `json { "link": "https://example.com/sitemap.xml", "excludeExtensions": "jpg,jpeg,png,gif", "customExcludePattern": "/tags/,/category/,author" }` ### Limit Results Get first 100 URLs for testing: ```json { "link": "https://example.com/sitemap.xml", "maxUrls": 100 }

Common Use Cases

Market Research

Gather competitive intelligence and market data

Lead Generation

Extract contact information for sales outreach

Price Monitoring

Track competitor pricing and product changes

Content Aggregation

Collect and organize content from multiple sources

Ready to Get Started?

Try Get URLs from link now on Apify. Free tier available with no credit card required.

Start Free Trial

Actor Information

Developer: boring_code
Pricing: Paid
Total Runs: 4,407
Active Users: 205

Related Actors

Web Scraper

by apify

Cheerio Scraper

by apify

Website Content Crawler

by apify

Legacy PhantomJS Crawler

by apify

Browse All Actors

Apify Platform

Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.

Learn more about Apify

Need Professional Help?

Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.

Find a Specialist

Trusted by millions | Money-back guarantee | 24/7 Support

Get URLs from link

About Get URLs from link

What does this actor do?

Key Features

How to Use

Documentation

Categories

Common Use Cases

Market Research

Lead Generation

Price Monitoring

Content Aggregation

Ready to Get Started?

Actor Information

Related Actors

Need Professional Help?