Domain Scraper

Name: Domain Scraper
Author: ib4ngz

by ib4ngz

This actor scrapes unique domains from a list of provided URLs. It crawls each page, extracts domains, and stores them in a dataset. The actor respect...

411 runs

61 users

Try This Actor

Opens on Apify.com

About Domain Scraper

This actor scrapes unique domains from a list of provided URLs. It crawls each page, extracts domains, and stores them in a dataset. The actor respects a defined maximum depth and filters domains based on whether they are ICANN-approved and whether private domains are allowed.

What does this actor do?

Domain Scraper is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.

Key Features

Cloud-based execution - no local setup required
Scalable infrastructure for large-scale operations
API access for integration with your applications
Built-in proxy rotation and anti-blocking measures
Scheduled runs and webhooks for automation

How to Use

Click "Try This Actor" to open it on Apify
Create a free Apify account if you don't have one
Configure the input parameters as needed
Run the actor and download your results

Documentation

Domain Scraper This actor scrapes domains from a list of provided URLs. It recursively crawls the pages, extracts unique domains, and stores them in a dataset. The actor respects a defined maximum depth and filters domains based on whether they are ICANN-approved and whether private domains are allowed. Only unique domains are saved, preventing duplicates during the crawling process. ## Features - Domain Extraction : Extracts domains from a list of provided URLs and recursively explores linked pages to gather additional domains. - Recursive Crawling: Crawls web pages to a user-defined maximum depth, enabling detailed exploration while managing resource usage. - Domain Filtering: Processes domains based on ICANN approval and user-defined preferences for private domains. - Unique Dataset: Ensures only unique domains are saved by preventing duplicates during the crawling process. ## Input Schema - Start URLs (required): A list of URLs to start crawling from. - Maximum Depth: The maximum depth for crawling, defining how deep the crawler should explore. - Allow Private Domains: Option to enable or disable crawling of private domains. - ICANN Domains Only: Option to restrict processing to ICANN-approved domains only. - Proxy Configuration: Configuration settings for selecting and using proxies during crawling. - Minimum Concurrency: The minimum number of concurrent requests or pages to process. - Maximum Concurrency: The maximum number of concurrent requests or pages to process. ## Dataset Schema - domain: The full domain name. - domainWithoutSuffix: The domain without the public suffix (e.g., example from example.com). - hostname: The hostname of the domain. - isIcann: Indicates whether the domain is ICANN-approved (boolean). - publicSuffix: The public suffix of the domain (e.g., .com, .org). - isPrivate: Indicates whether the domain is a private domain (boolean). - subdomain: The subdomain part of the domain (e.g., sub.example.com). ## How to Use 1. Set up the Actor\ Start by providing a list of URLs to begin the crawling process. You can either manually input the URLs or provide a list in the actor configuration. 2. Configure the Input Parameters - Start URLs: Provide the initial URLs from which the crawler will start. - Maximum Depth: Define how deep the crawler should explore. - Allow Private Domains: Choose whether to allow crawling of private domains. - ICANN Domains Only: Set whether to crawl only ICANN-approved domains. - Proxy Configuration: If necessary, configure the proxy settings for your crawler. - Concurrency: Adjust the minimum and maximum concurrency based on your needs. 3. Run the Actor\ Once the input parameters are configured, run the actor to start the crawling process. The actor will crawl the pages, extract unique domains, and store the results in the dataset. 4. View Results\ After the actor finishes running, you can view the extracted domains in the dataset. The data will be displayed in a table format with the following fields: - Domain - Domain Without Suffix - Hostname - ICANN Domain - Public Suffix - Private Domain - Subdomain 5. Export Data\ You can export the dataset for further processing or analysis. The results are saved in a structured format for easy integration with other tools. 6. Modify Parameters\ Adjust the configuration and rerun the actor as needed to gather additional data or refine the crawling process. ## Conclusion This actor provides an efficient solution for scraping and extracting unique domains from a list of URLs. It recursively crawls the provided pages, extracts domains, and stores them in a dataset. By respecting a defined maximum depth and filtering domains based on ICANN approval and private domain allowance, it ensures only relevant domains are captured. The actor is optimized to prevent duplicates by saving only unique domains during the crawling process. This makes it a valuable tool for anyone looking to gather domain data in a structured and efficient manner, while maintaining control over the types of domains collected.

Common Use Cases

Market Research

Gather competitive intelligence and market data

Lead Generation

Extract contact information for sales outreach

Price Monitoring

Track competitor pricing and product changes

Content Aggregation

Collect and organize content from multiple sources

Ready to Get Started?

Try Domain Scraper now on Apify. Free tier available with no credit card required.

Start Free Trial

Actor Information

Developer: ib4ngz
Pricing: Paid
Total Runs: 411
Active Users: 61

Related Actors

Video Transcript Scraper: Youtube, X, Facebook, Tiktok, etc.

by invideoiq

Linkedin Profile Details Scraper + EMAIL (No Cookies Required)

by apimaestro

Twitter (X.com) Scraper Unlimited: No Limits

by apidojo

Content Checker

by jakubbalada

Browse All Actors

Apify Platform

Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.

Learn more about Apify

Need Professional Help?

Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.

Find a Specialist

Trusted by millions | Money-back guarantee | 24/7 Support