LeadScraper
by cdubiel
Scrape a list of urls and receive business contact information, social media links, and a description of the services. This actor will scrape across ...
Opens on Apify.com
About LeadScraper
Scrape a list of urls and receive business contact information, social media links, and a description of the services. This actor will scrape across multiple pages in the sitemap and returns a confidence score to every phone number and email that it finds. webscraper, scrape leads, web scraper
What does this actor do?
LeadScraper is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.
Key Features
- Cloud-based execution - no local setup required
- Scalable infrastructure for large-scale operations
- API access for integration with your applications
- Built-in proxy rotation and anti-blocking measures
- Scheduled runs and webhooks for automation
How to Use
- Click "Try This Actor" to open it on Apify
- Create a free Apify account if you don't have one
- Configure the input parameters as needed
- Run the actor and download your results
Documentation
Service Company Website Scraper An Apify actor that scrapes service company websites and extracts structured information about the business, including contact information, services offered, hours of operation, and more. ## Features - Extracts company name, description, and contact information - Identifies services offered by the company - Extracts business hours, social media links, and reviews - Finds pricing information and FAQs - Handles multiple URLs in a single run - Supports SSL verification options - Optional Cloudflare bypass capability ## Input The actor accepts the following input parameters: - urls - An array of service company website URLs to scrape (required) - verifySSL - Whether to verify SSL certificates (default: true) - bypassCloudflare - Whether to attempt to bypass Cloudflare protection (default: true) - metadata - Optional custom metadata to include with each result Example input: json { "urls": [ "https://www.example1.com/", "https://www.example2.com/" ], "verifySSL": true, "bypassCloudflare": true, "metadata": { "project_id": "example-project", "source": "manual", "category": "roofing" } } ## Output The actor outputs a JSON object for each URL containing the following information: - url - The URL of the scraped website - title - The title of the website - meta_description - The meta description of the website - main_content - The main content of the website - contact_information - Contact information extracted from the website - phones - List of phone numbers with confidence scores - main_phone - The main phone number with highest confidence - emails - List of email addresses with confidence scores - main_email - The main email address with highest confidence - address - The physical address of the business - services - List of services offered by the company - hours_of_operation - Business hours by day of the week - social_media_links - Links to social media profiles - reviews - Customer reviews found on the website - pricing - Pricing information for services - faqs - Frequently asked questions - success - Whether the scraping was successful - error - Error message if scraping failed ## Example Usage ``javascript const Apify = require('apify'); Apify.main(async () => { const input = { urls: [ "https://www.example1.com/", "https://www.example2.com/" ], verifySSL: true, bypassCloudflare: true, metadata: { project_id: "example-project", source: "manual", category: "roofing" } }; // Run the actor and wait for it to finish const run = await Apify.call('your-username/service-company-scraper', input); // Print the results const dataset = await Apify.openDataset(run.defaultDatasetId); const { items } = await dataset.getData(); console.log('Results:', items); }); ## Development ### Project Structure -main.py- Entry point for the Apify actor -scraper.py- Contains theServiceCompanyScraperclass -requirements.txt- Python dependencies -INPUT_SCHEMA.json- Input schema for the Apify actor -OUTPUT_SCHEMA.json- Output schema for the Apify actor -Dockerfile- Docker configuration for the Apify actor ### Adding New Features To add new extraction capabilities: 1. Add a new method to theServiceCompanyScraperclass inscraper.py2. Call the method from thescrape` method 3. Update the output schema if necessary
Categories
Common Use Cases
Market Research
Gather competitive intelligence and market data
Lead Generation
Extract contact information for sales outreach
Price Monitoring
Track competitor pricing and product changes
Content Aggregation
Collect and organize content from multiple sources
Ready to Get Started?
Try LeadScraper now on Apify. Free tier available with no credit card required.
Start Free TrialActor Information
- Developer
- cdubiel
- Pricing
- Paid
- Total Runs
- 4,836
- Active Users
- 114
Related Actors
🏯 Tweet Scraper V2 - X / Twitter Scraper
by apidojo
Google Search Results Scraper
by apify
Instagram Profile Scraper
by apify
Tweet Scraper|$0.25/1K Tweets | Pay-Per Result | No Rate Limits
by kaitoeasyapi
Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.
Learn more about ApifyNeed Professional Help?
Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.
Trusted by millions | Money-back guarantee | 24/7 Support