Website Email Scraper - All Contacts

Name: Website Email Scraper - All Contacts
Author: thenetaji

by thenetaji

Extract videos, images, audio, APKs & emails from websites. This Apify actor crawls pages to discover media links with configurable depth, proxy suppo...

5,810 runs

611 users

Try This Actor

Opens on Apify.com

About Website Email Scraper - All Contacts

Extract videos, images, audio, APKs & emails from websites. This Apify actor crawls pages to discover media links with configurable depth, proxy support & domain filtering. Boost content research & lead gen.

What does this actor do?

Website Email Scraper - All Contacts is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.

Key Features

Cloud-based execution - no local setup required
Scalable infrastructure for large-scale operations
API access for integration with your applications
Built-in proxy rotation and anti-blocking measures
Scheduled runs and webhooks for automation

How to Use

Click "Try This Actor" to open it on Apify
Create a free Apify account if you don't have one
Configure the input parameters as needed
Run the actor and download your results

Documentation

Website Email Extractor - Most efficieent ## 🔍 Overview Media Link Extractor is a powerful Apify actor that automatically crawls websites to discover and extract various types of media links including videos, images, audio files, APK files, and email addresses. Perfect for content aggregation, SEO research, lead generation, and digital asset management. ## ✨ Key Features - Multi-Media Support: Extract various media types (videos, images, audio, APKs, emails) - Configurable Crawling: Set crawl depth, concurrency, and URL limits to suit your needs - Smart Extraction: Uses multiple detection methods including URL patterns, HTML tags, and CSS selectors - Proxy Support: Optional Apify proxy integration for better scraping success rates - Domain Filtering: Stays on the same domain to focus crawling on relevant content - Detailed Output: Organized dataset with source URLs, timestamps, and media metadata - Rate Limiting Protection: Built-in mechanisms to avoid overloading target websites ## 🎯 Use Cases - Content Creators: Find media resources for projects and presentations - Digital Marketers: Discover image and video assets for competitor analysis - App Developers: Locate APK distribution points for competitive research - Lead Generation: Extract email addresses for business outreach campaigns - SEO Specialists: Analyze media usage patterns across websites - Researchers: Gather media files for analysis and archiving projects ## 🛠️ Input Parameters `json { "startUrls": [{ "url": "https://example.com" }], "mediaType": "all", "maxCrawlDepth": 1, "maxConcurrency": 10, "maxRequestRetries": 3, "maxUrlsToCrawl": 100, "useProxy": { "useApifyProxy": false, "apifyProxyGroups": [], "apifyProxyCountry": "" } }` ### Parameter Details | Parameter | Type | Description | | ------------------- | ------ | ----------------------------------------------------------------------------- | | `startUrls` | Array | List of URLs where the crawler will begin | | `mediaType` | String | Type of media to extract: `video`, `audio`, `image`, `apk`, `email`, or `all` | | `maxCrawlDepth` | Number | How many links deep the crawler will go | | `maxConcurrency` | Number | Maximum parallel requests | | `maxRequestRetries` | Number | Number of retry attempts for failed requests | | `maxUrlsToCrawl` | Number | Maximum number of URLs to process | | `useProxy` | Object | Configuration for Apify proxy usage | ## 📊 Output Format The actor stores results in the default dataset with this structure: `json { "sourceUrl": "https://example.com/page", "pageTitle": "Example Page Title", "mediaLinks": [ { "url": "https://example.com/video.mp4", "sourceUrl": "https://example.com/page", "title": "Example Page Title", "type": "video", "foundAt": "2025-04-10T06:40:01.000Z" } ], "timestamp": "2025-04-10T06:40:01.000Z" }` ## ⚙️ Technical Implementation Media Link Extractor uses a combination of techniques to find media resources: 1. CSS Selectors: Targets specific HTML elements containing media 2. URL Pattern Matching: Identifies file extensions and URL patterns 3. Context Analysis: Examines surrounding elements for media indicators 4. Domain Adherence: Maintains focus on the original domain ## 💡 Best Practices - Start Small: Begin with a low `maxUrlsToCrawl` value to test results - Respect Websites: Use reasonable `maxConcurrency` values to avoid overloading sites - Optimize Depth: Most valuable media is often found within 1-2 levels of crawl depth - Target Specific Media: Use the appropriate `mediaType` parameter instead of "all" for more focused results ## 📚 Examples ### Extract Videos from a Website `json { "startUrls": [{ "url": "https://example.com/videos" }], "mediaType": "video", "maxCrawlDepth": 2, "maxUrlsToCrawl": 50 }` ### Find Email Addresses for Lead Generation `json { "startUrls": [{ "url": "https://company.com/about" }], "mediaType": "email", "maxCrawlDepth": 3, "maxUrlsToCrawl": 200 }` ### Collect APK Files from Android Sites `json { "startUrls": [{ "url": "https://apksite.com" }], "mediaType": "apk", "maxCrawlDepth": 2, "maxUrlsToCrawl": 100 }` ## 📈 Performance Considerations - Processing speed depends on website complexity and response times - Typical extraction rates: 5-10 pages per second without proxy, 2-5 pages per second with proxy - Memory usage scales with concurrency and page complexity - Consider using Apify proxy for rate-limited or IP-blocking websites ## 🔗 Integration Ideas - Connect with Apify Storage for permanent dataset archiving - Combine with Google Sheets integration for easy team collaboration - Use with Zapier or Make to automate workflows with extracted media - Export data to S3 or other cloud storage for batch processing

Common Use Cases

Market Research

Gather competitive intelligence and market data

Lead Generation

Extract contact information for sales outreach

Price Monitoring

Track competitor pricing and product changes

Content Aggregation

Collect and organize content from multiple sources

Ready to Get Started?

Try Website Email Scraper - All Contacts now on Apify. Free tier available with no credit card required.

Start Free Trial

Actor Information

Developer: thenetaji
Pricing: Paid
Total Runs: 5,810
Active Users: 611

Related Actors

🏯 Tweet Scraper V2 - X / Twitter Scraper

by apidojo

Google Search Results Scraper

by apify

Instagram Profile Scraper

by apify

Tweet Scraper|$0.25/1K Tweets | Pay-Per Result | No Rate Limits

by kaitoeasyapi

Browse All Actors

Apify Platform

Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.

Learn more about Apify

Need Professional Help?

Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.

Find a Specialist

Trusted by millions | Money-back guarantee | 24/7 Support