Automae Email Extractor

by automae-theo-jim

An Apify actor that crawls websites to extract email addresses automatically, with anti-detection and Cloudflare decoding to avoid blocks. Perfect for lead gen and research.

2,989 runs
11 users
Try This Actor

Opens on Apify.com

About Automae Email Extractor

Need to pull email addresses from websites without getting blocked? I've been there. The Automae Email Extractor is an Apify actor I built to handle exactly that. It crawls any site you point it at, digging through pages to find and collect email addresses automatically. The real trick is its stealth. It includes anti-detection measures to avoid tripping alarms and can handle Cloudflare-protected sites, which is a common headache with simpler scrapers. You just feed it a starting URL and it does the rest, saving you hours of manual searching or writing your own fragile scripts. I use it mostly for building contact lists for outreach and lead generation. It's also perfect for developers who need to verify site contacts or gather data for research, without the hassle of dealing with blocks or captchas. It runs on Apify's platform, so you get reliable cloud execution and can easily scale if you have a big list of sites to process. It’s a straightforward tool that solves one specific problem really well.

What does this actor do?

Automae Email Extractor is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.

Key Features

  • Cloud-based execution - no local setup required
  • Scalable infrastructure for large-scale operations
  • API access for integration with your applications
  • Built-in proxy rotation and anti-blocking measures
  • Scheduled runs and webhooks for automation

How to Use

  1. Click "Try This Actor" to open it on Apify
  2. Create a free Apify account if you don't have one
  3. Configure the input parameters as needed
  4. Run the actor and download your results

Documentation

Automae Email Extractor

An Apify actor that crawls websites to extract email addresses. It intelligently navigates to contact pages, uses multiple extraction methods, and includes anti-detection measures to avoid being blocked.

Key Features

  • Multi-Source Email Extraction: Finds emails from mailto: links, decodes Cloudflare-protected data-cfemail addresses, scans HTML content with regex, and checks <meta> tags.
  • Smart Navigation: Automatically identifies and prioritizes contact pages using a configurable list of multilingual keywords (e.g., contact, kontakt, contactez).
  • Anti-Detection: Uses realistic browser fingerprints, random delays between actions, human-like headers, and session management to mimic organic traffic and reduce the risk of bans.
  • Email Filtering & Validation: Prioritizes common business emails (e.g., contact@, info@), filters out unwanted addresses (e.g., no-reply@), validates email format, and removes duplicates.

How to Use

Input Configuration

Configure the actor using a JSON input. Only the baseUrl is required.

{
  "baseUrl": "https://example.com",
  "maxContactPages": 2,
  "navigationTimeoutMs": 30000,
  "blacklist": ["spam@", "test@", "@example.org"]
}
  • baseUrl (required): The starting URL to crawl.
  • maxContactPages (optional, default: 2): Maximum number of contact pages to analyze.
  • navigationTimeoutMs (optional, default: 30000): Page load timeout in milliseconds.
  • blacklist (optional): An array of email patterns to exclude. These are added to the default blacklist (no-reply@, noreply@, donotreply@, @mail.com).

Execution

Run the actor on the Apify platform or locally:

# Local execution
npm start
# or
node main.js

Input & Output

Input: The JSON configuration object detailed above.

Output: The actor returns a structured JSON result.

{
  "hit": true,
  "primaryEmail": "contact@example.com",
  "domain": "example.com",
  "emails": ["contact@example.com", "info@example.com"],
  "sourceUrl": "https://example.com/contact",
  "scanned": ["https://example.com", "https://example.com/contact"],
  "baseUrl": "https://example.com"
}
  • hit: Indicates if any emails were found.
  • primaryEmail: The highest-priority email address discovered.
  • emails: An array of all valid, filtered emails found.
  • sourceUrl: The page where the primaryEmail was extracted.
  • scanned: The list of URLs that were crawled.

Common Use Cases

Market Research

Gather competitive intelligence and market data

Lead Generation

Extract contact information for sales outreach

Price Monitoring

Track competitor pricing and product changes

Content Aggregation

Collect and organize content from multiple sources

Ready to Get Started?

Try Automae Email Extractor now on Apify. Free tier available with no credit card required.

Start Free Trial

Actor Information

Developer
automae-theo-jim
Pricing
Paid
Total Runs
2,989
Active Users
11
Apify Platform

Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.

Learn more about Apify

Need Professional Help?

Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.

Find a Specialist

Trusted by millions | Money-back guarantee | 24/7 Support