German Imprint Scraper with Decision Makers Names Extraction

German Imprint Scraper with Decision Makers Names Extraction

by dominic-quaiser

Need to find the right contacts at German companies? This scraper handles the tedious part for you. It automatically visits German website imprint pag...

462,179 runs
301 users
Try This Actor

Opens on Apify.com

About German Imprint Scraper with Decision Makers Names Extraction

Need to find the right contacts at German companies? This scraper handles the tedious part for you. It automatically visits German website imprint pages (the legally required "Impressum") and pulls out the exact details you need for outreach or verification. It reliably extracts the company name, full postal address, listed phone numbers, and email addresses. The real time-saver is its ability to specifically identify and extract the names of decision-makers (often listed as "Entscheider" or "Entscheidungsträger" in the imprint). This means you get a clean list of leads with key contacts, not just a generic office address. I use it for building targeted B2B contact lists, verifying business information, and streamlining lead generation for the German market. Instead of manually checking hundreds of sites, you set it running and get structured, usable data. It’s built for developers and marketers who need accurate German business data without the manual hassle. Just point it at your list of German website URLs, and it fetches the structured imprint data directly into your workflow.

What does this actor do?

German Imprint Scraper with Decision Makers Names Extraction is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.

Key Features

  • Cloud-based execution - no local setup required
  • Scalable infrastructure for large-scale operations
  • API access for integration with your applications
  • Built-in proxy rotation and anti-blocking measures
  • Scheduled runs and webhooks for automation

How to Use

  1. Click "Try This Actor" to open it on Apify
  2. Create a free Apify account if you don't have one
  3. Configure the input parameters as needed
  4. Run the actor and download your results

Documentation

German Imprint Scraper A Python-based Apify Actor designed to find and extract contact and legal information from German imprint pages ("Impressum"). Simply provide a list of website homepages, and the actor will automatically locate the imprint page and scrape key details like company name, address, phone number, email, and commercial register number. Beta Version Notice: This actor is currently in beta. While it's fully functional and returns results, you may encounter occasional quirks or incomplete features. I welcome your feedback! Please report any issues or suggestions you have. ## 💡 Features - Automatic Imprint Page Discovery: Intelligently crawls websites to find the correct imprint page from your starting URLs. - Selective Data Extraction: Choose exactly which data points you need, from basic contact info to advanced details like company decision-makers. - Dual Fetching Technology: - HTTP Mode: A fast, lightweight method for scraping simple, server-rendered websites. - Headless Browser Mode (Playwright): A powerful option for modern, JavaScript-heavy websites. The actor can be configured to use this mode for all sites or as an automatic fallback if the standard HTTP method fails, ensuring maximum success rates. - Proxy Support: Integrates seamlessly with Apify's proxy service to handle IP rotation and avoid blocking. - Customizable Output: Include optional metadata or error records for detailed analysis and troubleshooting. - Structured JSON Output: Delivers clean, well-structured data ready for use in your applications, databases, or CRM systems. ## 📥 Input Parameters Configure the actor's behavior using these fields in the Apify Console Input tab or via API: |Field|Type|Description|Default|Required| |---|---|---|---|---| |startUrls|Array|Enter the homepage URLs of the websites to process.|[{ "url": "https://www.vita-cola.de/" }]|Yes| |fieldsToExtract|Array|Choose the specific pieces of information you want to collect.|["company_name", "business_address"]|No| |usePlaywright|Boolean|Use a headless browser for all websites. Slower but more reliable for JavaScript-heavy sites.|false|No| |metaData|Boolean|Include technical details in the output.|false|No| |errorOutput|Boolean|Include a row in the output for each website that failed to process.|false|No| |debugLog|Boolean|Generate a verbose log for troubleshooting.|false|No| |proxyConfiguration|Object|Proxy settings. Apify Proxy is recommended.|{ "useApifyProxy": true }|No| ## 📤 Output Data Structure The exact fields depend on your fieldsToExtract selection. ### Example Output json { "start_url": "https://muster-firma.de/", "imprint_url": "https://muster-firma.de/impressum", "company_name": { "name": "Muster GmbH", "confidence": 1 }, "business_address": { "full_address": "Musterstraße 123, 12345 Berlin", "street": "Musterstraße", "house_number": "123", "postal_code": "12345", "city": "Berlin" }, "phone_number": { "phone_1": "+493012345678" }, "emails": { "email_1": "kontakt@muster-firma.de" }, "register_number": { "number": "HRB 12345 B", "court": "Amtsgericht Charlottenburg" }, "social_media": { "linkedin": "https://www.linkedin.com/company/muster-firma" }, "decision_makers": [ "Max Mustermann" ], "metadata": { "domain": "muster-firma.de", "fetch_method": "http", "fallback_attempted": false, "scraped_at": "2025-08-28T12:04:48.003780" } } Note: The numbered outputs like emails and phone numbers are sorted by confidence in how likely they are the main contact data for the company. ## 📊 Extractable Data in Detail You can select any combination of the following fields for extraction: |Field|Description|Data Structure| |---|---|---| |company_name|Extracts the official company name. The result includes a confidence score indicating the likelihood of a correct match. The higher the number, the lower is the confidence.|Object| |business_address|Parses the full business address into structured components: full_address, street, house_number, postal_code, and city.|Object| |phone_number|Finds and extracts one or more phone numbers from the page. Results are keyed as phone_1, phone_2, etc.|Object| |emails|Finds and extracts one or more email addresses. The extractor prioritizes emails that match the website's domain.|Object| |register_number|Extracts the commercial register number ("Handelsregisternummer") and the corresponding registration court (Registergericht).|Object| |social_media|Scans for and extracts links to common social media platforms like LinkedIn, Xing, Facebook, Instagram, etc.|Object| |decision_makers|(Premium) Identifies and extracts the names of key decision-makers ("Entscheidungsträger"). This feature uses an external NER (Named Entity Recognition) machine learning model to ensure accuracy.|Array| ##💲 Pricing This actor uses a pay-per-event pricing model. You are charged based on your usage, ensuring you only pay for what you need. The costs are as follows: - Actor Start: $0.10 per run - Per Website: - Website Processed: $0.0004 for each URL from your input list - Successful Result: $0.0026 for each website where data is successfully extracted - Decision Maker Extracted: $0.0006 for decision-makers found per website (this is in addition to the successful result charge) - Maximum Sum: $0.0036 per Website - Per 1000 Websites: - Website Processed: $0.40 for 1000 URL from your input list - Successful Result: $2.60 for 10000 websites where data is successfully extracted - Decision Maker Extracted: $0.60 for decision-makers found per 1000 websites (this is in addition to the successful result charge) - Maximum Sum: $3.60 per 1000 Websites ## ⚙️ Usage 1. Input URLs: Go to the Input tab and paste the homepage URLs of the websites you want to scrape. 2. Select Data: In the fieldsToExtract dropdown, select all the data points you wish to collect. 3. Configure Settings: Adjust settings like usePlaywright or proxyConfiguration as needed. 4. Start the Actor: Click the Start button. 5. Get Data: Once the run is finished, find your results in the StorageDataset tab. ### 🤖 Other Actors 🔗 Combine with Other Actors for Powerful Workflows. You can enhance your data processing pipelines by combining the German Imprint Scraper with other Apify actors. For example, you might also check out: - Gelbe Seiten (German Yellow Pages) Scraper - Extract business listings from Germany's Yellow Pages with three detail levels - Phone Number Formatter - Parse, validate, and format phone numbers in bulk across international formats ## 🎯 Use Cases - Lead Generation: Build targeted contact lists for sales and marketing. - Compliance & Verification: Check for legally compliant imprint information. - Market Research: Aggregate data on companies in a specific industry or region. - Data Enrichment: Enhance existing company profiles with official contact and registration details. ## ⚖️ Legal Disclaimer You are solely responsible for determining the legality of your use of this actor and the data it generates. The scraping and handling of data, particularly personal information, is subject to complex legal frameworks like the General Data Protection Regulation (GDPR/DSGVO), copyright laws, and the terms of service of the websites you scrape. It is your responsibility to ensure your use case is compliant with all applicable laws. This text does not constitute legal advice. ### GDPR Notice: "Decision Makers" Feature Please be aware that the decision_makers feature uses an external API hosted on a private server in Europe for data processing. - What is Processed: The text of the imprint page is sent to this API to identify personal names. - Why: This is necessary for the Named Entity Recognition (NER) model to accurately extract decision-makers. - Data Controller: You, the user, are the data controller. The actor's developer acts as the data processor for this specific task. - Location & Compliance: All processing for this feature occurs within the EU (Germany) and is subject to GDPR (DSGVO). - Data Storage: The text is processed in-memory and is not stored or logged on the external server. - Important: This processing is external to the Apify platform and is not covered by Apify's DPA. By using this feature, you acknowledge this separate data processing activity. ## 🛠️ Maintainer - Author: Dominic M. Quaiser - Contact: mail@dominic-quaiser.io - Website: dominic-quaiser.io

Common Use Cases

Market Research

Gather competitive intelligence and market data

Lead Generation

Extract contact information for sales outreach

Price Monitoring

Track competitor pricing and product changes

Content Aggregation

Collect and organize content from multiple sources

Ready to Get Started?

Try German Imprint Scraper with Decision Makers Names Extraction now on Apify. Free tier available with no credit card required.

Start Free Trial

Actor Information

Developer
dominic-quaiser
Pricing
Paid
Total Runs
462,179
Active Users
301
Apify Platform

Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.

Learn more about Apify

Need Professional Help?

Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.

Find a Specialist

Trusted by millions | Money-back guarantee | 24/7 Support