Google Ads Analyzer

Google Ads Analyzer

by amernas

Extract ad data from Google Ads Transparency Center by domain. Three modes: FULL (basic data), OCR (AI text extraction from images - headlines, descri...

412 runs
7 users
Try This Actor

Opens on Apify.com

About Google Ads Analyzer

Extract ad data from Google Ads Transparency Center by domain. Three modes: FULL (basic data), OCR (AI text extraction from images - headlines, descriptions, URLs), and LITE (summary counts). Filter by date range and region. Perfect for competitor analysis and ad research.

What does this actor do?

Google Ads Analyzer is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.

Key Features

  • Cloud-based execution - no local setup required
  • Scalable infrastructure for large-scale operations
  • API access for integration with your applications
  • Built-in proxy rotation and anti-blocking measures
  • Scheduled runs and webhooks for automation

How to Use

  1. Click "Try This Actor" to open it on Apify
  2. Create a free Apify account if you don't have one
  3. Configure the input parameters as needed
  4. Run the actor and download your results

Documentation

Google Ads Transparency Scraper (Python) ## 🔎 What is this Actor? This Apify Actor is a Python-based tool designed to scrape data from the Google Ads Transparency Center. It allows you to extract information about advertisers and their ads based on a single domain name (e.g., clay.com, apify.com) and flexible date range options. ## 📊 What Google ads data can I extract? The Actor supports three run modes: * FULL Mode: Extracts detailed information for each ad creative WITHOUT OCR, including: * Ad metadata (advertiser ID, creative ID, format, dates) * Preview image URLs * Domain information * OCR Mode: Extracts detailed information for each ad creative WITH OCR text extraction, including: * All data from FULL mode * OCR text extraction from ad images (headline, description, click URL) * LITE Mode: Extracts a summary for each matched advertiser, providing counts of their ad creatives by format. ## 📖 How to use ### ⬇️ Input The Actor expects a JSON input with the following fields: - keywords (string, required): A single domain name or Google Ads Transparency Center URL to search for ads. You can provide either: - Domain name: e.g., "clay.com" or "apify.com" - Full URL: e.g., "https://adstransparency.google.com/?region=anywhere&domain=clay.com" The actor will search for all ads associated with this domain. - runMode (string, optional, default: "FULL"): - "FULL": Fetches detailed ad information for each creative WITHOUT OCR. - "OCR": Fetches detailed ad information for each creative WITH OCR text extraction. - "LITE": Fetches only counts of ad creatives per format for each matched advertiser. - dateRangePreset (string, optional, default: "ANYTIME"): Controls the date range for fetching ads. Options: - "ANYTIME": No date filtering. - "LAST_7_DAYS": Ads shown in the last 7 days. - "LAST_30_DAYS": Ads shown in the last 30 days. - "CUSTOM_RANGE": Specify a custom date range using customStartDate and customEndDate. - customStartDate (string, optional, format: YYYY-MM-DD): The start date for a custom range (e.g., "2023-01-01"). Used only if dateRangePreset is "CUSTOM_RANGE". - customEndDate (string, optional, format: YYYY-MM-DD): The end date for a custom range (e.g., "2023-01-31"). Used only if dateRangePreset is "CUSTOM_RANGE". - count (integer, optional, default: 10): - In FULL mode or OCR mode: The maximum number of ad creatives to retrieve for each domain within the specified date range. - In LITE mode: The maximum number of creative summaries to fetch for counting (default: 2000). - region (string, optional, default: "anywhere"): The region code to filter ads by (e.g., 'US', 'GB'). Use "anywhere" for no specific region. - proxyConfig (object, optional): Standard Apify proxy configuration. Example Input (FULL Mode with URL): json { "keywords": "https://adstransparency.google.com/?region=anywhere&domain=clay.com", "runMode": "FULL", "dateRangePreset": "LAST_30_DAYS", "count": 5, "region": "anywhere" } Example Input (FULL Mode with Domain Name): json { "keywords": "apify.com", "runMode": "FULL", "dateRangePreset": "CUSTOM_RANGE", "customStartDate": "2024-01-01", "customEndDate": "2024-01-31", "count": 5, "region": "US" } Example Input (OCR Mode with OCR Text Extraction): json { "keywords": "clay.com", "runMode": "OCR", "dateRangePreset": "LAST_7_DAYS", "count": 10, "region": "anywhere" } Example Input (LITE Mode, Last 7 Days): json { "keywords": "clay.com", "runMode": "LITE", "dateRangePreset": "LAST_7_DAYS", "region": "anywhere" } ### ⬆️ Output The extracted data is stored in the default Apify dataset. The structure of items in the dataset depends on the selected runMode. Output Structure for runMode: "FULL" and runMode: "OCR" Each item is a JSON object representing a detailed ad creative with the following fields: - originalKeyword (string): The input keyword that led to this ad being scraped. - advertiserId (string): The unique ID of the advertiser. - advertiserName (string): The name of the advertiser. - creativeId (string): The unique ID of the ad creative. - format (string): The format of the ad. Possible values: "TEXT", "IMAGE", "VIDEO", "UNKNOWN". - previewUrl (string | null): URL to the ad preview image, if available. - imgHtml (string | null): The raw HTML of the ad preview image, if available. - domain (string | null): The domain associated with the ad creative. - firstShown (string | null): Date the ad was first shown (YYYY-MM-DD format), if available. - lastShown (string | null): Date the ad was last shown (YYYY-MM-DD format), if available. - ocrData (object | null): Only included in OCR mode. Text extracted from the ad creative image using OCR (Tesseract). This object contains: - rawText (string | null): The complete raw text extracted by OCR from the ad image. - ocrError (string | null): Any error message if OCR extraction failed. Null if successful. Example Output (FULL Mode - No OCR): json { "originalKeyword": "clay.com", "advertiserId": "AR12345678901234567890", "advertiserName": "Clay", "creativeId": "CR98765432109876543210", "format": "IMAGE", "previewUrl": "https://tpc.googlesyndication.com/...", "imgHtml": "<img src=\"https://tpc.googlesyndication.com/...\">", "domain": "clay.com", "firstShown": "2024-01-15", "lastShown": "2024-01-31" } Example Output (OCR Mode - With OCR): json { "originalKeyword": "clay.com", "advertiserId": "AR12345678901234567890", "advertiserName": "Clay", "creativeId": "CR98765432109876543210", "format": "IMAGE", "previewUrl": "https://tpc.googlesyndication.com/...", "imgHtml": "<img src=\"https://tpc.googlesyndication.com/...\">", "domain": "clay.com", "firstShown": "2024-01-15", "lastShown": "2024-01-31", "ocrData": { "rawText": "Sponsored\nA clay.com\nwww.clay.com/\nAutomate Inbound\nScore and route leads automatically. Enrich, score, and route inbound in real time.", "ocrError": null } } Output Structure for runMode: "LITE" Each item is a JSON object summarizing ad counts for all advertisers associated with the searched domain, filtered by the specified date range and region: - originalKeyword (string): The input keyword (domain or URL) that led to this summary. - keyword (string): The extracted domain name (e.g., "clay.com"). - advertisersFound (object): A dictionary of advertiser IDs mapped to advertiser names found for this domain. - textCreativeCount (integer): Number of TEXT ad creatives found for this domain within the specified parameters. - imageCreativeCount (integer): Number of IMAGE ad creatives. - videoCreativeCount (integer): Number of VIDEO ad creatives. - unknownFormatCount (integer): Number of creatives with an undetermined format. - regionSearched (string): The region parameter used for this count (e.g., "US", "anywhere"). - totalCreativesCountedFromSearch (integer): The actual number of creative summaries processed to derive the format counts for the specified date range and region. ## ⚙️ Setup and Running This Actor is designed to run as a Docker container on the Apify platform. 1. Build the Actor: apify build (from within the Actor's directory) Or, manually with Docker: docker build -t google-ads-scraper-actor . 2. Run the Actor: apify run (this will use the INPUT.json file in the Actor's .actor directory if present, or you can specify input via CLI or Apify Console) 3. Push to Apify Platform: apify push ## ❓ Frequently Asked Questions (FAQs) Is it legal to scrape Google Ads data? Scraping publicly available data is generally permissible, but you should always be mindful of the website's terms of service, robots.txt, and relevant data privacy regulations (like GDPR, CCPA). Ensure your scraping activities are ethical and do not overload the target servers. Consult legal advice if you are unsure. ## 💬 Your feedback If you have any feedback or feature requests, please let us know!

Common Use Cases

Market Research

Gather competitive intelligence and market data

Lead Generation

Extract contact information for sales outreach

Price Monitoring

Track competitor pricing and product changes

Content Aggregation

Collect and organize content from multiple sources

Ready to Get Started?

Try Google Ads Analyzer now on Apify. Free tier available with no credit card required.

Start Free Trial

Actor Information

Developer
amernas
Pricing
Paid
Total Runs
412
Active Users
7
Apify Platform

Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.

Learn more about Apify

Need Professional Help?

Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.

Find a Specialist

Trusted by millions | Money-back guarantee | 24/7 Support