Image Text Extractor

Image Text Extractor

by m3web

Extract text from images using OCR (Optical Character Recognition) via direct URLs or uploaded JSON/CSV files. Works with multiple languages and autom...

407 runs
34 users
Try This Actor

Opens on Apify.com

About Image Text Extractor

Extract text from images using OCR (Optical Character Recognition) via direct URLs or uploaded JSON/CSV files. Works with multiple languages and automatically enriches your structured file with the text found inside images.

What does this actor do?

Image Text Extractor is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.

Key Features

  • Cloud-based execution - no local setup required
  • Scalable infrastructure for large-scale operations
  • API access for integration with your applications
  • Built-in proxy rotation and anti-blocking measures
  • Scheduled runs and webhooks for automation

How to Use

  1. Click "Try This Actor" to open it on Apify
  2. Create a free Apify account if you don't have one
  3. Configure the input parameters as needed
  4. Run the actor and download your results

Documentation

πŸ–ΌοΈ Image Text Extractor Extract text from images using OCR (Optical Character Recognition) via direct URLs or uploaded JSON/CSV files. Works with multiple languages and automatically enriches your structured file with the text found inside images. --- ## βœ… Features - Accepts image URLs either: - Directly through startUrls, or - From uploaded .json or .csv files - Applies OCR (Optical Character Recognition) to each image and extracts: - extractedText: Full raw text detected - paragraphs: Text split into readable blocks - urls: Any links found within the image text - Supports Tesseract OCR with multiple languages (e.g. English, German, Spanish, etc.) - Saves results in Apify Key-Value Store with a shareable download link - Logs are clean and easy to follow --- ## πŸ“₯ Input This Actor accepts these input fields: | Field | Type | Description | |-------------------|----------|-----------------------------------------------------------------------------| | Image URLs | array | (Optional) One or more direct image URLs to process | | Upload a structured file| file | (Optional) Upload a .json or .csv file that contains image URLs | | Field name for image URL | string | The name of the column or field in your file that holds the image URLs | | language | string | Choose the OCR language from the dropdown (default is English) | ### πŸ‘‡ Explaining Field name for image URL in simple terms If you're uploading a .json or .csv file, you need to tell the Actor which part of each item contains the image URL. This is what the Field name for image URL is for: - πŸ”’ In a CSV file, each column has a name (like "image_url" or "photo"). You should type in the exact column name where the image URL is located. - Example: csv title,image_url Product 1,https://example.com/image1.jpg Product 2,https://example.com/image2.jpg In this case, you'd set Field name for image URL to image_url. - 🧱 In a JSON file, each object has a label for its fields. You need to write the name of the field that stores the image link. - Example: json [ { "name": "Item A", "photo": "https://example.com/photo.jpg" } ] Here, you'd set Field name for image URL to photo. - πŸ’¬ You can also use dot notation to reach inside nested fields. For example, if your JSON file looks like this: - Example: json [ { "assets": { "image": "https://example.com/image.jpg" } } ] Then set Field name for image URL to assets.image. - πŸ”’ Multiple Images in One Row If your .json or .csv file contains more than one image URL per item, you can still process them all! Simply point to the field that holds an array of URLs. - Example .json input: json { "title": "Product Set", "images": [ "https://example.com/photo1.jpg", "https://example.com/photo2.jpg" ] } Set Field name for image URL to images β€” the Actor will automatically process all image URLs inside that array. This also works with dot notation for nested arrays: json { "media": { "photos": [ "https://example.com/one.jpg", "https://example.com/two.jpg" ] } } In this case, set Field name for image URL to media.photos --- ### 🌍 OCR Language The Actor supports many languages beyond English. At the input step, you'll see a dropdown menu labeled language. Select the appropriate language for your images (e.g. German, French, Spanish...) - the default language is English. This helps the OCR engine correctly detect and read the text in your image. --- ## πŸ“€ Output After processing, you'll receive: 1. A structured CSV or JSON file with enriched data: - extractedText: All text found in each image - paragraphs: Text broken into readable chunks - urls: Any links found inside the image text 2. πŸ”— A downloadable link to your processed file saved in Apify's Key-Value Store 3. πŸ“Š OCR results also pushed to Apify Dataset (optional) --- ## πŸš€ Example Use Cases - Extracting text from screenshot-based Google Ads - Enriching scraped product data with visible text - Identifying links or CTAs from image banners --- ## πŸ€– Behind the Scenes This Actor uses: - Tesseract.js for OCR - Sharp for image preprocessing (grayscale, normalize) - Support for both in-memory JSON and CSV parsing/stringifying - Output is clean and downloadable, with clear logs and no clutter ## πŸ’‘ Tip Want to extract thousands of image ads from Google’s Ad Transparency Center? Combine this with a crawler that scrapes adstransparency.google.com, then feed that structured JSON into this Actor. Boom β€” text from image ads, at scale.

Common Use Cases

Market Research

Gather competitive intelligence and market data

Lead Generation

Extract contact information for sales outreach

Price Monitoring

Track competitor pricing and product changes

Content Aggregation

Collect and organize content from multiple sources

Ready to Get Started?

Try Image Text Extractor now on Apify. Free tier available with no credit card required.

Start Free Trial

Actor Information

Developer
m3web
Pricing
Paid
Total Runs
407
Active Users
34
Apify Platform

Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.

Learn more about Apify

Need Professional Help?

Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.

Find a Specialist

Trusted by millions | Money-back guarantee | 24/7 Support