Receipt OCR API

by happitap

Receipt OCR API - Multi-Model Text Extraction : Extract structured data from receipt images using advanced OCR technology with support for multiple A...

89 runs

10 users

Try This Actor

Opens on Apify.com

About Receipt OCR API

Receipt OCR API - Multi-Model Text Extraction : Extract structured data from receipt images using advanced OCR technology with support for multiple AI models including Google Vision, OpenAI, Azure, AWS Textract, Gemini, Hugging Face, DeepSeek, and Native OCR.

What does this actor do?

Receipt OCR API is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.

Key Features

Cloud-based execution - no local setup required
Scalable infrastructure for large-scale operations
API access for integration with your applications
Built-in proxy rotation and anti-blocking measures
Scheduled runs and webhooks for automation

How to Use

Click "Try This Actor" to open it on Apify
Create a free Apify account if you don't have one
Configure the input parameters as needed
Run the actor and download your results

Documentation

Receipt OCR API - Multi-Model Text Extraction Extract structured data from receipt images using advanced OCR technology with support for multiple AI models including Google Vision, OpenAI, Azure, AWS Textract, Gemini, Hugging Face, DeepSeek, and Native OCR. ## 🌟 Features ### Multi-Model OCR Support Choose from 8 different OCR engines based on your needs: - Google Vision API - High accuracy, excellent for printed receipts - DeepSeek OCR - Advanced AI-powered text extraction - Amazon Textract - Specialized for document and receipt analysis - Azure AI Vision - Microsoft's computer vision service - OpenAI GPT-4 Vision - State-of-the-art vision model - Hugging Face - Open-source OCR models - Google Gemini - Latest Google multimodal AI - Native (Tesseract.js) - Free, no API key required ### Intelligent Data Extraction - Merchant Information: Name, address, contact details - Transaction Details: Date, time, receipt number - Financial Data: Total amount, subtotal, tax, currency - Line Items: Individual items with prices - Payment Method: Credit card, cash, etc. ### Advanced Features - ✅ Automatic Calculation Verification - Validates totals and tax amounts - 📊 Batch Processing - Process multiple receipts simultaneously - 🔄 Multi-Format Support - JPG, PNG, PDF files - 📋 Structured JSON Output - Machine-readable data format - 🎯 High Accuracy - Advanced parsing algorithms ## 🚀 Quick Start ### Input Configuration `json { "ocrModel": "native", "receiptUrls": [ "https://example.com/receipt1.jpg", "https://example.com/receipt2.png" ], "extractLineItems": true, "verifyCalculations": true, "outputFormat": "detailed" }` ### Required API Keys by Model | Model | Required Fields | |-------|----------------| | Google Vision | `googleVisionApiKey` | | DeepSeek OCR | `deepseekApiKey` | | Amazon Textract | `awsAccessKeyId`, `awsSecretAccessKey`, `awsRegion` | | Azure AI Vision | `azureEndpoint`, `azureApiKey` | | OpenAI | `openaiApiKey` | | Hugging Face | `huggingfaceApiKey` | | Gemini | `geminiApiKey` | | Native | None (uses Tesseract.js) | ## 📋 Input Parameters ### Required Parameters - ocrModel (string) - OCR model to use - Options: `google-vision`, `deepseek-ocr`, `amazon-textract`, `azure-vision`, `openai`, `huggingface`, `gemini`, `native` - Default: `native` - receiptUrls (array) - Array of receipt image URLs - Supports: HTTP/HTTPS URLs, data URLs, Apify key-value store URLs - Formats: JPG, PNG, PDF ### Optional Parameters - extractLineItems (boolean) - Extract individual line items - Default: `true` - verifyCalculations (boolean) - Verify totals and tax calculations - Default: `true` - outputFormat (string) - Output data format - Options: `json` (compact), `detailed` (with metadata) - Default: `detailed` ### API Keys (Model-Specific) Configure the appropriate API keys based on your selected OCR model. See the table above for required fields. ## 📤 Output Format ### Detailed Output Example json { "receiptUrl": "https://example.com/receipt.jpg", "ocrModel": "google-vision", "success": true, "extractedAt": "2024-01-15T10:30:00.000Z", "merchantName": "SuperMart Store", "merchantAddress": "123 Main Street, City, State 12345", "date": "01/15/2024", "time": "10:25 AM", "receiptNumber": "TXN-12345", "currency": "USD", "subtotal": 45.50, "tax": 3.64, "totalAmount": 49.14, "paymentMethod": "Credit Card", "lineItems": [ { "name": "Product A", "price": 15.99 }, { "name": "Product B", "price": 29.51 } ], "calculationVerification": { "isValid": true, "errors": [] }, "metadata": { "confidence": 0.95, "processingTime": 1234, "imageSize": 524288 }, "rawText": "SuperMart Store\n123 Main Street..." } ## 🎯 Use Cases ### Expense Management - Automate receipt data entry for expense reports - Track business expenses in real-time - Integrate with accounting software ### Accounting & Bookkeeping - Digitize paper receipts for record keeping - Verify transaction details automatically - Generate financial reports from receipt data ### E-commerce & Retail - Receipt verification systems - Customer purchase tracking - Warranty and return management ### Fintech Applications - Personal finance tracking apps - Budget management tools - Tax preparation software ## 🔧 Model Comparison | Model | Speed | Accuracy | Cost | Best For | |-------|-------|----------|------|----------| | Native | ⚡⚡⚡ | ⭐⭐⭐ | Free | Testing, low volume | | Google Vision | ⚡⚡ | ⭐⭐⭐⭐⭐ | $$ | High accuracy needs | | Amazon Textract | ⚡⚡ | ⭐⭐⭐⭐⭐ | $$ | Receipt-specific | | OpenAI | ⚡ | ⭐⭐⭐⭐⭐ | $$$ | Complex receipts | | Azure Vision | ⚡⚡ | ⭐⭐⭐⭐ | $$ | Microsoft ecosystem | | Gemini | ⚡⚡ | ⭐⭐⭐⭐ | $$ | Latest AI tech | | DeepSeek | ⚡⚡ | ⭐⭐⭐⭐ | $$ | Alternative to OpenAI | | Hugging Face | ⚡⚡ | ⭐⭐⭐ | $ | Open-source models | ## 🔐 Security & Privacy - All API keys are stored securely as secrets - Images are processed in memory and not permanently stored - Supports private/internal image URLs - GDPR and data privacy compliant ## 💡 Tips for Best Results 1. Image Quality: Use high-resolution, well-lit images 2. Format: Straight, unfolded receipts work best 3. Contrast: Ensure good contrast between text and background 4. Model Selection: - Use Native for testing and low-volume processing - Use Google Vision or Textract for production workloads - Use OpenAI for complex or damaged receipts ## 🐛 Error Handling The actor handles various error scenarios: - Invalid or unreachable image URLs - OCR processing failures - Missing or invalid API keys - Malformed receipt data Each result includes a `success` field and `error` message when applicable. ## 📊 Batch Processing Process multiple receipts in a single run: `json { "ocrModel": "google-vision", "receiptUrls": [ "https://example.com/receipt1.jpg", "https://example.com/receipt2.jpg", "https://example.com/receipt3.jpg" ] }` The actor will process all receipts and provide individual results for each. ## 🔗 Integration ### API Integration `javascript const Apify = require('apify-client'); const client = new Apify.ApifyClient({ token: 'YOUR_API_TOKEN', }); const run = await client.actor('YOUR_ACTOR_ID').call({ ocrModel: 'google-vision', googleVisionApiKey: 'YOUR_GOOGLE_API_KEY', receiptUrls: ['https://example.com/receipt.jpg'], }); const { items } = await client.dataset(run.defaultDatasetId).listItems(); console.log(items);` ### Webhook Integration Configure webhooks to receive results automatically when processing completes. ## 📈 Performance - Processing Speed: 2-10 seconds per receipt (varies by model) - Concurrent Processing: Up to 10 receipts simultaneously - Maximum Image Size: 50MB per image - Supported Formats: JPG, PNG, PDF ## 🆘 Support For issues, questions, or feature requests: - Check the Apify documentation - Review the input schema for parameter details - Ensure API keys are valid and have sufficient quota ## 🔄 Version History ### v1.0.0 - Initial release - Support for 8 OCR models - Intelligent receipt parsing - Batch processing - Calculation verification - Multi-format image support --- Built with ❤️ using Apify Platform

Common Use Cases

Market Research

Gather competitive intelligence and market data

Lead Generation

Extract contact information for sales outreach

Price Monitoring

Track competitor pricing and product changes

Content Aggregation

Collect and organize content from multiple sources

Ready to Get Started?

Try Receipt OCR API now on Apify. Free tier available with no credit card required.

Start Free Trial

Actor Information

Developer: happitap
Pricing: Paid
Total Runs: 89
Active Users: 10

Related Actors

Google Search Results Scraper

by apify

Website Content Crawler

by apify

🔥 Leads Generator - $3/1k 50k leads like Apollo

by microworlds

Video Transcript Scraper: Youtube, X, Facebook, Tiktok, etc.

by invideoiq

Browse All Actors

Apify Platform

Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.

Learn more about Apify

Need Professional Help?

Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.

Find a Specialist

Trusted by millions | Money-back guarantee | 24/7 Support