Pdf Power Tools

Name: Pdf Power Tools
Author: agenscrape

by agenscrape

Split, merge, compress, convert & OCR PDFs via API. Extract text from scanned documents in 14 languages. Compress files for email, convert pages to PN...

36 runs

5 users

Try This Actor

Opens on Apify.com

About Pdf Power Tools

Split, merge, compress, convert & OCR PDFs via API. Extract text from scanned documents in 14 languages. Compress files for email, convert pages to PNG/JPEG/WebP, split by pages or ranges, merge multiple PDFs. Perfect for document automation & data extraction workflows.

What does this actor do?

Pdf Power Tools is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.

Key Features

Cloud-based execution - no local setup required
Scalable infrastructure for large-scale operations
API access for integration with your applications
Built-in proxy rotation and anti-blocking measures
Scheduled runs and webhooks for automation

How to Use

Click "Try This Actor" to open it on Apify
Create a free Apify account if you don't have one
Configure the input parameters as needed
Run the actor and download your results

Documentation

PDF Power Tools Facing an issue, unexpected error, edge case, or have a feature suggestion? Post it here and we'll address it within 24 hours. ## What is PDF Power Tools? PDF Power Tools is a comprehensive PDF processing API that handles all your PDF manipulation needs in the cloud. Whether you need to split large documents, merge multiple PDFs, compress files for email, extract text from scanned documents using OCR, or convert PDF pages to images - this actor does it all. Perfect for: - Document automation workflows - Process PDFs at scale without local software - Data extraction pipelines - Extract text from scanned invoices, receipts, contracts - Content management systems - Generate thumbnails, compress uploads, split documents - Archival and digitization - OCR historical documents, enhance scanned pages - Web applications - Server-side PDF processing via API ## Features ### Split PDF Break down large PDF documents into smaller, manageable files. Split options include: - Each page separate - Create individual PDFs for every page - By page ranges - Split into custom ranges (e.g., pages 1-10, 11-20, 21-30) - Split in half - Divide document into two equal parts - Extract specific pages - Pull out only the pages you need - By file size - Automatically split when file exceeds size limit ### Merge PDF Combine multiple PDF files into a single document: - Merge unlimited PDFs in sequence - Custom merge order - Interleave pages from multiple documents - Insert pages from one PDF into another at specific positions ### Compress PDF Reduce PDF file size for email attachments, web uploads, or storage optimization: - Low compression - Minimal size reduction, highest quality - Medium compression - Balanced quality and file size (default) - High compression - Maximum size reduction - Screen preset - Optimized for on-screen viewing - Print preset - Optimized for printing quality ### Convert PDF to Images Transform PDF pages into high-quality images: - Output formats: PNG, JPEG, WebP, TIFF - Customizable DPI (72-600) - Convert all pages or specific page selection - Combine all pages into single tall image - Generate thumbnails ### OCR - Text Extraction from Scanned PDFs Extract text from scanned documents, images, and non-searchable PDFs using Tesseract OCR: - 14 supported languages: English, French, German, Spanish, Italian, Portuguese, Dutch, Polish, Russian, Chinese (Simplified & Traditional), Japanese, Korean, Arabic - Image preprocessing for improved accuracy - Confidence scores per page - Word and line count statistics ### Enhance Scanned PDFs Improve readability of scanned documents: - Sharpen blurry text and images - Reduce noise and artifacts - Adjust contrast and brightness - Configurable DPI settings ### Page Manipulation Fine-grained control over PDF pages: - Reorder pages within a document - Remove unwanted pages - Insert pages at specific positions ### PDF Information Analyze PDF files before processing: - Page count and dimensions - File size breakdown - Detect if PDF is scanned or native text - Compression estimate ## Input Options ### Basic Input `json { "operation": "split", "pdfUrl": "https://example.com/document.pdf" }` ### Using Base64 Input `json { "operation": "compress", "pdfBase64": "JVBERi0xLjcKCjEgMCBvYmoK..." }` ## Operation Examples ### Get PDF Information `json { "operation": "info", "pdfUrl": "https://example.com/document.pdf" }` ### Split Into Individual Pages `json { "operation": "split", "pdfUrl": "https://example.com/large-document.pdf", "splitMode": "each_page" }` ### Split By Page Ranges `json { "operation": "split", "pdfUrl": "https://example.com/document.pdf", "splitMode": "ranges", "ranges": ["1-10", "11-20", "21-30"] }` ### Extract Specific Pages `json { "operation": "split", "pdfUrl": "https://example.com/document.pdf", "splitMode": "extract", "pages": [1, 5, 10, 15] }` ### Merge Multiple PDFs `json { "operation": "merge", "pdfUrls": [ "https://example.com/part1.pdf", "https://example.com/part2.pdf", "https://example.com/part3.pdf" ] }` ### Merge With Custom Order `json { "operation": "merge", "pdfUrls": ["doc1.pdf", "doc2.pdf", "doc3.pdf"], "order": [2, 0, 1] }` ### Compress PDF `json { "operation": "compress", "pdfUrl": "https://example.com/large-file.pdf", "compressionPreset": "high" }` ### Convert PDF to PNG Images `json { "operation": "convert", "pdfUrl": "https://example.com/document.pdf", "outputFormat": "png", "dpi": 200, "quality": 95 }` ### Convert Specific Pages to JPEG `json { "operation": "convert", "pdfUrl": "https://example.com/document.pdf", "outputFormat": "jpeg", "pages": [1, 3, 5], "dpi": 150 }` ### OCR - Extract Text from Scanned PDF `json { "operation": "ocr", "pdfUrl": "https://example.com/scanned-document.pdf", "language": "eng", "preprocess": true }` ### OCR in French `json { "operation": "ocr", "pdfUrl": "https://example.com/french-scan.pdf", "language": "fra" }` ### Enhance Scanned Document `json { "operation": "enhance", "pdfUrl": "https://example.com/old-scan.pdf", "sharpen": true, "denoise": true, "contrast": 1.3, "brightness": 1.1 }` ### Generate Thumbnail `json { "operation": "thumbnail", "pdfUrl": "https://example.com/document.pdf", "thumbnailWidth": 300, "outputFormat": "png" }` ### Remove Pages `json { "operation": "merge", "pdfUrl": "https://example.com/document.pdf", "pagesToRemove": [2, 5, 8] }` ### Reorder Pages `json { "operation": "merge", "pdfUrl": "https://example.com/document.pdf", "newPageOrder": [4, 3, 2, 1, 5, 6] }` ## Output Results are saved to the run's Key-Value Store for easy download: | Operation | Output Files | |-----------|-------------| | Split | `page_001.pdf`, `page_002.pdf`, ... or `pages_1-10.pdf`, etc. | | Merge | `merged.pdf` | | Compress | `compressed.pdf` | | Convert | `page_001.png`, `page_002.png`, ... | | OCR | `extracted_text.txt` + Dataset with per-page results | | Enhance | `enhanced.pdf` | | Thumbnail | `thumbnail.png` | ### Sample Output `json { "operation": "compress", "preset": "high", "pageCount": 25, "originalSize": "4.5 MB", "compressedSize": "1.2 MB", "compressionRatio": "73.3%", "outputKey": "compressed.pdf" }` ## Supported Languages for OCR | Code | Language | |------|----------| | `eng` | English | | `fra` | French | | `deu` | German | | `spa` | Spanish | | `ita` | Italian | | `por` | Portuguese | | `nld` | Dutch | | `pol` | Polish | | `rus` | Russian | | `chi_sim` | Chinese (Simplified) | | `chi_tra` | Chinese (Traditional) | | `jpn` | Japanese | | `kor` | Korean | | `ara` | Arabic | ## Compression Presets | Preset | Image Quality | Best For | |--------|--------------|----------| | `low` | 90% | Archives, legal documents | | `medium` | 75% | General use, email | | `high` | 50% | Web uploads, storage saving | | `screen` | 60% | On-screen viewing | | `print` | 85% | Print-quality output | ## Pricing | Event | Price | Description | |-------|-------|-------------| | `pdf-loaded` | $0.005 | Each PDF loaded from URL or base64 | | `page-enhanced` | $0.01 | Each page enhanced (sharpen, denoise) | | `page-processed` | $0.002 | Each page processed (split, merge, compress) | | `ocr-page` | $0.02 | Each page with OCR text extraction | | `pdf-compressed` | $0.01 | PDF compression completed | | `page-converted` | $0.005 | Each page converted to image | | `pdf-merged` | $0.01 | PDF merge operation completed | | `metadata-extracted` | $0.005 | PDF info/metadata extraction | | `text-extracted` | $0.005 | Text extraction completed | ## Use Cases - Invoice Processing - Extract data from scanned invoices using OCR - Document Splitting - Break down large reports into chapters - PDF Compression - Reduce file size for email attachments - Image Generation - Create thumbnails for document previews - Document Merging - Combine multiple contracts into one file - Archival - Enhance and OCR historical scanned documents - Web Publishing - Convert PDF pages to web-friendly images - Data Extraction - Pull text from non-searchable PDFs

Common Use Cases

Market Research

Gather competitive intelligence and market data

Lead Generation

Extract contact information for sales outreach

Price Monitoring

Track competitor pricing and product changes

Content Aggregation

Collect and organize content from multiple sources

Ready to Get Started?

Try Pdf Power Tools now on Apify. Free tier available with no credit card required.

Start Free Trial

Actor Information

Developer: agenscrape
Pricing: Paid
Total Runs: 36
Active Users: 5

Related Actors

Web Scraper

by apify

Cheerio Scraper

by apify

Website Content Crawler

by apify

Legacy PhantomJS Crawler

by apify

Browse All Actors

Apify Platform

Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.

Learn more about Apify

Need Professional Help?

Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.

Find a Specialist

Trusted by millions | Money-back guarantee | 24/7 Support

Pdf Power Tools

About Pdf Power Tools

What does this actor do?

Key Features

How to Use

Documentation

Categories

Common Use Cases

Market Research

Lead Generation

Price Monitoring

Content Aggregation

Ready to Get Started?

Actor Information

Related Actors

Need Professional Help?