Universal Document Format Transformer
by actorify
Universal Document Format Transformer: a cloud-based Apify Actor that converts documents (PDF, DOCX, PPTX, HTML, TXT) into Markdown, JSON, CSV, HTML o...
Opens on Apify.com
About Universal Document Format Transformer
Universal Document Format Transformer: a cloud-based Apify Actor that converts documents (PDF, DOCX, PPTX, HTML, TXT) into Markdown, JSON, CSV, HTML or TXT using Pandoc. Easy REST API for automations (n8n, Zapier, Make), production-ready error handling, and security controls.
What does this actor do?
Universal Document Format Transformer is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.
Key Features
- Cloud-based execution - no local setup required
- Scalable infrastructure for large-scale operations
- API access for integration with your applications
- Built-in proxy rotation and anti-blocking measures
- Scheduled runs and webhooks for automation
How to Use
- Click "Try This Actor" to open it on Apify
- Create a free Apify account if you don't have one
- Configure the input parameters as needed
- Run the actor and download your results
Documentation
Universal Document Format Transformer > Convert documents between formats instantly without installing any software. Just provide a URL and get your converted file in seconds. ## 🚀 Get Started in 30 Seconds 1. Go to the Actor: Universal Document Format Transformer 2. Click "Run Actor" 3. Enter your file details: json { "fileUrl": "https://example.com/your-document.docx", "fromFormat": "docx", "toFormat": "markdown" } 4. Get your converted file - Download link appears in results That's it! No software to install, no API keys needed for basic use. ## ✨ What You Can Do - 🔄 Convert DOCX to Markdown - Perfect for GitHub documentation - 📊 Extract tables from HTML to CSV - Great for data analysis - 📄 Transform PPTX to PDF - Ideal for sharing presentations - 📝 Convert TXT to HTML - Useful for web publishing - 📋 Process multiple formats - DOCX, PPTX, HTML, TXT → Markdown, JSON, CSV, HTML, TXT, PDF ## 📋 Supported Formats ### What You Can Convert From (Input) | Format | Best For | Example Files | |--------|----------|---------------| | DOCX | Reports, articles, documentation | *.docx | | PPTX | Presentations, slides, training | *.pptx | | HTML | Web pages, online articles | *.html, *.htm | | TXT | Plain text, simple data | *.txt | ### What You Can Convert To (Output) | Format | Perfect For | When to Use | |--------|--------------|-------------| | Markdown | GitHub docs, technical writing | Converting Word docs for code repositories | | JSON | Data processing, APIs | Extracting structured content | | CSV | Spreadsheets, data analysis | Pulling tables from web pages | | HTML | Web publishing, email | Converting docs for websites | | TXT | Simple text, logging | Extracting plain text from any format | | PDF | Sharing, printing | Final document distribution | ### ⚠️ Important: PDF Limitations - ❌ Cannot convert FROM PDF - This is a technical limitation - ✅ Can convert TO PDF - Perfect for final output - 💡 Workaround: Convert PDF to HTML online first, then use this actor ## 💡 Popular Use Cases ### 📝 Content Teams & Bloggers Convert Word documents to Markdown for GitHub json { "fileUrl": "https://example.com/blog-post.docx", "fromFormat": "docx", "toFormat": "markdown" } Perfect for: - Technical documentation - GitHub README files - Markdown-based blogs - Documentation sites ### 📊 Data Analysts & Researchers Extract tables from web pages to CSV json { "fileUrl": "https://example.com/financial-report.html", "fromFormat": "html", "toFormat": "csv" } Perfect for: - Financial data extraction - Research data processing - Spreadsheet analysis - Data import to Excel ### 🎯 Product Managers & Business Users Convert presentations to PDF for sharing json { "fileUrl": "https://example.com/presentation.pptx", "fromFormat": "pptx", "toFormat": "pdf" } Perfect for: - Client presentations - Training materials - Meeting handouts - Document distribution ### 🤖 Automation Builders Process text files to structured data json { "fileUrl": "https://example.com/data.txt", "fromFormat": "txt", "toFormat": "json" } Perfect for: - n8n workflows - Zapier automations - Make.com integrations - API data processing ## 📝 How to Use ### Step 1: Prepare Your File URL Your file must be: - ✅ Publicly accessible (no login required) - ✅ Direct link to file (not a web page) - ✅ Under 50MB in size - ✅ HTTP or HTTPS protocol Good URLs: https://example.com/report.docx https://cdn.example.com/files/presentation.pptx https://storage.googleapis.com/bucket/document.html Bad URLs: https://drive.google.com/file/d/123/view (requires login) https://example.com/page.html (web page, not file) ftp://example.com/file.docx (wrong protocol) ### Step 2: Choose Your Formats Check format compatibility: | From \ To | Markdown | JSON | CSV | HTML | TXT | PDF | |------------|:--------:|:-----:|:----:|:----:|:---:|:---:| | DOCX | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | PPTX | ✅ | ✅ | ⚠️ | ✅ | ✅ | ✅ | | HTML | ✅ | ⚠️ | ⚠️ | ✅ | ✅ | ✅ | | TXT | ✅ | ⚠️ | ⚠️ | ✅ | ✅ | ✅ | Legend: - ✅ Excellent - High quality conversion - ⚠️ Limited - Works but may lose some formatting ### Step 3: Run the Conversion Option A: Use Apify Web Interface (Easiest) 1. Go to Actor Page 2. Click "Run Actor" 3. Enter your JSON input 4. Click "Run" 5. Download your converted file Option B: Use API (For Automation) bash curl -X POST "https://api.apify.com/v2/acts/WgRQY2Ta2VKQE5NgO/runs?token=YOUR_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "fileUrl": "https://example.com/document.docx", "fromFormat": "docx", "toFormat": "markdown" }' ## 📤 What You Get Back After conversion, you'll receive: json { "downloadUrl": "https://api.apify.com/v2/key-value-stores/...", "inputFormat": "docx", "outputFormat": "markdown", "fileSize": 12345, "processingTime": 2.5, "status": "success" } What each field means: - downloadUrl: Link to download your converted file (works for 7 days) - inputFormat: The format we detected from your file - outputFormat: The format you requested - fileSize: Size of your converted file in bytes - processingTime: How long the conversion took - status: "success" or "error" ## 🚨 Common Problems & Solutions ### ❌ "Invalid URL format" Problem: Your URL doesn't work Solution: - Check URL starts with http:// or https:// - Test the URL in your browser first - Make sure it's a direct file link, not a web page ### ❌ "File not found" Problem: The file doesn't exist or moved Solution: - Verify the URL is correct - Check if the file was deleted or moved - Try uploading the file again ### ❌ "Access denied" Problem: File requires login or permission Solution: - Use a publicly accessible file - Upload to public cloud storage (Google Drive, Dropbox, etc.) - Make sure sharing permissions allow public access ### ❌ "Unsupported input format" Problem: You tried to convert from PDF Solution: - PDF cannot be used as input (technical limitation) - Convert PDF to HTML first using online tools - Then use this actor to convert HTML to your desired format ### ⏰ "Conversion timed out" Problem: File is too large or complex Solution: - Keep files under 50MB - Try a simpler output format - Split large documents into smaller parts ## 💡 Pro Tips ### 🎯 For Best Results 1. Test with small files first - Make sure everything works 2. Choose the right format combination - Check the compatibility matrix 3. Use direct file URLs - Avoid web pages that require login 4. Check file size - Keep under 50MB for reliable processing ### 🔗 Getting File URLs Google Drive: 1. Right-click file → "Share" 2. Set to "Anyone with the link can view" 3. Copy link and change .../view?usp=sharing to .../uc?export=download Dropbox: 1. Right-click file → "Share" 2. Create link with "Can edit" permissions 3. Copy the direct download link OneDrive: 1. Right-click file → "Share" 2. Set to "Anyone with the link can view" 3. Copy the link and ensure it's a direct download URL S3/Cloud Storage: 1. Set bucket/object to public read 2. Generate pre-signed URL if needed 3. Ensure URL points directly to the file ### ⚡ Speed Tips - TXT files convert fastest - Use when possible - DOCX to Markdown is very reliable - Great for documentation - HTML to TXT preserves text well - Good for content extraction - Simple conversions work best - Avoid complex format chains ## 🔧 Advanced Configuration ### For Power Users If you're using this in automation, you can adjust these settings: | Setting | Default | What it Does | |---------|----------|---------------| | File Size Limit | 50MB | Maximum input file size | | Timeout | 60 seconds | Maximum conversion time | | Retry Attempts | 3 | How many times to retry failed downloads | ### API Usage For high-volume usage, consider: - Batch processing - Process multiple files sequentially - Error handling - Check status before processing next file - Download timing - Files are available for 7 days only ## 🆘 Need Help? ### Quick Troubleshooting 1. File not working? - Try the URL in your browser first 2. Conversion failed? - Check if format combination is supported 3. Taking too long? - File might be too large or complex 4. Wrong output? - Verify your fromFormat matches the actual file type ### Get Support - 📖 Actor Page: View on Apify - 🐛 Report Issues: GitHub Issues - 💬 Community: Apify Forum - 📧 Direct Help: Contact through Apify platform ## 🎉 Ready to Start? ▶️ Run the Actor Now No registration required for basic use. Free tier includes processing credits. --- ⭐ Rate this actor if it helped you! Made with ❤️ for easy document conversion
Categories
Common Use Cases
Market Research
Gather competitive intelligence and market data
Lead Generation
Extract contact information for sales outreach
Price Monitoring
Track competitor pricing and product changes
Content Aggregation
Collect and organize content from multiple sources
Ready to Get Started?
Try Universal Document Format Transformer now on Apify. Free tier available with no credit card required.
Start Free TrialActor Information
- Developer
- actorify
- Pricing
- Paid
- Total Runs
- 110
- Active Users
- 2
Related Actors
Video Transcript Scraper: Youtube, X, Facebook, Tiktok, etc.
by invideoiq
Linkedin Profile Details Scraper + EMAIL (No Cookies Required)
by apimaestro
Twitter (X.com) Scraper Unlimited: No Limits
by apidojo
Content Checker
by jakubbalada
Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.
Learn more about ApifyNeed Professional Help?
Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.
Trusted by millions | Money-back guarantee | 24/7 Support