🔥 FireScrape AI Website Content Markdown Scraper
by mohamedgb00714
Advanced web scraper powered by Crawlee and Puppeteer — extracts website content, converts it to Markdown, and structures it for LLM training datasets...
Opens on Apify.com
About 🔥 FireScrape AI Website Content Markdown Scraper
Advanced web scraper powered by Crawlee and Puppeteer — extracts website content, converts it to Markdown, and structures it for LLM training datasets.
What does this actor do?
🔥 FireScrape AI Website Content Markdown Scraper is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.
Key Features
- Cloud-based execution - no local setup required
- Scalable infrastructure for large-scale operations
- API access for integration with your applications
- Built-in proxy rotation and anti-blocking measures
- Scheduled runs and webhooks for automation
How to Use
- Click "Try This Actor" to open it on Apify
- Create a free Apify account if you don't have one
- Configure the input parameters as needed
- Run the actor and download your results
Documentation
🔥 FireScrape AI Website Content Markdown Scraper ## Overview FireScrape is a powerful web scraper built with Crawlee and Puppeteer. It crawls websites, extracts content, converts it into Markdown format, and structures the data — perfect for generating datasets for LLMs. --- ## 🎯 Features - Extracts visible text or full HTML content - Converts content to Markdown - Captures screenshots - Supports proxy configurations - Follows links for deep crawling --- ## 🛠️ Input Schema json { "title": "FireScrape Input Schema", "type": "object", "schemaVersion": 1, "properties": { "startUrls": { "title": "Start URLs", "type": "array", "description": "List of URLs to start crawling from.", "editor": "requestListSources", "prefill": [{ "url": "https://apify.com" }] }, "maxPages": { "title": "Maximum Pages", "type": "integer", "description": "The maximum number of pages to crawl.", "default": 50, "minimum": 1 }, "proxyConfig": { "title": "Proxy Configuration", "type": "object", "description": "Select proxy settings.", "editor": "proxy", "default": { "useApifyProxy": true } }, "screenshot": { "title": "Take Screenshots", "type": "boolean", "description": "Enable this to capture a screenshot of each page.", "default": true }, "enqueue": { "title": "Enqueue Links", "type": "boolean", "description": "Whether to follow and enqueue new links on the page.", "default": true }, "getText": { "title": "Extract Text Content", "type": "boolean", "description": "Extract only the visible text content from the page.", "default": false }, "getHtml": { "title": "Extract HTML Content", "type": "boolean", "description": "Extract the full HTML content of the page.", "default": false } }, "required": ["startUrls"] } --- ## ✅ Output Format Each successfully scraped page will output a structured JSON object: json { "url": "https://example.com", "title": "Example Page", "metadata": { "description": "An example page", "keywords": ["example", "page"] }, "markdown": "# Example Page\n\nThis is an example page content...", "textContent": "This is an example page content...", "htmlContent": "<html><body><h1>Example Page</h1>...</body></html>", "screenshot": "data:image/png;base64,iVBORw..." } --- ## 🚀 How to Run 1. Deploy the actor on Apify. 2. Input the desired URLs and configuration. 3. Start the scraper and monitor progress. 4. Download results as JSON or Markdown. --- ## 🔧 Customization Feel free to extend FireScrape with additional features — like handling dynamic content, authentication, or specialized formatting. --- ## 🎁 Bonus: n8n Workflow Integration As a free bonus for using FireScrape, you can integrate these n8n workflows with this actor: - Instagram Automation Suite - Automated YouTube Leads These workflows can help automate post-scraping actions and expand your automation capabilities. Happy scraping! 🚀🔥
Categories
Common Use Cases
Market Research
Gather competitive intelligence and market data
Lead Generation
Extract contact information for sales outreach
Price Monitoring
Track competitor pricing and product changes
Content Aggregation
Collect and organize content from multiple sources
Ready to Get Started?
Try 🔥 FireScrape AI Website Content Markdown Scraper now on Apify. Free tier available with no credit card required.
Start Free TrialActor Information
- Developer
- mohamedgb00714
- Pricing
- Paid
- Total Runs
- 16,048
- Active Users
- 221
Related Actors
Google Search Results Scraper
by apify
Website Content Crawler
by apify
🔥 Leads Generator - $3/1k 50k leads like Apollo
by microworlds
Video Transcript Scraper: Youtube, X, Facebook, Tiktok, etc.
by invideoiq
Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.
Learn more about ApifyNeed Professional Help?
Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.
Trusted by millions | Money-back guarantee | 24/7 Support