Markdownify MCP Server
by crawlerbros
Turn any webpage into clean Markdown for AI, knowledge bases, or docs. An MCP server that handles the messy conversion work for you.
Opens on Apify.com
About Markdownify MCP Server
Ever needed to pull content from a website but got stuck with messy HTML? Markdownify MCP Server is what I use to solve that. It takes any URL and strips it down to clean, readable Markdown, which is exactly the format most AI tools and documentation systems prefer. I built a knowledge base with it last month, and it saved me hours of manual formatting. The core job is simple: conversion. You feed it a webpage, and it gives you back structured Markdown with preserved elements like headings, lists, and links. This makes it incredibly practical for a few key tasks. It's perfect for scraping documentation to create your own local copies, migrating blog content between platforms without losing structure, or preparing clean data for AI models and chatbots. The output is ready to feed into Obsidian, Notion, or your custom scripts. It works as a Model Context Protocol (MCP) server, so it integrates directly into your existing AI and development workflows. You don't have to wrestle with complex parsers or write your own scrubbing logic. For developers building content pipelines or automating research, it just handles the grunt work. Give it a list of URLs, and you've got a batch of formatted content ready to go.
What does this actor do?
Markdownify MCP Server is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.
Key Features
- Cloud-based execution - no local setup required
- Scalable infrastructure for large-scale operations
- API access for integration with your applications
- Built-in proxy rotation and anti-blocking measures
- Scheduled runs and webhooks for automation
How to Use
- Click "Try This Actor" to open it on Apify
- Create a free Apify account if you don't have one
- Configure the input parameters as needed
- Run the actor and download your results
Documentation
Markdownify MCP Server
Converts webpages into clean, formatted Markdown, optimized for AI consumption and content processing workflows.
Overview
This Apify actor takes a list of URLs, fetches the HTML content, and converts it to structured Markdown. It's designed for building knowledge bases, scraping documentation, migrating web content, and preparing data for AI agents and RAG systems. Output is formatted for use with Model Context Protocol (MCP) servers.
Key Features
- Web to Markdown Conversion: Transforms HTML from any public or authenticated webpage into clean Markdown.
- Precise Content Targeting: Use CSS selectors to
includespecific page sections orexcludeunwanted elements like navbars, footers, or ads. - JavaScript Rendering: Optional Playwright integration handles dynamic, JavaScript-heavy pages.
- Authentication Support: Configure HTTP Basic Auth credentials for accessing restricted content.
- Customizable Output: Control Markdown heading style (
ATXorSETEXT) and specify which HTML tags to strip entirely. - Structured MCP Output: Returns data in a consistent JSON format suitable for AI tools and pipelines.
How to Use
Provide the target URLs and any configuration via the actor's input. The process is:
1. Fetch: Retrieves page content using HTTP or, if enabled, a headless browser (Playwright).
2. Extract: Applies include/exclude CSS selectors to filter the HTML.
3. Convert: Transforms the filtered HTML into Markdown.
4. Output: Saves each result as a separate item in an Apify dataset.
Input Parameters
Required:
* urls (array): List of webpage URLs to convert.
Optional:
* includeSelectors (array): CSS selectors for sections to include (e.g., ["article", ".main-content"]).
* excludeSelectors (array): CSS selectors for sections to exclude (e.g., ["nav", "footer", ".advertisement"]).
* useJavaScript (boolean): Enable Playwright to render JavaScript. Default: false.
* headingStyle (string): "ATX" (# Heading) or "SETEXT". Default: "ATX".
* stripTags (array): HTML tags to remove completely. Default: ["script", "style", "iframe", "noscript"].
* auth (object): HTTP Basic Auth credentials: {"username": "user", "password": "pass"}.
* timeout (integer): Request timeout in seconds (10-120). Default: 30.
Input Example:
{
"urls": ["https://apify.com/docs?fpr=python_automation", "https://en.wikipedia.org/wiki/Markdown"],
"excludeSelectors": ["nav", "footer", ".advertisement"],
"useJavaScript": false,
"headingStyle": "ATX",
"timeout": 30
}
Input/Output
Output Format
Each converted page is saved as a JSON object with the following structure:
{
"url": "https://example.com",
"title": "Example Domain",
"markdown": "# Example Domain\n\nThis domain is for use...",
"markdown_length": 1234,
"success": true,
"error": null,
"scraped_at": "2025-10-24T10:30:00.000Z",
"meta": {
"method": "http",
"heading_style": "ATX",
"stripped_tags": ["script", "style"],
"used_include_selectors": false,
"used_exclude_selectors": true
}
}
API Integration Example (JavaScript)
const { ApifyClient } = require("apify-client");
const client = new ApifyClient({ token: "YOUR_API_TOKEN" });
const input = {
urls: ["https://example.com"],
excludeSelectors: ["nav", "footer"],
};
const run = await client.actor("YOUR_ACTOR_ID").call(input);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
console.log(`Title: ${item.title}`);
console.log(`Markdown length: ${item.markdown_length}`);
console.log(item.markdown);
});
Categories
Common Use Cases
Market Research
Gather competitive intelligence and market data
Lead Generation
Extract contact information for sales outreach
Price Monitoring
Track competitor pricing and product changes
Content Aggregation
Collect and organize content from multiple sources
Ready to Get Started?
Try Markdownify MCP Server now on Apify. Free tier available with no credit card required.
Start Free TrialActor Information
- Developer
- crawlerbros
- Pricing
- Paid
- Total Runs
- 11
- Active Users
- 8
Related Actors
Fast Website Content Crawler
by 6sigmag
Domain Availability, Expiry, WHOIS, DNS, IP, ASN, 70+ TLD
by datascoutapi
🧾 YouTube Extractor (Transcripts + Metadata)
by dz_omar
Email Verifier by Million Verifier - $1/1k emails
by account56
Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.
Learn more about ApifyNeed Professional Help?
Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.
Trusted by millions | Money-back guarantee | 24/7 Support