Markdownify MCP Server

by crawlerbros

Turn any webpage into clean Markdown for AI, knowledge bases, or docs. An MCP server that handles the messy conversion work for you.

11 runs

8 users

Try This Actor

Opens on Apify.com

About Markdownify MCP Server

Ever needed to pull content from a website but got stuck with messy HTML? Markdownify MCP Server is what I use to solve that. It takes any URL and strips it down to clean, readable Markdown, which is exactly the format most AI tools and documentation systems prefer. I built a knowledge base with it last month, and it saved me hours of manual formatting. The core job is simple: conversion. You feed it a webpage, and it gives you back structured Markdown with preserved elements like headings, lists, and links. This makes it incredibly practical for a few key tasks. It's perfect for scraping documentation to create your own local copies, migrating blog content between platforms without losing structure, or preparing clean data for AI models and chatbots. The output is ready to feed into Obsidian, Notion, or your custom scripts. It works as a Model Context Protocol (MCP) server, so it integrates directly into your existing AI and development workflows. You don't have to wrestle with complex parsers or write your own scrubbing logic. For developers building content pipelines or automating research, it just handles the grunt work. Give it a list of URLs, and you've got a batch of formatted content ready to go.

What does this actor do?

Markdownify MCP Server is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.

Key Features

Cloud-based execution - no local setup required
Scalable infrastructure for large-scale operations
API access for integration with your applications
Built-in proxy rotation and anti-blocking measures
Scheduled runs and webhooks for automation

How to Use

Click "Try This Actor" to open it on Apify
Create a free Apify account if you don't have one
Configure the input parameters as needed
Run the actor and download your results

Documentation

Markdownify MCP Server

Converts webpages into clean, formatted Markdown, optimized for AI consumption and content processing workflows.

Overview

This Apify actor takes a list of URLs, fetches the HTML content, and converts it to structured Markdown. It's designed for building knowledge bases, scraping documentation, migrating web content, and preparing data for AI agents and RAG systems. Output is formatted for use with Model Context Protocol (MCP) servers.

Key Features

Web to Markdown Conversion: Transforms HTML from any public or authenticated webpage into clean Markdown.
Precise Content Targeting: Use CSS selectors to include specific page sections or exclude unwanted elements like navbars, footers, or ads.
JavaScript Rendering: Optional Playwright integration handles dynamic, JavaScript-heavy pages.
Authentication Support: Configure HTTP Basic Auth credentials for accessing restricted content.
Customizable Output: Control Markdown heading style (ATX or SETEXT) and specify which HTML tags to strip entirely.
Structured MCP Output: Returns data in a consistent JSON format suitable for AI tools and pipelines.

How to Use

Provide the target URLs and any configuration via the actor's input. The process is:
1. Fetch: Retrieves page content using HTTP or, if enabled, a headless browser (Playwright).
2. Extract: Applies include/exclude CSS selectors to filter the HTML.
3. Convert: Transforms the filtered HTML into Markdown.
4. Output: Saves each result as a separate item in an Apify dataset.

Input Parameters

Required:
* urls (array): List of webpage URLs to convert.

Optional:
* includeSelectors (array): CSS selectors for sections to include (e.g., ["article", ".main-content"]).
* excludeSelectors (array): CSS selectors for sections to exclude (e.g., ["nav", "footer", ".advertisement"]).
* useJavaScript (boolean): Enable Playwright to render JavaScript. Default: false.
* headingStyle (string): "ATX" (# Heading) or "SETEXT". Default: "ATX".
* stripTags (array): HTML tags to remove completely. Default: ["script", "style", "iframe", "noscript"].
* auth (object): HTTP Basic Auth credentials: {"username": "user", "password": "pass"}.
* timeout (integer): Request timeout in seconds (10-120). Default: 30.

Input Example:

{
  "urls": ["https://apify.com/docs?fpr=python_automation", "https://en.wikipedia.org/wiki/Markdown"],
  "excludeSelectors": ["nav", "footer", ".advertisement"],
  "useJavaScript": false,
  "headingStyle": "ATX",
  "timeout": 30
}

Input/Output

Output Format

Each converted page is saved as a JSON object with the following structure:

{
  "url": "https://example.com",
  "title": "Example Domain",
  "markdown": "# Example Domain\n\nThis domain is for use...",
  "markdown_length": 1234,
  "success": true,
  "error": null,
  "scraped_at": "2025-10-24T10:30:00.000Z",
  "meta": {
    "method": "http",
    "heading_style": "ATX",
    "stripped_tags": ["script", "style"],
    "used_include_selectors": false,
    "used_exclude_selectors": true
  }
}

API Integration Example (JavaScript)

const { ApifyClient } = require("apify-client");
const client = new ApifyClient({ token: "YOUR_API_TOKEN" });

const input = {
  urls: ["https://example.com"],
  excludeSelectors: ["nav", "footer"],
};

const run = await client.actor("YOUR_ACTOR_ID").call(input);
const { items } = await client.dataset(run.defaultDatasetId).listItems();

items.forEach((item) => {
  console.log(`Title: ${item.title}`);
  console.log(`Markdown length: ${item.markdown_length}`);
  console.log(item.markdown);
});

Common Use Cases

Market Research

Gather competitive intelligence and market data

Lead Generation

Extract contact information for sales outreach

Price Monitoring

Track competitor pricing and product changes

Content Aggregation

Collect and organize content from multiple sources

Ready to Get Started?

Try Markdownify MCP Server now on Apify. Free tier available with no credit card required.

Start Free Trial

Actor Information

Developer: crawlerbros
Pricing: Paid
Total Runs: 11
Active Users: 8

Related Actors

Fast Website Content Crawler

by 6sigmag

Domain Availability, Expiry, WHOIS, DNS, IP, ASN, 70+ TLD

by datascoutapi

🧾 YouTube Extractor (Transcripts + Metadata)

by dz_omar

Email Verifier by Million Verifier - $1/1k emails

by account56

Browse All Actors

Apify Platform

Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.

Learn more about Apify

Need Professional Help?

Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.

Find a Specialist

Trusted by millions | Money-back guarantee | 24/7 Support

Markdownify MCP Server

About Markdownify MCP Server

What does this actor do?

Key Features

How to Use

Documentation

Markdownify MCP Server

Overview

Key Features

How to Use

Input Parameters

Input/Output

Output Format

API Integration Example (JavaScript)

Categories

Common Use Cases

Market Research

Lead Generation

Price Monitoring

Content Aggregation

Ready to Get Started?

Actor Information

Related Actors

Need Professional Help?