Markdownify MCP Server

Markdownify MCP Server

by crawlerbros

Turn any webpage into clean Markdown for AI, knowledge bases, or docs. An MCP server that handles the messy conversion work for you.

11 runs
8 users
Try This Actor

Opens on Apify.com

About Markdownify MCP Server

Ever needed to pull content from a website but got stuck with messy HTML? Markdownify MCP Server is what I use to solve that. It takes any URL and strips it down to clean, readable Markdown, which is exactly the format most AI tools and documentation systems prefer. I built a knowledge base with it last month, and it saved me hours of manual formatting. The core job is simple: conversion. You feed it a webpage, and it gives you back structured Markdown with preserved elements like headings, lists, and links. This makes it incredibly practical for a few key tasks. It's perfect for scraping documentation to create your own local copies, migrating blog content between platforms without losing structure, or preparing clean data for AI models and chatbots. The output is ready to feed into Obsidian, Notion, or your custom scripts. It works as a Model Context Protocol (MCP) server, so it integrates directly into your existing AI and development workflows. You don't have to wrestle with complex parsers or write your own scrubbing logic. For developers building content pipelines or automating research, it just handles the grunt work. Give it a list of URLs, and you've got a batch of formatted content ready to go.

What does this actor do?

Markdownify MCP Server is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.

Key Features

  • Cloud-based execution - no local setup required
  • Scalable infrastructure for large-scale operations
  • API access for integration with your applications
  • Built-in proxy rotation and anti-blocking measures
  • Scheduled runs and webhooks for automation

How to Use

  1. Click "Try This Actor" to open it on Apify
  2. Create a free Apify account if you don't have one
  3. Configure the input parameters as needed
  4. Run the actor and download your results

Documentation

Markdownify MCP Server

Converts webpages into clean, formatted Markdown, optimized for AI consumption and content processing workflows.

Overview

This Apify actor takes a list of URLs, fetches the HTML content, and converts it to structured Markdown. It's designed for building knowledge bases, scraping documentation, migrating web content, and preparing data for AI agents and RAG systems. Output is formatted for use with Model Context Protocol (MCP) servers.

Key Features

  • Web to Markdown Conversion: Transforms HTML from any public or authenticated webpage into clean Markdown.
  • Precise Content Targeting: Use CSS selectors to include specific page sections or exclude unwanted elements like navbars, footers, or ads.
  • JavaScript Rendering: Optional Playwright integration handles dynamic, JavaScript-heavy pages.
  • Authentication Support: Configure HTTP Basic Auth credentials for accessing restricted content.
  • Customizable Output: Control Markdown heading style (ATX or SETEXT) and specify which HTML tags to strip entirely.
  • Structured MCP Output: Returns data in a consistent JSON format suitable for AI tools and pipelines.

How to Use

Provide the target URLs and any configuration via the actor's input. The process is:
1. Fetch: Retrieves page content using HTTP or, if enabled, a headless browser (Playwright).
2. Extract: Applies include/exclude CSS selectors to filter the HTML.
3. Convert: Transforms the filtered HTML into Markdown.
4. Output: Saves each result as a separate item in an Apify dataset.

Input Parameters

Required:
* urls (array): List of webpage URLs to convert.

Optional:
* includeSelectors (array): CSS selectors for sections to include (e.g., ["article", ".main-content"]).
* excludeSelectors (array): CSS selectors for sections to exclude (e.g., ["nav", "footer", ".advertisement"]).
* useJavaScript (boolean): Enable Playwright to render JavaScript. Default: false.
* headingStyle (string): "ATX" (# Heading) or "SETEXT". Default: "ATX".
* stripTags (array): HTML tags to remove completely. Default: ["script", "style", "iframe", "noscript"].
* auth (object): HTTP Basic Auth credentials: {"username": "user", "password": "pass"}.
* timeout (integer): Request timeout in seconds (10-120). Default: 30.

Input Example:

{
  "urls": ["https://apify.com/docs?fpr=python_automation", "https://en.wikipedia.org/wiki/Markdown"],
  "excludeSelectors": ["nav", "footer", ".advertisement"],
  "useJavaScript": false,
  "headingStyle": "ATX",
  "timeout": 30
}

Input/Output

Output Format

Each converted page is saved as a JSON object with the following structure:

{
  "url": "https://example.com",
  "title": "Example Domain",
  "markdown": "# Example Domain\n\nThis domain is for use...",
  "markdown_length": 1234,
  "success": true,
  "error": null,
  "scraped_at": "2025-10-24T10:30:00.000Z",
  "meta": {
    "method": "http",
    "heading_style": "ATX",
    "stripped_tags": ["script", "style"],
    "used_include_selectors": false,
    "used_exclude_selectors": true
  }
}

API Integration Example (JavaScript)

const { ApifyClient } = require("apify-client");
const client = new ApifyClient({ token: "YOUR_API_TOKEN" });

const input = {
  urls: ["https://example.com"],
  excludeSelectors: ["nav", "footer"],
};

const run = await client.actor("YOUR_ACTOR_ID").call(input);
const { items } = await client.dataset(run.defaultDatasetId).listItems();

items.forEach((item) => {
  console.log(`Title: ${item.title}`);
  console.log(`Markdown length: ${item.markdown_length}`);
  console.log(item.markdown);
});

Categories

Common Use Cases

Market Research

Gather competitive intelligence and market data

Lead Generation

Extract contact information for sales outreach

Price Monitoring

Track competitor pricing and product changes

Content Aggregation

Collect and organize content from multiple sources

Ready to Get Started?

Try Markdownify MCP Server now on Apify. Free tier available with no credit card required.

Start Free Trial

Actor Information

Developer
crawlerbros
Pricing
Paid
Total Runs
11
Active Users
8
Apify Platform

Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.

Learn more about Apify

Need Professional Help?

Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.

Find a Specialist

Trusted by millions | Money-back guarantee | 24/7 Support