Bluesky & Mastodon Scraper - Decentralized Social Media

Bluesky & Mastodon Scraper - Decentralized Social Media

by barrierefix

Extract posts, users, and threads from Bluesky & Mastodon. The straightforward scraper for decentralized social media data, built for research, monitoring, and AI training.

154 runs
5 users
Try This Actor

Opens on Apify.com

About Bluesky & Mastodon Scraper - Decentralized Social Media

Need to track conversations on Bluesky or Mastodon? It's a different ballgame compared to scraping Twitter or Reddit. Their decentralized nature—the AT Protocol and the Fediverse—makes it tricky to get clean, structured data at scale. That's exactly why I built this scraper. It handles the technical quirks of both networks so you don't have to, letting you extract posts, user profiles, and threads with a simple configuration. Whether you're keeping an eye on brand mentions, gathering data for a research project, or building a dataset to train an AI model, this tool pulls it all into a consistent, usable format like JSON or CSV. I use it myself for social listening in niches where these platforms are booming, and for collecting real-time public posts for sentiment analysis. It saves you the headache of wrestling with different APIs and rate limits, delivering the data where you need it, whether that's a webhook, Google Sheets, or your own database. If you're working with decentralized social media, this is the reliable data pipeline you've been looking for.

What does this actor do?

Bluesky & Mastodon Scraper - Decentralized Social Media is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.

Key Features

  • Cloud-based execution - no local setup required
  • Scalable infrastructure for large-scale operations
  • API access for integration with your applications
  • Built-in proxy rotation and anti-blocking measures
  • Scheduled runs and webhooks for automation

How to Use

  1. Click "Try This Actor" to open it on Apify
  2. Create a free Apify account if you don't have one
  3. Configure the input parameters as needed
  4. Run the actor and download your results

Documentation

Bluesky & Mastodon Scraper

Overview

An Apify actor that collects public posts from decentralized social networks—Bluesky (AT Protocol) and Mastodon (Fediverse)—through a single API. It returns data in a unified JSON format, suitable for social listening, research, or data pipelines. You only pay for the posts collected.

Key Features

  • Multi-Platform Scraping: Collect from Bluesky and Mastodon simultaneously.
  • Flexible Search: Find posts by keyword or track specific user handles.
  • Time-Based Filtering: Set since and until parameters for historical data.
  • Real-Time Updates: Optionally send new posts directly to your endpoint via webhooks (emitWebhooks).
  • Normalized Output: A consistent data schema across both platforms.
  • Cost Control: Pay-per-post pricing model.

How to Use

Configure the actor with a JSON input. Here are common patterns:

Search by keyword across platforms:

{
  "platforms": ["bluesky", "mastodon"],
  "query": "artificial intelligence",
  "maxItems": 100,
  "languages": ["en"]
}

Track specific users:

{
  "platforms": ["bluesky", "mastodon"],
  "handles": ["jay.bsky.social", "@gargron@mastodon.social"],
  "maxItems": 500
}

Historical data with a date range:

{
  "platforms": ["bluesky"],
  "query": "climate change",
  "since": "2025-09-01T00:00:00Z",
  "until": "2025-10-01T00:00:00Z",
  "maxItems": 1000
}

Input/Output

Key Input Parameters

Parameter Type Required Description
platforms Array Yes ["bluesky"], ["mastodon"], or both.
query String No Keywords to search for.
handles Array No Specific user handles to track.
since / until String No ISO 8601 date strings for filtering.
maxItems Integer No Maximum posts to collect (default: 1000).
languages Array No Filter by BCP-47 language codes (e.g., ["en", "de"]).
emitWebhooks Boolean No Set to true to enable webhook delivery.
webhooks Array No Configure your webhook endpoints.
blueskyCredentials Object No Optional for higher rate limits.
mastodonInstances Array No Specific Mastodon instances to query.

Output

The actor outputs a dataset of items, each representing a normalized post. Every item includes core fields like text, authorHandle, platform, timestamp, url, and metrics (like/repost counts). Media attachments and post metadata are also included.

Platform-Specific Notes:
* Bluesky: Supports keyword search, user feeds, and includes data for quotes, replies, and reposts.
* Mastodon: Supports search across multiple instances and includes boosts, replies, and media alt text.

Common Use Cases

Market Research

Gather competitive intelligence and market data

Lead Generation

Extract contact information for sales outreach

Price Monitoring

Track competitor pricing and product changes

Content Aggregation

Collect and organize content from multiple sources

Ready to Get Started?

Try Bluesky & Mastodon Scraper - Decentralized Social Media now on Apify. Free tier available with no credit card required.

Start Free Trial

Actor Information

Developer
barrierefix
Pricing
Paid
Total Runs
154
Active Users
5
Apify Platform

Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.

Learn more about Apify

Need Professional Help?

Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.

Find a Specialist

Trusted by millions | Money-back guarantee | 24/7 Support