Bluesky & Mastodon Scraper - Decentralized Social Media
by barrierefix
Extract posts, users, and threads from Bluesky & Mastodon. The straightforward scraper for decentralized social media data, built for research, monitoring, and AI training.
Opens on Apify.com
About Bluesky & Mastodon Scraper - Decentralized Social Media
Need to track conversations on Bluesky or Mastodon? It's a different ballgame compared to scraping Twitter or Reddit. Their decentralized nature—the AT Protocol and the Fediverse—makes it tricky to get clean, structured data at scale. That's exactly why I built this scraper. It handles the technical quirks of both networks so you don't have to, letting you extract posts, user profiles, and threads with a simple configuration. Whether you're keeping an eye on brand mentions, gathering data for a research project, or building a dataset to train an AI model, this tool pulls it all into a consistent, usable format like JSON or CSV. I use it myself for social listening in niches where these platforms are booming, and for collecting real-time public posts for sentiment analysis. It saves you the headache of wrestling with different APIs and rate limits, delivering the data where you need it, whether that's a webhook, Google Sheets, or your own database. If you're working with decentralized social media, this is the reliable data pipeline you've been looking for.
What does this actor do?
Bluesky & Mastodon Scraper - Decentralized Social Media is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.
Key Features
- Cloud-based execution - no local setup required
- Scalable infrastructure for large-scale operations
- API access for integration with your applications
- Built-in proxy rotation and anti-blocking measures
- Scheduled runs and webhooks for automation
How to Use
- Click "Try This Actor" to open it on Apify
- Create a free Apify account if you don't have one
- Configure the input parameters as needed
- Run the actor and download your results
Documentation
Bluesky & Mastodon Scraper
Overview
An Apify actor that collects public posts from decentralized social networks—Bluesky (AT Protocol) and Mastodon (Fediverse)—through a single API. It returns data in a unified JSON format, suitable for social listening, research, or data pipelines. You only pay for the posts collected.
Key Features
- Multi-Platform Scraping: Collect from Bluesky and Mastodon simultaneously.
- Flexible Search: Find posts by keyword or track specific user handles.
- Time-Based Filtering: Set
sinceanduntilparameters for historical data. - Real-Time Updates: Optionally send new posts directly to your endpoint via webhooks (
emitWebhooks). - Normalized Output: A consistent data schema across both platforms.
- Cost Control: Pay-per-post pricing model.
How to Use
Configure the actor with a JSON input. Here are common patterns:
Search by keyword across platforms:
{
"platforms": ["bluesky", "mastodon"],
"query": "artificial intelligence",
"maxItems": 100,
"languages": ["en"]
}
Track specific users:
{
"platforms": ["bluesky", "mastodon"],
"handles": ["jay.bsky.social", "@gargron@mastodon.social"],
"maxItems": 500
}
Historical data with a date range:
{
"platforms": ["bluesky"],
"query": "climate change",
"since": "2025-09-01T00:00:00Z",
"until": "2025-10-01T00:00:00Z",
"maxItems": 1000
}
Input/Output
Key Input Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
platforms |
Array | Yes | ["bluesky"], ["mastodon"], or both. |
query |
String | No | Keywords to search for. |
handles |
Array | No | Specific user handles to track. |
since / until |
String | No | ISO 8601 date strings for filtering. |
maxItems |
Integer | No | Maximum posts to collect (default: 1000). |
languages |
Array | No | Filter by BCP-47 language codes (e.g., ["en", "de"]). |
emitWebhooks |
Boolean | No | Set to true to enable webhook delivery. |
webhooks |
Array | No | Configure your webhook endpoints. |
blueskyCredentials |
Object | No | Optional for higher rate limits. |
mastodonInstances |
Array | No | Specific Mastodon instances to query. |
Output
The actor outputs a dataset of items, each representing a normalized post. Every item includes core fields like text, authorHandle, platform, timestamp, url, and metrics (like/repost counts). Media attachments and post metadata are also included.
Platform-Specific Notes:
* Bluesky: Supports keyword search, user feeds, and includes data for quotes, replies, and reposts.
* Mastodon: Supports search across multiple instances and includes boosts, replies, and media alt text.
Categories
Common Use Cases
Market Research
Gather competitive intelligence and market data
Lead Generation
Extract contact information for sales outreach
Price Monitoring
Track competitor pricing and product changes
Content Aggregation
Collect and organize content from multiple sources
Ready to Get Started?
Try Bluesky & Mastodon Scraper - Decentralized Social Media now on Apify. Free tier available with no credit card required.
Start Free TrialActor Information
- Developer
- barrierefix
- Pricing
- Paid
- Total Runs
- 154
- Active Users
- 5
Related Actors
🏯 Tweet Scraper V2 - X / Twitter Scraper
by apidojo
Instagram Scraper
by apify
TikTok Scraper
by clockworks
Instagram Profile Scraper
by apify
Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.
Learn more about ApifyNeed Professional Help?
Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.
Trusted by millions | Money-back guarantee | 24/7 Support