Substack Posts Scraper 📚

Substack Posts Scraper 📚

by easyapi

Scrape Substack posts and articles by keywords. Extract comprehensive post data including title, author, publication details, podcast information, rea...

369 runs
62 users
Try This Actor

Opens on Apify.com

About Substack Posts Scraper 📚

Scrape Substack posts and articles by keywords. Extract comprehensive post data including title, author, publication details, podcast information, reactions, and more. Perfect for content analysis and research.

What does this actor do?

Substack Posts Scraper 📚 is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.

Key Features

  • Cloud-based execution - no local setup required
  • Scalable infrastructure for large-scale operations
  • API access for integration with your applications
  • Built-in proxy rotation and anti-blocking measures
  • Scheduled runs and webhooks for automation

How to Use

  1. Click "Try This Actor" to open it on Apify
  2. Create a free Apify account if you don't have one
  3. Configure the input parameters as needed
  4. Run the actor and download your results

Documentation

Substack Posts Scraper 📚 Powerful scraper for extracting posts and articles from Substack based on keywords. Get detailed information about posts, publications, and authors with advanced search capabilities. ## Features ✨ - 🔍 Search posts by keywords - 📊 Extract comprehensive post metadata - 🎙️ Support for podcast episodes data - 👥 Get author and publication details - ❤️ Capture engagement metrics (reactions, comments) - 🔄 Auto-scrolling for pagination - ⚡ High-performance with Puppeteer - 🛡️ Built-in anti-detection mechanisms ## Output Data Structure 📋 The actor provides rich post data including: - Post title, subtitle, and description - Publication details - Author information - Podcast episode data (if applicable) - Cover images and media - Engagement metrics - Tags and categories - Publication timestamps - And much more! ## Usage 💡 Simply provide: 1. Keywords to search for 2. Maximum number of items to scrape (optional) The actor will automatically: - Search Substack for your keywords - Scroll through results - Extract detailed post information - Handle pagination - Export structured JSON data ## Use Cases 🎯 - Content Research - Market Analysis - Topic Monitoring - Audience Engagement Analysis - Content Aggregation - Newsletter Analytics - Competitive Analysis ## Limitations ⚠️ - Respects Substack's terms of service - Public posts only - Rate limiting applied for stability ### Input Example A full explanation of an input example in JSON. json { "keywords": [ "ai" ], "maxItems": 50 } ### Output sample The results will be wrapped into a dataset which you can always find in the Storage tab. Here's an excerpt from the data you'd get if you apply the input parameters above: And here is the same data but in JSON. You can choose in which format to download your data: JSON, JSONL, Excel spreadsheet, HTML table, CSV, or XML. json [ { "keyword": "ai", "id": 156491923, "editor_v2": false, "publication_id": 2270667, "title": "New AI image models, free AI music generators, GPT can THINK now, new top AI models, DeepSeek Janus", "social_title": null, "search_engine_title": null, "search_engine_description": null, "type": "podcast", "slug": "new-ai-image-models-free-ai-music", "post_date": "2025-02-04T23:10:38.722Z", "audience": "everyone", "podcast_duration": 2684.9436, "video_upload_id": null, "podcast_upload_id": "06e2c81a-16e8-4c32-a936-a1f89d596005", "write_comment_permissions": "everyone", "should_send_free_preview": false, "free_unlock_required": false, "default_comment_sort": null, "canonical_url": "https://aisearch.substack.com/p/new-ai-image-models-free-ai-music", "section_id": null, "top_exclusions": [], "pins": [], "is_section_pinned": false, "section_slug": null, "section_name": null, "reactions": { "❤": 0 }, "restacked_post_id": null, "restacked_post_slug": null, "restacked_pub_name": null, "restacked_pub_logo_url": null, "position": 1, "subtitle": "Welcome to the AI Search podcast. Here are the top highlights in AI this week.", "cover_image": "https://substack-post-media.s3.amazonaws.com/public/images/1d231857-e4ee-468b-b626-0deb428ee7d6_1400x1400.png", "cover_image_is_square": true, "cover_image_is_explicit": false, "podcast_episode_image_url": "https://substack-post-media.s3.amazonaws.com/public/images/1d231857-e4ee-468b-b626-0deb428ee7d6_1400x1400.png", "podcast_episode_image_info": { "url": "https://substack-post-media.s3.amazonaws.com/public/images/1d231857-e4ee-468b-b626-0deb428ee7d6_1400x1400.png", "isDefaultArt": false, "isDefault": false }, "podcast_url": "https://api.substack.com/api/v1/audio/upload/06e2c81a-16e8-4c32-a936-a1f89d596005/src", "videoUpload": null, "podcastFields": { "post_id": 156491923, "podcast_episode_number": null, "podcast_season_number": null, "podcast_episode_type": null, "should_syndicate_to_other_feed": null, "syndicate_to_section_id": null, "hide_from_feed": false, "free_podcast_url": null, "free_podcast_duration": null }, "podcast_preview_upload_id": null, "podcastUpload": { "id": "06e2c81a-16e8-4c32-a936-a1f89d596005", "name": "news-24.mp3", "created_at": "2025-02-04T23:09:13.865Z", "uploaded_at": "2025-02-04T23:09:23.872Z", "publication_id": 2270667, "state": "transcoded", "post_id": 156491923, "user_id": 191014175, "duration": 2684.9436, "height": null, "width": null, "thumbnail_id": 1, "preview_start": null, "preview_duration": null, "media_type": "audio", "primary_file_size": "42959560", "is_mux": false, "mux_asset_id": null, "mux_playback_id": null, "mux_preview_asset_id": null, "mux_preview_playback_id": null, "mux_rendition_quality": null, "mux_preview_rendition_quality": null, "explicit": false, "copyright_infringement": null, "src_media_upload_id": null, "live_stream_id": null, "transcription": { "media_upload_id": "06e2c81a-16e8-4c32-a936-a1f89d596005", "created_at": "2025-02-04T23:10:05.279Z", "requested_by": 191014175, "status": "transcribed", "modal_call_id": "fc-01JK9KMQDG2KRT327JTSR976DS", "approved_at": "2025-02-04T23:12:43.876Z", "transcript_url": "s3://substack-video/video_upload/post/156491923/06e2c81a-16e8-4c32-a936-a1f89d596005/1738710621/transcription.json", "attention_vocab": null, "speaker_map": null, "captions_map": { "en": { "url": "s3://substack-video/video_upload/post/156491923/06e2c81a-16e8-4c32-a936-a1f89d596005/1738710621/en.vtt", "language": "en", "original": true } }, "cdn_url": "https://substackcdn.com/video_upload/post/156491923/06e2c81a-16e8-4c32-a936-a1f89d596005/1738710621/transcription.json?Expires=1739315665&Key-Pair-Id=APKAIVDA3NPSMPSPESQQ&Signature=kkyvVNWtOpL1VD3Y4n0LkIioiMU3r10tYrBXXlrNC927uzauCnLvzbp4j4-VU5UBwCm5HTayghaSnVeUPWfr6GPD-YTYgeHufNrkwCtmgqinTner3DwKh7z4EsvxbTkH58qXOAR82qLG8MHuu~iSTsXJ5CARuEeGPTW121bHK74poh6QH6jMT3iW-8qqRv4VP4aioSWL8OQyolUxoalTWSiejR6RE9RTxdRUMUbg8pk60GN3nzq3NTRff0qiZtnwJuvh~-A0L4FiTCiNtdFJsHOfYmcieyRydEDj7rHLsgY7yzuFXnsQx2qau9aoF79XAsJ5s4T1EySb~vg7fMTPNQ__", "cdn_unaligned_url": "https://substackcdn.com/video_upload/post/156491923/06e2c81a-16e8-4c32-a936-a1f89d596005/1738710621/unaligned_transcription.json?Expires=1739315665&Key-Pair-Id=APKAIVDA3NPSMPSPESQQ&Signature=TpwJtOqaEBWXzOc2orXqEOjPumCJZzfg0wCNb22pdu~T8GQ2PCRcJaPAd6qZZzEPMFLsHbtiEvxC23ZIiD4D6ft94TvlQVYJV~TKKfTy0nC8Ut77ni9FJvzfRQhfYw1tUPdQ2mQEx2s5~TVCKlplUoaceWJ03B55xSURcT9apy4-8X2MjZk57O9Z-almjQ2QtkvxOUxNWvGiM1HN4RGKCPNfu211OpEn1rVMbmU~0WdZ3Sz7QaaXRnJbc3~tQKkev4MWXyq-E8lYa88lpNFj1LJnpF9piIAwfJUtkPym8SXdqiMLFHj0B7pBc4Th15eTUeL1MgdM145boHfnuAlP~w__", "signed_captions": [ { "language": "en", "url": "https://substackcdn.com/video_upload/post/156491923/06e2c81a-16e8-4c32-a936-a1f89d596005/1738710621/en.vtt?Expires=1739315665&Key-Pair-Id=APKAIVDA3NPSMPSPESQQ&Signature=JpSxDJtOUrhIB698DRiCsap2PZTwPNbQaH1zBYxivUyFGmr8oTHIU80z0SELPqmnsaJmpRTgXjuAKJMbBBMeZlg7FwCNAPxijb9jhS5Oai-SZrDKH4jG21RJwh2TiF0Yg0yp9kF0z8xpK56RzqeS7JCHUJG8iVQmjqSNrrvCVtOGBGN3fBIHlr7Z3RKioxD0grIAatAkDbCwjkc49~~XWa9hd-awKhqByx2o2w6uHlNfuC5ZRRpwnaiX8Ju6rxU9MwW24QQCKCpjbND6S6kaGJ-Z~N90shdh-fD31FymCJ4quq6M1JaEyonZIjcwyUc6FduiKeQi25x-MvUIITtrow__", "original": true } ] } }, "podcastPreviewUpload": null, "voiceover_upload_id": null, "voiceoverUpload": null, "has_voiceover": false, "description": "Welcome to the AI Search podcast. Here are the top highlights in AI this week.", "body_json": null, "body_html": null, "truncated_body_text": "INSANE AI news: OpenAI o3-mini, DeepSeek Janus-Pro, Qwen2.5-Max, Riffusion FUZZ, YuE AI music generator, Doubao 1.5 Pro, Google Daily Listen, Tulu 3", "wordcount": 38, "postTags": [ { "id": "06ec7467-035f-41d3-aa2a-f2dafd005ba2", "publication_id": 2270667, "name": "research", "slug": "research", "hidden": false }, { "id": "134583b8-fd61-4289-83f8-2768f0e74637", "publication_id": 2270667, "name": "machine learning", "slug": "machine-learning", "hidden": false }, { "id": "3b762592-4665-4885-b8c3-5c01a13fbd93", "publication_id": 2270667, "name": "artificial intelligence", "slug": "artificial-intelligence", "hidden": false }, { "id": "aae204af-3183-4835-9184-ac27d860a342", "publication_id": 2270667, "name": "science", "slug": "science", "hidden": false }, { "id": "d07870f1-c05d-4f6b-9a7e-c94b7e0ba2c5", "publication_id": 2270667, "name": "tech", "slug": "tech", "hidden": false }, { "id": "e9d24f2a-02b0-4774-bc85-8b311ff1ab12", "publication_id": 2270667, "name": "ai", "slug": "ai", "hidden": false } ], "teaser_post_eligible": true, "postCountryBlocks": [], "coverImagePalette": { "Vibrant": { "rgb": [ 60, 180, 252 ], "population": 3621 }, "DarkVibrant": { "rgb": [ 109, 52, 68 ], "population": 14 }, "LightVibrant": { "rgb": [ 100, 196, 252 ], "population": 5 }, "Muted": { "rgb": [ 164, 87, 108 ], "population": 5 }, "DarkMuted": { "rgb": [ 86, 47, 61 ], "population": 115 }, "LightMuted": { "rgb": [ 218, 174, 194 ], "population": 99 } }, "publishedBylines": [ { "id": 191014175, "name": "AI Search", "handle": "aisearch", "previous_name": null, "photo_url": "https://substack-post-media.s3.amazonaws.com/public/images/e1ef43b4-d382-41ad-8e0b-86080c6f0b2a_1400x1400.png", "bio": "Stay up to date with AI news, tech, & research", "profile_set_up_at": "2024-01-18T18:32:07.954Z", "publicationUsers": [ { "id": 2288577, "user_id": 191014175, "publication_id": 2270667, "role": "admin", "public": true, "is_primary": false, "publication": { "id": 2270667, "name": "AI Search", "subdomain": "aisearch", "custom_domain": null, "custom_domain_optional": false, "hero_text": "Welcome to the AI Search newsletter. We bring you the highlights in AI every week. No fluff, just the interesting stuff. \n\nSubscribe and get a FREE cheat sheet on the top 50 most useful AI tools!", "logo_url": "https://substack-post-media.s3.amazonaws.com/public/images/2905d20a-608c-4fa8-9bdf-7af0c3792e1a_1280x1280.png", "author_id": 191014175, "theme_var_background_pop": "#FF0000", "created_at": "2024-01-18T18:32:47.729Z", "rss_website_url": null, "email_from_name": "AI Search", "copyright": "AI Search", "founding_plan_name": null, "community_enabled": false, "invite_only": false, "payments_state": "disabled", "language": null, "explicit": false, "is_personal_mode": false } } ], "is_guest": false, "bestseller_tier": null } ], "reaction": null, "reaction_count": 0, "comment_count": 0, "child_comment_count": 0, "is_geoblocked": false, "hasCashtag": false, "scrapedAt": "2025-02-10T05:26:49.500Z" }, ... ] ## Related Actors - 📚 Substack Publications Scraper - Extract detailed newsletter information from Substack with comprehensive publication metrics - 📚 Substack People Scraper - Extract comprehensive Substack author and publication data using keywords - 🔍 Substack Notes Scraper - Extract notes and comments from Substack's search results with engagement metrics - 📊 Substack Leaderboard Scraper - Get insights about top newsletters including subscriber counts and pricing - 📄 Article Content Extractor - Extract clean article content and metadata from any web page - 🔍 Medium Posts Search Scraper - Extract comprehensive article data from Medium's search results - 📝 Medium User Posts Scraper - Extract detailed post data from Medium user profiles - 🎯 Medium Publications Search Scraper - Extract publication details from Medium's search results - 📰 Google News Scraper - Collect up to 5000 news articles with flexible search options - 🔬 Nature Search Results Scraper - Extract comprehensive research article data from Nature.com - 📚 arXiv Search Scraper - Extract research paper data from arXiv search results - 📚 PubMed Search Scraper - Scrape research papers and academic articles from PubMed - 📚 Goodreads Book Scraper - Extract comprehensive book data from Goodreads search results - 📚 Goodreads Review Scraper - Extract detailed book reviews from Goodreads - 📚 Wattpad Story Scraper - Scrape stories from Wattpad with advanced filtering options

Common Use Cases

Market Research

Gather competitive intelligence and market data

Lead Generation

Extract contact information for sales outreach

Price Monitoring

Track competitor pricing and product changes

Content Aggregation

Collect and organize content from multiple sources

Ready to Get Started?

Try Substack Posts Scraper 📚 now on Apify. Free tier available with no credit card required.

Start Free Trial

Actor Information

Developer
easyapi
Pricing
Paid
Total Runs
369
Active Users
62
Apify Platform

Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.

Learn more about Apify

Need Professional Help?

Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.

Find a Specialist

Trusted by millions | Money-back guarantee | 24/7 Support