YouTube Transcript & Captions Scraper
by benthepythondev
Extract transcripts from any YouTube video with captions. Supports 100+ languages, auto-generated captions, and translation. Output as plain text, SRT...
Opens on Apify.com
About YouTube Transcript & Captions Scraper
Extract transcripts from any YouTube video with captions. Supports 100+ languages, auto-generated captions, and translation. Output as plain text, SRT, VTT, or JSON with timestamps. Includes video metadata (title, channel, views). Perfect for content repurposing and AI training.
What does this actor do?
YouTube Transcript & Captions Scraper is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.
Key Features
- Cloud-based execution - no local setup required
- Scalable infrastructure for large-scale operations
- API access for integration with your applications
- Built-in proxy rotation and anti-blocking measures
- Scheduled runs and webhooks for automation
How to Use
- Click "Try This Actor" to open it on Apify
- Create a free Apify account if you don't have one
- Configure the input parameters as needed
- Run the actor and download your results
Documentation
YouTube Transcript & Captions Scraper Extract transcripts, captions, and subtitles from any YouTube video. Supports auto-generated and manual captions in 100+ languages with multiple output formats. ## Features - Universal Transcript Extraction: Works with any YouTube video that has captions enabled - 100+ Languages: Supports all languages available on YouTube - Auto-Generated Fallback: Falls back to YouTube's auto-generated captions when manual captions unavailable - Translation: Translate transcripts to any supported language - Multiple Output Formats: Plain text, timestamped, SRT, VTT, or JSON - Video Metadata: Optional extraction of title, channel, views, duration - Batch Processing: Process multiple videos in a single run - High Success Rate: API-based extraction (no browser needed) ## Use Cases - Content Repurposing: Turn video content into blog posts, articles, or social media - SEO Optimization: Extract text for video descriptions and metadata - Accessibility: Generate captions for accessibility compliance - AI Training Data: Build datasets from YouTube content - Research: Analyze video content at scale - Translation: Get transcripts in your preferred language - Note Taking: Quickly extract key points from educational videos ## Input json { "videoUrls": [ "https://www.youtube.com/watch?v=dQw4w9WgXcQ", "https://youtu.be/jNQXAC9IVRw", "dQw4w9WgXcQ" ], "preferredLanguages": ["en", "en-US"], "includeAutoGenerated": true, "outputFormat": "text", "includeVideoMetadata": true, "translateTo": null, "maxVideos": 0 } ### Input Fields | Field | Type | Default | Description | |-------|------|---------|-------------| | videoUrls | array | required | YouTube URLs or video IDs | | preferredLanguages | array | ["en"] | Language codes in order of preference | | includeAutoGenerated | boolean | true | Include auto-generated captions as fallback | | outputFormat | string | "text" | Output format (text, timestamped, srt, vtt, json) | | includeVideoMetadata | boolean | true | Fetch video title, channel, views, etc. | | translateTo | string | null | Translate to this language code | | maxVideos | integer | 0 | Limit videos to process (0 = unlimited) | ### Supported URL Formats - https://www.youtube.com/watch?v=VIDEO_ID - https://youtu.be/VIDEO_ID - https://www.youtube.com/embed/VIDEO_ID - https://www.youtube.com/shorts/VIDEO_ID - VIDEO_ID (direct 11-character ID) ## Output json { "video_id": "dQw4w9WgXcQ", "video_url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ", "success": true, "transcript_text": "We're no strangers to love. You know the rules and so do I...", "formatted_transcript": "We're no strangers to love. You know the rules...", "output_format": "text", "language": "English", "language_code": "en", "is_auto_generated": false, "is_translated": false, "available_languages": ["en", "es", "fr", "de", "ja"], "word_count": 423, "character_count": 2156, "segment_count": 87, "duration_seconds": 212.5, "title": "Rick Astley - Never Gonna Give You Up", "channel_name": "Rick Astley", "view_count": 1400000000, "thumbnail_url": "https://i.ytimg.com/vi/dQw4w9WgXcQ/maxresdefault.jpg", "error": null, "extraction_time_ms": 1234 } ### Output Fields | Field | Description | |-------|-------------| | video_id | YouTube video ID | | success | Whether extraction was successful | | transcript_text | Plain text transcript | | formatted_transcript | Transcript in requested format | | language_code | Language code of transcript | | is_auto_generated | True if using YouTube's auto-generated captions | | is_translated | True if transcript was translated | | word_count | Number of words in transcript | | segment_count | Number of caption segments | | available_languages | All available transcript languages | ## Output Formats | Format | Description | Use Case | |--------|-------------|----------| | text | Continuous plain text | Reading, AI processing | | timestamped | Text with [MM:SS] timestamps | Note taking, navigation | | srt | SubRip subtitle format | Video editors, media players | | vtt | WebVTT format | Web video players, HTML5 | | json | Detailed segments with timing | Custom processing, analysis | ## Language Codes Common language codes: en, en-US, es, fr, de, it, pt, ja, ko, zh-Hans, zh-Hant, ru, ar, hi For a full list, see YouTube's supported languages. ## Error Handling The scraper handles various error conditions gracefully: | Error Type | Description | |------------|-------------| | transcripts_disabled | Video owner has disabled captions | | no_transcript_found | No transcript in requested language | | video_unavailable | Video is private, deleted, or region-restricted | | no_transcript_available | Video has no captions at all | ## Limitations - Captions Required: Cannot extract from videos without captions - Rate Limits: YouTube may rate-limit excessive requests - Private Videos: Cannot access private or unlisted videos without authorization - Live Streams: May not work with ongoing live streams ## Pricing $3 per 1,000 transcripts extracted. ## Examples ### Extract single video transcript json { "videoUrls": ["https://www.youtube.com/watch?v=dQw4w9WgXcQ"] } ### Get Spanish transcript with translation json { "videoUrls": ["https://www.youtube.com/watch?v=VIDEO_ID"], "preferredLanguages": ["es"], "translateTo": "en" } ### Generate SRT subtitles json { "videoUrls": ["https://www.youtube.com/watch?v=VIDEO_ID"], "outputFormat": "srt" } ### Batch process without metadata json { "videoUrls": ["VIDEO_ID_1", "VIDEO_ID_2", "VIDEO_ID_3"], "includeVideoMetadata": false, "maxVideos": 100 } ## Support For questions or issues, contact the developer or open an issue in the repository.
Categories
Common Use Cases
Market Research
Gather competitive intelligence and market data
Lead Generation
Extract contact information for sales outreach
Price Monitoring
Track competitor pricing and product changes
Content Aggregation
Collect and organize content from multiple sources
Ready to Get Started?
Try YouTube Transcript & Captions Scraper now on Apify. Free tier available with no credit card required.
Start Free TrialActor Information
- Developer
- benthepythondev
- Pricing
- Paid
- Total Runs
- 15
- Active Users
- 3
Related Actors
Google Search Results Scraper
by apify
Website Content Crawler
by apify
🔥 Leads Generator - $3/1k 50k leads like Apollo
by microworlds
Video Transcript Scraper: Youtube, X, Facebook, Tiktok, etc.
by invideoiq
Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.
Learn more about ApifyNeed Professional Help?
Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.
Trusted by millions | Money-back guarantee | 24/7 Support