YouTube Transcript & Captions Scraper

YouTube Transcript & Captions Scraper

by benthepythondev

Extract transcripts from any YouTube video with captions. Supports 100+ languages, auto-generated captions, and translation. Output as plain text, SRT...

15 runs
3 users
Try This Actor

Opens on Apify.com

About YouTube Transcript & Captions Scraper

Extract transcripts from any YouTube video with captions. Supports 100+ languages, auto-generated captions, and translation. Output as plain text, SRT, VTT, or JSON with timestamps. Includes video metadata (title, channel, views). Perfect for content repurposing and AI training.

What does this actor do?

YouTube Transcript & Captions Scraper is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.

Key Features

  • Cloud-based execution - no local setup required
  • Scalable infrastructure for large-scale operations
  • API access for integration with your applications
  • Built-in proxy rotation and anti-blocking measures
  • Scheduled runs and webhooks for automation

How to Use

  1. Click "Try This Actor" to open it on Apify
  2. Create a free Apify account if you don't have one
  3. Configure the input parameters as needed
  4. Run the actor and download your results

Documentation

YouTube Transcript & Captions Scraper Extract transcripts, captions, and subtitles from any YouTube video. Supports auto-generated and manual captions in 100+ languages with multiple output formats. ## Features - Universal Transcript Extraction: Works with any YouTube video that has captions enabled - 100+ Languages: Supports all languages available on YouTube - Auto-Generated Fallback: Falls back to YouTube's auto-generated captions when manual captions unavailable - Translation: Translate transcripts to any supported language - Multiple Output Formats: Plain text, timestamped, SRT, VTT, or JSON - Video Metadata: Optional extraction of title, channel, views, duration - Batch Processing: Process multiple videos in a single run - High Success Rate: API-based extraction (no browser needed) ## Use Cases - Content Repurposing: Turn video content into blog posts, articles, or social media - SEO Optimization: Extract text for video descriptions and metadata - Accessibility: Generate captions for accessibility compliance - AI Training Data: Build datasets from YouTube content - Research: Analyze video content at scale - Translation: Get transcripts in your preferred language - Note Taking: Quickly extract key points from educational videos ## Input json { "videoUrls": [ "https://www.youtube.com/watch?v=dQw4w9WgXcQ", "https://youtu.be/jNQXAC9IVRw", "dQw4w9WgXcQ" ], "preferredLanguages": ["en", "en-US"], "includeAutoGenerated": true, "outputFormat": "text", "includeVideoMetadata": true, "translateTo": null, "maxVideos": 0 } ### Input Fields | Field | Type | Default | Description | |-------|------|---------|-------------| | videoUrls | array | required | YouTube URLs or video IDs | | preferredLanguages | array | ["en"] | Language codes in order of preference | | includeAutoGenerated | boolean | true | Include auto-generated captions as fallback | | outputFormat | string | "text" | Output format (text, timestamped, srt, vtt, json) | | includeVideoMetadata | boolean | true | Fetch video title, channel, views, etc. | | translateTo | string | null | Translate to this language code | | maxVideos | integer | 0 | Limit videos to process (0 = unlimited) | ### Supported URL Formats - https://www.youtube.com/watch?v=VIDEO_ID - https://youtu.be/VIDEO_ID - https://www.youtube.com/embed/VIDEO_ID - https://www.youtube.com/shorts/VIDEO_ID - VIDEO_ID (direct 11-character ID) ## Output json { "video_id": "dQw4w9WgXcQ", "video_url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ", "success": true, "transcript_text": "We're no strangers to love. You know the rules and so do I...", "formatted_transcript": "We're no strangers to love. You know the rules...", "output_format": "text", "language": "English", "language_code": "en", "is_auto_generated": false, "is_translated": false, "available_languages": ["en", "es", "fr", "de", "ja"], "word_count": 423, "character_count": 2156, "segment_count": 87, "duration_seconds": 212.5, "title": "Rick Astley - Never Gonna Give You Up", "channel_name": "Rick Astley", "view_count": 1400000000, "thumbnail_url": "https://i.ytimg.com/vi/dQw4w9WgXcQ/maxresdefault.jpg", "error": null, "extraction_time_ms": 1234 } ### Output Fields | Field | Description | |-------|-------------| | video_id | YouTube video ID | | success | Whether extraction was successful | | transcript_text | Plain text transcript | | formatted_transcript | Transcript in requested format | | language_code | Language code of transcript | | is_auto_generated | True if using YouTube's auto-generated captions | | is_translated | True if transcript was translated | | word_count | Number of words in transcript | | segment_count | Number of caption segments | | available_languages | All available transcript languages | ## Output Formats | Format | Description | Use Case | |--------|-------------|----------| | text | Continuous plain text | Reading, AI processing | | timestamped | Text with [MM:SS] timestamps | Note taking, navigation | | srt | SubRip subtitle format | Video editors, media players | | vtt | WebVTT format | Web video players, HTML5 | | json | Detailed segments with timing | Custom processing, analysis | ## Language Codes Common language codes: en, en-US, es, fr, de, it, pt, ja, ko, zh-Hans, zh-Hant, ru, ar, hi For a full list, see YouTube's supported languages. ## Error Handling The scraper handles various error conditions gracefully: | Error Type | Description | |------------|-------------| | transcripts_disabled | Video owner has disabled captions | | no_transcript_found | No transcript in requested language | | video_unavailable | Video is private, deleted, or region-restricted | | no_transcript_available | Video has no captions at all | ## Limitations - Captions Required: Cannot extract from videos without captions - Rate Limits: YouTube may rate-limit excessive requests - Private Videos: Cannot access private or unlisted videos without authorization - Live Streams: May not work with ongoing live streams ## Pricing $3 per 1,000 transcripts extracted. ## Examples ### Extract single video transcript json { "videoUrls": ["https://www.youtube.com/watch?v=dQw4w9WgXcQ"] } ### Get Spanish transcript with translation json { "videoUrls": ["https://www.youtube.com/watch?v=VIDEO_ID"], "preferredLanguages": ["es"], "translateTo": "en" } ### Generate SRT subtitles json { "videoUrls": ["https://www.youtube.com/watch?v=VIDEO_ID"], "outputFormat": "srt" } ### Batch process without metadata json { "videoUrls": ["VIDEO_ID_1", "VIDEO_ID_2", "VIDEO_ID_3"], "includeVideoMetadata": false, "maxVideos": 100 } ## Support For questions or issues, contact the developer or open an issue in the repository.

Common Use Cases

Market Research

Gather competitive intelligence and market data

Lead Generation

Extract contact information for sales outreach

Price Monitoring

Track competitor pricing and product changes

Content Aggregation

Collect and organize content from multiple sources

Ready to Get Started?

Try YouTube Transcript & Captions Scraper now on Apify. Free tier available with no credit card required.

Start Free Trial

Actor Information

Developer
benthepythondev
Pricing
Paid
Total Runs
15
Active Users
3
Apify Platform

Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.

Learn more about Apify

Need Professional Help?

Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.

Find a Specialist

Trusted by millions | Money-back guarantee | 24/7 Support