TikTok & YouTube Transcript Extractor Scraper

TikTok & YouTube Transcript Extractor Scraper

by memo23

Need to pull clean, ready-to-use transcripts from TikTok and YouTube videos? This scraper does exactly that. It fetches the transcript and key metadat...

8,401 runs
473 users
Try This Actor

Opens on Apify.com

About TikTok & YouTube Transcript Extractor Scraper

Need to pull clean, ready-to-use transcripts from TikTok and YouTube videos? This scraper does exactly that. It fetches the transcript and key metadata from any public video on either platform, delivering everything in a standard WebVTT file that’s easy to work with. I use it for a few specific jobs. First, for accessibility—getting captions for my own content. Second, for content analysis, like running sentiment checks on competitor videos or tracking keyword mentions. And third, for repurposing; having the text makes it simple to turn a video into a blog post or social media clips. You can paste in a single URL or a whole list for batch processing. It handles language selection if a video has multiple tracks, and you can route requests through proxies if you're scraping at scale. The output keeps the timing data intact, which is crucial for editing or syncing. Ultimately, it saves the hours you’d spend manually transcribing or wrestling with inconsistent APIs. You get structured data that’s ready for your next step, whether that’s feeding it into an analysis tool, an editor, or your CMS.

What does this actor do?

TikTok & YouTube Transcript Extractor Scraper is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.

Key Features

  • Cloud-based execution - no local setup required
  • Scalable infrastructure for large-scale operations
  • API access for integration with your applications
  • Built-in proxy rotation and anti-blocking measures
  • Scheduled runs and webhooks for automation

How to Use

  1. Click "Try This Actor" to open it on Apify
  2. Create a free Apify account if you don't have one
  3. Configure the input parameters as needed
  4. Run the actor and download your results

Documentation

TikTok & YouTube Transcript Extractor Scraper The TikTok & YouTube Transcript Extractor Scraper is a tool designed to automate the extraction of captions and transcripts from TikTok and YouTube videos in WebVTT format. This scraper provides an easy way to retrieve video subtitles for accessibility, analysis, or repurposing. --- ## Features - Extract Transcripts from TikTok & YouTube: Automatically retrieve captions in WebVTT format from TikTok and YouTube videos. - Start with Specific URLs: Input TikTok or YouTube video URLs to fetch transcripts from desired videos. - Customizable Concurrency: Optimize the scraping speed with configurable concurrency settings. - Retry Mechanism: Ensure data accuracy with automatic retries for failed requests. - Proxy Support: Use proxies for anonymity and uninterrupted operations. - YouTube Language Selection: Choose the transcript language for YouTube videos. - Include YouTube Video Details: Optionally include extra metadata for YouTube videos. --- ## Input Configuration Below is the structure of the input configuration for the scraper: ### 1. Start URLs - Title: Start URLs - Type: Array - Description: Specify the TikTok or YouTube video URLs to scrape the transcript. - Editor: requestListSources - Example: json { "url": "https://www.tiktok.com/@stoolpresidente/video/7451747413649263915" }, { "url": "https://www.youtube.com/watch?v=aAkMkVFwAoo" } --- ### 2. Max Concurrency - Title: Max Concurrency - Type: Integer - Description: Maximum number of pages that can be processed simultaneously. - Default: 10 --- ### 3. Min Concurrency - Title: Min Concurrency - Type: Integer - Description: Minimum number of pages that will be processed simultaneously. - Default: 1 --- ### 4. Max Request Retries - Title: Max Request Retries - Type: Integer - Description: Number of retries for a failed request before the scraper gives up. - Default: 100 --- ### 5. Include YouTube Video Details - includeYoutubeVideosDetails: Boolean - Description: If true, includes extra metadata for YouTube videos. --- ### 6. YouTube Transcript Language - youtubeTranscriptLanguage: String - Description: Language code for YouTube transcript extraction (e.g., "en"). --- ### 7. Proxy Configuration - proxy: Object - Description: Configure proxy servers for secure and anonymous scraping. - Default: json { "useApifyProxy": true, "apifyProxyGroups": ["RESIDENTIAL"] } - apifyProxyCountry: String - Description: Custom proxy configuration with proxyUrls array. - Details: For more information on proxy configuration, refer to Apify Proxy Configuration. --- ## Example Input json { "startUrls": [ { "url": "https://www.tiktok.com/@stoolpresidente/video/7451747413649263915" }, { "url": "https://www.youtube.com/watch?v=aAkMkVFwAoo" } ], "maxConcurrency": 10, "minConcurrency": 1, "maxRequestRetries": 100, "maxItems": 20, "includeYoutubeVideosDetails": true, "youtubeTranscriptLanguage": "en", "proxy": { "useApifyProxy": true, "apifyProxyGroups": ["RESIDENTIAL"] } } --- ## Output The scraper produces an output containing the transcript in WebVTT format. Here’s an example: ### Output TikTok sample json { "transcript": "WEBVTT\n\n\n00:00:00.260 --> 00:00:01.500\nWatch out for the snow storm,\n\n00:00:01.501 --> 00:00:02.621\npresident. Oh,\n\n00:00:02.622 --> 00:00:04.061\nhe said watch out for. No,\n\n00:00:04.062 --> 00:00:05.541\nI didn't know what the hell you were talking about.\n\n..." } - transcript: Contains the WebVTT format captions from the TikTok video. ### Transcript Format (WebVTT) The captions are structured in WebVTT format with time codes and text: WEBVTT 00:00:00.260 --> 00:00:01.500 Watch out for the snow storm, 00:00:01.501 --> 00:00:02.621 president. Oh, --- ### Output Youtube sample json { "transcript": [ { "text": "(light cheerful music)", "startMs": "3760", "endMs": "7010", "startTimeText": "0:03" }, { "text": "♪ I don't want a lot for Christmas ♪", "startMs": "10482", "endMs": "15482", "startTimeText": "0:10" }, { "text": "♪ There is just one thing I need ♪", "startMs": "16724", "endMs": "21100", "startTimeText": "0:16" }, { "text": "♪ I don't care about the presents ♪", "startMs": "21100", "endMs": "24788", "startTimeText": "0:21" }, { "text": "♪ Underneath the Christmas tree ♪", "startMs": "24788", "endMs": "28210", "startTimeText": "0:24" }, { "text": "♪ I just want you for my own ♪", "startMs": "28210", "endMs": "32479", "startTimeText": "0:28" }, ... ], "transcript_only_text": "(light cheerful music) ♪ I don't want a lot for Christmas ♪ ♪ There is just one thing I need ♪ ♪ I don't care about the presents ♪ ♪ Underneath the Christmas tree ♪ ♪ I just want you for my own ♪ ♪ More than you could ever know ♪ ♪ Make my wish come true ♪ ♪ All I want for Christmas is you ♪ ♪ Yeah ♪ ♪ I don't want a lot for Christmas ♪ ♪ There is just one thing I need ♪ ♪ And I don't care about the presents ♪ ♪ Underneath the Christmas tree ♪ ♪ I don't need to hang my stocking\nthere upon the fireplace ♪ ♪ Santa Claus won't make me happy ♪ ♪ With a toy on Christmas Day ♪ ♪ I just want you for my own ♪ ♪ More than you could ever know ♪ ♪ Make my wish come true ♪ ♪ All I want for Christmas is you ♪ ♪ You, baby ♪ ♪ Oh, I won't ask for\nmuch this Christmas ♪ ♪ I won't even wish for snow ♪ ♪ And I, I'm just gonna keep on waiting ♪ ♪ Underneath the mistletoe ♪ ♪ I won't make a list and send it ♪ ♪ To the North Pole for Saint Nick ♪ ♪ I won't even stay awake to\nhear those magic reindeer click ♪ ♪ 'Cause I just want you here tonight ♪ ♪ Holding on to me so tight ♪ ♪ What more can I do ♪ ♪ Oh, baby, all I want\nfor Christmas is you ♪ ♪ You, baby ♪ ♪ Oh oh, all the lights are\nshining so brightly everywhere ♪ ♪ So brightly, baby ♪ ♪ And the sound of children's\nlaughter fills the air ♪ ♪ And everyone is singing ♪ ♪ I hear those sleigh bells ringing ♪ ♪ Santa, won't you bring\nme the one I really need ♪ ♪ Won't you please bring my baby to me ♪ ♪ Oh, I don't want a lot for Christmas ♪ ♪ This is all I'm asking for ♪ ♪ I just wanna see my baby\nstanding right outside my door ♪ ♪ Oh, I just want you for my own ♪ ♪ More than you could ever know ♪ ♪ Make my wish come true ♪ ♪ Oh, baby, all I want\nfor Christmas is you ♪ ♪ You, baby ♪ ♪ All I want for Christmas is you, baby ♪ ♪ All I want for Christmas is you, baby ♪ ♪ All I want for Christmas is you, baby ♪ ♪ All I want for Christmas is you, baby ♪ ♪ All I want for Christmas is you, baby ♪", "videoId": "aAkMkVFwAoo", "title": "Mariah Carey - All I Want for Christmas Is You (Make My Wish Come True Edition)", "lengthSeconds": "243", "keywords": [ "all want for christmas is you", "christmas songs", "mariah carey all want for christmas", "all want for christmas", "merry christmas", "mariah carey christmas", "mariah carey", "christmas", "christmas song", "last christmas", "christmas music", "always be my baby", "Mariah carey hero", "oh santa", "holy night", "we belong together", "Jennifer Lopez", "beyonce", "Whitney Houston", "pop music", "90s pop", "2000s pop", "tiktok", "tiktok trend", "Holiday", "Pop" ], "channelId": "UClS0wn3LPs9jdX_yt2g1k8w", "isOwnerViewing": false, "shortDescription": "\"All I Want For Christmas Is You\" by Mariah Carey (Make My Wish Come True Edition)\nListen to Mariah Carey: https://MariahCarey.lnk.to/listenYD \nSubscribe to the official Mariah Carey YouTube channel: https://MariahCarey.lnk.to/subscribe_YD\n \nWatch more Mariah Carey videos: https://MariahCarey.lnk.to/listen_YC/youtube\n \nFollow Mariah Carey\nFacebook: https://MariahCarey.lnk.to/followFI\nInstagram: https://MariahCarey.lnk.to/followII\nTwitter: https://MariahCarey.lnk.to/followTI\nWebsite: https://MariahCarey.lnk.to/followWI\nYouTube: https://MariahCarey.lnk.to/subscribeYD\nSpotify: https://MariahCarey.lnk.to/followSI\n \nLyrics: \nI just want you for my own (Ooh)\nMore than you could ever know (Ooh)\nMake my wish come true\nAll I want for Christmas is you\nYou, baby\n \n#MariahCarey #AllIWantForChristmasIsYou #MakeMyWishComeTrue", "isCrawlable": true, "thumbnail": { "thumbnails": [ { "url": "https://i.ytimg.com/vi/aAkMkVFwAoo/hqdefault.jpg?sqp=-oaymwEmCKgBEF5IWvKriqkDGQgBFQAAiEIYAdgBAeIBCggYEAIYBjgBQAE=&rs=AOn4CLBVNtG-3snews4SkuVgHePgFV50bA", "width": 168, "height": 94 }, { "url": "https://i.ytimg.com/vi/aAkMkVFwAoo/hqdefault.jpg?sqp=-oaymwEmCMQBEG5IWvKriqkDGQgBFQAAiEIYAdgBAeIBCggYEAIYBjgBQAE=&rs=AOn4CLCRGI_2lb-QxPufhBHOrgB6n2zOYw", "width": 196, "height": 110 }, { "url": "https://i.ytimg.com/vi/aAkMkVFwAoo/hqdefault.jpg?sqp=-oaymwEnCPYBEIoBSFryq4qpAxkIARUAAIhCGAHYAQHiAQoIGBACGAY4AUAB&rs=AOn4CLB97lSkpMJKiz4s8WQTKJIEkd8tdw", "width": 246, "height": 138 }, { "url": "https://i.ytimg.com/vi/aAkMkVFwAoo/hqdefault.jpg?sqp=-oaymwEnCNACELwBSFryq4qpAxkIARUAAIhCGAHYAQHiAQoIGBACGAY4AUAB&rs=AOn4CLBFav8lBGECh5aDt2erS3N_siXOTQ", "width": 336, "height": 188 }, { "url": "https://i.ytimg.com/vi/aAkMkVFwAoo/maxresdefault.jpg?v=5dfa53d4", "width": 1920, "height": 1080 } ] }, "allowRatings": true, "viewCount": "739849592", "author": "MariahCareyVEVO", "isLowLatencyLiveStream": false, "isPrivate": false, "isUnpluggedCorpus": false, "latencyClass": "MDE_STREAM_OPTIMIZATIONS_RENDERER_LATENCY_NORMAL", "isLiveContent": false, "microformat": { "playerMicroformatRenderer": { "thumbnail": { "thumbnails": [ { "url": "https://i.ytimg.com/vi/aAkMkVFwAoo/maxresdefault.jpg", "width": 1280, "height": 720 } ] }, "embed": { "iframeUrl": "https://www.youtube.com/embed/aAkMkVFwAoo?start=985", "width": 1280, "height": 720 }, "title": { "simpleText": "Mariah Carey - All I Want for Christmas Is You (Make My Wish Come True Edition)" }, "description": { "simpleText": "\"All I Want For Christmas Is You\" by Mariah Carey (Make My Wish Come True Edition)\nListen to Mariah Carey: https://MariahCarey.lnk.to/listenYD \nSubscribe to the official Mariah Carey YouTube channel: https://MariahCarey.lnk.to/subscribe_YD\n \nWatch more Mariah Carey videos: https://MariahCarey.lnk.to/listen_YC/youtube\n \nFollow Mariah Carey\nFacebook: https://MariahCarey.lnk.to/followFI\nInstagram: https://MariahCarey.lnk.to/followII\nTwitter: https://MariahCarey.lnk.to/followTI\nWebsite: https://MariahCarey.lnk.to/followWI\nYouTube: https://MariahCarey.lnk.to/subscribeYD\nSpotify: https://MariahCarey.lnk.to/followSI\n \nLyrics: \nI just want you for my own (Ooh)\nMore than you could ever know (Ooh)\nMake my wish come true\nAll I want for Christmas is you\nYou, baby\n \n#MariahCarey #AllIWantForChristmasIsYou #MakeMyWishComeTrue" }, "lengthSeconds": "243", "ownerProfileUrl": "http://www.youtube.com/@MariahCareyVEVO", "externalChannelId": "UClS0wn3LPs9jdX_yt2g1k8w", "isFamilySafe": true, "isUnlisted": false, "hasYpcMetadata": false, "viewCount": "739849592", "category": "Music", "publishDate": "2019-12-19T21:00:11-08:00", "ownerChannelName": "MariahCareyVEVO", "liveBroadcastDetails": { "isLiveNow": false, "startTimestamp": "2019-12-20T05:00:11+00:00", "endTimestamp": "2019-12-20T05:06:09+00:00" }, "uploadDate": "2019-12-19T21:00:11-08:00", "isShortsEligible": false, "externalVideoId": "aAkMkVFwAoo", "likeCount": "6483757", "canonicalUrl": "https://www.youtube.com/watch?v=aAkMkVFwAoo" } }, "captions": { "playerCaptionsTracklistRenderer": { "captionTracks": [ { "baseUrl": "https://www.youtube.com/api/timedtext?v=aAkMkVFwAoo&ei=f681aKTUNaH31sQP8a-4YQ&opi=112496729&xoaf=5&hl=hr&ip=0.0.0.0&ipbits=0&expire=1748373999&sparams=ip,ipbits,expire,v,ei,opi,xoaf&signature=6D4061B1E750F9D72358BC2212EBFF26AA7423D7.A12F4FD68C7DCADDC114730E6F19F75884B0A787&key=yt8&lang=en&name=en", "name": { "simpleText": "Engleski - en" }, "vssId": ".en.nP7-2PuUl7o", "languageCode": "en", "isTranslatable": true, "trackName": "en" } ], "audioTracks": [ { "captionTrackIndices": [ 0 ] } ], "defaultAudioTrackIndex": 0 } } } #### Output Field Explanations (YouTube) The YouTube transcript extractor provides detailed information about the video along with its transcript in a structured format. Here's a breakdown of the output fields: ### Transcript Data - transcript: Array of objects containing the timed transcript segments - text: The actual text of the transcript segment - startMs: Start time of the segment in milliseconds - endMs: End time of the segment in milliseconds - startTimeText: Formatted start time (e.g., "0:03") - transcript_only_text: Complete transcript text as a single string, concatenating all segments ### Video Metadata - videoId: YouTube's unique 11-character video identifier - title: Full title of the video - lengthSeconds: Total duration of the video in seconds - keywords: Array of SEO keywords/tags associated with the video - channelId: YouTube's unique identifier for the channel - isOwnerViewing: Boolean indicating if the video owner is viewing (typically false) - shortDescription: Full video description including links and metadata - isCrawlable: Boolean indicating if search engines can index this video - allowRatings: Boolean indicating if the video allows likes/dislikes - viewCount: Total number of video views (as a string) - author: Display name of the channel that uploaded the video - isLiveContent: Boolean indicating if this is a live stream ### Thumbnails - thumbnail: Contains different resolution thumbnails - thumbnails: Array of thumbnail objects with different sizes - url: Image URL - width: Image width in pixels - height: Image height in pixels ### Microformat Data - microformat: Detailed metadata in YouTube's microformat - playerMicroformatRenderer: Contains structured data about the video - publishDate: When the video was published (ISO 8601 format) - uploadDate: When the video was uploaded (ISO 8601 format) - viewCount: Number of views (redundant with root viewCount) - likeCount: Number of likes (if available) - category: Video category (e.g., "Music") - isFamilySafe: Boolean indicating if the content is family-friendly ### Captions Information - captions: Details about available caption tracks - playerCaptionsTracklistRenderer: Contains caption track information - captionTracks: Array of available caption tracks - languageCode: BCP-47 language code (e.g., "en") - name: Display name of the language - isTranslatable: Boolean indicating if machine translation is available - audioTracks: Information about audio tracks and their associated captions - author: Name of the channel or content creator. - isPrivate: Boolean indicating if the video is set to private. - isUnpluggedCorpus: Boolean related to YouTube's content classification (usually false). - isLiveContent: Boolean indicating if the video is or was a live stream. - microformat: Object containing additional metadata about the video. - playerMicroformatRenderer: Contains details such as: - thumbnail: Main thumbnail image. - embed: Embed information (iframe URL, width, height). - title: Video title. - description: Video description. - lengthSeconds: Video duration in seconds. - ownerProfileUrl: URL to the channel's profile. - externalChannelId: Channel ID. - isFamilySafe: Boolean indicating if the video is family-friendly. - availableCountries: Array of country codes where the video is available. - isUnlisted: Boolean indicating if the video is unlisted. - hasYpcMetadata: Boolean for YouTube Premium content. - viewCount: Video view count. - category: Video category (e.g., "People & Blogs"). - publishDate: Date and time when the video was published. - ownerChannelName: Name of the channel owner. - uploadDate: Date and time when the video was uploaded. - isShortsEligible: Boolean indicating if the video is eligible for YouTube Shorts. - externalVideoId: Video ID. - likeCount: Number of likes. - captions: Object containing information about available captions and translation languages. - playerCaptionsTracklistRenderer: Contains: - captionTracks: Array of available caption tracks (subtitles), each with: - baseUrl: URL to fetch the captions. - name: Name of the caption track. - vssId: Caption track ID. - languageCode: Language code of the captions. - kind: Type of captions (e.g., "asr" for auto-generated). - isTranslatable: Boolean indicating if the captions can be auto-translated. - trackName: Name of the track (if any). - audioTracks: Array of audio track objects. - translationLanguages: Array of available translation languages, each with: - languageCode: Language code. - languageName: Object with simpleText (full language name). - defaultAudioTrackIndex: Index of the default audio track. - transcript: Array of transcript segments, each with: - text: The spoken text for that segment. - startMs: Start time in milliseconds. - endMs: End time in milliseconds. - startTimeText: Human-readable start time (e.g., "0:02"). - transcript_only_text: The full transcript as a single plain text string, with all Note: Some fields (like microformat, captions, thumbnail) are nested objects and may contain additional subfields for advanced use cases. ## Why Use This Scraper? - Ease of Use: Just provide the TikTok video URLs and extract captions effortlessly. - Customizable Settings: Adjust concurrency, retries, and proxy settings to fit your needs. - Accurate and Reliable: Automatically retries failed requests to minimize data loss. - Time-Saving: Automates manual effort of retrieving TikTok captions for multiple videos. --- ## Notes - Ensure valid TikTok video URLs are provided in the startUrls field. - Proxies are recommended for large-scale scraping to prevent rate-limiting or IP bans. --- ## Explore More Scrapers If you found this Apify Smartbuyglasses Scraper useful, be sure to check out our other powerful scrapers and actors at memo23's Apify profile. We offer a wide range of tools to enhance your web scraping and automation needs across various platforms and use cases. ## Support - For issues or feature requests, please use the Issues section of this actor. - If you need customization or have questions, feel free to contact the author: - Author's website: https://muhamed-didovic.github.io/ - Email: muhamed.didovic@gmail.com ## Additional Services - Request customization or whole dataset: muhamed.didovic@gmail.com - If you need anything else scraped, or this actor customized, email: muhamed.didovic@gmail.com - For API services of this scraper (no Apify fee, just usage fee for the API), contact: muhamed.didovic@gmail.com - Email: muhamed.didovic@gmail.com

Common Use Cases

Market Research

Gather competitive intelligence and market data

Lead Generation

Extract contact information for sales outreach

Price Monitoring

Track competitor pricing and product changes

Content Aggregation

Collect and organize content from multiple sources

Ready to Get Started?

Try TikTok & YouTube Transcript Extractor Scraper now on Apify. Free tier available with no credit card required.

Start Free Trial

Actor Information

Developer
memo23
Pricing
Paid
Total Runs
8,401
Active Users
473
Apify Platform

Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.

Learn more about Apify

Need Professional Help?

Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.

Find a Specialist

Trusted by millions | Money-back guarantee | 24/7 Support