Youtube Scraper Pro

Youtube Scraper Pro

by coregent

Full‑fidelity YouTube data extractor for videos, Shorts, live streams, and channels. Built on Apify + Puppeteer for reliable, scalable web scraping th...

250 runs
15 users
Try This Actor

Opens on Apify.com

About Youtube Scraper Pro

Full‑fidelity YouTube data extractor for videos, Shorts, live streams, and channels. Built on Apify + Puppeteer for reliable, scalable web scraping that returns complete metadata, engagement stats, caption tracks for transcripts, hashtags/keywords, description links, and channel insights — fast.

What does this actor do?

Youtube Scraper Pro is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.

Key Features

  • Cloud-based execution - no local setup required
  • Scalable infrastructure for large-scale operations
  • API access for integration with your applications
  • Built-in proxy rotation and anti-blocking measures
  • Scheduled runs and webhooks for automation

How to Use

  1. Click "Try This Actor" to open it on Apify
  2. Create a free Apify account if you don't have one
  3. Configure the input parameters as needed
  4. Run the actor and download your results

Documentation

YouTube Scraper Pro (Apify Actor) Full‑fidelity YouTube data extractor for videos, Shorts, live streams, and channels. Built with YouTube Data API v3 + Puppeteer for reliable, scalable hybrid extraction that returns complete metadata, engagement stats, hashtags/keywords, description links, and channel insights — fast and efficient. YouTube Scraper Pro banner --- ## At a Glance - Hybrid Architecture: YouTube Data API v3 + web scraping for 100% field coverage - Scrape by keywords, channel handles (e.g. @mkbhd), channel IDs, playlists, or direct video URLs - Returns: title, description, duration, publish date, views, likes, comments count, category, features (HD/4K/HDR), hashtags, keywords, description links, and rich channel data - Optimized: API for core fields + selective scraping for advanced fields → ~4–6s/video without proxy, all fields populated - Localization: region + language options - Optional Proxy: Residential proxy available for enhanced reliability > Best for: content research, SEO, competitive intelligence, trend analysis, brand monitoring, influencer discovery, and data‑driven content strategy. --- ## 🚀 New: Hybrid API + Scraping Architecture Version 2.6 introduces a revolutionary hybrid approach: | Method | Fields Covered | Speed | |--------|---------------|-------| | YouTube Data API v3 | 16 core fields (title, views, likes, duration, etc.) | Instant | | Web Scraping | 11 advanced fields (hashtags, links, monetization, etc.) | ~4-6s/video | | Residential Proxy (optional) | Enhanced reliability | ~10-15s/video | Benefits: - ✅ 100% field coverage - All 27 fields populated - ✅ No proxy required - Works reliably without residential proxy - ✅ Dual fallback system - API → Scraping → API Description extraction - ✅ Production ready - Proven reliability with complete data extraction --- ## Why this scraper? - Hybrid extraction: YouTube Data API v3 for core data + web scraping for advanced fields = best of both worlds - Complete fields: visits each video page to populate all available metadata (not just search snippets) - Scalable & reliable: automatic multi-project rotation for consistent performance - Triple fallback: API → Browser scraping → API description parsing ensures maximum field population - Stable & fast: ~4-6 seconds per video with 100% field coverage - Consistent schema: predictable JSON keys designed for analytics pipelines - Enterprise‑ready: optional residential proxies, localization, rate limiting, and error handling --- ## Key Features - Hybrid Data Extraction: YouTube Data API v3 (16 fields) + Web Scraping (11 advanced fields) - Intelligent API Management: Automatic multi-project rotation for optimal performance - Multiple Input Methods: keywords, channel handles/IDs, playlists, direct URLs, bulk URL upload via text file or remote file link - Content Coverage: standard videos, Shorts, live/live‑replay - Date Filtering: filter search results by publish date range (applies to search keywords only) - Comprehensive Metadata: title, description, duration, publish date, views, likes, comment count, category - Rich Media & SEO Signals: hashtags, keywords/tags, description outbound links - Channel Intelligence: id, name, handle, URL, subscribers, (optionally) totals and profile data - Performance: 3 concurrent page visits; resource/ads/fonts/video blocking for speed - Localization: regionCode + language - Optional Proxy: Residential proxy available for enhanced reliability (not required) --- ## Quick Start (Apify) 1. Create an Actor task and paste one of the JSON inputs below 2. (Optional) Enable residential proxy for maximum reliability - controlled via code setting 3. Run. Export JSON/CSV/Excel to your datastore, Google Sheets, or S3 --- ## Input Schema The input form is organized into collapsible sections for better usability: - Search Settings: Configure search behavior, localization, and result limits - Direct URLs: Scrape specific YouTube URLs directly ### Input Parameters | # | Field | Key | Type | Required | Default | Description | |---|-------|-----|------|----------|---------|-------------| | 1 | Search Keywords | searchQueries | Array | No | [] | Search for keywords, video topics, or channels. Accepts channel handles (@name) or channel IDs (UC...) | | 2 | Include Shorts | includeShorts | boolean | No | false | If true, include Shorts in search/results | | 3 | Max videos per search term | maxResultsPerQuery | integer | No | 10 | Applies per search keyword and per list source. Min 1 | | 4 | Country | regionCode | string | No | "US" | ISO-3166-1 alpha-2 code. Options: US, GB, CA, AU, IN, DE, FR, JP, BR, MX | | 5 | Language | language | string | No | "en" | IETF BCP-47 code. Options: en, es, de, fr, pt, ja, hi, zh | | 6 | From Date | dateFrom | string | No | "" | Filter videos published after this date (YYYY-MM-DD). Only applies to Search Keywords | | 7 | To Date | dateTo | string | No | "" | Filter videos published before this date (YYYY-MM-DD). Only applies to Search Keywords | | 8 | YouTube URLs | startUrls | Array | No | [] | Accepts video/shorts/channel/playlist/search URLs. Supports manual entry, text file upload, or remote file link for bulk processing | Important Notes: - Date Filtering: Date filters (dateFrom and dateTo) only apply to Search Keywords. Direct URLs are not filtered by date. - Bulk URL Upload: The YouTube URLs field supports uploading a text file (one URL per line) or linking to a remote text file for batch processing. - Residential Proxy: Configured via code (not user input) for testing purposes. Disabled by default for production. > Tip: Start with smaller maxResultsPerQuery to validate your setup, then scale up. ### Input Example - 01: Search with date filtering json { "searchQueries": ["AI tools", "machine learning", "@veritasium"], "maxResultsPerQuery": 20, "includeShorts": false, "dateFrom": "2025-01-01", "dateTo": "2025-12-31", "regionCode": "US", "language": "en" } ### Input Example - 02: Direct video URLs json { "searchQueries": [], "startUrls": [ {"url": "https://www.youtube.com/watch?v=7Sx0o-41r2k"}, {"url": "https://www.youtube.com/watch?v=5oAnKSCP4do"}, {"url": "https://www.youtube.com/watch?v=QJBP2uy8LcU"}, {"url": "https://www.youtube.com/watch?v=DOtJEwVsJic"} ], "includeShorts": true, "maxResultsPerQuery": 10, "regionCode": "US", "language": "en" } ### Input Example - 03: Bulk URL upload from remote file json { "searchQueries": [], "startUrls": [ { "requestsFromUrl": "https://raw.githubusercontent.com/coregentdevspace/youtube-scraper-assets/main/youtube-scraper-pro-direct-url-text-file.txt" } ], "dateFrom": "2025-10-01", "dateTo": "2025-10-31", "includeShorts": true, "maxResultsPerQuery": 5, "regionCode": "US", "language": "en" } --- ## Output Schema ### Core Output Fields | # | Field | Type | Description | |---|-------|------|-------------| | 1 | type | String | One of video, shorts, live, stream | | 2 | VideoId | String | YouTube video ID (e.g., 7Sx0o-41r2k) | | 3 | PageURL | String | Canonical YouTube watch URL | | 4 | title | String | Video title | | 5 | thumbnailUrl | String | null | Primary/hero thumbnail URL | | 6 | publishDate | String (ISO) | null | When the video was published | | 7 | duration | String | null | HH:MM:SS format (e.g., 00:22:43) | | 8 | durationSeconds | Integer | null | Duration in seconds | | 9 | viewCount | Integer | null | Total views | | 10 | likeCount | Integer | null | Total likes | | 11 | commentCount | Integer | null | Public comments count | | 12 | category | String | null | Video category | ### Video Properties | # | Field | Type | Description | |---|-------|------|-------------| | 13 | isLive | Boolean | Whether the item is/was a live stream | | 14 | isMembersOnly | Boolean | Members-only gated flag | | 15 | isPrivate | Boolean | Private/unavailable to public | | 16 | isFamilySafe | Boolean | null | Family-safe flag if exposed | | 17 | isMonetized | Boolean | null | Monetization detectable flag | | 18 | isRatingsAllowed | Boolean | null | Whether likes/dislikes are enabled | | 19 | commentsTurnedOff | Boolean | null | Whether comments are disabled | | 20 | commentsAllowed | Boolean | null | Convenience mirror | ### Content & Metadata | # | Field | Type | Description | |---|-------|------|-------------| | 21 | description | String | null | Creator-written description | | 22 | descriptionLinks | Array | Links parsed from description | | 23 | keywords | Array | Tags/keywords | | 24 | hashtags | Array | Hashtags from title/description | ### Media Features | # | Field | Type | Description | |---|-------|------|-------------| | 25 | features | Object | Media feature flags | | - | features.isHD | Boolean | null | HD available | | - | features.is4K | Boolean | null | 4K available | | - | features.isHDR | Boolean | null | HDR available | | - | features.isVR180 | Boolean | null | VR180 available | | - | features.is360 | Boolean | null | 360° available | ### Channel Information | # | Field | Type | Description | |---|-------|------|-------------| | 26 | channel | Object | Channel metadata | | - | channel.id | String | null | Channel ID | | - | channel.name | String | null | Display name | | - | channel.handle | String | null | @handle | | - | channel.url | String | null | Channel URL | | - | channel.subscriberCount | Integer | String | null | Subscriber count | | - | channel.totalViews | Integer | String | null | Lifetime channel views | | - | channel.totalVideos | Integer | null | Number of uploads | | - | channel.country | String | null | Channel country/region | | - | channel.profileImage | String | null | Avatar URL | | - | channel.description | String | null | About text | | - | channel.links | Object | Social/website links map | ### Provenance | # | Field | Type | Description | |---|-------|------|-------------| | 27 | provenance | Object | Source/ordering metadata | | - | provenance.order | Integer | null | Position in results | | - | provenance.source | String | null | search | channel | playlist | trending | startUrl | | - | provenance.collectedAt | String (ISO) | Timestamp when collected | ### Sample Output #### Output Example - Overview Fields (Key fields only) json [ { "type": "live", "PageURL": "https://www.youtube.com/watch?v=dQw4w9WgXcQ", "title": "Rick Astley - Never Gonna Give You Up (Official Video) (4K Remaster)", "viewCount": 1707551805, "likeCount": 18606140, "duration": "00:03:33", "publishDate": "2009-10-24", "channel": { "id": "UCuAXFkgsw1L7xaCfnd5JJOw", "name": "Rick Astley", "handle": null, "url": "https://www.youtube.com/channel/UCuAXFkgsw1L7xaCfnd5JJOw", "subscriberCount": "4.42M subscribers" } } ] #### Output Example - Complete Record (All 27 fields) json [ { "type": "live", "VideoId": "dQw4w9WgXcQ", "PageURL": "https://www.youtube.com/watch?v=dQw4w9WgXcQ", "title": "Rick Astley - Never Gonna Give You Up (Official Video) (4K Remaster)", "thumbnailUrl": "https://i.ytimg.com/vi/dQw4w9WgXcQ/maxresdefault.jpg", "publishDate": "2009-10-24", "duration": "00:03:33", "durationSeconds": 213, "viewCount": 1707551805, "likeCount": 18606140, "commentCount": 2406204, "category": "Music", "isLive": false, "isMembersOnly": false, "isPrivate": false, "isFamilySafe": true, "isMonetized": true, "isRatingsAllowed": true, "commentsAllowed": true, "description": "The official video for "Never Gonna Give You Up" by Rick Astley...", "descriptionLinks": [ { "url": "https://linktr.ee/rickastleynever", "text": "https://linktr.ee/rickastleynever" }, { "url": "https://RickAstley.lnk.to/YTSubID", "text": "https://RickAstley.lnk.to/YTSubID" } ], "keywords": [ "rick astley", "Never Gonna Give You Up", "rick rolled" ], "hashtags": [ "RickAstleyNever", "RickAstley", "NeverGonnaGiveYouUp" ], "features": { "isHD": true, "is4K": false, "isHDR": false, "isVR180": null, "is360": null }, "channel": { "id": "UCuAXFkgsw1L7xaCfnd5JJOw", "name": "Rick Astley", "handle": "@RickAstleyYT", "url": "https://www.youtube.com/channel/UCuAXFkgsw1L7xaCfnd5JJOw", "subscriberCount": "4.42M subscribers" }, "provenance": { "collectedAt": "2025-10-30T12:00:00.000Z" } } ] --- ## Performance & Reliability ### Without Residential Proxy (Default) - Speed: ~4–6 seconds per video with all fields populated - Parallelism: up to 3 concurrent video page visits - Throughput: ~10-15 videos/minute - Field Coverage: 100% (all 27 fields populated via hybrid API + scraping) ### With Residential Proxy (Optional) - Speed: ~10–15 seconds per video - Reliability: Maximum (bypasses all detection) - Field Coverage: 100% (all fields guaranteed) ### Architecture Benefits - YouTube Data API v3: Instant core data (duration, views, likes, category) - Smart Web Scraping: Advanced fields (hashtags, description links, monetization) - Dual Fallback System: 1. Browser scraping from ytInitialPlayerResponse 2. API description extraction for links/hashtags - Stability: smart retries, exponential backoff, resource blocking (video streams, ads, fonts) > For large jobs, prefer batching by topic/channel and consider residential proxies for maximum reliability. --- ## Popular Use Cases - SEO research: surface keywords, tags, hashtags, linking practices - Content strategy: analyze formats, titles, thumbnails, and posting cadence - Competitive intelligence: benchmark creators and track launches - Market/academic research: study trends by niche, region, or language - Brand monitoring: find mentions and categorize sentiment downstream - Influencer discovery: filter by views/engagement within your topics --- ## FAQ How does the hybrid approach work? The scraper uses YouTube Data API v3 for instant core data (duration, views, likes, category, keywords) and web scraping for advanced fields (hashtags, description links, monetization flags). This ensures 100% field coverage with optimal speed and reliability. Do I need residential proxy? No! The hybrid architecture works reliably without residential proxy. All 27 fields populate correctly using the API + scraping approach. Residential proxy is optional and only recommended for maximum reliability in production environments with very high volumes. Can I target a country or language? Yes — set regionCode and language for localization. What about Shorts and live videos? includeShorts controls Shorts. Live/live‑replay is auto‑detected via the type and isLive flags. Can I filter by date? Yes — use dateFrom and dateTo (YYYY-MM-DD format) to filter videos published within a date range. Note: Date filtering only applies to Search Keywords, not Direct URLs. Are dislikes available? No (YouTube no longer exposes them publicly). The field is returned as null. Any limits or restrictions? The scraper respects YouTube's rate limits and uses intelligent throttling. Avoid abusive rates. Some content is age‑restricted or members‑only. --- ## Best Practices 1. Start small: validate with maxResultsPerQuery: 5–10 2. Filter early: set includeShorts: false if not needed; use dateFrom/dateTo to narrow search results by publish date 3. Bulk processing: use text file upload or remote file link for large URL lists 4. Batch thoughtfully: group by topic/channel to improve cache locality 5. Schema‑first: build downstream models against the stable keys listed above --- ## Technical Details ### Runtime - Node.js: 18+ - Puppeteer: Headless Chrome with stealth mode - APIs: YouTube Data API v3 for hybrid extraction - Architecture: Optimized for scalability and reliability --- ## Compliance - Intended for legitimate research & business intelligence - Collects only public information - Respects YouTube Data API Terms of Service - Respect YouTube Terms of Service & applicable laws in your jurisdiction --- ## Changelog - v2.7 (Current): - 🚀 Removed captions field (requires residential proxy, always null on datacenter IPs) - ✅ Improved performance by eliminating caption API calls - ✅ Cleaner output schema with 27 fields - ✅ Removed hasSubtitles/hasCC from features (dependent on captions) - v2.6: - 🚀 NEW: Hybrid YouTube Data API v3 + web scraping architecture - ✅ Intelligent multi-project API management for optimal performance - ✅ 100% field coverage without residential proxy - ✅ Dual fallback system for maximum reliability - ✅ ~4-6 seconds per video (without proxy) - ✅ Enhanced hashtags and description links extraction from API fallback - ✅ Improved duration extraction with 4 fallback methods - ✅ Residential proxy now optional - v2.5: Added date range filtering for search queries (dateFrom/dateTo), bulk URL upload support via text file or remote file link - v2.0: Performance tuning, richer channel fields, improved localization & proxy options --- ## Get Help - Issues & feature requests → GitHub / Apify support - Need something custom? Open an issue describing your dataset needs

Common Use Cases

Market Research

Gather competitive intelligence and market data

Lead Generation

Extract contact information for sales outreach

Price Monitoring

Track competitor pricing and product changes

Content Aggregation

Collect and organize content from multiple sources

Ready to Get Started?

Try Youtube Scraper Pro now on Apify. Free tier available with no credit card required.

Start Free Trial

Actor Information

Developer
coregent
Pricing
Paid
Total Runs
250
Active Users
15
Apify Platform

Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.

Learn more about Apify

Need Professional Help?

Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.

Find a Specialist

Trusted by millions | Money-back guarantee | 24/7 Support