Youtube Scraper Pro

Youtube Scraper Pro

by coregent

Full‑fidelity YouTube data extractor for videos, Shorts, live streams, and channels. Built on Apify + Puppeteer for reliable, scalable web scraping th...

250 runs

15 users

Opens on Apify.com

About Youtube Scraper Pro

Full‑fidelity YouTube data extractor for videos, Shorts, live streams, and channels. Built on Apify + Puppeteer for reliable, scalable web scraping that returns complete metadata, engagement stats, caption tracks for transcripts, hashtags/keywords, description links, and channel insights — fast.

What does this actor do?

Youtube Scraper Pro is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.

Key Features

Cloud-based execution - no local setup required
Scalable infrastructure for large-scale operations
API access for integration with your applications
Built-in proxy rotation and anti-blocking measures
Scheduled runs and webhooks for automation

How to Use

Click "Try This Actor" to open it on Apify
Create a free Apify account if you don't have one
Configure the input parameters as needed
Run the actor and download your results

Documentation

YouTube Scraper Pro (Apify Actor) Full‑fidelity YouTube data extractor for videos, Shorts, live streams, and channels. Built with YouTube Data API v3 + Puppeteer for reliable, scalable hybrid extraction that returns complete metadata, engagement stats, hashtags/keywords, description links, and channel insights — fast and efficient. --- ## At a Glance - Hybrid Architecture: YouTube Data API v3 + web scraping for 100% field coverage - Scrape by keywords, channel handles (e.g. `@mkbhd`), channel IDs, playlists, or direct video URLs - Returns: title, description, duration, publish date, views, likes, comments count, category, features (HD/4K/HDR), hashtags, keywords, description links, and rich channel data - Optimized: API for core fields + selective scraping for advanced fields → ~4–6s/video without proxy, all fields populated - Localization: region + language options - Optional Proxy: Residential proxy available for enhanced reliability > Best for: content research, SEO, competitive intelligence, trend analysis, brand monitoring, influencer discovery, and data‑driven content strategy. --- ## 🚀 New: Hybrid API + Scraping Architecture Version 2.6 introduces a revolutionary hybrid approach: | Method | Fields Covered | Speed | |--------|---------------|-------| | YouTube Data API v3 | 16 core fields (title, views, likes, duration, etc.) | Instant | | Web Scraping | 11 advanced fields (hashtags, links, monetization, etc.) | ~4-6s/video | | Residential Proxy (optional) | Enhanced reliability | ~10-15s/video | Benefits: - ✅ 100% field coverage - All 27 fields populated - ✅ No proxy required - Works reliably without residential proxy - ✅ Dual fallback system - API → Scraping → API Description extraction - ✅ Production ready - Proven reliability with complete data extraction --- ## Why this scraper? - Hybrid extraction: YouTube Data API v3 for core data + web scraping for advanced fields = best of both worlds - Complete fields: visits each video page to populate all available metadata (not just search snippets) - Scalable & reliable: automatic multi-project rotation for consistent performance - Triple fallback: API → Browser scraping → API description parsing ensures maximum field population - Stable & fast: ~4-6 seconds per video with 100% field coverage - Consistent schema: predictable JSON keys designed for analytics pipelines - Enterprise‑ready: optional residential proxies, localization, rate limiting, and error handling --- ## Key Features - Hybrid Data Extraction: YouTube Data API v3 (16 fields) + Web Scraping (11 advanced fields) - Intelligent API Management: Automatic multi-project rotation for optimal performance - Multiple Input Methods: keywords, channel handles/IDs, playlists, direct URLs, bulk URL upload via text file or remote file link - Content Coverage: standard videos, Shorts, live/live‑replay - Date Filtering: filter search results by publish date range (applies to search keywords only) - Comprehensive Metadata: title, description, duration, publish date, views, likes, comment count, category - Rich Media & SEO Signals: hashtags, keywords/tags, description outbound links - Channel Intelligence: id, name, handle, URL, subscribers, (optionally) totals and profile data - Performance: 3 concurrent page visits; resource/ads/fonts/video blocking for speed - Localization: `regionCode` + `language` - Optional Proxy: Residential proxy available for enhanced reliability (not required) --- ## Quick Start (Apify) 1. Create an Actor task and paste one of the JSON inputs below 2. (Optional) Enable residential proxy for maximum reliability - controlled via code setting 3. Run. Export JSON/CSV/Excel to your datastore, Google Sheets, or S3 --- ## Input Schema The input form is organized into collapsible sections for better usability: - Search Settings: Configure search behavior, localization, and result limits - Direct URLs: Scrape specific YouTube URLs directly ### Input Parameters | # | Field | Key | Type | Required | Default | Description | |---|-------|-----|------|----------|---------|-------------| | 1 | Search Keywords | `searchQueries` | Array | No | `[]` | Search for keywords, video topics, or channels. Accepts channel handles (`@name`) or channel IDs (`UC...`) | | 2 | Include Shorts | `includeShorts` | boolean | No | `false` | If true, include Shorts in search/results | | 3 | Max videos per search term | `maxResultsPerQuery` | integer | No | `10` | Applies per search keyword and per list source. Min 1 | | 4 | Country | `regionCode` | string | No | `"US"` | ISO-3166-1 alpha-2 code. Options: US, GB, CA, AU, IN, DE, FR, JP, BR, MX | | 5 | Language | `language` | string | No | `"en"` | IETF BCP-47 code. Options: en, es, de, fr, pt, ja, hi, zh | | 6 | From Date | `dateFrom` | string | No | `""` | Filter videos published after this date (YYYY-MM-DD). Only applies to Search Keywords | | 7 | To Date | `dateTo` | string | No | `""` | Filter videos published before this date (YYYY-MM-DD). Only applies to Search Keywords | | 8 | YouTube URLs | `startUrls` | Array | No | `[]` | Accepts video/shorts/channel/playlist/search URLs. Supports manual entry, text file upload, or remote file link for bulk processing | Important Notes: - Date Filtering: Date filters (`dateFrom` and `dateTo`) only apply to Search Keywords. Direct URLs are not filtered by date. - Bulk URL Upload: The YouTube URLs field supports uploading a text file (one URL per line) or linking to a remote text file for batch processing. - Residential Proxy: Configured via code (not user input) for testing purposes. Disabled by default for production. > Tip: Start with smaller `maxResultsPerQuery` to validate your setup, then scale up. ### Input Example - 01: Search with date filtering `json { "searchQueries": ["AI tools", "machine learning", "@veritasium"], "maxResultsPerQuery": 20, "includeShorts": false, "dateFrom": "2025-01-01", "dateTo": "2025-12-31", "regionCode": "US", "language": "en" }` ### Input Example - 02: Direct video URLs `json { "searchQueries": [], "startUrls": [ {"url": "https://www.youtube.com/watch?v=7Sx0o-41r2k"}, {"url": "https://www.youtube.com/watch?v=5oAnKSCP4do"}, {"url": "https://www.youtube.com/watch?v=QJBP2uy8LcU"}, {"url": "https://www.youtube.com/watch?v=DOtJEwVsJic"} ], "includeShorts": true, "maxResultsPerQuery": 10, "regionCode": "US", "language": "en" }` ### Input Example - 03: Bulk URL upload from remote file `json { "searchQueries": [], "startUrls": [ { "requestsFromUrl": "https://raw.githubusercontent.com/coregentdevspace/youtube-scraper-assets/main/youtube-scraper-pro-direct-url-text-file.txt" } ], "dateFrom": "2025-10-01", "dateTo": "2025-10-31", "includeShorts": true, "maxResultsPerQuery": 5, "regionCode": "US", "language": "en" }` --- ## Output Schema ### Core Output Fields | # | Field | Type | Description | |---|-------|------|-------------| | 1 | type | String | One of `video`, `shorts`, `live`, `stream` | | 2 | VideoId | String | YouTube video ID (e.g., `7Sx0o-41r2k`) | | 3 | PageURL | String | Canonical YouTube watch URL | | 4 | title | String | Video title | | 5 | thumbnailUrl | String | null | Primary/hero thumbnail URL | | 6 | publishDate | String (ISO) | null | When the video was published | | 7 | duration | String | null | `HH:MM:SS` format (e.g., `00:22:43`) | | 8 | durationSeconds | Integer | null | Duration in seconds | | 9 | viewCount | Integer | null | Total views | | 10 | likeCount | Integer | null | Total likes | | 11 | commentCount | Integer | null | Public comments count | | 12 | category | String | null | Video category | ### Video Properties | # | Field | Type | Description | |---|-------|------|-------------| | 13 | isLive | Boolean | Whether the item is/was a live stream | | 14 | isMembersOnly | Boolean | Members-only gated flag | | 15 | isPrivate | Boolean | Private/unavailable to public | | 16 | isFamilySafe | Boolean | null | Family-safe flag if exposed | | 17 | isMonetized | Boolean | null | Monetization detectable flag | | 18 | isRatingsAllowed | Boolean | null | Whether likes/dislikes are enabled | | 19 | commentsTurnedOff | Boolean | null | Whether comments are disabled | | 20 | commentsAllowed | Boolean | null | Convenience mirror | ### Content & Metadata | # | Field | Type | Description | |---|-------|------|-------------| | 21 | description | String | null | Creator-written description | | 22 | descriptionLinks | Array