instagram Video Scraper and Downloader

instagram Video Scraper and Downloader

by neuro-scraper

πŸš€ Unlock Instagram content like never before! Scrape, download, & explore reels, posts & videos with AI-powered fallback, smart proxies, and hidden me...

39 runs
3 users
Try This Actor

Opens on Apify.com

About instagram Video Scraper and Downloader

πŸš€ Unlock Instagram content like never before! Scrape, download, & explore reels, posts & videos with AI-powered fallback, smart proxies, and hidden media links. Perfect for creators & researchers seeking full control & insights. πŸ”βœ¨"

What does this actor do?

instagram Video Scraper and Downloader is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.

Key Features

  • Cloud-based execution - no local setup required
  • Scalable infrastructure for large-scale operations
  • API access for integration with your applications
  • Built-in proxy rotation and anti-blocking measures
  • Scheduled runs and webhooks for automation

How to Use

  1. Click "Try This Actor" to open it on Apify
  2. Create a free Apify account if you don't have one
  3. Configure the input parameters as needed
  4. Run the actor and download your results

Documentation

🌟 Instagram Video Scraper and Downloader One-line hero: Instantly fetch metadata and download from public Reels or posts β€” production-ready, privacy-safe, and built for fast batch runs in Apify Console. --- ## πŸ“– Short summary This Actor extracts clean metadata for Instagram Reels/posts and (optionally) fetches downloadable media. It returns structured records to Dataset / Key-Value store and is designed for reliability, proxy-safety, and enterprise-scale runs. --- ## πŸ’‘ Use cases β€” When to use * Bulk-collect metadata (title, author, upload date, views, likes) for analytics or feeds. * Attach best-effort download links and optionally download media for archival or processing. * Quick HTML metadata fallback when the primary scraper hits rate limits. * Privacy-sensitive workflows where raw media links should be redacted. --- ## ⚑ Quick Start (Console β€” one-click) Hero screenshot (Console run): (Add a screenshot/GIF of the Console run here for best conversion.) One-liner: Paste a list of startUrls into the Input pane and click Run β€” results appear in Dataset and Key-Value Store in seconds. --- ## βš™οΈ Quick Start (CLI + API) CLI (one-liner) bash apify run --token=<APIFY_TOKEN> Python (apify-client) β€” minimal example python from apify_client import ApifyClient client = ApifyClient(token="<APIFY_TOKEN>") run_input = { "mode": "both", "startUrls": [{"url": "https://www.instagram.com/reel/SHORTCODE/"}], "desired_resolution": "1080p" } run = client.actor("your-username/your-actor").call(run_input=run_input) print(run) --- ## πŸ“ Inputs (fields & schema) Console JSON input example (also saved as input.example.json): json { "mode": "scrape", "startUrls": [ {"url": "https://www.instagram.com/reel/SHORTCODE/"} ], "desired_resolution": "1080p", "download": false, "merge_if_ffmpeg": false, "cookie_file": "<COOKIE_FILE_STORE_KEY_OR_PATH>", "hide_media_links": true, "preserve_thumbnails": true, "maxConcurrency": 3, "preferred_proxy_type": "auto", "diagnostic": false } > Tip: The Platform can validate inputs with an input schema. Provide startUrls as an array of objects {"url": "..."} for the Console UI. --- ## βš™οΈ Configuration (actor inputs) | πŸ”‘ Name | πŸ“ Type | ❓ Required | βš™οΈ Default | πŸ“Œ Example | 🧠 Notes | | -------------------- | ------- | ----------- | ---------- | ----------------------- | --------------------------------------------------- | | mode | string | βœ… Yes | "scrape" | "scrape" / "download" | Choose what to run (metadata vs media) | | startUrls | array | βœ… Yes | None | [{"url":"https://..."}] | List of target post/reel URLs | | proxyConfiguration | object | βš™οΈ Optional | {} | {"useApifyProxy": true} | Override actor proxy settings | | preferred_proxy_type | string | βš™οΈ Optional | "auto" | "residential" | Preferred proxy type for sessions | | force_residential | boolean | βš™οΈ Optional | false | true | Alias to force residential proxy | | download | boolean | βš™οΈ Optional | false | true | Whether to download media files | | desired_resolution | string | βš™οΈ Optional | "1080p" | "720p" | Preferred media resolution (UI: string) | | merge_if_ffmpeg | boolean | βš™οΈ Optional | false | true | Use system merger to combine audio+video (optional) | | cookie_file | string | βš™οΈ Optional | None | "" | Cookie file key if authenticated access is needed | | hide_media_links | boolean | βš™οΈ Optional | true | false | Redact raw media URLs in output (privacy-safe) | | preserve_thumbnails | boolean | βš™οΈ Optional | true | false | If false, thumbnails are redacted from output | | maxConcurrency | integer | βš™οΈ Optional | 3 | 5 | Concurrency cap (1–10) | | diagnostic | boolean | βš™οΈ Optional | false | true | Enable verbose logs for debugging | > Example Console setup: Paste https://www.instagram.com/reel/SHORTCODE/ into startUrls input and click Run Actor. --- ## πŸ“„ Outputs (Dataset / KV examples) Example output (one record) json { "original_url": "https://www.instagram.com/reel/SHORTCODE/", "id": "SHORTCODE", "ownerUsername": "creator_handle", "description": "Post caption text", "likesCount": 1234, "likesDisplay": "1.2k", "commentsCount": 12, "commentsDisplay": "12", "videoViewCount": 45678, "viewsDisplay": "45.7k", "upload_date_iso": "2025-03-01T12:34:56Z", "upload_date": "1st March 2025", "thumbnail": "https://.../thumbnail.jpg", "download_links": {"merged_video": "https://..."}, "_scraped_at": "2025-11-13T12:00:00Z", "_source_index": 1 } Notes: Records are written to Dataset (rows) and a full array is stored in Key-Value under key OUTPUT. --- ## πŸ”‘ Environment Variables * APIFY_TOKEN β€” use in CLI / API calls. Use placeholder <APIFY_TOKEN> in examples. * HTTP_PROXY / HTTPS_PROXY β€” optional when providing a custom proxy like <PROXY_USER:PASS@HOST:PORT>. > ⚠️ Always store credentials as Secrets in Console (do not paste plaintext into input fields). --- ## ▢️ How to Run (Console, CLI, API) 1. Apify Console β€” open the Actor, paste startUrls JSON, choose mode, click Run. 2. CLI β€” apify run --token=<APIFY_TOKEN> (ensure Actor is published or run from project folder). 3. API / apify-client β€” call the Actor run endpoint with run_input JSON (see snippet above). Quick checklist before running * Provide startUrls (required). * If you need consistent sessions, enable proxyConfiguration or set preferred_proxy_type. * Toggle hide_media_links to redact raw media URLs for privacy. --- ## ⏰ Scheduling & Webhooks * Schedule recurring runs from the Console (Runs β†’ Schedule) β€” pick frequency and input. * Webhooks: configure a webhook on successful run completion to get run payloads (Dataset / Key-Value links) for automation. --- ## πŸ•ΎοΈ Logs & Troubleshooting * Check Run logs in Console for step-by-step messages. * Common issues: * No startUrls β€” actor exits early; supply startUrls array. * Rate limits / access errors β€” enable Proxy or try preferred_proxy_type: "residential". * Download fails β€” ensure download is enabled and proxy/cookie settings are correct. Quick fixes: enable diagnostic: true for verbose logs, or reduce maxConcurrency to avoid bursts. --- ## πŸ”’ Permissions & Storage Notes * Output storage: Dataset (records) and Key-Value (OUTPUT key) for full run JSON. * Privacy-first defaults: hide_media_links = true, preserve_thumbnails = true. * Do not store secrets in plain input β€” use Console Secrets or environment variables. --- ## πŸ”Ÿ Changelog / Versioning (example) * v1.0.0 β€” Initial public release: metadata-first scraper, HTML fallback, optional downloader, privacy defaults. --- ## πŸ–Œ Notes / TODOs * TODO: confirm output schema β€” inferred from the Actor but a formal schema.json will improve the Console UI. * TODO: add demo GIF/screenshots (provide images or Console screenshots for best conversion). --- ## 🌍 Proxy configuration Enable Apify Proxy (quick): In Console β†’ Actor run Options β†’ toggle Use Apify proxy. Custom proxy (example env vars): bash export HTTP_PROXY="http://<PROXY_USER:PASS@HOST:PORT>" export HTTPS_PROXY="http://<PROXY_USER:PASS@HOST:PORT>" Notes * Store proxy credentials as Console Secrets, not plaintext in inputs. * The Actor supports session-aware proxy URLs for consistent sessions. * TODO: Consider proxy rotation for large-scale scraping. --- ## πŸ“š References (official docs) * How to create an Actor README β€” https://docs.apify.com/academy/actor-marketing-playbook/actor-basics/how-to-create-an-actor-readme * Actor input schema β€” https://docs.apify.com/platform/actors/development/actor-definition/input-schema * Apify CLI β€” https://docs.apify.com/cli/ --- ## πŸ€” What I inferred from main.py * Primary behavior: metadata-first scraper for public Reels/posts with an HTML fallback when the primary scraper is rate-limited. * Optional media extraction/download flow that selects best-resolution streams and can merge audio+video using a system merger when enabled. * Uses a proxy configuration (session-aware) and exposes flags to prefer residential proxies. * Outputs are written to Dataset and the Key-Value store under key OUTPUT. * Defaults are privacy-focused: hide_media_links: true, preserve_thumbnails: true, and maxConcurrency capped. --- Why this Actor? Quick benefits: production-ready, privacy-safe defaults, plug-and-play in Console, and robust fallback for stable metadata collection. Run it now β€” get instant insights in seconds. Run this Actor on Apify Console β€” get results instantly. { "mode": "scrape", "startUrls": [ {"url": "https://www.instagram.com/reel/SHORTCODE/"} ], "desired_resolution": "1080p", "download": false, "merge_if_ffmpeg": false, "cookie_file": "", "hide_media_links": true, "preserve_thumbnails": true, "maxConcurrency": 3, "preferred_proxy_type": "auto", "diagnostic": false } # CONFIG.md β€” Advanced configuration & proxy notes This optional config file explains advanced options and recommended Console setup for high-volume or sensitive runs. ## Proxy & session hygiene * Prefer using the Actor's Proxy configuration option in Console (actor run Options) for session-aware URLs. * If you provide a custom proxy, store credentials as a Console Secret and reference them via environment variables or proxyConfiguration input. Example env vars bash HTTP_PROXY="http://<PROXY_USER:PASS@HOST:PORT>" HTTPS_PROXY="http://<PROXY_USER:PASS@HOST:PORT>" ## Large-scale / reliability tips * Use preferred_proxy_type: "residential" for heavy runs when access errors occur. * Lower maxConcurrency to reduce bursts when you encounter rate limits. * Enable diagnostic: true to collect detailed logs for support triage. ## Security & privacy * hide_media_links defaults to true β€” keep it enabled if you must not expose direct media URLs. * preserve_thumbnails defaults to true β€” set false to redact thumbnails as well. ## TODOs * Add an INPUT_SCHEMA.json to the repo for Console UI form validation. * Add demo screenshots/GIFs to README for higher conversion.

Common Use Cases

Market Research

Gather competitive intelligence and market data

Lead Generation

Extract contact information for sales outreach

Price Monitoring

Track competitor pricing and product changes

Content Aggregation

Collect and organize content from multiple sources

Ready to Get Started?

Try instagram Video Scraper and Downloader now on Apify. Free tier available with no credit card required.

Start Free Trial

Actor Information

Developer
neuro-scraper
Pricing
Paid
Total Runs
39
Active Users
3
Apify Platform

Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.

Learn more about Apify

Need Professional Help?

Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.

Find a Specialist

Trusted by millions | Money-back guarantee | 24/7 Support