Metadata Scraper

Metadata Scraper

by autofacts

A powerful web scraper that extracts various types of structured metadata from web pages, including JSON-LD, Microdata, Open Graph, Twitter Cards, and...

64,340 runs
57 users
Try This Actor

Opens on Apify.com

About Metadata Scraper

A powerful web scraper that extracts various types of structured metadata from web pages, including JSON-LD, Microdata, Open Graph, Twitter Cards, and more. Perfect for SEO analysis, content aggregation, and research purposes.

What does this actor do?

Metadata Scraper is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.

Key Features

  • Cloud-based execution - no local setup required
  • Scalable infrastructure for large-scale operations
  • API access for integration with your applications
  • Built-in proxy rotation and anti-blocking measures
  • Scheduled runs and webhooks for automation

How to Use

  1. Click "Try This Actor" to open it on Apify
  2. Create a free Apify account if you don't have one
  3. Configure the input parameters as needed
  4. Run the actor and download your results

Documentation

Metadata Scraper A powerful web scraper that extracts various types of structured metadata from web pages, including JSON-LD, Microdata, Open Graph, Twitter Cards, and more. Perfect for SEO analysis, content aggregation, and research purposes. ## Features - 🔍 Comprehensive Metadata Extraction: - JSON-LD structured data - Microdata structured data (schema.org) - Open Graph metadata - Twitter Card metadata - Website icons/favicons - Standard meta tags - ⚙️ Advanced Configuration: - Configurable crawling depth - Adjustable concurrency - Request limits - Proxy support - 🚀 Robust Performance: - Efficient HTML parsing - Handles multiple JSON-LD formats - Support for various icon formats ## Input Parameters | Parameter | Type | Description | Default | |-----------|------|-------------|---------| | startUrls | Array | URLs to start crawling from | (required) | | maxRequestsPerCrawl | Integer | Maximum number of pages to crawl | 100 | | maxConcurrency | Integer | Maximum number of pages processed in parallel | 10 | | extractMetaTags | Boolean | Whether to extract all meta tags | true | ## Output Data Structure For each crawled page, the scraper outputs a JSON object with the following fields: | Field | Type | Description | |-------|------|-------------| | url | String | The URL of the crawled page | | title | String | The page title | | icon | String | URL of the website's icon/favicon | | linkedData | Array | JSON-LD structured data found on the page | | microdata | Array | Microdata structured data (schema.org) found on the page | | openGraph | Object | Open Graph metadata (used by Facebook and other platforms) | | twitterCard | Object | Twitter Card metadata | | metaTags | Object | Other meta tags from the page (when extractMetaTags is enabled) | ## Example Use Cases ### E-commerce Research Extract product information, pricing, availability, and reviews from various online stores for competitive analysis or price monitoring. ### Content Aggregation Build a news aggregator or content recommendation engine by extracting article metadata from different sources. ### SEO Analysis Analyze websites' structured data implementation for SEO optimization recommendations. ### Social Media Preview Testing Test how your content will appear when shared on social media platforms by extracting Open Graph and Twitter Card data. ## Example Outputs ### Medium Article json { "url": "https://medium.com/coding-beauty/new-google-project-idx-fae1fdd079c7", "title": "This new IDE from Google is an absolute game changer | by Tari Ibaba | Coding Beauty | Mar, 2025 | Medium", "linkedData": [ { "@context": "http://schema.org", "@type": "NewsArticle", "image": [ "https://miro.medium.com/v2/resize:fit:1200/1*f-1HQQng85tbA7kwgECqoQ.png" ], "url": "https://medium.com/coding-beauty/new-google-project-idx-fae1fdd079c7", "dateCreated": "2025-03-11T19:45:26.427Z", "datePublished": "2025-03-11T19:45:26.427Z", "dateModified": "2025-04-15T14:25:56.263Z", "headline": "This new IDE from Google is an absolute game changer", "name": "This new IDE from Google is an absolute game changer", "description": "I was not surprised to see this sort of thing coming from Google — with their deep-seated hatred for local desktop apps. Loading your projects from GitHub and then install dependencies instantly…", "identifier": "fae1fdd079c7", "author": { "@type": "Person", "name": "Tari Ibaba", "url": "https://medium.com/@tariibaba" }, "creator": [ "Tari Ibaba" ], "publisher": { "@type": "Organization", "name": "Coding Beauty", "url": "https://medium.com/coding-beauty", "logo": { "@type": "ImageObject", "width": 272, "height": 60, "url": "https://miro.medium.com/v2/resize:fit:544/7*V1_7XP4snlmqrc_0Njontw.png" } }, "mainEntityOfPage": "https://medium.com/coding-beauty/new-google-project-idx-fae1fdd079c7", "isAccessibleForFree": "False", "hasPart": { "@type": "WebPageElement", "isAccessibleForFree": "False", "cssSelector": ".meteredContent" } } ], "microdata": [], "openGraph": { "site_name": "Medium", "type": "article", "title": "This new IDE from Google is an absolute game changer", "description": "This new IDE from Google is seriously revolutionary.", "url": "https://medium.com/coding-beauty/new-google-project-idx-fae1fdd079c7", "image": "https://miro.medium.com/v2/resize:fit:1200/1*f-1HQQng85tbA7kwgECqoQ.png" }, "twitterCard": { "app:name:iphone": "Medium", "app:id:iphone": "828256236", "site": "@CodingBeautyDev", "app:url:iphone": "medium://p/fae1fdd079c7", "image:src": "https://miro.medium.com/v2/resize:fit:1200/1*f-1HQQng85tbA7kwgECqoQ.png", "card": "summary_large_image", "creator": "@tariibabadev", "label1": "Reading time", "data1": "5 min read", "title": "This new IDE from Google is an absolute game changer", "description": "This new IDE from Google is seriously revolutionary.", "image": "https://miro.medium.com/v2/resize:fit:1200/1*f-1HQQng85tbA7kwgECqoQ.png", "has_large_image": "true" }, "metaTags": { "viewport": "width=device-width,minimum-scale=1,initial-scale=1,maximum-scale=1", "theme-color": "#000000", "al:ios:app_name": "Medium", "al:ios:app_store_id": "828256236", "al:android:package": "com.medium.reader", "fb:app_id": "542599432471018", "article:published_time": "2025-04-10T09:50:11.344Z", "title": "This new IDE from Google is an absolute game changer | by Tari Ibaba | Coding Beauty | Mar, 2025 | Medium", "al:android:url": "medium://p/fae1fdd079c7", "al:ios:url": "medium://p/fae1fdd079c7", "al:android:app_name": "Medium", "description": "I was not surprised to see this sort of thing coming from Google — with their deep-seated hatred for local desktop apps. Loading your projects from GitHub and then install dependencies instantly…", "al:web:url": "https://medium.com/coding-beauty/new-google-project-idx-fae1fdd079c7", "article:author": "https://medium.com/@tariibaba", "author": "Tari Ibaba", "robots": "index,noarchive,follow,max-image-preview:large", "referrer": "unsafe-url" }, "icon": "https://miro.medium.com/v2/resize:fill:304:304/10fd5c419ac61637245384e7099e131627900034828f4f386bdaa47a74eae156" } ### YouTube Video json { "url": "https://www.youtube.com/watch?v=YCgnccJW_O0", "title": "Rust on Vercel | Build and Deploy Blazing Fast Serverless Functions (Full Guide) - YouTube", "linkedData": [], "microdata": [ { "@type": "http://schema.org/VideoObject", "url": [ "https://www.youtube.com/watch?v=YCgnccJW_O0", "http://www.youtube.com/@Semicolon10", "https://i.ytimg.com/vi/YCgnccJW_O0/maxresdefault.jpg" ], "name": [ "Rust on Vercel | Build and Deploy Blazing Fast Serverless Functions (Full Guide)", "Semicolon" ], "description": "Want to deploy blazing fast serverless functions using Rust? In this video, I'll show you how to run Rust on Vercel with zero hassle. We'll build a simple AP...", "requiresSubscription": "False", "identifier": "YCgnccJW_O0", "duration": "PT12M15S", "position": "1", "thumbnailUrl": "https://i.ytimg.com/vi/YCgnccJW_O0/maxresdefault.jpg", "width": [ "1280", "1280" ], "height": [ "720", "720" ], "embedUrl": "https://www.youtube.com/embed/YCgnccJW_O0", "playerType": "HTML5 Flash", "isFamilyFriendly": "true", "regionsAllowed": "AD,AE,AF,AG,AI,AL,AM,AO,AQ,AR,AS,AT,AU,AW,AX,AZ,BA,BB,BD,BE,BF,BG,BH,BI,BJ,BL,BM,BN,BO,BQ,BR,BS,BT,BV,BW,BY,BZ,CA,CC,CD,CF,CG,CH,CI,CK,CL,CM,CN,CO,CR,CU,CV,CW,CX,CY,CZ,DE,DJ,DK,DM,DO,DZ,EC,EE,EG,EH,ER,ES,ET,FI,FJ,FK,FM,FO,FR,GA,GB,GD,GE,GF,GG,GH,GI,GL,GM,GN,GP,GQ,GR,GS,GT,GU,GW,GY,HK,HM,HN,HR,HT,HU,ID,IE,IL,IM,IN,IO,IQ,IR,IS,IT,JE,JM,JO,JP,KE,KG,KH,KI,KM,KN,KP,KR,KW,KY,KZ,LA,LB,LC,LI,LK,LR,LS,LT,LU,LV,LY,MA,MC,MD,ME,MF,MG,MH,MK,ML,MM,MN,MO,MP,MQ,MR,MS,MT,MU,MV,MW,MX,MY,MZ,NA,NC,NE,NF,NG,NI,NL,NO,NP,NR,NU,NZ,OM,PA,PE,PF,PG,PH,PK,PL,PM,PN,PR,PS,PT,PW,PY,QA,RE,RO,RS,RU,RW,SA,SB,SC,SD,SE,SG,SH,SI,SJ,SK,SL,SM,SN,SO,SR,SS,ST,SV,SX,SY,SZ,TC,TD,TF,TG,TH,TJ,TK,TL,TM,TN,TO,TR,TT,TV,TW,TZ,UA,UG,UM,US,UY,UZ,VA,VC,VE,VG,VI,VN,VU,WF,WS,YE,YT,ZA,ZM,ZW", "interactionType": [ "https://schema.org/LikeAction", "https://schema.org/WatchAction" ], "userInteractionCount": [ "28", "525" ], "datePublished": "2025-04-18T05:31:11-07:00", "uploadDate": "2025-04-18T05:31:11-07:00", "genre": "Science & Technology" } ], "openGraph": { "site_name": "YouTube", "url": "https://www.youtube.com/watch?v=YCgnccJW_O0", "title": "Rust on Vercel | Build and Deploy Blazing Fast Serverless Functions (Full Guide)", "image": "https://i.ytimg.com/vi/YCgnccJW_O0/maxresdefault.jpg", "image:width": "1280", "image:height": "720", "description": "Want to deploy blazing fast serverless functions using Rust? In this video, I'll show you how to run Rust on Vercel with zero hassle. We'll build a simple AP...", "type": "video.other", "video:url": "https://www.youtube.com/embed/YCgnccJW_O0", "video:secure_url": "https://www.youtube.com/embed/YCgnccJW_O0", "video:type": "text/html", "video:width": "1280", "video:height": "720" }, "twitterCard": { "card": "player", "site": "@youtube", "url": "https://www.youtube.com/watch?v=YCgnccJW_O0", "title": "Rust on Vercel | Build and Deploy Blazing Fast Serverless Functions (Full Guide)", "description": "Want to deploy blazing fast serverless functions using Rust? In this video, I'll show you how to run Rust on Vercel with zero hassle. We'll build a simple AP...", "image": "https://i.ytimg.com/vi/YCgnccJW_O0/maxresdefault.jpg", "app:name:iphone": "YouTube", "app:id:iphone": "544007664", "app:name:ipad": "YouTube", "app:id:ipad": "544007664", "app:url:iphone": "vnd.youtube://www.youtube.com/watch?v=YCgnccJW_O0&feature=applinks", "app:url:ipad": "vnd.youtube://www.youtube.com/watch?v=YCgnccJW_O0&feature=applinks", "app:name:googleplay": "YouTube", "app:id:googleplay": "com.google.android.youtube", "app:url:googleplay": "https://www.youtube.com/watch?v=YCgnccJW_O0", "player": "https://www.youtube.com/embed/YCgnccJW_O0", "player:width": "1280", "player:height": "720" }, "metaTags": { "theme-color": "rgba(255, 255, 255, 0.98)", "title": "Rust on Vercel | Build and Deploy Blazing Fast Serverless Functions (Full Guide)", "description": "Want to deploy blazing fast serverless functions using Rust? In this video, I'll show you how to run Rust on Vercel with zero hassle. We'll build a simple AP...", "keywords": "video, sharing, camera phone, video phone, free, upload", "al:ios:app_store_id": "544007664", "al:ios:app_name": "YouTube", "al:ios:url": "vnd.youtube://www.youtube.com/watch?v=YCgnccJW_O0&feature=applinks", "al:android:url": "vnd.youtube://www.youtube.com/watch?v=YCgnccJW_O0&feature=applinks", "al:web:url": "http://www.youtube.com/watch?v=YCgnccJW_O0&feature=applinks", "al:android:app_name": "YouTube", "al:android:package": "com.google.android.youtube", "fb:app_id": "87741124305" }, "icon": "https://www.youtube.com/s/desktop/c722ba88/img/logos/favicon_32x32.png" } ## Getting Help If you need help or have questions about using this Actor, please don't hesitate to submit issues.

Common Use Cases

Market Research

Gather competitive intelligence and market data

Lead Generation

Extract contact information for sales outreach

Price Monitoring

Track competitor pricing and product changes

Content Aggregation

Collect and organize content from multiple sources

Ready to Get Started?

Try Metadata Scraper now on Apify. Free tier available with no credit card required.

Start Free Trial

Actor Information

Developer
autofacts
Pricing
Paid
Total Runs
64,340
Active Users
57
Apify Platform

Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.

Learn more about Apify

Need Professional Help?

Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.

Find a Specialist

Trusted by millions | Money-back guarantee | 24/7 Support