Hugging Face Model Scraper
by parseforge
Collect models from Hugging Face Hub via public API endpoints. Get metadata including author, downloads, likes, lastModified, task, library, license, ...
Opens on Apify.com
About Hugging Face Model Scraper
Collect models from Hugging Face Hub via public API endpoints. Get metadata including author, downloads, likes, lastModified, task, library, license, tags and filenames.
What does this actor do?
Hugging Face Model Scraper is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.
Key Features
- Cloud-based execution - no local setup required
- Scalable infrastructure for large-scale operations
- API access for integration with your applications
- Built-in proxy rotation and anti-blocking measures
- Scheduled runs and webhooks for automation
How to Use
- Click "Try This Actor" to open it on Apify
- Create a free Apify account if you don't have one
- Configure the input parameters as needed
- Run the actor and download your results
Documentation
π€ Hugging Face Model Scraper Collect models from Hugging Face Hub via public API endpoints. Get metadata including author, downloads, likes, lastModified, task, library, license, tags and filenames. Built for analysts, researchers, and developers who need fast insights with no browser automation. ## π― What does it collect? β
Model id, name, URL β
Author β
Downloads, likes β
Last modified, createdAt β
Task (pipeline tag), library β
License, tags ## How to use [YouTube video embed or link] Example run: query βbertβ, 20 items, sorted by downloads. ## Input Fields supported: - query string β free text search - task string β e.g., text-classification, image-classification, text-generation - library string β e.g., transformers, diffusers, timm - license string β e.g., apache-2.0, mit, cc-by-4.0 - language string β e.g., en, zh, multi - sort enum β downloads | likes | lastModified | trending - direction enum β asc | desc - maxItems integer β max models to return (up to 1,000,000). Free users: Limited to 100. Paid users: Optional, max 1,000,000. Prefill value: 10. Here's what the filled-out input schema looks like:
And here it is written in JSON: json { "query": "bert", "sort": "downloads", "direction": "desc", "maxItems": 100 } Pro Tip: Combine multiple filters to narrow down results. For example, search for "bert" models with task "text-classification" and library "transformers" for highly targeted results. ## Output After the Actor finishes its run, you'll get a dataset with the output. The length of the dataset depends on the amount of results you've set. You can download those results as CSV, Excel, or JSON. Here's an example of scraped Hugging Face model data:
json { "imageUrl": "https://huggingface.co/google-bert/avatar", "id": "google-bert/bert-base-uncased", "name": "google-bert/bert-base-uncased", "url": "https://huggingface.co/google-bert/bert-base-uncased", "author": "google-bert", "downloads": 54018364, "likes": 2423, "private": false, "gated": false, "disabled": false, "sha": "86b5e0934494bd15c9632b12f734a8a67f723594", "lastModified": "2024-02-19T11:06:12.000Z", "createdAt": "2022-03-02T23:29:04.000Z", "task": "fill-mask", "library": "transformers", "license": "apache-2.0", "language": ["en"], "datasets": ["bookcorpus", "wikipedia"], "tags": ["exbert"], "files": [ ".gitattributes", "LICENSE", "README.md", "config.json", "model.safetensors", "pytorch_model.bin", "tokenizer.json", "tokenizer_config.json", "vocab.txt" ] } What You Get: Complete model metadata including popularity metrics (downloads, likes), technical details (task, library, license), training information (datasets, language), and available model files. Download Options: CSV, Excel, or JSON formats for easy analysis in your business tools ## β‘ Why choose this scraper? β
API-first, fast: Uses Hugging Face public API endpoints (no browser) β
Flexible filtering: query, task, library, license, language, sorting β
Comprehensive data: Get downloads, likes, tasks, licenses, files, and more β
User-Friendly: No coding neededβjust set filters and go β° Time Savings: Save hours compared to manual model research and tracking π° Cost Efficiency: Fraction of the cost of maintaining custom tracking infrastructure ## π§ How to use 1. π Sign Up: Create a free account w/ $5 credit (takes 2 minutes) 2. π Find the Scraper: Visit the Hugging Face Model Scraper page 3. βοΈ Set Input: Add your filters and max items 4. π Run It: Click "Start" and let it collect your data 5. π₯ Download Data: Get your results in the "Dataset" tab as CSV, Excel, or JSON β±οΈ Total Time: 5 minutes setup, 10-30 minutes for data collection π― No Technical Skills Required: Everything is point-and-click ## Business Use Cases AI/ML Researchers: - Track trending models in your research area - Monitor model performance metrics (downloads, likes) - Identify popular architectures and libraries - Discover datasets used for training ML Engineers: - Find production-ready models for specific tasks - Compare models by popularity and recency - Identify licensing requirements before deployment - Track model updates and new releases Data Scientists: - Build comprehensive model catalogs - Analyze AI/ML trends and adoption patterns - Identify suitable pre-trained models for projects - Monitor emerging techniques and libraries Product Managers: - Track competitive AI/ML landscape - Monitor adoption of different model types - Identify popular solutions for product features - Support AI strategy with market intelligence ## Integrate with any app and automate your workflow Hugging Face Model Scraper can be connected with almost any cloud service or web app thanks to integrations on the Apify platform. These includes: - Make - Zapier - Slack - Airbyte - GitHub - Google Drive - and much more. Alternatively, you can use webhooks to carry out an action whenever an event occurs, e.g. get a notification whenever a run successfully finishes. ## Using with the Apify API For advanced users who want to automate this process, you can control the scraper programmatically with the Apify API. This allows you to schedule regular data collection and integrate with your existing business tools. - Node.js: Install the apify-client NPM package - Python: Use the apify-client PyPI package - See the Apify API reference for full details ## π° Pricing - Start price: $0.005 per run - Price per 1,000 results: $5.00 (i.e., $0.005 per result) Free users are automatically limited to 100 items. Paid users can process up to 1,000,000 items, and if not defined, maxItems is unlimited. ## Frequently Asked Questions Q: How accurate is the data? A: We collect data directly from Hugging Face's public API in real-time, ensuring the most up-to-date and accurate information available. Q: Can I schedule regular runs? A: Yes! Use the Apify scheduler or API to schedule daily, weekly, or monthly runs automatically. Perfect for tracking model trends over time. Q: What's the rate limit? A: We respect Hugging Face's API limits. The scraper handles rate limiting automatically. Q: Can I get model descriptions and READMEs? A: Currently, the scraper focuses on metadata. For full READMEs, you can use the model URLs provided in the output. Q: What if I need help? A: Our support team is available. Contact us through the Apify platform. Q: Is my data secure? A: Absolutely. All data is encrypted in transit and at rest. We never share your data with third parties. ## π Recommended Actors Looking for more data collection tools? Check out these related actors: | Actor | Description | Link | |-------|-------------|------| | Hubspot Marketplace Scraper | Extracts business app data from HubSpot marketplace | https://apify.com/parseforge/hubspot-marketplace-scraper | | PR Newswire Scraper | Extracts press release and news content from PR Newswire | https://apify.com/parseforge/pr-newswire-scraper | | Smart Apify Actor Scraper (+70 Fields + Actor Quality Metrics) | Collects comprehensive actor data from Apify store | https://apify.com/parseforge/smart-apify-actor-scraper | | AWS Marketplace Scraper | Extracts business app data from AWS marketplace | https://apify.com/parseforge/aws-marketplace-scraper | | Stripe App Marketplace Scraper | Collects app data from Stripe marketplace | https://apify.com/parseforge/stripe-marketplace-scraper | Pro Tip: π‘ Browse our complete collection of data collection actors to find the perfect tool for your business needs. Need Help? Our support team is here to help you get the most out of this tool. --- > β οΈ Disclaimer: This Actor is an independent tool and is not affiliated with, endorsed by, or sponsored by Hugging Face or any of its subsidiaries. All trademarks mentioned are the property of their respective owners.
Categories
Common Use Cases
Market Research
Gather competitive intelligence and market data
Lead Generation
Extract contact information for sales outreach
Price Monitoring
Track competitor pricing and product changes
Content Aggregation
Collect and organize content from multiple sources
Ready to Get Started?
Try Hugging Face Model Scraper now on Apify. Free tier available with no credit card required.
Start Free TrialActor Information
- Developer
- parseforge
- Pricing
- Paid
- Total Runs
- 62
- Active Users
- 5
Related Actors
Google Search Results Scraper
by apify
Website Content Crawler
by apify
π₯ Leads Generator - $3/1k 50k leads like Apollo
by microworlds
Video Transcript Scraper: Youtube, X, Facebook, Tiktok, etc.
by invideoiq
Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.
Learn more about ApifyNeed Professional Help?
Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.
Trusted by millions | Money-back guarantee | 24/7 Support