NPM Scraper
by epctex
Scrape NPM packages for titles, maintainers, downloads, dependencies & more. Export clean data to JSON, Excel, or XML for analysis, with no scraping limits.
Opens on Apify.com
About NPM Scraper
Ever needed to pull structured data from NPM for an audit, a research project, or to analyze a package's ecosystem? I've been there, manually copying details from pages, and it's a chore. This actor lets you scrape NPM directly, turning that messy public data into clean, ready-to-use datasets. You can extract everything you'd expect: the package title, maintainer details, the full README, download counts broken down by version, and a list of dependent packages. It pulls all the metadata NPM exposes. I use it to gather competitive intel on similar packages or to build a directory of tools for a specific niche. The fact that it outputs to JSON, Excel, or XML means I can feed the data straight into my analysis scripts or share a formatted report with my team without any manual reformatting. There are no artificial limits on the scraping, so you can run it for a single package or batch-process a whole list. It just handles the data collection, so you can focus on the insights. If you work with the JavaScript/Node.js ecosystem and need data, this automates the tedious part.
What does this actor do?
NPM Scraper is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.
Key Features
- Cloud-based execution - no local setup required
- Scalable infrastructure for large-scale operations
- API access for integration with your applications
- Built-in proxy rotation and anti-blocking measures
- Scheduled runs and webhooks for automation
How to Use
- Click "Try This Actor" to open it on Apify
- Create a free Apify account if you don't have one
- Configure the input parameters as needed
- Run the actor and download your results
Documentation
NPM Scraper
Overview
This actor scrapes data from npmjs.com, providing an alternative to npm's limited official API. It can retrieve package details, search results, and user profiles.
Key Features
- Keyword Search: Search npm with any keyword. Supports npm's special search syntax (e.g.,
keywords:front-end). - Package Details: Scrape structured, detailed data for any package, including versions, maintainers, and historical metadata.
- User Packages: Retrieve all packages published by a specific user.
- Performance: Optimized for speed with efficient request handling.
How to Use
Configure the actor using a JSON input object. The scraper runs on Apify and stores its results in a dataset for later export via the Apify API.
Input Parameters
Provide input as a JSON object with the following fields:
search: (Optional, String) A keyword to search for on npmjs.com.startUrls: (Optional, Array) A list of npm URLs to scrape. Accepts package detail, user profile, or search/list URLs.endPage: (Optional, Number) The final page number to scrape forsearchrequests and paginatedstartUrls. Default is to scrape all pages.maxItems: (Optional, Number) Limit the total number of items scraped.proxy: (Required, Object) Proxy configuration. You must use a proxy, such as Apify Proxy.customMapFunction: (Optional, String) A JavaScript function (as a string) to transform each scraped object.
Tip: To scrape a specific list or page range, use startUrls. For example, providing a link to page 5 and setting "endPage": 6 will scrape only pages 5 and 6.
Input Example
{
"startUrls": [
"https://www.npmjs.com/package/lodash",
"https://www.npmjs.com/~ljharb",
"https://www.npmjs.com/search?q=keywords:front-end"
],
"endPage": 5,
"maxItems": 100,
"proxy": {
"useApifyProxy": true
}
}
Output
During the run, the actor outputs progress logs. Results are stored in the run's dataset. Each item is a structured JSON object. For package details, this typically includes:
* url
* name
* description
* versions
* maintainers
* Other package metadata
You can access results using the Apify API in any language. See the Apify API reference for details.
Notes
- A proxy is required for operation.
- Compute unit consumption is typically low (~0.01-0.03 units for 100 listings).
- For bug reports or feature requests, create an issue on the GitHub repository.
Categories
Common Use Cases
Market Research
Gather competitive intelligence and market data
Lead Generation
Extract contact information for sales outreach
Price Monitoring
Track competitor pricing and product changes
Content Aggregation
Collect and organize content from multiple sources
Ready to Get Started?
Try NPM Scraper now on Apify. Free tier available with no credit card required.
Start Free TrialActor Information
- Developer
- epctex
- Pricing
- Paid
- Total Runs
- 1,875
- Active Users
- 14
Related Actors
Web Scraper
by apify
Cheerio Scraper
by apify
Website Content Crawler
by apify
Legacy PhantomJS Crawler
by apify
Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.
Learn more about ApifyNeed Professional Help?
Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.
Trusted by millions | Money-back guarantee | 24/7 Support