NPM Scraper

NPM Scraper

by epctex

Scrape NPM packages for titles, maintainers, downloads, dependencies & more. Export clean data to JSON, Excel, or XML for analysis, with no scraping limits.

1,875 runs
14 users
Try This Actor

Opens on Apify.com

About NPM Scraper

Ever needed to pull structured data from NPM for an audit, a research project, or to analyze a package's ecosystem? I've been there, manually copying details from pages, and it's a chore. This actor lets you scrape NPM directly, turning that messy public data into clean, ready-to-use datasets. You can extract everything you'd expect: the package title, maintainer details, the full README, download counts broken down by version, and a list of dependent packages. It pulls all the metadata NPM exposes. I use it to gather competitive intel on similar packages or to build a directory of tools for a specific niche. The fact that it outputs to JSON, Excel, or XML means I can feed the data straight into my analysis scripts or share a formatted report with my team without any manual reformatting. There are no artificial limits on the scraping, so you can run it for a single package or batch-process a whole list. It just handles the data collection, so you can focus on the insights. If you work with the JavaScript/Node.js ecosystem and need data, this automates the tedious part.

What does this actor do?

NPM Scraper is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.

Key Features

  • Cloud-based execution - no local setup required
  • Scalable infrastructure for large-scale operations
  • API access for integration with your applications
  • Built-in proxy rotation and anti-blocking measures
  • Scheduled runs and webhooks for automation

How to Use

  1. Click "Try This Actor" to open it on Apify
  2. Create a free Apify account if you don't have one
  3. Configure the input parameters as needed
  4. Run the actor and download your results

Documentation

NPM Scraper

Overview

This actor scrapes data from npmjs.com, providing an alternative to npm's limited official API. It can retrieve package details, search results, and user profiles.

Key Features

  • Keyword Search: Search npm with any keyword. Supports npm's special search syntax (e.g., keywords:front-end).
  • Package Details: Scrape structured, detailed data for any package, including versions, maintainers, and historical metadata.
  • User Packages: Retrieve all packages published by a specific user.
  • Performance: Optimized for speed with efficient request handling.

How to Use

Configure the actor using a JSON input object. The scraper runs on Apify and stores its results in a dataset for later export via the Apify API.

Input Parameters

Provide input as a JSON object with the following fields:

  • search: (Optional, String) A keyword to search for on npmjs.com.
  • startUrls: (Optional, Array) A list of npm URLs to scrape. Accepts package detail, user profile, or search/list URLs.
  • endPage: (Optional, Number) The final page number to scrape for search requests and paginated startUrls. Default is to scrape all pages.
  • maxItems: (Optional, Number) Limit the total number of items scraped.
  • proxy: (Required, Object) Proxy configuration. You must use a proxy, such as Apify Proxy.
  • customMapFunction: (Optional, String) A JavaScript function (as a string) to transform each scraped object.

Tip: To scrape a specific list or page range, use startUrls. For example, providing a link to page 5 and setting "endPage": 6 will scrape only pages 5 and 6.

Input Example

{
  "startUrls": [
    "https://www.npmjs.com/package/lodash",
    "https://www.npmjs.com/~ljharb",
    "https://www.npmjs.com/search?q=keywords:front-end"
  ],
  "endPage": 5,
  "maxItems": 100,
  "proxy": {
    "useApifyProxy": true
  }
}

Output

During the run, the actor outputs progress logs. Results are stored in the run's dataset. Each item is a structured JSON object. For package details, this typically includes:
* url
* name
* description
* versions
* maintainers
* Other package metadata

You can access results using the Apify API in any language. See the Apify API reference for details.

Notes

  • A proxy is required for operation.
  • Compute unit consumption is typically low (~0.01-0.03 units for 100 listings).
  • For bug reports or feature requests, create an issue on the GitHub repository.

Common Use Cases

Market Research

Gather competitive intelligence and market data

Lead Generation

Extract contact information for sales outreach

Price Monitoring

Track competitor pricing and product changes

Content Aggregation

Collect and organize content from multiple sources

Ready to Get Started?

Try NPM Scraper now on Apify. Free tier available with no credit card required.

Start Free Trial

Actor Information

Developer
epctex
Pricing
Paid
Total Runs
1,875
Active Users
14
Apify Platform

Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.

Learn more about Apify

Need Professional Help?

Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.

Find a Specialist

Trusted by millions | Money-back guarantee | 24/7 Support