Amazon Product Scrapper

Name: Amazon Product Scrapper
Author: happitap

by happitap

Amazon Product Scraper - extracts product details from Amazon product pages, search results, and category pages with structured data including title, ...

397 runs

80 users

Try This Actor

Opens on Apify.com

About Amazon Product Scrapper

Amazon Product Scraper - extracts product details from Amazon product pages, search results, and category pages with structured data including title, ASIN, price, ratings, reviews, availability, seller, and more.

What does this actor do?

Amazon Product Scrapper is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.

Key Features

Cloud-based execution - no local setup required
Scalable infrastructure for large-scale operations
API access for integration with your applications
Built-in proxy rotation and anti-blocking measures
Scheduled runs and webhooks for automation

How to Use

Click "Try This Actor" to open it on Apify
Create a free Apify account if you don't have one
Configure the input parameters as needed
Run the actor and download your results

Documentation

Amazon Product Scraper An Apify actor that extracts product details from Amazon product pages, search results, and category pages with comprehensive structured data including availability, seller information, and more. ## What It Does This scraper extracts structured data from various Amazon pages including: | Field | Description | |-------|-------------| | `title` | Product name/title | | `asin` | Amazon Standard Identification Number | | `price` | Current product price | | `rating` | Average user rating (e.g., 4.5) | | `reviewCount` | Total number of user reviews | | `availability` | In stock / out of stock status | | `seller` | Sold by (seller name) | | `category` | Main category or breadcrumb | | `url` | Direct product URL on Amazon | | `imageUrl` | Product image thumbnail | ## Supported Page Types - Product Pages: Individual product detail pages (e.g., `/dp/B08J8KJ9T3`) - Search Results: Search query results (e.g., `/s?k=wireless+earbuds`) - Category Pages: Category browsing pages (e.g., `/Best-Sellers-Electronics`) - Best Sellers: Best sellers pages (e.g., `/Best-Sellers/zgbs`) ## Use Cases - Product Research: Extract detailed product information for market analysis - Price Monitoring: Track product prices and availability across categories - Competitor Analysis: Monitor competitor products and pricing strategies - Inventory Tracking: Check product availability and seller information - E-commerce Data Collection: Gather comprehensive product catalogs ## Input The actor accepts the following input format: `json { "startUrls": [ { "url": "https://www.amazon.com/s?k=wireless+earbuds" } ], "maxItems": 50 }` ### Input Parameters | Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | `startUrls` | Array | Yes | - | Array of objects with `url` property pointing to Amazon pages | | `maxItems` | Number | No | `50` | Maximum number of products to extract per page | ### Supported Amazon URLs The scraper works with various Amazon page types: Product Pages: - `https://www.amazon.com/dp/B08J8KJ9T3` - `https://www.amazon.com/gp/product/B08J8KJ9T3` Search Results: - `https://www.amazon.com/s?k=wireless+earbuds` - `https://www.amazon.com/s?k=laptop&ref=sr_pg_1` Category Pages: - `https://www.amazon.com/Best-Sellers-Electronics/zgbs/electronics` - `https://www.amazon.com/Best-Sellers/zgbs` ## Output The actor outputs structured data for each product found: `json { "title": "Apple AirPods (3rd Generation)", "asin": "B08J8KJ9T3", "price": "$169.00", "rating": 4.7, "reviewCount": 15600, "availability": "In Stock", "seller": "Amazon.com Services LLC", "category": "Electronics > Headphones", "url": "https://www.amazon.com/dp/B08J8KJ9T3", "imageUrl": "https://m.media-amazon.com/images/I/61SUj2aKoEL._AC_UL320_.jpg", "scrapedAt": "2024-01-01T00:00:00.000Z" }` ## Example Usage ### Search for Wireless Earbuds `json { "startUrls": [ { "url": "https://www.amazon.com/s?k=wireless+earbuds" } ], "maxItems": 50 }` ### Extract from Product Page `json { "startUrls": [ { "url": "https://www.amazon.com/dp/B08J8KJ9T3" } ], "maxItems": 1 }` ### Multiple Sources `json { "startUrls": [ { "url": "https://www.amazon.com/s?k=laptop" }, { "url": "https://www.amazon.com/Best-Sellers-Electronics/zgbs/electronics" } ], "maxItems": 25 }` ## How It Works 1. Page Type Detection: Automatically detects whether the URL is a product page, search results, or category page 2. Appropriate Handler: Routes to the correct scraping handler based on page type 3. Data Extraction: Uses specialized selectors for each page type to extract product information 4. Comprehensive Fields: Extracts all required fields including availability and seller information 5. Data Validation: Ensures only products with valid ASINs and titles are included 6. Structured Output: Returns clean, structured data ready for analysis ## Features - Multi-Page Support: Handles product pages, search results, and category pages - Robust Extraction: Multiple fallback selectors to handle Amazon's changing page structure - Stealth Mode: Uses Puppeteer with stealth plugins to avoid detection - Proxy Support: Built-in proxy rotation for reliable scraping - Error Handling: Graceful error handling with detailed logging - Data Validation: Ensures data quality with validation checks - Availability Tracking: Extracts stock status and seller information ## Installation 1. Clone this repository 2. Install dependencies: `npm install` 3. Run the actor: `npm start` ## Development - `npm start` - Run the actor - `npm run format` - Format code with Prettier - `npm run lint` - Run ESLint - `npm run lint:fix` - Fix ESLint issues ## Architecture - `src/main.js` - Main entry point and input validation - `src/routes.js` - Request routing and page type detection - `src/handlers/amazonProductPage.js` - Individual product page scraping logic - `src/handlers/amazonSearchResults.js` - Search results and category page scraping logic - `src/puppeteerLauncher.js` - Puppeteer browser configuration with stealth mode ## Notes - The scraper is designed to be respectful of Amazon's servers and includes appropriate delays - Results may vary based on Amazon's page structure changes - The scraper automatically handles different Amazon page layouts and product formats - All extracted data is timestamped for tracking purposes - Product pages return single items, while search/category pages return multiple products

Common Use Cases

Market Research

Gather competitive intelligence and market data

Lead Generation

Extract contact information for sales outreach

Price Monitoring

Track competitor pricing and product changes

Content Aggregation

Collect and organize content from multiple sources

Ready to Get Started?

Try Amazon Product Scrapper now on Apify. Free tier available with no credit card required.

Start Free Trial

Actor Information

Developer: happitap
Pricing: Paid
Total Runs: 397
Active Users: 80

Related Actors

Google Maps Reviews Scraper

by compass

Facebook Ads Scraper

by apify

Google Ads Scraper

by silva95gustavo

Facebook marketplace scraper

by curious_coder

Browse All Actors

Apify Platform

Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.

Learn more about Apify

Need Professional Help?

Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.

Find a Specialist

Trusted by millions | Money-back guarantee | 24/7 Support