Amazon Product Scrapper
by happitap
Amazon Product Scraper - extracts product details from Amazon product pages, search results, and category pages with structured data including title, ...
Opens on Apify.com
About Amazon Product Scrapper
Amazon Product Scraper - extracts product details from Amazon product pages, search results, and category pages with structured data including title, ASIN, price, ratings, reviews, availability, seller, and more.
What does this actor do?
Amazon Product Scrapper is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.
Key Features
- Cloud-based execution - no local setup required
- Scalable infrastructure for large-scale operations
- API access for integration with your applications
- Built-in proxy rotation and anti-blocking measures
- Scheduled runs and webhooks for automation
How to Use
- Click "Try This Actor" to open it on Apify
- Create a free Apify account if you don't have one
- Configure the input parameters as needed
- Run the actor and download your results
Documentation
Amazon Product Scraper An Apify actor that extracts product details from Amazon product pages, search results, and category pages with comprehensive structured data including availability, seller information, and more. ## What It Does This scraper extracts structured data from various Amazon pages including: | Field | Description | |-------|-------------| | title | Product name/title | | asin | Amazon Standard Identification Number | | price | Current product price | | rating | Average user rating (e.g., 4.5) | | reviewCount | Total number of user reviews | | availability | In stock / out of stock status | | seller | Sold by (seller name) | | category | Main category or breadcrumb | | url | Direct product URL on Amazon | | imageUrl | Product image thumbnail | ## Supported Page Types - Product Pages: Individual product detail pages (e.g., /dp/B08J8KJ9T3) - Search Results: Search query results (e.g., /s?k=wireless+earbuds) - Category Pages: Category browsing pages (e.g., /Best-Sellers-Electronics) - Best Sellers: Best sellers pages (e.g., /Best-Sellers/zgbs) ## Use Cases - Product Research: Extract detailed product information for market analysis - Price Monitoring: Track product prices and availability across categories - Competitor Analysis: Monitor competitor products and pricing strategies - Inventory Tracking: Check product availability and seller information - E-commerce Data Collection: Gather comprehensive product catalogs ## Input The actor accepts the following input format: json { "startUrls": [ { "url": "https://www.amazon.com/s?k=wireless+earbuds" } ], "maxItems": 50 } ### Input Parameters | Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | startUrls | Array | Yes | - | Array of objects with url property pointing to Amazon pages | | maxItems | Number | No | 50 | Maximum number of products to extract per page | ### Supported Amazon URLs The scraper works with various Amazon page types: Product Pages: - https://www.amazon.com/dp/B08J8KJ9T3 - https://www.amazon.com/gp/product/B08J8KJ9T3 Search Results: - https://www.amazon.com/s?k=wireless+earbuds - https://www.amazon.com/s?k=laptop&ref=sr_pg_1 Category Pages: - https://www.amazon.com/Best-Sellers-Electronics/zgbs/electronics - https://www.amazon.com/Best-Sellers/zgbs ## Output The actor outputs structured data for each product found: json { "title": "Apple AirPods (3rd Generation)", "asin": "B08J8KJ9T3", "price": "$169.00", "rating": 4.7, "reviewCount": 15600, "availability": "In Stock", "seller": "Amazon.com Services LLC", "category": "Electronics > Headphones", "url": "https://www.amazon.com/dp/B08J8KJ9T3", "imageUrl": "https://m.media-amazon.com/images/I/61SUj2aKoEL._AC_UL320_.jpg", "scrapedAt": "2024-01-01T00:00:00.000Z" } ## Example Usage ### Search for Wireless Earbuds json { "startUrls": [ { "url": "https://www.amazon.com/s?k=wireless+earbuds" } ], "maxItems": 50 } ### Extract from Product Page json { "startUrls": [ { "url": "https://www.amazon.com/dp/B08J8KJ9T3" } ], "maxItems": 1 } ### Multiple Sources json { "startUrls": [ { "url": "https://www.amazon.com/s?k=laptop" }, { "url": "https://www.amazon.com/Best-Sellers-Electronics/zgbs/electronics" } ], "maxItems": 25 } ## How It Works 1. Page Type Detection: Automatically detects whether the URL is a product page, search results, or category page 2. Appropriate Handler: Routes to the correct scraping handler based on page type 3. Data Extraction: Uses specialized selectors for each page type to extract product information 4. Comprehensive Fields: Extracts all required fields including availability and seller information 5. Data Validation: Ensures only products with valid ASINs and titles are included 6. Structured Output: Returns clean, structured data ready for analysis ## Features - Multi-Page Support: Handles product pages, search results, and category pages - Robust Extraction: Multiple fallback selectors to handle Amazon's changing page structure - Stealth Mode: Uses Puppeteer with stealth plugins to avoid detection - Proxy Support: Built-in proxy rotation for reliable scraping - Error Handling: Graceful error handling with detailed logging - Data Validation: Ensures data quality with validation checks - Availability Tracking: Extracts stock status and seller information ## Installation 1. Clone this repository 2. Install dependencies: npm install 3. Run the actor: npm start ## Development - npm start - Run the actor - npm run format - Format code with Prettier - npm run lint - Run ESLint - npm run lint:fix - Fix ESLint issues ## Architecture - src/main.js - Main entry point and input validation - src/routes.js - Request routing and page type detection - src/handlers/amazonProductPage.js - Individual product page scraping logic - src/handlers/amazonSearchResults.js - Search results and category page scraping logic - src/puppeteerLauncher.js - Puppeteer browser configuration with stealth mode ## Notes - The scraper is designed to be respectful of Amazon's servers and includes appropriate delays - Results may vary based on Amazon's page structure changes - The scraper automatically handles different Amazon page layouts and product formats - All extracted data is timestamped for tracking purposes - Product pages return single items, while search/category pages return multiple products
Categories
Common Use Cases
Market Research
Gather competitive intelligence and market data
Lead Generation
Extract contact information for sales outreach
Price Monitoring
Track competitor pricing and product changes
Content Aggregation
Collect and organize content from multiple sources
Ready to Get Started?
Try Amazon Product Scrapper now on Apify. Free tier available with no credit card required.
Start Free TrialActor Information
- Developer
- happitap
- Pricing
- Paid
- Total Runs
- 397
- Active Users
- 80
Related Actors
Google Maps Reviews Scraper
by compass
Facebook Ads Scraper
by apify
Google Ads Scraper
by silva95gustavo
Facebook marketplace scraper
by curious_coder
Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.
Learn more about ApifyNeed Professional Help?
Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.
Trusted by millions | Money-back guarantee | 24/7 Support