Tokopedia Product Search Scraper
by ecomscrape
The Tokopedia Product Search Scraper extracts detailed product data from Tokopedia including name, price, brand, etc., using search query URLs. It's p...
Opens on Apify.com
About Tokopedia Product Search Scraper
The Tokopedia Product Search Scraper extracts detailed product data from Tokopedia including name, price, brand, etc., using search query URLs. It's perfect for market research, trend analysis, lead generation, and campaign planning.
What does this actor do?
Tokopedia Product Search Scraper is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.
Key Features
- Cloud-based execution - no local setup required
- Scalable infrastructure for large-scale operations
- API access for integration with your applications
- Built-in proxy rotation and anti-blocking measures
- Scheduled runs and webhooks for automation
How to Use
- Click "Try This Actor" to open it on Apify
- Create a free Apify account if you don't have one
- Configure the input parameters as needed
- Run the actor and download your results
Documentation
Contact If you encounter any issues or need to exchange information, please feel free to contact us through the following link: My profile # Advanced Tokopedia Marketplace Data Extraction Solution Tokopedia stands as the largest eCommerce marketplace in the Indonesian region, serving millions of consumers and merchants across the archipelago. With Indonesia's e-commerce market expected to reach $200 billion by 2025, accessing and analyzing product data from this platform has become crucial for businesses, researchers, and market analysts. Tokopedia provides a customer-to-customer (C2C) platform that is free to use for merchants and buyers, hosting products across at least 25 categories, making it a goldmine of market intelligence and competitive data. ## Overview of Tokopedia Product Search Scraper Our Tokopedia scraper is designed to efficiently extract comprehensive product information from search results and category pages. This powerful tool navigates through product listings, collecting detailed data including pricing, ratings, seller information, and product specifications. The scraper handles Tokopedia's dynamic loading system and anti-bot measures through advanced proxy rotation and intelligent request management. The scraper is ideal for e-commerce businesses conducting competitive analysis, market researchers tracking price trends, dropshippers identifying profitable products, and data analysts studying Indonesian consumer behavior. With its robust error handling and scalable architecture, it ensures reliable data extraction even from large product catalogs. ## Input and Output Details Example url 1: https://www.tokopedia.com/p/fashion-anak-bayi/baju-sepatu-bayi/piyama-bayi Example url 2: https://www.tokopedia.com/p/audio-kamera-elektronik-lainnya/aksesoris-kamera/tripod-kamera Example url 3: https://www.tokopedia.com/p/audio-kamera-elektronik-lainnya/kamera-pengintai Example Screenshot of product list by query page:
### Input Format The scraper accepts configuration through a JSON object with several key parameters: #### Scrape with URLs: json { "max_retries_per_url": 2, // Maximum number of retry attempts for each URL "proxy": { // Proxy configuration to avoid bot detection "useApifyProxy": true, "apifyProxyGroups": [ "RESIDENTIAL" ], "apifyProxyCountry": "ID" // Choose a country that matches your target data location }, "max_items_per_url": 20, // Total number of items you want to scrape "urls": [ // Product list page URLs to scrape "https://www.tokopedia.com/p/fashion-anak-bayi/baju-sepatu-bayi/piyama-bayi", "https://www.tokopedia.com/p/audio-kamera-elektronik-lainnya/aksesoris-kamera/tripod-kamera", "https://www.tokopedia.com/p/audio-kamera-elektronik-lainnya/kamera-pengintai" ], "ignore_url_failures": true // Continue scraping even if some URLs fail } The urls parameter: List of product list page URLs that you want to scrape. You can add URLs one by one, or use the Bulk edit section to add a prepared list. The ignore_url_failures parameter: If set to true, the scraper will continue running even if some URLs fail to be scraped after reaching the maximum number of retries. This ensures that one problematic URL doesn't stop your entire scraping job. When you provide a list of URLs for scraping, all options in the "Scrape with search filters" section will be disabled. The system will only collect data from the URLs you specified. #### Scrape with Search Filters: json { "max_retries_per_url": 2, // Maximum number of retry attempts for each search filter "proxy": { // Proxy configuration to avoid bot detection "useApifyProxy": true, "apifyProxyGroups": [ "RESIDENTIAL" ], "apifyProxyCountry": "ID" // Choose a country that matches your target data location }, "max_items_per_url": 20, // Total number of items you want to scrape "keyword": "laptop", // Search keyword to find products "sort_by": "4", // Sort products by specific criteria "posted_date": "30" // Filter by posting date } The keyword parameter: The search keyword to find products (e.g., "laptop", "smartphone", "baju", "sepatu"). The sort_by parameter: Sort products by various criteria: - "23" - Best Match (most relevant results) - "5" - Latest Reviews (newest reviews first) - "9" - Highest Price (most expensive first) - "4" - Lowest Price (cheapest first) - "3" - Most Sold (best sellers) The posted_date parameter: Filter products by posting date: - "7" - Last 7 days - "14" - Last 14 days - "30" - Last 30 days - "90" - Last 90 days When using search filters for scraping, you need to leave the urls field empty (or set it to null) in the "Scrape with URLs" configuration. #### General Options: The max_items_per_url parameter: Limits the number of products extracted from each product list page or search results page. The default value is 20, providing a manageable batch size while allowing for comprehensive data collection. The max_retries_per_url parameter: Sets the maximum number of retry attempts for each URL or search filters if the scrape is detected as a bot or the page fails to load. The default value is 2, providing a good balance between thoroughness and efficiency. The proxy parameter: Proxy configuration is essential for maintaining anonymity and avoiding detection. The residential proxy option ensures that your scraping activities appear as legitimate browsing, reducing the risk of being blocked or rate-limited. You should choose a country that matches the location of the website you're scraping (e.g., Indonesia/ID for tokopedia.com). ### Output Format You get the output from the Tokopedia Product Search Scraper stored in a tab. The following is an example of the Information Fields collected after running the Actor. json [ // List of product information { "id": "3867655430", "url": "https://www.tokopedia.com/minicottons/mini-cottons-baby-sleepsuit-jumper-bayi-piyama-bayi-mustard-18-24-bulan?extParam=ivf%3Dfalse", "image_url": "https://images.tokopedia.net/img/cache/700/VqbcmM/2023/10/17/77b4cfcc-7bb3-48b2-a1b0-0fe55f699810.jpg", "name": "MINI COTTONS BABY Sleepsuit Jumper Bayi Piyama bayi", "price": 64900.0, "original_price": 230000.0, "currency": "Rp", "discount_percentage": 72, "is_preorder": false, "rating": 5.0, "count_review": 458, "category": "Fashion Anak & Bayi", "shop_info": { "id": 9560455, "name": "MINI COTTONS", "url": "https://www.tokopedia.com/minicottons", "location": "Bandung" } }, // ... Many other product details ] ### Output Format The scraper returns structured data for each product with the following fields: Core Product Information: - ID: Tokopedia's unique product identifier - essential for tracking and referencing specific items - URL: Direct product page link - enables easy access to full product details and customer reviews - Image URL: Primary product image link - crucial for visual product catalogs and marketing materials Pricing Data: - Price: Current selling price in Indonesian Rupiah - the actual price customers pay - Original Price: Initial price before discounts - helps calculate deal attractiveness - Currency: Always IDR for Tokopedia - ensures proper price formatting in applications - Discount Percentage: Calculated discount amount - valuable for identifying promotional opportunities Performance Metrics: - Rating: Average customer rating (1-5 stars) - indicates product quality and customer satisfaction - Count Review: Total number of reviews - shows product popularity and provides confidence in ratings Product Classification: - Category: Product category classification - useful for organizing data and market segmentation - Is Preorder: Boolean indicating availability status - helps identify inventory constraints Seller Information: - Shop Info: Merchant details including shop name, location, and verification status - critical for supplier analysis and trust assessment Example output structure helps businesses quickly identify profitable products, track competitor pricing, and analyze market trends across Indonesia's diverse product landscape. ## How to Use the Scraper Step 1: Choose Your Scraping Approach Option A - Prepare URLs: Navigate to Tokopedia and use the website's category navigation and search features. Copy category page URLs or search result URLs. Ensure URLs contain product listings rather than individual product pages. Option B - Use Search Filters: Define your search criteria using the built-in filters: - Set keyword for specific products (e.g., "laptop", "smartphone", "baju") - Select sort_by to organize results (best match, price, reviews, sales) - Choose posted_date to filter by product listing recency (7, 14, 30, or 90 days) Step 2: Configure Settings Set appropriate retry limits (max_retries_per_url: 2-3 recommended), enable residential proxies from Indonesia (ID) or Southeast Asian countries, and limit items per URL (max_items_per_url) based on your processing needs. Enable ignore_url_failures for robust scraping. Step 3: Run Extraction Execute the scraper and monitor progress. The tool handles pagination automatically and respects rate limits to avoid blocking. Best Practices: Method Selection: - Use URL-based scraping for specific category pages or complex filtered searches - Use filter-based scraping for keyword searches with sorting and date filtering - Combine both approaches: use filters for product discovery, then URLs for targeted category extraction Scraping Strategy: - Use Indonesian (ID) residential proxies to maintain access reliability - Rotate between different category URLs to diversify data - Monitor for CAPTCHA challenges and adjust retry settings accordingly - Schedule runs during off-peak Indonesian hours (late night/early morning WIB) for better performance Filter Optimization: Sorting Strategy: - Use sort_by: "23" (Best Match) for most relevant products to your keyword - Use sort_by: "4" (Lowest Price) for budget-focused research or finding deals - Use sort_by: "9" (Highest Price) to identify premium products - Use sort_by: "3" (Most Sold) to find trending/popular items - Use sort_by: "5" (Latest Reviews) to track recently reviewed products Date Filtering Strategy: - Use posted_date: "7" (Last 7 days) for newest product listings - Use posted_date: "14" (Last 14 days) for recent additions - Use posted_date: "30" (Last 30 days) for monthly product tracking - Use posted_date: "90" (Last 90 days) for quarterly market analysis - Leave empty for all-time product listings Advanced Filter Combinations: - New budget products: keyword: "laptop", sort_by: "4", posted_date: "7" - Trending items: keyword: "smartphone", sort_by: "3", posted_date: "30" - Premium products: keyword: "kamera", sort_by: "9", posted_date: "14" - Recently reviewed: keyword: "sepatu", sort_by: "5", posted_date: "7" - Best matches: keyword: "baju", sort_by: "23", posted_date: "30" Common Issues: Connection Timeouts: - Handle by increasing retry limits (max_retries_per_url: 3-5) - Ensure using Indonesian (ID) proxies for better connectivity - Enable ignore_url_failures to continue despite timeouts Blocked Requests: - Resolve by switching to Indonesian (ID) residential proxies - Rotate proxy settings if one location is blocked - Reduce max_items_per_url to lower request frequency Large Datasets: - Manage by processing URLs in batches - Use pagination wisely with smaller max_items_per_url values - Implement delays between batch runs Empty Results: - For filter-based: Verify keyword is in Indonesian or commonly used on Tokopedia - Try broader posted_date ranges if results are limited - Test different sort_by options to see if results appear CAPTCHA Challenges: - Use high-quality Indonesian residential proxies - Adjust retry settings and implement delays - Reduce scraping frequency during peak hours Sort Code Issues: - Ensure sort_by values match exactly: "23", "5", "9", "4", "3" - Default to "23" (Best Match) if uncertain Date Filter Issues: - Ensure posted_date values match: "7", "14", "30", "90" - Leave empty for all products regardless of posting date - Note that newer filters may exclude older popular products Use Cases: - Price Monitoring: Use sort_by: "4" or "9" to track price ranges - Trend Analysis: Use sort_by: "3" (Most Sold) to identify trending products - New Product Tracking: Use posted_date: "7" or "14" to monitor latest listings - Market Research: Combine keyword with different sort options to analyze market segments - Competitor Analysis: Track specific product categories with posted_date filters ## Benefits and Applications This scraper delivers significant time savings by automating data collection that would take hours manually. Market researchers gain instant access to pricing trends, product performance metrics, and competitor analysis across Indonesia's largest marketplace. E-commerce businesses can identify trending products, monitor competitor pricing strategies, and discover new suppliers efficiently. The extracted data enables price comparison services, supports dropshipping product research, powers market intelligence dashboards, and facilitates academic research into Southeast Asian e-commerce trends. With comprehensive seller information included, users can also analyze merchant performance and identify reliable suppliers for business partnerships. ## Conclusion The Tokopedia Product Search Scraper provides essential market intelligence from Indonesia's dominant e-commerce platform. Whether you're conducting competitive research, identifying market opportunities, or building data-driven applications, this tool delivers accurate, comprehensive product data efficiently. Start extracting valuable insights from millions of products and stay ahead in the rapidly growing Indonesian e-commerce market. # Related Actors - Tokopedia Product Details Page Scraper: A specialized data extraction tool engineered to harvest comprehensive product information from Tokopedia's dominant Indonesian marketplace. # Your feedback We are always working to improve Actors' performance. So, if you have any technical feedback about Tokopedia Product Search Scraper or simply found a bug, please create an issue on the Actor's Issues tab in Apify Console.
Categories
Common Use Cases
Market Research
Gather competitive intelligence and market data
Lead Generation
Extract contact information for sales outreach
Price Monitoring
Track competitor pricing and product changes
Content Aggregation
Collect and organize content from multiple sources
Ready to Get Started?
Try Tokopedia Product Search Scraper now on Apify. Free tier available with no credit card required.
Start Free TrialActor Information
- Developer
- ecomscrape
- Pricing
- Paid
- Total Runs
- 372
- Active Users
- 25
Related Actors
Google Maps Reviews Scraper
by compass
Facebook Ads Scraper
by apify
Google Ads Scraper
by silva95gustavo
Facebook marketplace scraper
by curious_coder
Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.
Learn more about ApifyNeed Professional Help?
Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.
Trusted by millions | Money-back guarantee | 24/7 Support