Pull&Bear

Name: Pull&Bear
Author: datasaurus

by datasaurus

Scrape Pull&Bear's global product catalog by country, category, or individual item. Get fast, structured data on prices, inventory, and details for e-commerce analysis.

294 runs

11 users

Try This Actor

Opens on Apify.com

About Pull&Bear

Need to pull product data from Pull&Bear's online stores? This actor is your direct line to their entire catalog, no matter where you're shopping from. It works across all of Pull&Bear's international sites, so you can gather data from the Spanish, German, or US storefronts just as easily. Whether you're looking to scrape their entire website for a full competitive analysis, target a specific category like 'New In' or 'Jeans', or just grab details for a handful of individual product URLs, this handles it. I've used it to track pricing changes and monitor inventory for my own projects, and the speed is a real advantage—you get structured JSON or CSV data quickly without bogging down your systems. It's straightforward to set up; you just configure your target country, choose your scraping scope, and run it. The output gives you clean data on product names, prices, images, variants, and descriptions, ready to feed into your database, price tracking tool, or analytics dashboard.

What does this actor do?

Pull&Bear is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.

Key Features

Cloud-based execution - no local setup required
Scalable infrastructure for large-scale operations
API access for integration with your applications
Built-in proxy rotation and anti-blocking measures
Scheduled runs and webhooks for automation

How to Use

Click "Try This Actor" to open it on Apify
Create a free Apify account if you don't have one
Configure the input parameters as needed
Run the actor and download your results

Documentation

Pull&Bear Products Scraper

Scrapes product data from Pull&Bear's global websites. Supports all countries and languages. Part of a suite of scrapers for Inditex brands, including Zara, Stradivarius, Bershka, MassimoDutti, Oysho, and ZaraHome.

Key Features

Comprehensive Data: Extracts detailed product information including name, description, price, SKUs, colors, sizes, category, images, availability date, promotions, composition, and sustainability details.
Flexible Scraping: Start from the homepage, a specific category page, or an individual product page.
Multi-URL Support: Process multiple start URLs in a single run, even from different country sites (e.g., .com/gb/ and .com/es/).
Deduplication: Returns unique products, minimizing duplicates when the same item appears across multiple categories.
Cost-Efficient: Configured with residential proxies to avoid blocking and optimized for data transfer. Scraping ~1000 products costs approximately $0.24 and takes about 5 minutes.
Structured Output: Results are in JSON format. Key fields (colors, sizes, category, mainImage) are flattened for easy export to CSV/Excel, while full detail remains in the nested colorsSizesImagesJSON field.

How to Use

Provide one or more Start URLs in the actor's input. You can control the scope using these limits:

maxProducts: Maximum number of products to scrape per start URL.
maxCategories: Maximum number of categories to explore per start URL.

Example Start URLs:
* Entire site: https://www.pullandbear.com/gb/
* Category: https://www.pullandbear.com/gb/woman/sale/clothing/t-shirts-and-tops-n7097
* Single Product: https://www.pullandbear.com/gb/midrise-capri-jeans-l07685309

Input & Output

Input Configuration:
The primary input is the startUrls array. Configure maxProducts and maxCategories to set scraping limits.

Output:
The dataset contains JSON objects for each unique product. Each item represents a product bundle (all color/size variants). The structure includes both flat fields for convenience and a detailed colorsSizesImagesJSON object containing granular color and size data.

Notes & Known Issues

Output Format: JSON is used due to the nested nature of the data (multiple colors/sizes per product).
Result Count: The final count may be lower than maxProducts because:
- The scraper filters out dummy/empty products returned by the site.
- It returns product bundles. Five separate color listings on a webpage might be consolidated into one result item containing all five colors.
Blocking: The actor uses residential proxies, but occasional request blocking (e.g., 403 errors) can occur. If a scrape fails to start or stops prematurely, re-running it usually resolves the issue.
Deduplication: While the scraper deduplicates products, in highly concurrent runs a small number of duplicates may occasionally slip through.