Booking Scraper
by runtime
This Apify actor scrapes hotel data from Booking.com. It supports robust navigation, proxy configuration, batch processing, and flexible extraction li...
Opens on Apify.com
About Booking Scraper
This Apify actor scrapes hotel data from Booking.com. It supports robust navigation, proxy configuration, batch processing, and flexible extraction limits.
What does this actor do?
Booking Scraper is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.
Key Features
- Cloud-based execution - no local setup required
- Scalable infrastructure for large-scale operations
- API access for integration with your applications
- Built-in proxy rotation and anti-blocking measures
- Scheduled runs and webhooks for automation
How to Use
- Click "Try This Actor" to open it on Apify
- Create a free Apify account if you don't have one
- Configure the input parameters as needed
- Run the actor and download your results
Documentation
π¨ Booking.com Hotel Scraper - Advanced Web Scraping Actor > Professional-grade Booking.com scraper built with Playwright and Apify SDK. Extract comprehensive hotel data including prices, ratings, addresses, coordinates, and direct booking links with anti-detection measures.
## π Features ### Core Functionality - Comprehensive Hotel Data Extraction: Name, price, rating, address, coordinates, and direct booking links - Advanced Anti-Detection: Human-like scrolling, mouse movements, and browser fingerprinting - Flexible Extraction Limits: Configurable maximum hotels with global counter tracking - Batch Processing: Process hotels in configurable batches for optimal performance - Detailed Mode: Visit individual hotel pages for complete addresses and precise coordinates ### Technical Capabilities - Robust Navigation: Multiple fallback selectors for dynamic Booking.com interface - Proxy Support: Apify Proxy integration with residential IPs and country selection - Error Handling: Comprehensive retry logic and graceful failure recovery - Real-time Logging: Detailed progress tracking and debugging information - Memory Efficient: Optimized for large-scale scraping operations ### Data Quality - Address Validation: Clean and validate extracted addresses - Coordinate Extraction: Multiple methods to find precise GPS coordinates - Price Normalization: Consistent price formatting when available; gracefully handles unavailable prices - Rating Accuracy: Extract guest ratings, qualitative review labels, review counts, and location scores - Rich Snippet Capture: Captures Booking.com's headline description so you can display contextual property summaries ## π Input Parameters | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | destination | string | "Paris" | City or location to search for hotels | | maxHotels | number | 100 | Maximum hotels to extract (0 = unlimited) | | batchSize | number | 10 | Hotels per batch for processing | | getDetails | boolean | false | Visit individual hotel pages for detailed data | | startUrls | array | ["https://www.booking.com"] | Starting URLs for crawling | | proxyConfiguration | object | See below | Proxy settings for anti-detection | | newUrlFunction | string | - | Custom proxy URL function (advanced) | ### Proxy Configuration json { "useApifyProxy": true, "apifyProxyGroups": ["RESIDENTIAL"], "apifyProxyCountry": "FR" } ## π₯ Example Input ### Basic Usage json { "destination": "New York", "maxHotels": 50, "batchSize": 10 } ### Advanced Usage with Proxy json { "destination": "Tokyo", "maxHotels": 100, "batchSize": 5, "getDetails": true, "proxyConfiguration": { "useApifyProxy": true, "apifyProxyGroups": ["RESIDENTIAL"], "apifyProxyCountry": "JP" } } ### Custom Start URLs json { "destination": "London", "maxHotels": 25, "startUrls": [ "https://www.booking.com/searchresults.html?ss=London" ] } ## π€ Output Format ### Standard Mode Output json { "hotelName": "HΓ΄tel l'Inattendu", "price": null, "rating": "9.3", "reviewText": "Wonderful", "reviewCount": "81", "locationScore": "9.5", "description": "HΓ΄tel l'Inattendu 6th arr., Paris 2.2 Subway AccessThe hotel Chaplain Rive Gauche is located in central Paris, 1148 feet from Jardin du Luxembourg and 1969 feet from Montparnasse. Scored 9.3 9.3Wonderful 81 reviewsLocation 9.5", "imageUrl": "https://cf.bstatic.com/xdata/images/hotel/square240/739999722.webp?k=dfdbee513e9c35d594db2ef2817546074d7737cdcff7d3f5ca4ee6be4ea7b3da&o=", "address": "6th arr., Paris", "latitude": null, "longitude": null, "hotelLink": "https://www.booking.com/hotel/fr/hotel-inattendu.html", "scrapedAt": "2025-10-02T04:37:24.053Z", "pageType": "search_results" } The description field preserves Booking.com's headline blurb (with light cleanup), so it may include location context, transport hints, and review snapshots exactly as visitors see them on the card. ### Detailed Mode Output (with getDetails: true) json { "hotelName": "The Ritz London", "price": "Β£450", "rating": "9.2", "address": "150 Piccadilly, St. James's, London W1J 9BR, United Kingdom", "detailedAddress": "150 Piccadilly, St. James's, Westminster, London, W1J 9BR, United Kingdom", "hotelLink": "https://www.booking.com/hotel/gb/the-ritz-london.html", "latitude": 51.5074, "longitude": -0.1378, "scrapedAt": "2025-01-15T10:30:00.000Z", "pageType": "hotel_details", "hotelIndex": 1 } ## π οΈ Usage Guide ### 1. Deploy on Apify Platform 1. Upload this actor to your Apify account 2. Configure input parameters in the web interface 3. Run the actor and monitor progress 4. Download results from the dataset ### 2. Local Development bash # Install dependencies npm install # Run locally with Apify CLI apify run # Run with custom input apify run --input='{"destination": "Paris", "maxHotels": 20}' ### 3. API Integration javascript const { Actor } = require('apify'); const input = { destination: "Barcelona", maxHotels: 30, getDetails: true }; const run = await Actor.call('your-actor-id', { input }); ## βοΈ Configuration Tips ### Performance Optimization - Batch Size: Use 5-10 for optimal performance vs. speed balance - Max Hotels: Set realistic limits to avoid timeouts - Proxy Groups: Use RESIDENTIAL for best success rates ### Data Quality - getDetails: Enable for precise addresses and coordinates - Destination: Use specific city names for better results - Proxy Country: Match destination country for local results ### Anti-Detection - Residential Proxies: Essential for reliable scraping - Batch Processing: Helps avoid rate limiting - Human-like Behavior: Built-in scrolling and mouse movements ## π§ Advanced Features ### Custom Proxy Functions javascript // Custom proxy URL function const newUrlFunction = ` return 'http://username:password@proxy.example.com:8080'; `; ### Multiple Destinations json { "startUrls": [ "https://www.booking.com/searchresults.html?ss=Paris", "https://www.booking.com/searchresults.html?ss=London" ] } ## π Performance Metrics - Success Rate: >95% with proper proxy configuration - Speed: 10-50 hotels per minute (depending on settings) - Memory Usage: Optimized for large-scale operations - Reliability: Built-in retry logic and error recovery ## π¨ Important Notes ### Rate Limiting - Booking.com has anti-bot measures - Always use residential proxies - Respect reasonable request rates - Monitor for IP blocks ### Data Accuracy - Prices may vary based on availability - Ratings are real-time from Booking.com - Addresses are validated and cleaned - Coordinates are extracted from multiple sources ### Legal Compliance - Respect Booking.com's Terms of Service - Use data responsibly and ethically - Consider rate limiting and delays - Monitor for policy changes ## π Troubleshooting ### Common Issues "Destination input not found" - Booking.com interface changes frequently - Actor includes multiple fallback selectors - Try refreshing or using different proxy "No hotels extracted" - Check destination spelling - Verify proxy configuration - Increase timeout values if needed "Incomplete data" - Enable getDetails for full addresses - Check proxy country matches destination - Verify network connectivity ### Debug Mode Enable detailed logging by setting environment variable: bash APIFY_LOG_LEVEL=DEBUG ## π Use Cases ### Market Research - Hotel price analysis - Competitive intelligence - Market trend monitoring ### Travel Applications - Hotel comparison tools - Travel planning platforms - Booking aggregators ### Data Analysis - Geographic distribution analysis - Price correlation studies - Rating analysis ## π License This project is licensed under the MIT License - see the LICENSE file for details. ## π Related Actors - CNN Top Headlines - Scrape the latest top news headlines from CNN's homepage and article pages with optional full article content extraction. --- Built with β€οΈ using Apify and Playwright For support, feature requests, or bug reports, please open an issue in the repository.
Common Use Cases
Market Research
Gather competitive intelligence and market data
Lead Generation
Extract contact information for sales outreach
Price Monitoring
Track competitor pricing and product changes
Content Aggregation
Collect and organize content from multiple sources
Ready to Get Started?
Try Booking Scraper now on Apify. Free tier available with no credit card required.
Start Free TrialActor Information
- Developer
- runtime
- Pricing
- Paid
- Total Runs
- 327
- Active Users
- 37
Related Actors
Google Maps Reviews Scraper
by compass
Facebook Ads Scraper
by apify
Google Ads Scraper
by silva95gustavo
Facebook marketplace scraper
by curious_coder
Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.
Learn more about ApifyNeed Professional Help?
Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.
Trusted by millions | Money-back guarantee | 24/7 Support