Scrape And Bypass Any Url Using Scrappey
by dormic
A template for scraping data from web pages using the Scrappey.com API service integrated with an Apify Actor. This actor provides a robust solution f...
Opens on Apify.com
About Scrape And Bypass Any Url Using Scrappey
A template for scraping data from web pages using the Scrappey.com API service integrated with an Apify Actor. This actor provides a robust solution for handling complex web scraping scenarios, including sites with anti-bot protection such as Cloudflare, Datadome, PerimeterX and all other forms.
What does this actor do?
Scrape And Bypass Any Url Using Scrappey is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.
Key Features
- Cloud-based execution - no local setup required
- Scalable infrastructure for large-scale operations
- API access for integration with your applications
- Built-in proxy rotation and anti-blocking measures
- Scheduled runs and webhooks for automation
How to Use
- Click "Try This Actor" to open it on Apify
- Create a free Apify account if you don't have one
- Configure the input parameters as needed
- Run the actor and download your results
Documentation
Apify Scrappey Actor A powerful web scraping solution that combines Apify's actor infrastructure with Scrappey's advanced anti-detection capabilities. This actor helps you scrape any website while bypassing common anti-bot protections like Cloudflare, Datadome, and PerimeterX.
## 🚀 Key Features - Advanced Protection Bypass - Handles Cloudflare, Datadome, PerimeterX, and other anti-bot systems - Session Management - Maintains persistent browser sessions for efficient scraping - Smart Proxy Rotation - Automatic proxy management with country-specific options - Browser Fingerprint Randomization - Prevents detection through browser fingerprinting - Comprehensive Data Extraction - Captures HTML, cookies, headers, and more - Error Handling - Robust error handling with detailed error codes and messages ## 📋 Input Options javascript { "scrappeyApiKey": "your-api-key", "url": "https://example.com", "requestType": "browser", // "browser" or "request" "customHeaders": {}, // Custom HTTP headers "browserActions": [], // Automated browser actions "session": null, // Session ID for persistent browsing "proxyCountry": null, // Specific country for proxy "cookiejar": null, // Pre-set cookies "includeImages": false, // Include image URLs in response "includeLinks": false // Include link URLs in response } ## 📦 Output Data Structure The actor stores the following data in the Apify dataset: javascript { "url": "scraped-url", "verified": true/false, // Request verification status "cookieString": "cookie-string", // Formatted cookie string "responseHeaders": {}, // Response HTTP headers "requestHeaders": {}, // Request HTTP headers "html": "page-html", // Raw HTML content "innerText": "page-text", // Page text content "cookies": [], // Array of cookies "ipInfo": {}, // IP information "status": 200, // HTTP status code "timeElapsed": "1.2s", // Request duration "session": "session-id", // Session identifier "localStorage": {}, // Browser localStorage data "timestamp": "ISO-date" // Timestamp of scrape } ## 🛠️ Common Use Cases 1. E-commerce Scraping - Product details from protected stores - Price monitoring - Inventory tracking 2. Login-Protected Content - Session management for authenticated scraping - Cookie handling for maintaining login state 3. Anti-Bot Protected Sites - Cloudflare challenge bypass - Datadome protection handling - PerimeterX mitigation ## 💡 Usage Examples ### Basic Scraping javascript { "scrappeyApiKey": "your-api-key", "url": "https://example.com", "requestType": "browser" } ### Session-Based Scraping javascript { "scrappeyApiKey": "your-api-key", "url": "https://example.com", "requestType": "browser", "session": "my-session-id", "cookiejar": [ { "name": "sessionId", "value": "abc123", "domain": "example.com", "path": "/" } ] } ### Geo-Targeted Scraping javascript { "scrappeyApiKey": "your-api-key", "url": "https://example.com", "proxyCountry": "UnitedStates", "includeImages": true, "includeLinks": true } ## ⚠️ Error Handling The actor handles common error scenarios: | Code | Description | Solution | |------|-------------|----------| | CODE-0001 | Server overload | Retry with backoff | | CODE-0002 | Cloudflare blocked | Try different proxy | | CODE-0010 | Datadome blocked | Change proxy country | | CODE-0029 | Too many sessions | Wait for session cleanup | ## 🚦 Best Practices 1. Session Management - Use persistent sessions for related requests - Clean up sessions when done using sessions.destroy 2. Proxy Usage - Rotate proxies for high-volume scraping - Use country-specific proxies for geo-restricted content 3. Error Handling - Implement exponential backoff for retries - Monitor error rates by URL ## 📚 Getting Started 1. Setup bash git clone https://github.com/yourusername/apify-scrappey cd apify-scrappey npm install 2. Configuration - Get your Scrappey API key from scrappey.com - Set up your input.json in the Apify console or locally 3. Running Locally bash apify run 4. Deployment bash apify login apify push ## 🔗 Resources - Scrappey API Documentation - Apify SDK Documentation - Example Scraping Scripts ## 🆘 Support - Technical issues: Create GitHub Issue - Scrappey API: Scrappey Support - Apify Platform: Apify Discord ## 📄 License ISC License - Feel free to use this actor for your scraping needs!
Categories
Common Use Cases
Market Research
Gather competitive intelligence and market data
Lead Generation
Extract contact information for sales outreach
Price Monitoring
Track competitor pricing and product changes
Content Aggregation
Collect and organize content from multiple sources
Ready to Get Started?
Try Scrape And Bypass Any Url Using Scrappey now on Apify. Free tier available with no credit card required.
Start Free TrialActor Information
- Developer
- dormic
- Pricing
- Paid
- Total Runs
- 58,186
- Active Users
- 142
Related Actors
Google Search Results Scraper
by apify
Website Content Crawler
by apify
🔥 Leads Generator - $3/1k 50k leads like Apollo
by microworlds
Video Transcript Scraper: Youtube, X, Facebook, Tiktok, etc.
by invideoiq
Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.
Learn more about ApifyNeed Professional Help?
Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.
Trusted by millions | Money-back guarantee | 24/7 Support