Indeed Job Scraper

Name: Indeed Job Scraper
Author: shahidirfan

by shahidirfan

A simple Indeed Job Scraper for minimalist, essential data. Uses residential proxies and cookies to prevent blocks, ensuring smooth and reliable runs....

438 runs

48 users

Try This Actor

Opens on Apify.com

About Indeed Job Scraper

A simple Indeed Job Scraper for minimalist, essential data. Uses residential proxies and cookies to prevent blocks, ensuring smooth and reliable runs. Perfect for getting targeted job data without the clutter.

What does this actor do?

Indeed Job Scraper is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.

Key Features

Cloud-based execution - no local setup required
Scalable infrastructure for large-scale operations
API access for integration with your applications
Built-in proxy rotation and anti-blocking measures
Scheduled runs and webhooks for automation

How to Use

Click "Try This Actor" to open it on Apify
Create a free Apify account if you don't have one
Configure the input parameters as needed
Run the actor and download your results

Documentation

Indeed Jobs Scraper A powerful and configurable scraper for extracting job listings from Indeed.com. Ideal for job market analysis, recruitment automation, and data-driven insights. This actor efficiently collects job metadata and full descriptions, supporting advanced search parameters, pagination, and anti-detection measures for reliable, large-scale scraping. ## What is this actor? The Indeed Jobs Scraper is designed to automate the extraction of job postings from Indeed's search results. Whether you're building a job board, conducting market research, or aggregating employment data, this tool provides a seamless way to gather structured job information without manual effort. It mimics human browsing to avoid blocks, making it suitable for production use. ## Key Features - Comprehensive Data Extraction: Scrape job titles, companies, locations, salaries, posting dates, and detailed descriptions (both HTML and plain text). - JSON-first & API fallbacks: Job cards are parsed from Indeed's provider JSON before HTML selectors; job details try the lightweight `rpc/jobdescs` API before DOM parsing for speed and resilience. - Flexible Search Options: Input full Indeed URLs, keywords, locations, or date filters to target specific job searches. - Pagination Support: Automatically handles multiple pages of results to collect extensive datasets. - Performance Controls: Configure concurrency, proxies, and cookies to optimize speed and bypass rate limits. - Output to Dataset: Results are stored in a structured JSON format for easy integration with downstream tools. - Anti-Bot Measures: Built-in support for proxies and session management to ensure uninterrupted scraping. ## Use Cases - Job Market Research: Analyze trends in job postings by location, industry, or salary ranges. - Recruitment Platforms: Feed job data into custom job boards or matching algorithms. - Data Analytics: Collect and process job listings for reports on employment opportunities. - Competitive Intelligence: Monitor competitor hiring patterns or industry-specific roles. - Automation Workflows: Integrate with tools like Zapier or custom scripts for automated job alerts. ## Inputs Configure the actor with a JSON input object. Defaults are applied for unspecified fields to ensure ease of use. ### Search Parameters | Field | Type | Description | |----------------|------------|-----------------------------------------------------------------------------| | `searchUrl` | string | Full Indeed search URL (e.g., `https://www.indeed.com/jobs?q=developer&l=New+York`). Overrides `keyword` and `location` if provided. | | `startUrls` | string[] | Array of Indeed search URLs to scrape multiple queries in one run. | | `startUrl` | string | Alias for a single start URL. | | `keyword` | string | Job search keywords (e.g., "software engineer"). Used to build search URLs. | | `location` | string | Geographic filter (e.g., "Remote" or "San Francisco, CA"). | | `posted_date` | string | Date filter: Options include "Last 24 hours", "Last 7 days", "Last 30 days". | | `maxPages` | number | Stop after this many pages; leave empty to rely on `results_wanted`. | ### Scraping Options | Field | Type | Description | |-------------------|------------|-----------------------------------------------------------------------------| | `maxItems` | number | Limit the number of jobs collected (default: 100). Set to 0 for unlimited. | | `collectDetails` | boolean | Enable to fetch full job descriptions from detail pages (default: false). | | `maxConcurrency` | number | Number of parallel requests (default: 10). Reduce to avoid IP bans. | | `cookies` / `cookiesJson` | object|string | Cookies for authenticated sessions or to mimic real users. Provide the same UA if you pass cookies. | | `userAgent` | string | Override UA to match your cookie session. | | `proxyConfiguration` | object | Proxy settings (e.g., `{ "useApifyProxy": true, "apifyProxyGroups": ["RESIDENTIAL"] }`). Use residential proxies for better success rates. | ### Example Input `json { "startUrls": [ "https://www.indeed.com/jobs?q=software+engineer&l=Remote", "https://www.indeed.com/jobs?q=data+scientist&l=New+York" ], "maxItems": 500, "collectDetails": true, "maxConcurrency": 5, "proxyConfiguration": { "useApifyProxy": true, "apifyProxyGroups": ["RESIDENTIAL"] } }` This example scrapes up to 500 remote software engineer and New York data scientist jobs, including full descriptions, using residential proxies for reliability. ## Output Data is saved to the default dataset in JSON format. Each record contains: - `title` (string): Job title. - `company` (string): Hiring company. - `location` (string): Job location. - `postedAt` (string): Posting date (e.g., "2 days ago"). - `salary` (string): Salary details if available. - `description_html` (string): Full job description in HTML. - `description_text` (string): Plain text version of the description. - `url` (string): Direct link to the job posting. - `source` (string): Always "indeed". - `search_url` (string): The search page URL where the job was found. Example output item: `json { "title": "Senior Software Engineer", "company": "Tech Corp", "location": "Remote", "postedAt": "1 day ago", "salary": "$120,000 - $150,000 a year", "description_html": "<p>We are looking for...</p>", "description_text": "We are looking for a skilled engineer...", "url": "https://www.indeed.com/viewjob?jk=12345", "source": "indeed", "search_url": "https://www.indeed.com/jobs?q=software+engineer&l=Remote" }` ## How to Run 1. Set Up Input: Use the JSON schema above. Test with small `maxItems` first. 2. Launch the Actor: Run via Apify Console, API, or CLI. Monitor logs for progress. 3. Retrieve Results: Access the dataset after completion. Export to CSV/JSON for analysis. 4. Optimize for Scale: Adjust `maxConcurrency` and proxies based on run feedback. For CLI: `apify run your-actor-id --input input.json` ## Best Practices & Troubleshooting ### Optimizing Performance - Proxies: Always use residential proxies for high-volume scrapes to avoid CAPTCHAs. - Concurrency: Start low (e.g., 5) and increase gradually. Add delays if needed. - Cookies: Provide session cookies from a real browser to reduce detection. - Rate Limits: If errors occur, pause runs or rotate IPs. ### Common Issues - Incomplete Data: Check `searchUrl` validity or increase `maxItems`. - HTTP Errors (429/403): Lower concurrency, enable proxies, or add cookies. - CAPTCHAs: Switch to residential proxies and ensure cookies are fresh. - No Results: Verify keywords/location; Indeed may have regional restrictions. ### Limitations - Scraping is subject to Indeed's terms of service; use responsibly. - Results may vary based on Indeed's layout changes or geo-blocking. - Full descriptions require additional requests, increasing run time. ## SEO Keywords Indeed job scraper, scrape Indeed jobs, Indeed API alternative, job data extraction, automated job scraping, Indeed crawler, job market scraper, recruitment data tool, Indeed job listings scraper, extract Indeed jobs data. ## Support For issues or feature requests, check Apify's documentation or community forums. Ensure your runs comply with legal guidelines.

Common Use Cases

Market Research

Gather competitive intelligence and market data

Lead Generation

Extract contact information for sales outreach

Price Monitoring

Track competitor pricing and product changes

Content Aggregation

Collect and organize content from multiple sources

Ready to Get Started?

Try Indeed Job Scraper now on Apify. Free tier available with no credit card required.

Start Free Trial

Actor Information

Developer: shahidirfan
Pricing: Paid
Total Runs: 438
Active Users: 48

Related Actors

Company Employees Scraper

by build_matrix

🔥 LinkedIn Jobs Scraper

by bebity

Linkedin Company Detail (No Cookies)

by apimaestro

Linkedin Profile Details Batch Scraper + EMAIL (No Cookies)

by apimaestro

Browse All Actors

Apify Platform

Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.

Learn more about Apify

Need Professional Help?

Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.

Find a Specialist

Trusted by millions | Money-back guarantee | 24/7 Support