✨ Y Combinator Scraper Apify

Name: ✨ Y Combinator Scraper Apify
Author: damilo

by damilo

⚡ Scrape Y Combinator’s startup directory with rich data: company info, founders, job postings, batch, team size, social links, and more. Filter by in...

391 runs

61 users

Try This Actor

Opens on Apify.com

About ✨ Y Combinator Scraper Apify

⚡ Scrape Y Combinator’s startup directory with rich data: company info, founders, job postings, batch, team size, social links, and more. Filter by industry, region, or hiring status. Ideal for lead gen, VC scouting, and recruiting. Clean JSON output ready for Airtable, Notion, or Excel.

What does this actor do?

✨ Y Combinator Scraper Apify is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.

Key Features

Cloud-based execution - no local setup required
Scalable infrastructure for large-scale operations
API access for integration with your applications
Built-in proxy rotation and anti-blocking measures
Scheduled runs and webhooks for automation

How to Use

Click "Try This Actor" to open it on Apify
Create a free Apify account if you don't have one
Configure the input parameters as needed
Run the actor and download your results

Documentation

Y Combinator Scraper Easily extract structured data on thousands of startups from YCombinator.com — including company descriptions, websites, founders, job postings, batch info, team size, LinkedIn, Crunchbase links, and more. > 🚀 Whether you're a VC, recruiter, founder, journalist, or researcher — this scraper saves hours of manual work. --- ## ✅ What This Scraper Does This Apify actor lets you scrape Y Combinator startup profiles in bulk using their public company directory. With just a few clicks, you'll get: - ✅ Company name, website, and YC page link - ✅ Short and long descriptions - ✅ Batch (e.g. Summer 2024), stage, status - ✅ Team size, location, founding year - ✅ Founders with social media profiles - ✅ Active job postings (title, location, type, salary) - ✅ Social links: LinkedIn, X (Twitter), Facebook, Crunchbase - ✅ Industry tags and advanced filters (like isHiring, region, nonprofit) --- ## 📦 Example Use Cases - Lead generation: Build targeted B2B lead lists based on industry, team size, and region - Recruiting: Find hiring startups and their open roles - Investor research: Track new YC companies, founder bios, and product pitches - Market intelligence: Analyze trends across batches, industries, or geographies - News & media: Source verified startup info for journalism or newsletters --- ## 🛠️ Input Options The scraper is flexible. You can control the data you want via the following inputs: ### `directUrls` (array of URLs) Paste search result URLs from YCombinator.com, with filters applied. Example: `json [ "https://www.ycombinator.com/companies?batch=Winter%202025&industry=B2B&isHiring=true" ]` ### `searchQuery` (optional, string) Search for specific keywords in the company name, description, or tags. ### `sort` (optional, string) Choose how results are sorted: - `default`: Y Combinator’s default sort - `launch_date`: Newest companies first --- JSON example: `json { "directUrls": [ "https://www.ycombinator.com/companies?batch=Winter%202025&batch=Summer%202024&industry=B2B&isHiring=true&query=arv&team_size=%5B%225%22%2C%22500%22%5D" ], "searchQuery": "ar", "sort": "default" }` --- ## 📤 Output Data Format Each scraped company will be saved as a JSON object in the Apify dataset. Here's a breakdown of the fields you’ll receive: | Field | Type | Description | | ------------------- | ------- | ---------------------------------------------------------------------- | | `id` | string | Unique identifier of the company from Y Combinator | | `logo` | string | URL to the company's logo image | | `name` | string | Company name | | `yc_url` | string | Direct link to the company's Y Combinator profile page | | `website` | string | Company’s official website URL | | `short_description` | string | One-liner description from YC | | `long_description` | string | Detailed description of what the company does | | `batch` | string | Y Combinator batch (e.g. "Winter 2025") | | `status` | string | Startup status in the YC ecosystem (e.g. "active", "dead", "acquired") | | `stage` | string | Stage of the company, if available (e.g. "pre-launch", "launched") | | `tags` | array | Array of industry tags assigned by YC | | `location` | string | HQ or main location of the company | | `year_founded` | integer | Year the company was founded | | `team_size` | integer | Reported team size (min/max if available) | | `linkedin` | string | Link to company’s LinkedIn page | | `x` | string | Link to company’s X (Twitter) profile | | `facebook` | string | Link to company’s Facebook page | | `crunchbase` | string | Link to Crunchbase profile | | `job_postings` | array | List of current job openings posted on YC | | → `title` | string | Job title | | → `url` | string | Direct link to the job posting | | → `location` | string | Job location | | → `type` | string | Employment type (e.g. full-time, intern) | | → `salary` | string | Salary range (if available) | | → `experience` | string | Required experience (if listed) | | `founders` | array | List of founders and their public profiles | | → `full_name` | string | Founder’s full name | | → `title` | string | Title or role at the company | | → `avatar` | string | Thumbnail/avatar image of the founder | | → `linkedin` | string | Founder’s LinkedIn profile | | → `x` | string | Founder’s Twitter (X) profile | You can download the dataset in JSON, CSV, or XLSX format — or use the Apify API to fetch the data programmatically. --- Example JSON Output: json [ { "id": 29973, "logo": "https://bookface-images.s3.us-west-2.amazonaws.com/logos/4e428f0bdd367f8e923ff5d314b4a4394e11ccd7.png?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=ASIAQC4NIECAHGNCKMI3%2F20250716%2Fus-west-2%2Fs3%2Faws4_request&X-Amz-Date=20250716T214732Z&X-Amz-Expires=3600&X-Amz-Security-Token=IQoJb3JpZ2luX2VjEE0aCXVzLXdlc3QtMiJGMEQCIDjs%2FQg4zO30Z%2BYa80v4ekkNTPAWdSYtLHkZmV8YlhG0AiB89MskitEQd1Z6AZJ0LYx%2BGRpTDbY%2B63hu02UyMfFukSrlAwhmEAAaDDAwNjIwMTgxMTA3MiIM8AFe%2F52HPppDdE%2BoKsIDquMtiUdKBV9U9j7zhu3XXuenlRfreSsmF6kZPvt%2F7qxiHkskqyqOMLPQ1zAHH5E7YmonqIVMDwHMSXNOQsy4iYU7q%2B%2FhxjhpK39iHA51ZzL08hg4eB%2Fo5viH%2FxtbNBqCbikbOV%2FTWIeKaQcGqaiKFrSyKla6MPwtKSDQkAIhYeo4raPIM7Uw8thv6nXZfv2B4o97kPRoh7L41lHMUnzzthJdlfpXnZ3RUcB75kGM%2Bz9CKcaKrYM33%2F2FmePZdgPrbWM7uia8zGoZJ6My9wUQe%2BW33ar4R78lY2FvRdHjHNlLjhTJQnNe0qBfUBXRYx3BBs6v20j9XZVvIMR4LiFqY9JlmGpf4I4FPnUFF6v4Sowrjqar7Rz35KywqLLwIYsy%2F9pA6SpUNGsmtKhCJIW3%2BuwUyljiuzjDjehnPfjQeLgUMwte60FpC3eVPCMYkczCdLB%2BM7IwIJqsXBKk6QzF99hp9Oge3hQlyx43CVmthA2TgAtme1cnKbVrHy5tsM8QDNFIJB1%2BcVT7rjBxFu8EAVR5y2cEza0%2FbEavD1ccrqPySVTgpfjEASeAF1QTdVcaRmB%2F5dLMUxAkceEUxlVaxHNjMLeW4MMGOqYByse2D31Es9Uff1lmcvouVTicar2hp9fzsZecqKh46D%2FF3KA2FVj8HCx6gYPpx93RUukCJiG2gOmIcMysXAaATJ%2FInCvqGYFpAFycCekMDvbLsC3O%2FpTLNKsbatGD%2FSrRs7wqIC4pFC5llKnNItpgV4kCKElrFocxBWZ07o%2B5ZWAXYr247Kvfy%2BIQq9plaID%2Fn2D14TeFMDTM6HIkAeONzC9liPZpOQ%3D%3D&X-Amz-SignedHeaders=host&X-Amz-Signature=d007f9865e5e9c7f404243aeffa7c9bf0c8b9f5fd919fc8b6929ef4e4b3fe7e9", "name": "Arva AI", "yc_url": "https://www.ycombinator.com/companies/arva-ai", "website": "https://www.arva.ai", "short_description": "AI Agents to scale AML, KYB and KYC operations", "long_description": "Banks and fintechs have large teams of human analysts who conduct manual checks and reviews. We replace those human compliance analysts with AI agents that automate 80% of the manual work for faster and more compliant financial crime (AML) reviews.\r\n\r\nOur first three products are: Screening AI (discount 91% of screening alerts), KYB/C AI (speed up onboarding), Transaction Monitoring AI (handle AML transaction monitoring alerts)\r\n\r\nOur end goal is to have a whole suite of AI workers that can handle manual compliance work for banks and fintechs. A $24B opportunity in the US alone.", "batch": "Summer 2024", "status": "Active", "stage": "Early", "tags": [ "Artificial Intelligence", "Fintech", "B2B", "Compliance", "Regtech" ], "location": null, "year_founded": 2024, "team_size": 8, "linkedin": "https://www.linkedin.com/company/arva-ai/", "x": "https://x.com/arva_ai", "facebook": null, "crunchbase": "https://www.crunchbase.com/organization/arva-ai", "job_postings": [ { "title": "AI Research Engineer", "url": "http://ycombinator.com/companies/arva-ai/jobs/f6APYpN-ai-research-engineer", "location": "London, England, GB", "type": "Full-time", "salary": "£80K - £120K GBP", "experience": "3+ years" }, { "title": "GTM Lead", "url": "http://ycombinator.com/companies/arva-ai/jobs/dT2oqNZ-gtm-lead", "location": "London, England, GB", "type": "Full-time", "salary": "£75K - £95K GBP", "experience": "3+ years" } ], "founders": [ { "avatar": "https://bookface-images.s3.us-west-2.amazonaws.com/avatars/560201fdbc53ae709c42ebe943c4888d076e5e7a.jpg?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=ASIAQC4NIECAHGNCKMI3%2F20250716%2Fus-west-2%2Fs3%2Faws4_request&X-Amz-Date=20250716T214732Z&X-Amz-Expires=3600&X-Amz-Security-Token=IQoJb3JpZ2luX2VjEE0aCXVzLXdlc3QtMiJGMEQCIDjs%2FQg4zO30Z%2BYa80v4ekkNTPAWdSYtLHkZmV8YlhG0AiB89MskitEQd1Z6AZJ0LYx%2BGRpTDbY%2B63hu02UyMfFukSrlAwhmEAAaDDAwNjIwMTgxMTA3MiIM8AFe%2F52HPppDdE%2BoKsIDquMtiUdKBV9U9j7zhu3XXuenlRfreSsmF6kZPvt%2F7qxiHkskqyqOMLPQ1zAHH5E7YmonqIVMDwHMSXNOQsy4iYU7q%2B%2FhxjhpK39iHA51ZzL08hg4eB%2Fo5viH%2FxtbNBqCbikbOV%2FTWIeKaQcGqaiKFrSyKla6MPwtKSDQkAIhYeo4raPIM7Uw8thv6nXZfv2B4o97kPRoh7L41lHMUnzzthJdlfpXnZ3RUcB75kGM%2Bz9CKcaKrYM33%2F2FmePZdgPrbWM7uia8zGoZJ6My9wUQe%2BW33ar4R78lY2FvRdHjHNlLjhTJQnNe0qBfUBXRYx3BBs6v20j9XZVvIMR4LiFqY9JlmGpf4I4FPnUFF6v4Sowrjqar7Rz35KywqLLwIYsy%2F9pA6SpUNGsmtKhCJIW3%2BuwUyljiuzjDjehnPfjQeLgUMwte60FpC3eVPCMYkczCdLB%2BM7IwIJqsXBKk6QzF99hp9Oge3hQlyx43CVmthA2TgAtme1cnKbVrHy5tsM8QDNFIJB1%2BcVT7rjBxFu8EAVR5y2cEza0%2FbEavD1ccrqPySVTgpfjEASeAF1QTdVcaRmB%2F5dLMUxAkceEUxlVaxHNjMLeW4MMGOqYByse2D31Es9Uff1lmcvouVTicar2hp9fzsZecqKh46D%2FF3KA2FVj8HCx6gYPpx93RUukCJiG2gOmIcMysXAaATJ%2FInCvqGYFpAFycCekMDvbLsC3O%2FpTLNKsbatGD%2FSrRs7wqIC4pFC5llKnNItpgV4kCKElrFocxBWZ07o%2B5ZWAXYr247Kvfy%2BIQq9plaID%2Fn2D14TeFMDTM6HIkAeONzC9liPZpOQ%3D%3D&X-Amz-SignedHeaders=host&X-Amz-Signature=b249eba9cdcf43f1ab2627cbbb53171996338006b2c69fcf2bba16c68154f618", "full_name": "Rhim Shah", "title": "CEO", "linkedin": "https://linkedin.com/in/rhim-shah", "x": "" }, { "avatar": "https://bookface-images.s3.us-west-2.amazonaws.com/avatars/58517dc8081e6d7ecd4040bf285d2d77cd205ed1.jpg?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=ASIAQC4NIECAHGNCKMI3%2F20250716%2Fus-west-2%2Fs3%2Faws4_request&X-Amz-Date=20250716T214732Z&X-Amz-Expires=3600&X-Amz-Security-Token=IQoJb3JpZ2luX2VjEE0aCXVzLXdlc3QtMiJGMEQCIDjs%2FQg4zO30Z%2BYa80v4ekkNTPAWdSYtLHkZmV8YlhG0AiB89MskitEQd1Z6AZJ0LYx%2BGRpTDbY%2B63hu02UyMfFukSrlAwhmEAAaDDAwNjIwMTgxMTA3MiIM8AFe%2F52HPppDdE%2BoKsIDquMtiUdKBV9U9j7zhu3XXuenlRfreSsmF6kZPvt%2F7qxiHkskqyqOMLPQ1zAHH5E7YmonqIVMDwHMSXNOQsy4iYU7q%2B%2FhxjhpK39iHA51ZzL08hg4eB%2Fo5viH%2FxtbNBqCbikbOV%2FTWIeKaQcGqaiKFrSyKla6MPwtKSDQkAIhYeo4raPIM7Uw8thv6nXZfv2B4o97kPRoh7L41lHMUnzzthJdlfpXnZ3RUcB75kGM%2Bz9CKcaKrYM33%2F2FmePZdgPrbWM7uia8zGoZJ6My9wUQe%2BW33ar4R78lY2FvRdHjHNlLjhTJQnNe0qBfUBXRYx3BBs6v20j9XZVvIMR4LiFqY9JlmGpf4I4FPnUFF6v4Sowrjqar7Rz35KywqLLwIYsy%2F9pA6SpUNGsmtKhCJIW3%2BuwUyljiuzjDjehnPfjQeLgUMwte60FpC3eVPCMYkczCdLB%2BM7IwIJqsXBKk6QzF99hp9Oge3hQlyx43CVmthA2TgAtme1cnKbVrHy5tsM8QDNFIJB1%2BcVT7rjBxFu8EAVR5y2cEza0%2FbEavD1ccrqPySVTgpfjEASeAF1QTdVcaRmB%2F5dLMUxAkceEUxlVaxHNjMLeW4MMGOqYByse2D31Es9Uff1lmcvouVTicar2hp9fzsZecqKh46D%2FF3KA2FVj8HCx6gYPpx93RUukCJiG2gOmIcMysXAaATJ%2FInCvqGYFpAFycCekMDvbLsC3O%2FpTLNKsbatGD%2FSrRs7wqIC4pFC5llKnNItpgV4kCKElrFocxBWZ07o%2B5ZWAXYr247Kvfy%2BIQq9plaID%2Fn2D14TeFMDTM6HIkAeONzC9liPZpOQ%3D%3D&X-Amz-SignedHeaders=host&X-Amz-Signature=e1fba3c71b330c4637371d1dd5138f5384a2dcda5a3aa0d8eabcbdc430840dcd", "full_name": "Oli Wales", "title": "Founder/CTO", "linkedin": "https://linkedin.com/in/oliverfwales", "x": "" } ] } ... ] --- ## ⚙️ How It Works 1. Uses Apify's proxy infrastructure to handle Y Combinator traffic cleanly 2. Parses YC search data for batch filtering 3. Crawls each startup’s profile page to extract job postings, founders, and full metadata 4. Outputs structured JSON data you can use in Airtable, Notion, CRMs, or spreadsheets --- ## ⚡ Performance Tips - For large-scale scraping, filter by region, industry, or batch to reduce load - Schedule the actor to run weekly or monthly to track new YC startups over time --- ## 🧠 Built With ❤️ by Damilo I build fast, scalable Apify scrapers for serious use cases — B2B lead gen, research, job mining, content aggregation, and beyond. If this actor saved you time, please consider leaving a ⭐⭐⭐⭐⭐ review to help others find it! --- Happy scraping!

Common Use Cases

Market Research

Gather competitive intelligence and market data

Lead Generation

Extract contact information for sales outreach

Price Monitoring

Track competitor pricing and product changes

Content Aggregation

Collect and organize content from multiple sources

Ready to Get Started?

Try ✨ Y Combinator Scraper Apify now on Apify. Free tier available with no credit card required.

Start Free Trial

Actor Information

Developer: damilo
Pricing: Paid
Total Runs: 391
Active Users: 61

Related Actors

🏯 Tweet Scraper V2 - X / Twitter Scraper

by apidojo

Google Search Results Scraper

by apify

Instagram Profile Scraper

by apify

Tweet Scraper|$0.25/1K Tweets | Pay-Per Result | No Rate Limits

by kaitoeasyapi

Browse All Actors

Apify Platform

Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.

Learn more about Apify

Need Professional Help?

Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.

Find a Specialist

Trusted by millions | Money-back guarantee | 24/7 Support