Y Combinator Scraper

Y Combinator Scraper

by futurizerush

Automatically extract company, founder, and job data from the Y Combinator directory. Get structured JSON/CSV for lead gen, recruitment, and market research.

54 runs
6 users
Try This Actor

Opens on Apify.com

About Y Combinator Scraper

Need to pull fresh, structured data from the Y Combinator directory without the manual hassle? I built this scraper because I got tired of copying and pasting details for my own lead lists and job boards. It reliably extracts the core details you're after: company profiles, founder information, and active job postings from the YC batches. You get clean, ready-to-use JSON or CSV output that you can plug straight into a CRM, a recruitment database, or your own market analysis. I use it weekly to track emerging startups, identify potential investment or partnership opportunities, and source technical roles that aren't always listed on the big job sites. It handles the pagination and structure parsing for you, so you can focus on building your list or feeding your analysis pipeline. Just configure your run, hit start, and come back to a dataset. It's become an essential part of my workflow for staying on top of the startup ecosystem, and it'll save you hours of manual work.

What does this actor do?

Y Combinator Scraper is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.

Key Features

  • Cloud-based execution - no local setup required
  • Scalable infrastructure for large-scale operations
  • API access for integration with your applications
  • Built-in proxy rotation and anti-blocking measures
  • Scheduled runs and webhooks for automation

How to Use

  1. Click "Try This Actor" to open it on Apify
  2. Create a free Apify account if you don't have one
  3. Configure the input parameters as needed
  4. Run the actor and download your results

Documentation

Y Combinator Scraper

Extracts structured startup data from Y Combinator's public companies directory. Useful for investors, researchers, and recruiters needing company, founder, and job listing information.

Overview

This actor scrapes the Y Combinator companies directory based on a provided URL. It collects all companies matching the search filters embedded in that URL, with options to include founder profiles and open job postings. Data is exported in JSON, CSV, or Excel format.

Key Features

  • Comprehensive Scraping: Extracts all companies matching the criteria in your provided directory URL.
  • Flexible Filtering: Use Y Combinator's own URL filters (batch, industry tags, status, location) to narrow results.
  • Detailed Data: Captures company profiles, founder information (with social links), and open job listings (including salary and experience where available).
  • Structured Output: Exports clean data in JSON, CSV, or Excel formats.
  • Reliable Execution: Automatically saves progress after each company is processed.

Input Parameters

Configure the actor run using these input fields.

Parameter Type Required Description
url String Yes The Y Combinator companies directory URL. Include filters (batch, tags, etc.) in this URL to target specific companies.
scrapeFounders Boolean No Set to true to include founder profiles. Default is false.
scrapeOpenJobs Boolean No Set to true to include open job postings. Default is false.

Output

The actor returns an array of company objects. Each object contains the following fields.

Company Data (Always Included)

  • company_image, company_name, url, short_description, long_description
  • batch, status, tags (array), company_location
  • year_founded, team_size, website
  • company_linkedin, company_x, is_hiring, number_of_open_jobs

Founder Data (If scrapeFounders: true)

  • founders: An array of objects with name, linkedin, and x fields.

Job Data (If scrapeOpenJobs: true)

  • open_jobs: An array of objects with title, description_url, location, salary, and years_experience fields.

How to Use

  1. Configure Input: Create a task with your desired parameters. For example, to scrape the Winter 2024 batch with founders and jobs:
    json { "url": "https://www.ycombinator.com/companies?batch=Winter%202024", "scrapeFounders": true, "scrapeOpenJobs": true }
  2. Run the Actor: Start the task. It will process all companies matching the URL.
  3. Get Results: Download the dataset from the actor run in your preferred format (JSON, CSV, or Excel).

Limitations & Notes

  • Public Data Only: Scrapes only information visible on the public Y Combinator website.
  • Language: All extracted data is in English.
  • Data Freshness: Information reflects the state of the website at the time of the scrape.
  • Rate Limiting: Sending a very high volume of requests in a short time may trigger temporary access limits.

Troubleshooting Tip: If no results are returned, double-check that your input URL is a valid Y Combinator companies directory link and that your filters match existing companies.

Common Use Cases

Market Research

Gather competitive intelligence and market data

Lead Generation

Extract contact information for sales outreach

Price Monitoring

Track competitor pricing and product changes

Content Aggregation

Collect and organize content from multiple sources

Ready to Get Started?

Try Y Combinator Scraper now on Apify. Free tier available with no credit card required.

Start Free Trial

Actor Information

Developer
futurizerush
Pricing
Paid
Total Runs
54
Active Users
6
Apify Platform

Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.

Learn more about Apify

Need Professional Help?

Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.

Find a Specialist

Trusted by millions | Money-back guarantee | 24/7 Support