Y Combinator Extractor

Y Combinator Extractor

by jupri

Extract structured startup and job data from YCombinator.com. Perfect for investors, recruiters, and founders needing reliable, automated data collection.

1,230 runs
148 users
Try This Actor

Opens on Apify.com

About Y Combinator Extractor

Need to track startups, find investment opportunities, or analyze Y Combinator's latest batches? This actor is your go-to scraper for pulling structured data directly from YCombinator.com. I built it because I got tired of manually copying company details and job listings for my own research. It reliably extracts key information from the main directory, including company names, descriptions, founding years, team members, and job postings. You can run it to get a clean JSON or CSV file with everything organized, ready to import into a spreadsheet or your own database. It's perfect for recruiters sourcing talent from top startups, investors scouting for early-stage companies, or founders conducting competitive analysis. The setup is straightforward—just input the YC batch or company list you want to target, and it handles the rest. I use it quarterly to keep my own market maps updated, and it saves me hours of tedious work. It runs on the Apify platform, so you don't have to worry about proxies or getting blocked.

What does this actor do?

Y Combinator Extractor is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.

Key Features

  • Cloud-based execution - no local setup required
  • Scalable infrastructure for large-scale operations
  • API access for integration with your applications
  • Built-in proxy rotation and anti-blocking measures
  • Scheduled runs and webhooks for automation

How to Use

  1. Click "Try This Actor" to open it on Apify
  2. Create a free Apify account if you don't have one
  3. Configure the input parameters as needed
  4. Run the actor and download your results

Documentation

Y Combinator Extractor

An Apify actor for scraping structured data from Y Combinator's platforms, including job listings, company profiles, and news.

Overview

This actor extracts data from several sections of Y Combinator's ecosystem: Hacker News, company directories, founder profiles, launch announcements, job boards, and the resource library. It's designed for lead generation, market research, and job searching.

Note: Accessing job listings requires an authenticated session cookie.

Key Features

  • Multi-Source Scraping: Extracts data from:
    • News (Hacker News)
    • Companies
    • Founders
    • Launches
    • Jobs (requires authentication)
    • Library (resources and posts)
  • Structured Output: Returns clean, parsed data in JSON format.
  • Configurable Search: Allows filtering and targeting specific search queries or listing types.

How to Use

Run the actor on Apify with the required input configuration. For most searches (News, Companies, etc.), no authentication is needed.

Authentication for Job Search

To scrape Jobs, you must provide a session cookie from a logged-in browser session on workatastartup.com or ycombinator.com.

Steps to obtain the cookie (Google Chrome):

  1. Log in to www.workatastartup.com.
  2. Open Developer Tools (Ctrl+Shift+I or Cmd+Opt+I).
  3. Go to the Application tab.
  4. In the left panel, navigate to: StorageCookieshttps://www.workatastartup.com.
  5. Find the cookie named _sso.key.
  6. Copy its Value and paste it into the actor's input field for the cookie.

Input / Output

Input Configuration

Configure the actor run via a JSON input object. Key parameters include:

  • searchType: The category to scrape (e.g., "jobs", "companies", "news").
  • searchQuery: (Optional) Keywords to filter results.
  • maxItems: (Optional) Limit the number of results.
  • cookie: The _sso.key cookie value required for job searches.

Example Input (for Jobs):

{
  "searchType": "jobs",
  "searchQuery": "software engineer",
  "maxItems": 50,
  "cookie": "your_copied_sso.key_value_here"
}

Output

The actor outputs a dataset of items, each representing a scraped entity (e.g., a job posting, company profile). The data structure varies by searchType but typically includes titles, URLs, descriptions, metadata, and timestamps.

Example Output Item (Job):

{
  "title": "Senior Backend Engineer",
  "company": "Example Startup",
  "url": "https://www.workatastartup.com/jobs/12345",
  "description": "Job description text...",
  "location": "Remote",
  "postedDate": "2023-10-26"
}

Support

For issues or feature suggestions, you can reach out via the Apify actor issue page.

Common Use Cases

Market Research

Gather competitive intelligence and market data

Lead Generation

Extract contact information for sales outreach

Price Monitoring

Track competitor pricing and product changes

Content Aggregation

Collect and organize content from multiple sources

Ready to Get Started?

Try Y Combinator Extractor now on Apify. Free tier available with no credit card required.

Start Free Trial

Actor Information

Developer
jupri
Pricing
Paid
Total Runs
1,230
Active Users
148
Apify Platform

Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.

Learn more about Apify

Need Professional Help?

Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.

Find a Specialist

Trusted by millions | Money-back guarantee | 24/7 Support