Data Go Kr Scraper

Data Go Kr Scraper

by parseforge

Automatically extract and filter datasets from Korea's data.go.kr portal. Get metadata, API specs, and organization details for 50,000+ public datasets with precise search.

15 runs
2 users
Try This Actor

Opens on Apify.com

About Data Go Kr Scraper

Need to pull data from Korea's official open data portal? This scraper is your direct line to the data.go.kr catalog. I use it to systematically extract the full list of available datasets, grabbing everything from titles and descriptions to the crucial API endpoint details and publisher information. It handles all the heavy lifting of navigating the portal and parsing the details, so you don't have to. The real advantage is in the filtering. You can narrow down the massive library of over 50,000 datasets by the issuing organization, dataset format, or specific keywords. This means you can automate the discovery of, say, all new CSV files from the Ministry of Environment or find every dataset related to "public transportation" in Seoul. It turns a manual browsing task into a precise, repeatable data pipeline. Whether you're building an application that relies on this public data, conducting market research, or just need to keep a local mirror of available resources, this actor saves a huge amount of time. It delivers clean, structured data ready for your database or analysis, straight from the source.

What does this actor do?

Data Go Kr Scraper is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.

Key Features

  • Cloud-based execution - no local setup required
  • Scalable infrastructure for large-scale operations
  • API access for integration with your applications
  • Built-in proxy rotation and anti-blocking measures
  • Scheduled runs and webhooks for automation

How to Use

  1. Click "Try This Actor" to open it on Apify
  2. Create a free Apify account if you don't have one
  3. Configure the input parameters as needed
  4. Run the actor and download your results

Documentation

Data Go Kr Scraper

An Apify actor that programmatically extracts dataset listings from South Korea's official open data portal (data.go.kr). It collects metadata from thousands of public datasets, which include APIs, downloadable files, and linked data from various government agencies.

Target Users: Developers, data analysts, and researchers who need to discover or catalog Korean public data for applications, integration, or analysis.

Key Features

The scraper collects comprehensive dataset information, including:
* Core Metadata: Dataset ID, title, description, categories, and tags.
* Source Info: Providing organization and publisher contact details.
* Access Details: API endpoints, documentation URLs, file formats, and direct download links.
* Technical Data: Data fields, parameters (for APIs), and service type.
* Usage Stats: View counts, download statistics, and license information.
* Timestamps: Dataset creation and last update dates.

How to Use

Configure the actor via its input settings to define your search scope. The actor will navigate the portal, scrape the specified datasets, and output the structured results to an Apify dataset.

Input Configuration

Configure the scrape using the following input parameters. Provide them via the Apify console or as a JSON object.

  • startUrl: A direct URL to a pre-filtered search results page from data.go.kr. Useful for replicating a specific browser search.
  • keyword: A search term to query across dataset titles, descriptions, and keywords (e.g., 'weather').
  • svcType: Filter by dataset type.
    • FILE: Downloadable files (CSV, XML, Excel, etc.).
    • API: REST API endpoints.
    • STD: Standardized format datasets.
    • LINKED: Semantic web linked data.
  • recmSe: Filter by portal recommendation status (Y for recommended/high-quality datasets, N for all).
  • conditionType: Search condition mode ('init' for default state, 'search' for active filtering).
  • kwrdArray: Comma-separated list for an advanced keyword search (e.g., 'temperature,precipitation').
  • maxItems: The maximum number of datasets to scrape. Free users are limited to 100. Paid users can set a limit up to 1,000,000 or leave empty for no limit.

Example Input:

{
  "keyword": "weather",
  "svcType": "API",
  "maxItems": 100
}

Output

The actor stores its results in an Apify dataset. You can download the data in multiple formats: JSON, CSV, Excel, XML, or HTML.

Example Output Item:

{
  "dataset_id": "15059093",
  "title": "JSON AsosDaIyInfoService",
  "organization": "Science and technology research",
  "dataset_type": "API",
  "description": "Provides daily weather observation data.",
  "api_endpoint": "http://apis.data.go.kr/...",
  "formats": ["JSON"],
  "tags": ["weather", "observation", "daily"],
  "view_count": 1250,
  "modified_date": "2023-11-15"
}

Common Use Cases

Market Research

Gather competitive intelligence and market data

Lead Generation

Extract contact information for sales outreach

Price Monitoring

Track competitor pricing and product changes

Content Aggregation

Collect and organize content from multiple sources

Ready to Get Started?

Try Data Go Kr Scraper now on Apify. Free tier available with no credit card required.

Start Free Trial

Actor Information

Developer
parseforge
Pricing
Paid
Total Runs
15
Active Users
2
Apify Platform

Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.

Learn more about Apify

Need Professional Help?

Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.

Find a Specialist

Trusted by millions | Money-back guarantee | 24/7 Support