Data Gov UK Scraper
by parseforge
Automate your UK open data research. This scraper collects structured dataset info from Data.gov.uk for daily updates, analytics, and streamlined workflows.
Opens on Apify.com
About Data Gov UK Scraper
Need to pull data from the UK's official open data portal, but tired of manual exports and inconsistent formats? I built this scraper because I was in the same spot. It automates the tedious work of collecting dataset details from Data.gov.uk, turning a messy research task into a scheduled, hands-off process. You get clean, structured data on everything from dataset titles and descriptions to publishers and update frequencies, ready to drop into a spreadsheet or database. I use it primarily for two things: keeping a local repository of UK public data automatically updated, and feeding fresh dataset metadata into analytics dashboards. It saves a ton of time if you're in research, policy analysis, or building data-driven applications that rely on current UK government statistics, geospatial data, or transport info. You can set it to run daily, so you're always working with the latest information without having to manually check the portal. The results come out formatted (I typically use JSON or CSV), making integration into your existing workflows straightforward. It’s basically a dedicated assistant for UK open data.
What does this actor do?
Data Gov UK Scraper is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.
Key Features
- Cloud-based execution - no local setup required
- Scalable infrastructure for large-scale operations
- API access for integration with your applications
- Built-in proxy rotation and anti-blocking measures
- Scheduled runs and webhooks for automation
How to Use
- Click "Try This Actor" to open it on Apify
- Create a free Apify account if you don't have one
- Configure the input parameters as needed
- Run the actor and download your results
Documentation
Data.gov.uk Scraper
An Apify actor for scraping dataset metadata from the UK government's open data portal, data.gov.uk. It automates the collection of detailed information, supporting both direct URL scraping and search-based discovery.
Overview
This tool extracts structured data from data.gov.uk, eliminating the need for manual research. It's designed for developers, data analysts, and researchers who need to systematically gather UK open data intelligence. You can scrape specific dataset pages directly or use search filters to find relevant datasets based on keywords, publishers, topics, and formats.
Key Features
The scraper collects comprehensive metadata for each dataset, including:
- Dataset Titles & Descriptions: Full names and detailed summaries.
- Publisher Information: The originating government department or organization.
- Temporal Data: Last updated dates.
- Categorization: Topics (e.g., Business and economy, Transport, Health).
- Technical Details: Available file formats (CSV, JSON, XML, PDF) and licensing info (primarily Open Government License).
- Access Links: Direct URLs to dataset pages and download links for the actual data files.
- Contact Information: Provided enquiry links for datasets.
How to Use
Configure the actor run via input JSON. You can use it in two primary ways.
Option 1: Direct URL Scraping
Provide the exact URLs of the dataset pages you want to scrape.
{
"startUrl": [
"https://www.data.gov.uk/dataset/economic-review",
"https://www.data.gov.uk/dataset/regional-economic-indicators"
],
"maxItems": 10
}
Option 2: Search with Filters
Use search parameters to find and scrape datasets dynamically.
searchQuery: Keywords (e.g., "transport", "health").publisher: Specific government department.topic: Category likeTransportorEnvironment.format: File format (CSV,JSON, etc.).oglOnly: Set totrueto filter for Open Government License.sort: Order results by"best"match or"recent"updates.maxItems: Limit the number of datasets scraped (required for free users).
Example: Basic Search
{
"searchQuery": "economics",
"sort": "best",
"maxItems": 50
}
Example: Advanced Filtered Search
{
"searchQuery": "transport",
"publisher": "Department for Transport",
"topic": "Transport",
"format": "CSV",
"oglOnly": true,
"sort": "recent",
"maxItems": 100
}
Input/Output
Input: Configure the scraper using the JSON input schema as shown in the examples above.
Output: The actor stores the scraped dataset metadata in the Apify dataset associated with the run. You can then download this structured data in multiple formats including JSON, CSV, Excel, XML, or HTML for further processing and analysis.
Categories
Common Use Cases
Market Research
Gather competitive intelligence and market data
Lead Generation
Extract contact information for sales outreach
Price Monitoring
Track competitor pricing and product changes
Content Aggregation
Collect and organize content from multiple sources
Ready to Get Started?
Try Data Gov UK Scraper now on Apify. Free tier available with no credit card required.
Start Free TrialActor Information
- Developer
- parseforge
- Pricing
- Paid
- Total Runs
- 34
- Active Users
- 2
Related Actors
Video Transcript Scraper: Youtube, X, Facebook, Tiktok, etc.
by invideoiq
Linkedin Profile Details Scraper + EMAIL (No Cookies Required)
by apimaestro
Twitter (X.com) Scraper Unlimited: No Limits
by apidojo
Content Checker
by jakubbalada
Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.
Learn more about ApifyNeed Professional Help?
Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.
Trusted by millions | Money-back guarantee | 24/7 Support