GSTIN Scraper

GSTIN Scraper

by codingfrontend

Scrape taxpayer details, filing & HSN from GST portal without CAPTCHA

95 runs
25 users
Try This Actor

Opens on Apify.com

About GSTIN Scraper

Scrape taxpayer details, filing & HSN from GST portal without CAPTCHA

What does this actor do?

GSTIN Scraper is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.

Key Features

  • Cloud-based execution - no local setup required
  • Scalable infrastructure for large-scale operations
  • API access for integration with your applications
  • Built-in proxy rotation and anti-blocking measures
  • Scheduled runs and webhooks for automation

How to Use

  1. Click "Try This Actor" to open it on Apify
  2. Create a free Apify account if you don't have one
  3. Configure the input parameters as needed
  4. Run the actor and download your results

Documentation

GSTIN Scraper ## Features - Comprehensive GST Data Extraction: Scrapes all available GSTIN information from GST portal - Taxpayer Details: Extracts complete taxpayer information including legal name, trade name, registration details, business activities, and address - State Jurisdiction Parsing: Automatically splits state jurisdiction into separate fields (state, division, zone, circle) for easier filtering and analysis - HSN Codes & Services: Captures goods and services information with HSN codes for goods and SAC codes for services - Filing Status & History: Extracts complete return filing status for all financial years with detailed monthly/quarterly filing information - Financial Years Data: Retrieves available financial years and filing frequency preferences - Flexible Data Extraction: Choose which data to extract (HSN codes, filing details, or both) using input parameters - Usage-based Billing: Pay only for the data you actually extract with transparent per-item charging ## Input Parameters | Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | gstins | Array | Yes | ["06AAICK7471J1Z4"] | Array of GSTIN (Goods and Services Tax Identification Number) strings to scrape | | extractHsnCodes | Boolean | No | true | Extract goods and services information including HSN/SAC codes | | extractFilingDetails | Boolean | No | true | Extract filing status and financial year details | ### Input Schema Example json { "gstins": ["06AAICK7471J1Z4", "07AABCT1234F1Z5"], "extractHsnCodes": true, "extractFilingDetails": true } ### Advanced Usage Examples Extract only taxpayer details (no HSN codes or filing details): json { "gstins": ["06AAICK7471J1Z4"], "extractHsnCodes": false, "extractFilingDetails": false } Extract taxpayer details and HSN codes only: json { "gstins": ["06AAICK7471J1Z4"], "extractHsnCodes": true, "extractFilingDetails": false } Extract taxpayer details and filing details only: json { "gstins": ["06AAICK7471J1Z4"], "extractHsnCodes": false, "extractFilingDetails": true } ## Billing and Charges This actor uses a usage-based billing system. You are charged based on the data you actually extract: ### Billing Events | Event | Description | Count Basis | |-------|-------------|-------------| | taxpayer | Basic taxpayer information | 1 per GSTIN processed | | hsn | HSN codes and goods/services data | Number of HSN items extracted | | filing | Filing status and financial year data | Number of filing entries extracted | ## Output Schema The scraper outputs structured JSON data for each GSTIN. Below is the complete output schema: | Field | Type | Description | |-------|------|-------------| | gstinNumber | String | The GSTIN that was scraped | | success | Boolean | Whether the scraping was successful | | goodservice | Object | Goods and services information with SAC codes | | finanacialYears | Array | Financial year mappings | | filingStatus | Array | Detailed return filing status for each year | | filingFrequency | Array | Quarterly filing preferences by year | | legalName | String | Legal name of the taxpayer | | tradeName | String | Trade name of the taxpayer | | registrationDate | String | GST registration date | | taxpayerType | String | Type of taxpayer (Regular/Composition) | | businessActivities | Array | Business activities of the taxpayer | | principalAddress | Object | Principal place of business address | | stateJurisdiction | String | State tax jurisdiction (full string) | | state | String | State extracted from stateJurisdiction | | division | String | Division extracted from stateJurisdiction | | zone | String | Zone extracted from stateJurisdiction | | circle | String | Circle extracted from stateJurisdiction | | taxJurisdiction | String | Central tax jurisdiction | | status | String | GST registration status | | natureOfTaxpayer | String | Nature of taxpayer | | companyType | String | Type of company | | isAadhaarVerified | String | Aadhaar verification status | | isEKYCVerified | String | eKYC verification status | | compositionScheme | String | Composition scheme status | | eInvoiceEnabled | String | e-Invoice enablement status | | fieldVisitConducted | String | Field visit conduction status | | cancellationDate | String | GST cancellation date (if applicable) | ### Detailed Output Structure #### Good Service Object json { "services": [ { "sacCode": "998361", "description": "Advertising Services" }, { "sacCode": "998599", "description": "Other support services nowhere else classified" } ] } #### Financial Years Array json [ { "year": "2021-2022", "value": "2021" }, { "year": "2022-2023", "value": "2022" } ] #### Filing Status Array (Each year contains) json [ { "year": "2021", "returns": [ { "financialYear": "2021-2022", "taxPeriod": "March", "modeOfFiling": "ONLINE", "dateOfFiling": "11/04/2022", "returnType": "GSTR1", "arn": "NA", "status": "Filed" } ] } ] #### Filing Frequency Array json [ { "quarter": "Q1", "preference": "M", "year": "2021" } ] #### Principal Address Object json { "adr": "4th Floor, Unit No. 401 and Unit No. 402, Worldmark 2, Sector 65, Village Maidawas, Gurugram, Haryana, 122001" } ## Sample Output here's a sample of the GSTIN data structure: json { "goodservice": { "services": [ { "sacCode": "998361", "description": "Advertising Services" }, { "sacCode": "998599", "description": "Other support services nowhere else classified" } ] }, "finanacialYears": [ { "year": "2021-2022", "value": "2021" }, { "year": "2022-2023", "value": "2022" }, { "year": "2023-2024", "value": "2023" }, { "year": "2024-2025", "value": "2024" }, { "year": "2025-2026", "value": "2025" } ], "filingStatus": [ { "year": "2025", "returns": [ { "financialYear": "2025-2026", "taxPeriod": "April", "modeOfFiling": "ONLINE", "dateOfFiling": "20/05/2025", "returnType": "GSTR3B", "arn": "NA", "status": "Filed" } ] } ], "filingFrequency": [ { "quarter": "Q4", "preference": "M", "year": "2025" } ], "natureOfTaxpayer": "SPO", "isAadhaarVerified": "No", "legalName": "KFC INDIA MARKETING PRIVATE LIMITED", "stateJurisdiction": "State - Haryana,Range - Gurgaon,District - Gurgaon (South),Ward - Gurgaon (South) Ward 1", "state": "Haryana", "division": null, "zone": null, "circle": null, "taxpayerType": "Regular", "cancellationDate": "", "gstinNumber": "06AAICK7471J1Z4", "businessActivities": [ "Export", "Supplier of Services", "Recipient of Goods or Services", "Others" ], "isEKYCVerified": "No", "compositionScheme": "NA", "registrationDate": "07/07/2021", "companyType": "Private Limited Company", "principalAddress": { "adr": "4th Floor, Unit No. 401 and Unit No. 402, Worldmark 2, Sector 65, Village Maidawas, Gurugram, Haryana, 122001" }, "status": "Active", "tradeName": "KFC INDIA MARKETING PRIVATE LIMITED", "fieldVisitConducted": "No", "taxJurisdiction": "State - CBIC,Zone - PANCHKULA,Commissionerate - GURUGRAM,Division - DIVISION-SOUTH-1,Range - R-20 (Jurisdictional Office)", "eInvoiceEnabled": "Yes" } ## Changelog For a detailed list of changes and version history, see CHANGELOG.md. ## Support For issues and questions: - Ensure GSTIN numbers are valid 15-digit format - Verify GST portal accessibility - Email : lakshmanan.w3dev@gmail.com

Common Use Cases

Market Research

Gather competitive intelligence and market data

Lead Generation

Extract contact information for sales outreach

Price Monitoring

Track competitor pricing and product changes

Content Aggregation

Collect and organize content from multiple sources

Ready to Get Started?

Try GSTIN Scraper now on Apify. Free tier available with no credit card required.

Start Free Trial

Actor Information

Developer
codingfrontend
Pricing
Paid
Total Runs
95
Active Users
25
Apify Platform

Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.

Learn more about Apify

Need Professional Help?

Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.

Find a Specialist

Trusted by millions | Money-back guarantee | 24/7 Support