GSTIN Scraper / Crawler

GSTIN Scraper / Crawler

by mikolabs

Verify Indian GST numbers and retrieve comprehensive taxpayer information from the official GST portal. Retrieves GSTIN overview, business details, go...

67 runs
8 users
Try This Actor

Opens on Apify.com

About GSTIN Scraper / Crawler

Verify Indian GST numbers and retrieve comprehensive taxpayer information from the official GST portal. Retrieves GSTIN overview, business details, goods & services, HSN, and filing details.

What does this actor do?

GSTIN Scraper / Crawler is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.

Key Features

  • Cloud-based execution - no local setup required
  • Scalable infrastructure for large-scale operations
  • API access for integration with your applications
  • Built-in proxy rotation and anti-blocking measures
  • Scheduled runs and webhooks for automation

How to Use

  1. Click "Try This Actor" to open it on Apify
  2. Create a free Apify account if you don't have one
  3. Configure the input parameters as needed
  4. Run the actor and download your results

Documentation

GST Verification API - India GST Portal Apify Python An automated Apify actor that verifies Indian GST (Goods and Services Tax) numbers and retrieves comprehensive taxpayer information from the official GST portal. This actor features automatic captcha solving, intelligent retry mechanisms, and structured data output - making GST verification seamless and reliable. ## 🚀 Key Features - ✅ Automatic Captcha Solving - Uses AI-powered Gradio API to solve captchas automatically - ✅ Intelligent Retry Logic - Automatically retries failed captcha solving and data retrieval - ✅ Data Validation - Retries verification if data is incomplete (all N/A values) - ✅ Structured Output - Clean, organized JSON output with taxpayer details, address, and additional info - ✅ Zero Manual Intervention - Only requires GSTIN - no manual captcha solving needed - ✅ Comprehensive Data - Retrieves legal name, trade name, registration date, status, address, jurisdiction, and more - ✅ Error Handling - Robust error handling with detailed error messages - ✅ Session Management - Automatic session handling with the GST portal ## 📋 Input The actor requires only the GSTIN number. All other parameters are optional: json { "GSTIN": "07AAECD1686J1ZG", "captchaRetryAttempts": 3, "maxRetryAttempts": 3 } ### Input Parameters | Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | GSTIN | string | ✅ Yes | - | The GST Identification Number to verify (15 characters). Format: 07AAECD1686J1ZG | | captchaRetryAttempts | integer | ❌ No | 3 | Number of times to retry captcha solving if it fails (range: 1-5) | | maxRetryAttempts | integer | ❌ No | 3 | Maximum number of times to retry GST verification if data is all N/A (range: 1-10) | ### GSTIN Format - Length: 15 characters - Format: [State Code (2 digits)][PAN (10 chars)][Entity Number][Z][Check Digit] - Example: 07AAECD1686J1ZG - 27 = State Code (Maharashtra) - CEGPN9391P = PAN Number - 1 = Entity Number - Z = Fixed character - F = Check Digit ## 📤 Output The actor outputs structured GST verification data in JSON format. ### Success Response json { "gstin": "07AAECD1686J1ZG", "status": "success", "message": "GST verification completed successfully", "verificationDate": "16/06/2024", "verificationAttempts": 1, "taxpayerDetails": { "legalName": "NILESH LAXMANRAO NIMBALKAR", "tradeName": "SHRI SWAMI SAMARTH CONTRACT WORKS", "registrationDate": "16/06/2024", "status": "Active", "businessType": "Regular", "centerJurisdiction": "State - CBIC,Zone - NAGPUR,Commissionerate - NAGPUR II,Division - DIVISION CITY,Range - RANGE -I (Jurisdictional Office)", "stateJurisdiction": "State - Maharashtra,Zone - Nagpur,Division - NAGPUR_EAST,Charge - AJNI_701" }, "address": { "fullAddress": "PLOT NO 74, ZINGABAI TAKDI, New Mankapur Road, SAI NAGAR, Manak Pur, Nagpur, Nagpur, Maharashtra, 440030", "principalPlace": "PLOT NO 74", "buildingName": "PLOT NO 74", "street": "ZINGABAI TAKDI, New Mankapur Road", "location": "ZINGABAI TAKDI", "district": "Nagpur", "state": "Maharashtra", "pincode": "440030" }, "additionalInfo": { "centerJurisdiction": "State - CBIC,Zone - NAGPUR,Commissionerate - NAGPUR II,Division - DIVISION CITY,Range - RANGE -I (Jurisdictional Office)", "stateJurisdiction": "State - Maharashtra,Zone - Nagpur,Division - NAGPUR_EAST,Charge - AJNI_701", "constitutionOfBusiness": "Proprietorship", "natureOfBusiness": ["Works Contract", "Others"], "aadhaarVerified": "Yes", "aadhaarVerificationDate": "20/01/2025", "eKYCVerified": "Not Applicable", "eInvoiceStatus": "No", "cancellationDate": "N/A" }, "rawData": { // Complete raw response from GST portal } } ### Captcha Required Response If automatic captcha solving fails after all retry attempts: json { "gstin": "07AAECD1686J1ZG", "captchaImage": "data:image/png;base64,...", "status": "captcha_required", "message": "Automatic captcha solving failed after 3 attempts. Last error: ...", "error": "Error details", "attempts": 3, "verificationAttempts": 3 } ### Error Response json { "gstin": "07AAECD1686J1ZG", "status": "error", "error": "Error message", "message": "GST verification failed: Error message", "verificationAttempts": 3 } ## 🔄 How It Works 1. Session Initialization - Creates a session with the official GST portal 2. Captcha Fetching - Retrieves a captcha image from the GST portal 3. Automatic Captcha Solving - Uses AI-powered Gradio API to solve captcha automatically 4. GST Verification - Submits GSTIN and solved captcha to the GST portal 5. Data Validation - Checks if retrieved data is valid (not all N/A) 6. Retry Logic - If data is incomplete, automatically retries with a new captcha 7. Structured Output - Formats and returns comprehensive GST details ### Retry Mechanisms - Captcha Retry: If captcha solving fails, retries up to captchaRetryAttempts times (default: 3) - Data Validation Retry: If retrieved data is all N/A, retries the entire verification process up to maxRetryAttempts times (default: 3) - Automatic Fresh Captcha: Each retry fetches a new captcha for better success rate ## 💻 Usage ### Using Apify Console 1. Go to Apify Console 2. Search for "GST Verification API - India GST Portal" 3. Click "Run" on the actor 4. Enter your GSTIN in the input form 5. (Optional) Adjust retry attempts if needed 6. Click "Start" to run the actor 7. View results in the dataset tab ### Using Apify API (Python) python from apify_client import ApifyClient # Initialize the ApifyClient client = ApifyClient("YOUR_API_TOKEN") # Prepare the actor input run_input = { "GSTIN": "07AAECD1686J1ZG", "captchaRetryAttempts": 3, # Optional "maxRetryAttempts": 3 # Optional } # Run the actor run = client.actor("YOUR_USERNAME/gst-verification-india").call(run_input=run_input) # Fetch results from the default dataset for item in client.dataset(run["defaultDatasetId"]).iterate_items(): print(item) ### Using Apify API (JavaScript/Node.js) javascript const { ApifyClient } = require('apify-client'); // Initialize the ApifyClient const client = new ApifyClient({ token: 'YOUR_API_TOKEN', }); // Prepare the actor input const runInput = { GSTIN: "07AAECD1686J1ZG", captchaRetryAttempts: 3, // Optional maxRetryAttempts: 3 // Optional }; // Run the actor const run = await client.actor("YOUR_USERNAME/gst-verification-india").call(runInput); // Fetch results from the default dataset const { items } = await client.dataset(run.defaultDatasetId).listItems(); console.log(items); ### Using cURL bash curl -X POST \ https://api.apify.com/v2/acts/YOUR_USERNAME~gst-verification-india/runs \ -H 'Authorization: Bearer YOUR_API_TOKEN' \ -H 'Content-Type: application/json' \ -d '{ "GSTIN": "07AAECD1686J1ZG", "captchaRetryAttempts": 3, "maxRetryAttempts": 3 }' ## 📊 Output Schema The actor outputs data following a structured schema defined in .actor/dataset_schema.json. The output includes: - Basic Info: GSTIN, status, message, verification date - Taxpayer Details: Legal name, trade name, registration date, status, business type, jurisdiction - Address: Full address with parsed components (building, street, location, district, state, pincode) - Additional Info: Constitution, nature of business, verification statuses, eInvoice status - Raw Data: Complete original response from GST portal for reference ## ⚠️ Limitations - Rate Limiting: The GST portal may apply rate limiting based on their policies - Captcha Success Rate: Automatic captcha solving success rate depends on image complexity - Internet Connection: Requires a valid internet connection to access the GST portal - GST Portal Availability: Depends on the GST portal being accessible and operational ## 🔒 Privacy & Compliance - This actor only accesses publicly available GST verification data - No personal sensitive data is stored beyond what's necessary for verification - Users are responsible for ensuring compliance with GST portal terms of service - This actor is for legitimate verification purposes only ## 📝 Changelog ### Version 1.0 - Initial release - Automatic captcha solving using Gradio API - Intelligent retry logic for captcha and data retrieval - Structured output format - Data validation and retry mechanism ## 🤝 Support For issues, questions, or contributions: - Open an issue on the repository - Contact through Apify Console - Check the actor's documentation in Apify Store ## 📄 License This project is provided as-is for educational and legitimate business purposes. Please ensure compliance with the GST portal's terms of service and all applicable laws and regulations. ## ⚖️ Disclaimer This actor is for legitimate GST verification purposes only. Users are responsible for: - Ensuring compliance with all applicable laws and regulations - Respecting the GST portal's terms of service - Using the actor responsibly and ethically - Verifying the accuracy of retrieved data independently The developers and maintainers of this actor are not responsible for any misuse or non-compliance with applicable laws and regulations.

Common Use Cases

Market Research

Gather competitive intelligence and market data

Lead Generation

Extract contact information for sales outreach

Price Monitoring

Track competitor pricing and product changes

Content Aggregation

Collect and organize content from multiple sources

Ready to Get Started?

Try GSTIN Scraper / Crawler now on Apify. Free tier available with no credit card required.

Start Free Trial

Actor Information

Developer
mikolabs
Pricing
Paid
Total Runs
67
Active Users
8
Apify Platform

Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.

Learn more about Apify

Need Professional Help?

Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.

Find a Specialist

Trusted by millions | Money-back guarantee | 24/7 Support