Algolia Website Indexer
by apify
The Indexer crawls recursively a website using the Puppeteer browser (headless Chrome) and indexes the selected pages to the Algolia index.
Opens on Apify.com
About Algolia Website Indexer
The Indexer crawls recursively a website using the Puppeteer browser (headless Chrome) and indexes the selected pages to the Algolia index.
What does this actor do?
Algolia Website Indexer is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.
Key Features
- Cloud-based execution - no local setup required
- Scalable infrastructure for large-scale operations
- API access for integration with your applications
- Built-in proxy rotation and anti-blocking measures
- Scheduled runs and webhooks for automation
How to Use
- Click "Try This Actor" to open it on Apify
- Create a free Apify account if you don't have one
- Configure the input parameters as needed
- Run the actor and download your results
Documentation
Algolia Website Indexer The Indexer crawls a website using the Puppeteer browser (headless Chrome) and indexes the selected pages to the Algolia index. It was designed to run in an Apify actor. ## Usage You can find instructions on how to run it in the Apify cloud on its Apify Store page. If you want to run it in your environment, you can use the Apify CLI. ## Input The input of the actor is JSON with the following parameters. | Field | Type | Description | | ----- | ---- | ----------- | | algoliaAppId | String | Your Algolia Application ID | | algoliaApiKey | String | Your Algolia API key | | algoliaIndexName | String | Your Algolia index name | | crawlerName | String | Crawler name, it updates/removes/adds pages into the index regarding this name. In this case, you can have more websites in the index. | | startUrls | Array | URLs where crawler starts crawling | | selectors | Array | Selectors, which text content you want to index. Key is name of the attribute and value is the CSS selector. | | waitForElement | String | Selector of an element to wait on each page. | | additionalPageAttrs | Object | Additional attributes you want to attach to each record in the index. | | skipIndexUpdate | Boolean | Option to switch off updating the Algolia index. | ### Advanced There are a few parameters not shown in the UI. These parameters change the behaviour of crawling, and you can set them up using the API or in the local environment. | Field | Type | Description | | ----- | ---- | ----------- | | pageFunction | String | Overrides default pageFunction | | pseudoUrls | Array | Overrides default pseudoUrls | | clickableElements | String | Overrides default clickableElements | | keepUrlFragment | Boolean | Option to switch on enqueueing URL with URL fragments | | omitSearchParamsFromUrl | Boolean | Option to switch off enqueueing with search params. | ## Debug indexed pages You can find all the pages that will be indexed in the default dataset for a specific actor run.
Categories
Common Use Cases
Market Research
Gather competitive intelligence and market data
Lead Generation
Extract contact information for sales outreach
Price Monitoring
Track competitor pricing and product changes
Content Aggregation
Collect and organize content from multiple sources
Ready to Get Started?
Try Algolia Website Indexer now on Apify. Free tier available with no credit card required.
Start Free TrialActor Information
- Developer
- apify
- Pricing
- Paid
- Total Runs
- 15,209
- Active Users
- 23
Related Actors
Video Transcript Scraper: Youtube, X, Facebook, Tiktok, etc.
by invideoiq
Linkedin Profile Details Scraper + EMAIL (No Cookies Required)
by apimaestro
Twitter (X.com) Scraper Unlimited: No Limits
by apidojo
Content Checker
by jakubbalada
Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.
Learn more about ApifyNeed Professional Help?
Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.
Trusted by millions | Money-back guarantee | 24/7 Support