My Actor

Name: My Actor
Author: david15999

by david15999

An open-source HTML scraper for developers. Use it as a reliable foundation to extract data from any website for research, monitoring, or building datasets.

766 runs

17 users

Try This Actor

Opens on Apify.com

About My Actor

Need to pull clean data from any website? This open-source HTML scraper is the straightforward tool I keep coming back to. It’s built to handle the messy reality of web scraping—different page structures, dynamic content, and all. You give it a URL and some configuration, and it fetches the raw HTML for you to parse and extract exactly what you need. It’s perfect for developers who want a reliable, no-fuss foundation for their data projects without being locked into a specific data extraction service. I’ve used it for everything from monitoring competitor prices and gathering research data to building datasets for machine learning. Because it’s open-source, you can inspect the code, tweak it for your specific case, and even contribute improvements. It runs reliably on the Apify platform, handling things like proxy rotation and request queues so you can focus on the data. If you're comfortable with tools like Cheerio or Beautiful Soup and need a dependable scraper to feed them, this actor is a great starting point.

What does this actor do?

My Actor is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.

Key Features

Cloud-based execution - no local setup required
Scalable infrastructure for large-scale operations
API access for integration with your applications
Built-in proxy rotation and anti-blocking measures
Scheduled runs and webhooks for automation

How to Use

Click "Try This Actor" to open it on Apify
Create a free Apify account if you don't have one
Configure the input parameters as needed
Run the actor and download your results

Documentation

My Actor

A JavaScript (Node.js) template for scraping data from a single web page. You provide a URL via the input, and the actor fetches the page, parses it, and stores the extracted data in an Apify dataset. The template is pre-configured to extract page headings but is designed to be easily modified for any scraping task.

Key Features

Apify SDK: The core toolkit for building and running the actor.
Input Schema: A defined schema for validating the actor's input (primarily the target URL).
Structured Storage: Output is saved to an Apify Dataset for easy access and export.
Axios Client: Used for reliable HTTP requests to fetch page HTML.
Cheerio: A fast, jQuery-like library for parsing and extracting data from HTML.

Input / Output

Input: The actor expects an input object containing the url of the page to scrape, as defined by its input schema.

Output: The scraped data is stored as individual items in the actor's default dataset. The default template stores an array of page headings (h1 through h6), but you will modify this to match your needs.

How to Use

Basic Operation

Provide the target page URL in the actor's input.
Run the actor. It will:
- Fetch the page HTML using axios.get(url).
- Load the HTML into Cheerio for parsing (cheerio.load(response.data)).
- Execute the extraction logic (by default, selecting all heading elements).
- Save the results to the dataset via Actor.pushData().

Customization

The main scraping logic is in the Cheerio parsing step. To scrape different data, edit the selector and data extraction code. For example, the default code is:

$("h1, h2, h3, h4, h5, h6").each((_i, element) => {...});

Change the selector (e.g., $(".product-name")) and the extracted properties within the loop to match your target data.

Local Development

To modify the actor locally, use the Apify CLI to pull the source code:

Install the Apify CLI:
bash npm -g install apify-cli
or
bash brew install apify-cli
Pull the actor using its unique name or ID (found in the Apify console):
bash apify pull <ActorId>

Resources

Common Use Cases

Market Research

Gather competitive intelligence and market data

Lead Generation

Extract contact information for sales outreach

Price Monitoring

Track competitor pricing and product changes

Content Aggregation

Collect and organize content from multiple sources

Ready to Get Started?

Try My Actor now on Apify. Free tier available with no credit card required.

Start Free Trial

Actor Information

Developer: david15999
Pricing: Paid
Total Runs: 766
Active Users: 17

Related Actors

Similarweb scraper

by curious_coder

Google Ads Scraper

by silva95gustavo

Cheap Google Search Results Scraper

by tuningsearch

G2 Explorer

by jupri

Browse All Actors

Apify Platform

Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.

Learn more about Apify

Need Professional Help?

Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.

Find a Specialist

Trusted by millions | Money-back guarantee | 24/7 Support

My Actor

About My Actor

What does this actor do?

Key Features

How to Use

Documentation

My Actor

Key Features

Input / Output

How to Use

Basic Operation

Customization

Local Development

Resources

Categories

Common Use Cases

Market Research

Lead Generation

Price Monitoring

Content Aggregation

Ready to Get Started?

Actor Information

Related Actors

Need Professional Help?