Scraper Results Checker

Scraper Results Checker

by drobnikj

Monitor your Apify datasets automatically. Get instant alerts if your scrapers or actors produce errors, so you never miss a data issue again.

22,764 runs
18 users
Try This Actor

Opens on Apify.com

About Scraper Results Checker

Ever had a scraper fail silently and only noticed hours later when you needed the data? I've been there, and it's frustrating. That's why I built this Scraper Results Checker. It's a straightforward actor that monitors any Apify dataset—whether from scrapers, custom actors, or integrations—and alerts you the moment something goes wrong. You set it up to run via webhook, so it automatically checks your dataset after each job completes. If it spots errors, missing fields, or unexpected empty results, it sends you a notification immediately. I use it to monitor my production scrapers because catching issues early saves me from downstream headaches. It's perfect for developers who run automated data collection and need reliability without constant manual checking. Think of it as a watchdog for your datasets: simple, effective, and one less thing to worry about in your stack.

What does this actor do?

Scraper Results Checker is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.

Key Features

  • Cloud-based execution - no local setup required
  • Scalable infrastructure for large-scale operations
  • API access for integration with your applications
  • Built-in proxy rotation and anti-blocking measures
  • Scheduled runs and webhooks for automation

How to Use

  1. Click "Try This Actor" to open it on Apify
  2. Create a free Apify account if you don't have one
  3. Configure the input parameters as needed
  4. Run the actor and download your results

Documentation

Scraper Results Checker

Overview

This actor automatically checks the results from any Apify actor (or other process) that stores data to a dataset. It validates the output for errors and can send notifications based on the findings. It's designed to be triggered via webhook after a scraper run completes.

Key Features

  • Error Validation: Checks dataset results for common issues.
  • Webhook Integration: Primarily designed to run automatically from an actor/task webhook.
  • Flexible Checks: Supports validation against a JSON schema, comparison with a previous run, and minimum item count thresholds.
  • Notifications: Can send email alerts when errors are found (or when a run is successful).
  • Chained Actions: Can trigger other actors (e.g., send an email) automatically based on the check result (success or error).

How to Use

1. Actor/Task Webhook (Recommended)

Set up a webhook for your actor or task with the following URL and payload template.

Webhook URL:

https://api.apify.com/v2/acts/drobnikj~check-crawler-results/runs?token=APIFY_API_TOKEN

Payload Template:

{
  "actId": "{{resource.actId}}",
  "runId": "{{resource.id}}",
  "options": {
    "notifyTo": "your-email@example.com",
    "minOutputtedPages": 10
  }
}

2. Direct API Call

You can also call it programmatically from another actor using Apify.call().

await Apify.call('drobnikj/check-crawler-results', {
  actId: 'TARGET_ACTOR_ID',
  runId: 'TARGET_RUN_ID',
  options: {
    minOutputtedPages: 1000,
  }
});

Input

The actor requires an input object with the following fields:

  • actId (String): The ID of the actor whose results you want to check.
  • runId (String): The specific run ID of the actor to check.
  • datasetId (String, Optional): A specific dataset ID to check. If not provided, the actor uses the dataset from the specified run.
  • options (Object, Optional): Configuration for the check.
    • sampleCount (Number): Number of dataset items to sample (default: 100000).
    • minOutputtedPages (Number): Minimum number of items required; triggers an error if not met.
    • jsonSchema (Object): A JSON schema to validate all sampled results against.
    • compareWithPreviousExecution (Boolean): Compares results with the previous execution (legacy crawler only).
    • notifyTo (String): Email address to send error notifications to.
    • runActOnSuccess (Object): Defines an actor to run if no errors are found (e.g., a success notification).
      json { "id": "apify/send-mail", "input": { ... } }
    • runActOnError (Object): Defines an actor to run if errors are found (e.g., an alert notification). Uses the same format as runActOnSuccess.

Output

The actor's output contains an errors array. If the array is empty, the check passed. Any validation failures are listed as strings in this array.

Example Output:

{
  "errors": [
    "Run is not in SUCCEEDED status, act status: ABORTED",
    "Crawler returns only 0 outputted pages and minimum is 100"
  ]
}

Common Use Cases

Market Research

Gather competitive intelligence and market data

Lead Generation

Extract contact information for sales outreach

Price Monitoring

Track competitor pricing and product changes

Content Aggregation

Collect and organize content from multiple sources

Ready to Get Started?

Try Scraper Results Checker now on Apify. Free tier available with no credit card required.

Start Free Trial

Actor Information

Developer
drobnikj
Pricing
Paid
Total Runs
22,764
Active Users
18
Apify Platform

Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.

Learn more about Apify

Need Professional Help?

Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.

Find a Specialist

Trusted by millions | Money-back guarantee | 24/7 Support