🔥 Power Data Transformer

🔥 Power Data Transformer

by wiseek

Clean, transform, and automate your scraped data. Use built-in tools or SQL to prepare datasets, then send them directly to n8n, Make, or Zapier.

2,257 runs
16 users
Try This Actor

Opens on Apify.com

About 🔥 Power Data Transformer

So you've scraped a mountain of data, but now what? It's probably messy, full of duplicates, and in a dozen different formats. That's where this actor comes in. Think of it as your data workshop. I use it to clean up scraped datasets by removing duplicates, fixing formatting issues, and filtering out the junk. You can merge data from different sources or split a huge file into manageable chunks. Need to check for valid email addresses or standardize phone numbers? The built-in transformations handle that. For more complex jobs, you can build SQL pipelines to really shape the data exactly how you need it for your ETL or ELT processes. Once your data is polished, the best part is sending it where it needs to go. I regularly pipe cleaned data directly into automation platforms like n8n, Make.com, and Zapier to trigger workflows, update databases, or populate reports. It saves me the headache of manual data wrangling and lets me focus on building things.

What does this actor do?

🔥 Power Data Transformer is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.

Key Features

  • Cloud-based execution - no local setup required
  • Scalable infrastructure for large-scale operations
  • API access for integration with your applications
  • Built-in proxy rotation and anti-blocking measures
  • Scheduled runs and webhooks for automation

How to Use

  1. Click "Try This Actor" to open it on Apify
  2. Create a free Apify account if you don't have one
  3. Configure the input parameters as needed
  4. Run the actor and download your results

Documentation

Power Data Transformer

A unified data transformation tool for cleaning, merging, and enriching scraped datasets on the Apify platform. It automates post-scraping data processing through SQL queries and built-in operations.

Overview

The actor processes data via a pipeline defined as a Directed Acyclic Graph (DAG). You configure Sources, apply Transformations (SQL or built-in functions), and define Outputs. It's designed to integrate directly into your existing Apify actors or run as a standalone transformation job.

Key Features

  • Data Operations: Clean, standardize, merge, and deduplicate datasets.
  • Multi-Source Support: Process data from multiple Apify datasets or key-value stores in a single run.
  • Flexible Transformations: Use SQL (SELECT * FROM {{$0}}) or built-in steps (dedup, ref, merge).
  • Incremental Updates: Handle delta data by merging new records with existing results.
  • Integrated Ready: Designed for easy embedding within other Apify actors.

How to Use

A pipeline consists of three configurable parts in the actor's input: Sources, Transformations, and Outputs.

1. Referencing Data

Use $n to reference sources and #n to reference transformation results.
* $0: Always refers to the main dataset selected in the Datasets UI field.
* $1, $2: Refer to additional sources defined in the Sources input field (1-indexed).
* #1, #2: Refer to the results of previous transformation steps (1-indexed).

Syntax Rule: In SQL queries, you must wrap references in double curly braces: SELECT * FROM {{$0}}. In built-in transformations, the braces are optional for the from parameter.

2. Basic Pipeline Example

This input selects 10 records from a source dataset and saves the result.

sources:
  - name: my_source
    value: $0 # Your selected dataset

transformations:
  - name: get_sample
    sql: SELECT * FROM {{my_source}} LIMIT 10

outputs:
  - dataset: # Saves to the run's default output dataset

3. Built-in Transformations

Use these as steps in your transformations list.

  • Deduplicate (dedup): Removes duplicate rows.
    yaml dedup: from: '#1' # Input from previous step keys: ['id', 'url'] # Columns to check for duplicates
  • Reference Mapping (ref): Enriches data using a lookup table (e.g., mapping country codes to names).
    yaml ref: from: '#1' ref_table: '$2' # Your reference dataset keys: ['country_code'] # Column in main data ref_keys: ['code'] # Matching column in reference table fields: ['country_name'] # Column(s) to pull in
  • Merge (merge): Combines two datasets.
    yaml merge: from: '#1' other: '$2' how: inner # inner, left, right, or outer keys: ['id'] # Merge on this column

Input/Output

Input Configuration

Configure the actor via the input schema (UI) or a JSON object. The core structure is:

{
  "sources": [...],        // Optional additional data sources
  "transformations": [...], // List of SQL or built-in steps
  "outputs": [...]         // Where to save results
}
  • Primary Source: Select your main input dataset via the Datasets field in the Apify console (referenced as $0).
  • Additional Sources: Define in the sources input using resource IDs (e.g., { "name": "lookup_table", "value": "~kE1s2Rc..." }).
  • Transformations: A list of steps executed in order. Use sql for queries or keys like dedup, ref, merge.
  • Outputs: Typically save to a dataset (dataset: "") or a key-value store.

Output

Results are saved to the destinations defined in outputs. The default is the actor's run dataset, accessible via the Apify dataset API or console.

For detailed field specifications and advanced examples, refer to the full input reference.

Common Use Cases

Market Research

Gather competitive intelligence and market data

Lead Generation

Extract contact information for sales outreach

Price Monitoring

Track competitor pricing and product changes

Content Aggregation

Collect and organize content from multiple sources

Ready to Get Started?

Try 🔥 Power Data Transformer now on Apify. Free tier available with no credit card required.

Start Free Trial

Actor Information

Developer
wiseek
Pricing
Paid
Total Runs
2,257
Active Users
16
Apify Platform

Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.

Learn more about Apify

Need Professional Help?

Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.

Find a Specialist

Trusted by millions | Money-back guarantee | 24/7 Support