🔥 Power Data Transformer

Name: 🔥 Power Data Transformer
Author: wiseek

by wiseek

Clean, transform, and automate your scraped data. Use built-in tools or SQL to prepare datasets, then send them directly to n8n, Make, or Zapier.

2,257 runs

16 users

Try This Actor

Opens on Apify.com

About 🔥 Power Data Transformer

So you've scraped a mountain of data, but now what? It's probably messy, full of duplicates, and in a dozen different formats. That's where this actor comes in. Think of it as your data workshop. I use it to clean up scraped datasets by removing duplicates, fixing formatting issues, and filtering out the junk. You can merge data from different sources or split a huge file into manageable chunks. Need to check for valid email addresses or standardize phone numbers? The built-in transformations handle that. For more complex jobs, you can build SQL pipelines to really shape the data exactly how you need it for your ETL or ELT processes. Once your data is polished, the best part is sending it where it needs to go. I regularly pipe cleaned data directly into automation platforms like n8n, Make.com, and Zapier to trigger workflows, update databases, or populate reports. It saves me the headache of manual data wrangling and lets me focus on building things.

What does this actor do?

🔥 Power Data Transformer is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.

Key Features

Cloud-based execution - no local setup required
Scalable infrastructure for large-scale operations
API access for integration with your applications
Built-in proxy rotation and anti-blocking measures
Scheduled runs and webhooks for automation

How to Use

Click "Try This Actor" to open it on Apify
Create a free Apify account if you don't have one
Configure the input parameters as needed
Run the actor and download your results

Documentation

Power Data Transformer

A unified data transformation tool for cleaning, merging, and enriching scraped datasets on the Apify platform. It automates post-scraping data processing through SQL queries and built-in operations.

Overview

The actor processes data via a pipeline defined as a Directed Acyclic Graph (DAG). You configure Sources, apply Transformations (SQL or built-in functions), and define Outputs. It's designed to integrate directly into your existing Apify actors or run as a standalone transformation job.

Key Features

Data Operations: Clean, standardize, merge, and deduplicate datasets.
Multi-Source Support: Process data from multiple Apify datasets or key-value stores in a single run.
Flexible Transformations: Use SQL (SELECT * FROM {{$0}}) or built-in steps (dedup, ref, merge).
Incremental Updates: Handle delta data by merging new records with existing results.
Integrated Ready: Designed for easy embedding within other Apify actors.

How to Use

A pipeline consists of three configurable parts in the actor's input: Sources, Transformations, and Outputs.

1. Referencing Data

Use $n to reference sources and #n to reference transformation results.
* $0: Always refers to the main dataset selected in the Datasets UI field.
* $1, $2: Refer to additional sources defined in the Sources input field (1-indexed).
* #1, #2: Refer to the results of previous transformation steps (1-indexed).

Syntax Rule: In SQL queries, you must wrap references in double curly braces: SELECT * FROM {{$0}}. In built-in transformations, the braces are optional for the from parameter.

2. Basic Pipeline Example

This input selects 10 records from a source dataset and saves the result.

sources:
  - name: my_source
    value: $0 # Your selected dataset

transformations:
  - name: get_sample
    sql: SELECT * FROM {{my_source}} LIMIT 10

outputs:
  - dataset: # Saves to the run's default output dataset

3. Built-in Transformations

Use these as steps in your transformations list.

Deduplicate (dedup): Removes duplicate rows.
yaml dedup: from: '#1' # Input from previous step keys: ['id', 'url'] # Columns to check for duplicates
Reference Mapping (ref): Enriches data using a lookup table (e.g., mapping country codes to names).
yaml ref: from: '#1' ref_table: '$2' # Your reference dataset keys: ['country_code'] # Column in main data ref_keys: ['code'] # Matching column in reference table fields: ['country_name'] # Column(s) to pull in
Merge (merge): Combines two datasets.
yaml merge: from: '#1' other: '$2' how: inner # inner, left, right, or outer keys: ['id'] # Merge on this column

Input/Output

Input Configuration

Configure the actor via the input schema (UI) or a JSON object. The core structure is:

{
  "sources": [...],        // Optional additional data sources
  "transformations": [...], // List of SQL or built-in steps
  "outputs": [...]         // Where to save results
}

Primary Source: Select your main input dataset via the Datasets field in the Apify console (referenced as $0).
Additional Sources: Define in the sources input using resource IDs (e.g., { "name": "lookup_table", "value": "~kE1s2Rc..." }).
Transformations: A list of steps executed in order. Use sql for queries or keys like dedup, ref, merge.
Outputs: Typically save to a dataset (dataset: "") or a key-value store.

Output

Results are saved to the destinations defined in outputs. The default is the actor's run dataset, accessible via the Apify dataset API or console.

For detailed field specifications and advanced examples, refer to the full input reference.

Common Use Cases

Market Research

Gather competitive intelligence and market data

Lead Generation

Extract contact information for sales outreach

Price Monitoring

Track competitor pricing and product changes

Content Aggregation

Collect and organize content from multiple sources

Ready to Get Started?

Try 🔥 Power Data Transformer now on Apify. Free tier available with no credit card required.

Start Free Trial

Actor Information

Developer: wiseek
Pricing: Paid
Total Runs: 2,257
Active Users: 16

Related Actors

Video Transcript Scraper: Youtube, X, Facebook, Tiktok, etc.

by invideoiq

Linkedin Profile Details Scraper + EMAIL (No Cookies Required)

by apimaestro

Twitter (X.com) Scraper Unlimited: No Limits

by apidojo

Content Checker

by jakubbalada

Browse All Actors

Apify Platform

Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.

Learn more about Apify

Need Professional Help?

Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.

Find a Specialist

Trusted by millions | Money-back guarantee | 24/7 Support

🔥 Power Data Transformer

About 🔥 Power Data Transformer

What does this actor do?

Key Features

How to Use

Documentation

Power Data Transformer

Overview

Key Features

How to Use

1. Referencing Data

2. Basic Pipeline Example

3. Built-in Transformations

Input/Output

Input Configuration

Output

Categories

Common Use Cases

Market Research

Lead Generation

Price Monitoring

Content Aggregation

Ready to Get Started?

Actor Information

Related Actors

Need Professional Help?