🔥 Power Data Transformer
by wiseek
Clean, transform, and automate your scraped data. Use built-in tools or SQL to prepare datasets, then send them directly to n8n, Make, or Zapier.
Opens on Apify.com
About 🔥 Power Data Transformer
So you've scraped a mountain of data, but now what? It's probably messy, full of duplicates, and in a dozen different formats. That's where this actor comes in. Think of it as your data workshop. I use it to clean up scraped datasets by removing duplicates, fixing formatting issues, and filtering out the junk. You can merge data from different sources or split a huge file into manageable chunks. Need to check for valid email addresses or standardize phone numbers? The built-in transformations handle that. For more complex jobs, you can build SQL pipelines to really shape the data exactly how you need it for your ETL or ELT processes. Once your data is polished, the best part is sending it where it needs to go. I regularly pipe cleaned data directly into automation platforms like n8n, Make.com, and Zapier to trigger workflows, update databases, or populate reports. It saves me the headache of manual data wrangling and lets me focus on building things.
What does this actor do?
🔥 Power Data Transformer is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.
Key Features
- Cloud-based execution - no local setup required
- Scalable infrastructure for large-scale operations
- API access for integration with your applications
- Built-in proxy rotation and anti-blocking measures
- Scheduled runs and webhooks for automation
How to Use
- Click "Try This Actor" to open it on Apify
- Create a free Apify account if you don't have one
- Configure the input parameters as needed
- Run the actor and download your results
Documentation
Power Data Transformer
A unified data transformation tool for cleaning, merging, and enriching scraped datasets on the Apify platform. It automates post-scraping data processing through SQL queries and built-in operations.
Overview
The actor processes data via a pipeline defined as a Directed Acyclic Graph (DAG). You configure Sources, apply Transformations (SQL or built-in functions), and define Outputs. It's designed to integrate directly into your existing Apify actors or run as a standalone transformation job.
Key Features
- Data Operations: Clean, standardize, merge, and deduplicate datasets.
- Multi-Source Support: Process data from multiple Apify datasets or key-value stores in a single run.
- Flexible Transformations: Use SQL (
SELECT * FROM {{$0}}) or built-in steps (dedup,ref,merge). - Incremental Updates: Handle delta data by merging new records with existing results.
- Integrated Ready: Designed for easy embedding within other Apify actors.
How to Use
A pipeline consists of three configurable parts in the actor's input: Sources, Transformations, and Outputs.
1. Referencing Data
Use $n to reference sources and #n to reference transformation results.
* $0: Always refers to the main dataset selected in the Datasets UI field.
* $1, $2: Refer to additional sources defined in the Sources input field (1-indexed).
* #1, #2: Refer to the results of previous transformation steps (1-indexed).
Syntax Rule: In SQL queries, you must wrap references in double curly braces: SELECT * FROM {{$0}}. In built-in transformations, the braces are optional for the from parameter.
2. Basic Pipeline Example
This input selects 10 records from a source dataset and saves the result.
sources:
- name: my_source
value: $0 # Your selected dataset
transformations:
- name: get_sample
sql: SELECT * FROM {{my_source}} LIMIT 10
outputs:
- dataset: # Saves to the run's default output dataset
3. Built-in Transformations
Use these as steps in your transformations list.
- Deduplicate (
dedup): Removes duplicate rows.
yaml dedup: from: '#1' # Input from previous step keys: ['id', 'url'] # Columns to check for duplicates - Reference Mapping (
ref): Enriches data using a lookup table (e.g., mapping country codes to names).
yaml ref: from: '#1' ref_table: '$2' # Your reference dataset keys: ['country_code'] # Column in main data ref_keys: ['code'] # Matching column in reference table fields: ['country_name'] # Column(s) to pull in - Merge (
merge): Combines two datasets.
yaml merge: from: '#1' other: '$2' how: inner # inner, left, right, or outer keys: ['id'] # Merge on this column
Input/Output
Input Configuration
Configure the actor via the input schema (UI) or a JSON object. The core structure is:
{
"sources": [...], // Optional additional data sources
"transformations": [...], // List of SQL or built-in steps
"outputs": [...] // Where to save results
}
- Primary Source: Select your main input dataset via the Datasets field in the Apify console (referenced as
$0). - Additional Sources: Define in the
sourcesinput using resource IDs (e.g.,{ "name": "lookup_table", "value": "~kE1s2Rc..." }). - Transformations: A list of steps executed in order. Use
sqlfor queries or keys likededup,ref,merge. - Outputs: Typically save to a dataset (
dataset: "") or a key-value store.
Output
Results are saved to the destinations defined in outputs. The default is the actor's run dataset, accessible via the Apify dataset API or console.
For detailed field specifications and advanced examples, refer to the full input reference.
Categories
Common Use Cases
Market Research
Gather competitive intelligence and market data
Lead Generation
Extract contact information for sales outreach
Price Monitoring
Track competitor pricing and product changes
Content Aggregation
Collect and organize content from multiple sources
Ready to Get Started?
Try 🔥 Power Data Transformer now on Apify. Free tier available with no credit card required.
Start Free TrialActor Information
- Developer
- wiseek
- Pricing
- Paid
- Total Runs
- 2,257
- Active Users
- 16
Related Actors
Video Transcript Scraper: Youtube, X, Facebook, Tiktok, etc.
by invideoiq
Linkedin Profile Details Scraper + EMAIL (No Cookies Required)
by apimaestro
Twitter (X.com) Scraper Unlimited: No Limits
by apidojo
Content Checker
by jakubbalada
Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.
Learn more about ApifyNeed Professional Help?
Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.
Trusted by millions | Money-back guarantee | 24/7 Support