Save To S3

Name: Save To S3
Author: drinksight

by drinksight

Designed to be run from an ACTOR.RUN.SUCCEEDED webhook, this actor downloads a task run's default dataset and saves it to an S3 bucket.

586,598 runs

98 users

Try This Actor

Opens on Apify.com

About Save To S3

Designed to be run from an ACTOR.RUN.SUCCEEDED webhook, this actor downloads a task run's default dataset and saves it to an S3 bucket.

What does this actor do?

Save To S3 is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.

Key Features

Cloud-based execution - no local setup required
Scalable infrastructure for large-scale operations
API access for integration with your applications
Built-in proxy rotation and anti-blocking measures
Scheduled runs and webhooks for automation

How to Use

Click "Try This Actor" to open it on Apify
Create a free Apify account if you don't have one
Configure the input parameters as needed
Run the actor and download your results

Documentation

save-to-s3 An Apify actor to save the default dataset of a run to an S3 bucket. It is designed to be called from the ACTOR.RUN.SUCCEEDED webhook of the actor that has generated the dataset. This actor is compatible with API v2 - I made it because I couldn't get the Crawler Results To S3 actor to work with v2 actors. ## Usage AWS credentials and options for fomatting the data set are set on this actor's input, which are merged with the webhook's post data. You'll therefore need to create a task for your uploads so you can save common config such as your AWS credentials and dataset format details. ### 1. Create the task Create a new task using the save-to-3 actor. This allows you to specify input to use every time the task is run. The webhook's post data will be merged with this at runtime - the values are those from the get actor run API endpoint, all grouped under a `resource` property. The properties you can specify in your Input for the task: | Property | Description | | ----------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | | `accessKeyId` | The access key for the AWS user to connect with | | `secretAccessKey` | The secret access key for the AWS user to connect with | | `region` | The AWS region your bucket is located in (eg `eu-west-2`) | | `bucket` | The bucket name to save files to | | `objectKeyFormat` | A string to specify the key (i.e. filename) for the S3 object you will save. You can specify any property from the `input` object using dot notation in a syntax similar to JavaScript template literals. For example, the defauult value `${resource.id}_${resource.startedAt}.${format}` will yield an S3 object with a name something like `SBNgQGmp87LtspHF1_2019-05-15T07:25:00.414Z.json`. | | `format` | Maps to the `format` parameter of the get dataset items API endpoint and accepts any of the valid string values | | `clean` | Maps to the `clean` parameter of the get dataset items API endpoint | | `datasetOptions` | An object that allows you to specify any of the other parameters of the get dataset items API endpoint, for example `{ "offset": "10" }` is the equivalent of settings `?offset=10` in the API call | | `debugLog` | A `boolean` indicating whether to use debug level logging | ### 2. Create the webhook Go to your save-to-s3 task's API tab and copy the URL for the Run Task endpoint, which will be in the format: `https://api.apify.com/v2/actor-tasks/TASK_NAME_HERE/runs?token=YOUR_TOKEN_HERE` Go to either the actor or (more likely) the actor task you want to add save-to-s3 functionality to. In the Webhooks tab, add a webhook with the URL you just copied. For Event types, select ACTOR.RUN.SUCCEEDED. Then Save. ## Security Because you store yoour AWS user's key and secret as part of this actor's input, it is strongly recommended that you create an AWS IAM user specifically for Apify, and only grant access to the specific buckey you are using. An example policy: `json { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": ["s3:GetBucketLocation", "s3:ListAllMyBuckets"], "Resource": "arn:aws:s3:::" }, { "Effect": "Allow", "Action": "s3:", "Resource": ["arn:aws:s3:::YOUR-BUCKET", "arn:aws:s3:::YOUR-BUCKET/*"] } ] }`

Common Use Cases

Market Research

Gather competitive intelligence and market data

Lead Generation

Extract contact information for sales outreach

Price Monitoring

Track competitor pricing and product changes

Content Aggregation

Collect and organize content from multiple sources

Ready to Get Started?

Try Save To S3 now on Apify. Free tier available with no credit card required.

Start Free Trial

Actor Information

Developer: drinksight
Pricing: Paid
Total Runs: 586,598
Active Users: 98

Related Actors

Video Transcript Scraper: Youtube, X, Facebook, Tiktok, etc.

by invideoiq

Linkedin Profile Details Scraper + EMAIL (No Cookies Required)

by apimaestro

Twitter (X.com) Scraper Unlimited: No Limits

by apidojo

Content Checker

by jakubbalada

Browse All Actors

Apify Platform

Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.

Learn more about Apify

Need Professional Help?

Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.

Find a Specialist

Trusted by millions | Money-back guarantee | 24/7 Support