How to configure an Amazon S3 bucket to allow ScrapeHero to write data and images

How to configure an Amazon S3 bucket to allow ScrapeHero to write data and images

We will assume that you have already created your Amazon S3 account. If not, then go here to do that https://aws.amazon.com/

At the end of this exercise, we will need 2 things from you through an email or in response to the ticket.

These two items are marked with this icon 

  1. The Name of the Bucket that you create in Step 1
  2. The CSV file containing the Access Key and the Secret Access Key that you downloaded from AWS in Step 3

STEP 1: Create a new Bucket

A bucket is like a folder where we can write files or read file from – like a Windows or FTP folder

Go to the S3 area in AWS

https://s3.console.aws.amazon.com/s3/home

Click the Create bucket button

Enter a name for the Bucket – For the rest of the article we will be using the name scrapehero for the bucket, so please use that name

 NOTE: If you decide to use a different name, you will need to use the same name in the sections below.

Keep clicking the NEXT button and DO NOT change or add any values on the subsequent screens. Till you see the Create bucket button on the bottom right – click the Create bucket button.

Create a folder (or multiple folders – Optional) to organize the data

We need AT LEAST one folder created in this bucket – let’s call it data

Click Save to create the folder

 Please copy the name of this bucket and email us the NAME of this BUCKET (or tell us it is the default – scrapehero) 


STEP 2: Create a new Policy

This policy restricts access for scrapehero to just this S3 bucket (not your full S3 folders)

Go to IAM – Policies

https://console.aws.amazon.com/iam/home#/policies

Click the Create Policy Button

Select the Create your own policy button

Give the policy a name such as ScrapeHeroS3AccessPolicy

and type anything you like in the Description field.

Then in the Policy Document – copy and paste the following.

 NOTE: If you changed the bucket name (above) to anything other than scrapehero, please change the name of the bucket in the policy below to the same name as you used (above)

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:*"
            ],
            "Resource": [
                "arn:aws:s3:::scrapehero",
                "arn:aws:s3:::scrapehero/*"
            ]
        }
    ]
}

Click Validate Policy and if there are no errors, click the CreatePolicy Button


STEP 3: Create a new User

For a secure configuration, you will need create a new user for ScrapeHero (say it is named scrapehero) with its own access and secret keys (which can be done using the Amazon AWS console’s “IAM” service), then you will need to make sure that that user has enough permissions.

Go to the Identity and Access Management (IAM) console in Amazon AWS – Users – Add user

https://console.aws.amazon.com/iam/home#/users$new

Create a user named scrapehero and click the Programmatic Access check box before clicking Next: Permissions

Now click the Attach existing policy directly option on the next screen

Type in ScrapeHeroS3 to filter the policy list and click the CHECKBOX next to it. (DO NOT CLICK ON THE POLICY LINK ITSELF)

Scroll to the bottom of the page and click Next: Review

On the next page click the Create User button

On the next page, you will HAVE THE ONLY CHANCE TO SEE THE CREDENTIALS and DOWNLOAD them to send to us.

Click the Download CSV button

Download this file and email that to us 


    • Related Articles

    • How to configure FTP

      Please note: These instructions below are generic and may vary depending on your specific FTP server software and configuration. It's recommended to consult your server documentation for detailed instructions tailored to your FTP environment. 1. ...
    • How to configure SFTP

      To ensure the secure transfer of your files, use Secure File Transfer Protocol (SFTP). Here's a guide to help you set up SFTP access on your server: Please note: These instructions below are generic and may vary depending on your specific FTP server ...
    • How to configure Microsoft Azure Storage

      1. Create an Azure Account If you don't already have one, visit the Azure portal (https://portal.azure.com) to create an Azure account. 2. Create a Storage Account Within the Azure portal, navigate to the "Storage accounts" service. Click "Create" to ...
    • How to configure Google Cloud Storage

      1. Sign Up or Log In to Google Cloud Platform If you don't already have a Google Cloud account, you'll need to sign up at Google Cloud Platform. If you have an account, simply log in. 2. Creating a Project Create a new project in the Google Cloud ...
    • Opening numbers related data in Microsoft Excel - the right way

      Common CSV files that have data usually contain numbers and when you open those files in Microsoft Excel you encounter issues with how the data is displayed in Excel. Some common problems are ‘Leading’ zeros may get dropped – very commonly seen with ...