Skip to main content

Amazon S3

Overview

Amazon S3 is a scalable, fully managed, and highly reliable object storage solution offered by AWS, designed to store and access data from anywhere in the world. It provides a secure and cost-effective way to store data, including common storage formats such as CSV and Parquet. Rill natively supports connecting to S3 using the provided S3 URI of your bucket to retrieve and read files.

Connect to S3

To connect to Amazon S3, you need to provide authentication credentials. You have four options:

  1. Use Access Key/Secret Key (recommended for cloud deployment)
  2. Use IAM Role Assumption (enhanced security with temporary credentials)
  3. Use Anonymous Access (for publicly accessible buckets only)
  4. Use Local AWS credentials (local development only - not recommended for production)

Choose the method that best fits your setup. For production deployments to Rill Cloud, use Access Key/Secret Key or IAM Role Assumption. Local AWS credentials only work for local development and will cause deployment failures.

S3-Compatible Storage

You can also connect to S3-compatible storage services by specifying a custom endpoint in your connector configuration.

Access Key and Secret Key

Create a connector with your credentials to connect to S3. Here's an example connector configuration file you can copy into your connectors directory to get started:

type: connector
driver: s3

aws_access_key_id: "{{ .env.connector.s3.aws_access_key_id }}"
aws_secret_access_key: "{{ .env.connector.s3.aws_secret_access_key }}"

This approach ensures your AWS sources authenticate consistently across both local development and cloud deployment environments.

Did you know?

If this project has already been deployed to Rill Cloud and credentials have been set for this connector, you can use rill env pull to pull these cloud credentials locally (into your local .env file). Please note that this may override any credentials that you have set locally for this source.

IAM Role-Based Authentication

Rill supports AWS IAM role assumption for enhanced security. This method allows Rill to temporarily assume an IAM role to access S3 resources.

Benefits of Using IAM Roles

  • Temporary Credentials: No need to manage long-lived access keys.
  • Enhanced Security: Follows the principle of least privilege.
  • Cross-Account Access: Access S3 resources in different AWS accounts.
  • Centralized Control: Manage permissions through IAM roles and policies.

Anonymous Access

For publicly accessible S3 buckets, you can connect without authentication:

type: connector
driver: s3
# No authentication credentials needed for public buckets
Public Access Only

This method only works with publicly accessible buckets. Most production S3 buckets are private and require authentication.

Local AWS Credentials (Local Development Only)

Not recommended for production

Local AWS credentials only work for local development. If you deploy to Rill Cloud using this method, your dashboards will fail. Use one of the methods above for production deployments.

To check if you already have the AWS CLI installed and authenticated, open a terminal window and run:

aws iam get-user --no-cli-pager
note

The above command only works with AWS CLI version 2 and above.

If it prints information about your user, there is nothing more to do. Rill will be able to connect to any data in S3 that you have access to.

If you do not have the AWS CLI installed and authenticated, follow these steps:

  1. Open a terminal window and install the AWS CLI if it is not already installed on your system.

  2. If your organization has SSO configured, reach out to your admin for instructions on how to authenticate using aws sso login.

  3. If your organization does not have SSO configured:

    a. Follow the steps described under How to create an AWS service account using the AWS Management Console, which you will find below on this page.

    b. Run the following command and provide the access key, access secret, and default region when prompted (you can leave the "Default output format" blank):

    aws configure

You have now configured AWS access from your local environment. Rill will detect and use your credentials the next time you try to ingest a source.

Deploy to Rill Cloud

When deploying your project to Rill Cloud, you must provide an access key and secret key for an AWS service account with appropriate read access/permissions to the S3 buckets used in your project. If these credentials exist in your .env file, they'll be pushed with your project automatically. If you're using inferred credentials only, your deployment will result in errored dashboards.

To manually configure your environment variables, run:

rill env configure

Appendix

How to create an AWS service account using the AWS Management Console

Here is a step-by-step guide on how to create an AWS service account with read-only access to S3:

  1. Log in to the AWS Management Console and navigate to the IAM dashboard.

  2. In the sidebar, select "Users" and click the "Add users" button.

  3. Enter a username for the service account and click "Next".

  4. Select "Attach policies directly" and grant the service account read access to data in S3:

    • To grant access to data in all buckets, search for the "AmazonS3ReadOnlyAccess" policy. Check the box next to the policy to select it.
    • To only grant access to data in a specific bucket, follow these steps:
      1. Click the "Create policy" button in the top right corner of the "Permissions policies" box.
      2. Select the "JSON" tab in the top right corner of the "Policy editor".
      3. Paste the following policy and replace [BUCKET_NAME] with the name of your bucket:
        {
        "Version": "2012-10-17",
        "Statement": [
        {
        "Effect": "Allow",
        "Action": [
        "s3:GetObject",
        "s3:ListBucket"
        ],
        "Resource": [
        "arn:aws:s3:::[BUCKET_NAME]",
        "arn:aws:s3:::[BUCKET_NAME]/*"
        ]
        }
        ]
        }
      4. Click "Next".
      5. Give the policy a name and click "Create policy".
      6. Go back to the service account creation flow. Click the refresh button next to the "Create policy" button.
      7. Search for the policy you just created. Check the box next to the policy to select it.
  5. After attaching a policy, click "Next". Then, under "Set permissions boundaries and tags", click the "Create user" button.

  6. On the "Users" page, navigate to the newly created user and go to the "Security credentials" tab.

  7. Under the "Access keys" section, click "Create access key".

  8. On the "Access key best practices & alternatives" screen, select "Third-party service", confirm the checkbox, and click "Next".

  9. On the "Set description tag" screen, optionally enter a description, and click "Create access key".

  10. Note down the "Access key" and "Secret access key" values for the service account. (Hint: Click the ❐ icon next to the secrets to copy them to the clipboard.)

How to create an AWS service account using the aws CLI

Here is a step-by-step guide on how to create an AWS service account with read-only access to S3 using the AWS CLI:

  1. Open a terminal window and install the AWS CLI if it is not already installed on your system.

  2. Run the following command to create a new user (optionally replace rill-service-account with a name of your choice):

    aws iam create-user --no-cli-pager --user-name rill-service-account
  3. Grant the user read access to data in S3:

    • To grant access to data in all buckets, run the following command:

      aws iam attach-user-policy \
      --policy-arn arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess \
      --user-name rill-service-account
    • To only grant access to data in a specific bucket:

      1. Create a custom policy by running the following command, replacing [POLICY_NAME] with a custom name and [BUCKET_NAME] with the bucket name:

        aws iam create-policy \
        --policy-name [POLICY_NAME] \
        --policy-document \
        '{
        "Version": "2012-10-17",
        "Statement": [
        {
        "Effect": "Allow",
        "Action": [
        "s3:GetObject",
        "s3:ListBucket"
        ],
        "Resource": [
        "arn:aws:s3:::[BUCKET_NAME]",
        "arn:aws:s3:::[BUCKET_NAME]/*"
        ]
        }
        ]
        }'
      2. Attach the custom policy to the user by running the following command, replacing [POLICY_NAME] with the custom name set in the previous step:

        aws iam attach-user-policy \
        --policy-arn arn:aws:iam::aws:policy/[POLICY_NAME] \
        --user-name rill-service-account
  4. Run the following command to create an access key pair for the user:

    aws iam create-access-key --user-name rill-service-account
  5. Note down the AccessKeyId and SecretAccessKey values in the returned JSON object. Press "q" to exit the page.

How to create an IAM role for cross-account access with Rill-provided AWS account

To set up an IAM role that grants Rill's AWS account access to your S3 buckets:

  1. Log in to the AWS Management Console of your account that owns the S3 bucket (your resource account).

  2. Navigate to the IAM dashboard.

  3. In the sidebar, select "Roles" and click the "Create role" button.

  4. For trust relationship, select "AWS account" and choose "Another AWS account".

  5. Enter Rill's AWS account ID that was provided to you by your Rill representative.

  6. Select "Require external ID" and enter the External ID provided by Rill. This helps prevent the confused deputy problem.

  7. Click "Next: Permissions".

  8. Attach policies that grant the necessary S3 access permissions:

    • For read-only access to all buckets, select "AmazonS3ReadOnlyAccess"
    • For more restricted access, create a custom policy similar to the one described in the previous sections, limiting access to specific buckets.
  9. Click "Next: Tags", add optional tags if desired, and then click "Next: Review".

  10. Give the role a descriptive name (e.g., "RillDataAccess") and an optional description.

  11. Click "Create role".

  12. After creating the role, click on it to view its details.

  13. Note the "Role ARN" value which looks like: arn:aws:iam::123456789012:role/RillDataAccess

  14. Share this Role ARN with your Rill representative to complete the setup. Rill will configure their systems to assume this role when accessing your data.