Amazon S3

Overview

Amazon S3 is a scalable, fully managed, and highly reliable object storage solution offered by AWS, designed to store and access data from anywhere in the world. It provides a secure and cost-effective way to store data, including common storage formats such as CSV and Parquet. Rill natively supports connecting to S3 using the provided S3 URI of your bucket to retrieve and read files.

Authentication Methods

To connect to Amazon S3, you can choose from four authentication options:

Access Key/Secret Key (recommended for cloud deployment)
IAM Role Assumption (enhanced security with temporary credentials)
Public (for publicly accessible buckets - no authentication required)
Local AWS credentials (local development only - not recommended for production)

S3-Compatible Storage

You can also connect to S3-compatible storage services by specifying a custom endpoint in your connector configuration.

When you add data from S3 through the Rill UI, you'll see two authentication options:

Access Key/Secret Key: The process follows two steps:
1. Configure Authentication - Set up your S3 connector with credentials
2. Configure Data Model - Define which bucket and objects to ingest The UI will automatically create both the connector file and model file for you.
Public: For publicly accessible buckets, you skip the connector creation step and go directly to:
1. Configure Data Model - Define which bucket and objects to ingest The UI will only create the model file (no connector file is needed).

Manual Configuration Only

IAM Role Assumption and Local AWS credentials are only available through manual configuration. See Method 2: IAM Role Assumption and Method 4: Local AWS Credentials for setup instructions.

Method 1: Access Key/Secret Key (Recommended)

Access Key/Secret Key credentials provide reliable authentication for S3. This method works for both local development and Rill Cloud deployments.

Using the UI

Click Add Data in your Rill project
Select Amazon S3 as the data model type
In the authentication step:
- Choose Access Key/Secret Key
- Enter your Access Key ID
- Enter your Secret Access Key
In the data model configuration step, enter your SQL query
Click Create to finalize

After the model YAML is generated, you can add additional model settings directly to the file.

Manual Configuration

If you prefer to configure manually:

Step 1: Create connector configuration

Create connectors/s3.yaml:

type: connector
driver: s3

aws_access_key_id: "{{ .env.AWS_ACCESS_KEY_ID }}"
aws_secret_access_key: "{{ .env.AWS_SECRET_ACCESS_KEY }}"

Step 2: Add credentials to .env

AWS_ACCESS_KEY_ID=your_access_key_id
AWS_SECRET_ACCESS_KEY=your_secret_access_key

Did you know?

If this project has already been deployed to Rill Cloud and credentials have been set for this connector, you can use rill env pull to pull these cloud credentials locally (into your local .env file). Please note that this may override any credentials you have set locally for this source.

Then, create your first model.

Method 2: IAM Role Assumption

Rill supports AWS IAM role assumption for enhanced security. This method allows Rill to temporarily assume an IAM role to access S3 resources. This method is only available through manual configuration.

Benefits of Using IAM Roles

Temporary Credentials: No need to manage long-lived access keys.
Enhanced Security: Follows the principle of least privilege.
Cross-Account Access: Access S3 resources in different AWS accounts.
Centralized Control: Manage permissions through IAM roles and policies.

Manual Configuration

Step 1: Create connector configuration

Create connectors/s3_role.yaml:

type: connector
driver: s3

aws_role_arn: "{{ .env.AWS_ROLE_ARN }}"
aws_external_id: "{{ .env.AWS_EXTERNAL_ID }}"

Step 2: Add credentials to .env

AWS_ROLE_ARN=arn:aws:iam::123456789012:role/RillDataAccess
AWS_EXTERNAL_ID=your_external_id

Then, create your first model.

Method 3: Public Buckets

For publicly accessible S3 buckets, you don't need to create a connector. Simply use the S3 URI directly in your model configuration.

Using the UI

Click Add Data in your Rill project
Select Amazon S3 as the data model type
In the authentication step:
- Choose Public
- The UI will skip connector creation and proceed directly to data model configuration
In the data model configuration step, enter your SQL query
Click Create to finalize

After the model YAML is generated, you can add additional model settings directly to the file.

Manual Configuration

No connector configuration is needed for public buckets. Proceed to Create Your First Model below and omit the create_secrets_from_connectors line.

Public Access Only

This method only works with publicly accessible buckets. Most production S3 buckets are private and require authentication.

Method 4: Local AWS Credentials (Local Development Only)

For local development, you can use credentials from the AWS CLI. This method is not suitable for production or Rill Cloud deployments. This method is only available through manual configuration, and you don't need to create a connector file.

Setup

Install the AWS CLI if not already installed
Authenticate with your AWS account:
- If your organization has SSO configured, reach out to your admin for instructions on how to authenticate using aws sso login
- If your organization does not have SSO configured, follow the steps described under How to create an AWS service account using the AWS Management Console, then run aws configure
Create your model file (no connector needed)

Manual Configuration

No connector is needed when using local credentials. Proceed to Create Your First Model below and omit the create_secrets_from_connectors line.

Rill will automatically detect and use your local AWS CLI credentials when no connector is specified.

warning

This method only works for local development. Deploying to Rill Cloud with this configuration will fail because the cloud environment doesn't have access to your local credentials. Always use Access Key/Secret Key or IAM Role Assumption for production deployments.

Path Patterns

You can use wildcards to read multiple files:

-- Single file
SELECT * FROM read_parquet('s3://my-bucket/data/file.parquet')

-- All files in a directory
SELECT * FROM read_parquet('s3://my-bucket/data/*.parquet')

-- All files in nested directories
SELECT * FROM read_parquet('s3://my-bucket/data/**/*.parquet')

-- Files matching a pattern
SELECT * FROM read_parquet('s3://my-bucket/data/2024-*.parquet')

Create Your First Model

After setting up your authentication above, create a model to define what data to pull.

Create models/s3_data.yaml:

type: model
connector: duckdb
create_secrets_from_connectors: s3
dev:
  sql: SELECT * FROM read_parquet('s3://my-bucket/path/to/data/*.parquet')

sql: SELECT * FROM read_parquet('s3://my-bucket/**/*.parquet')

Replace s3 in create_secrets_from_connectors with the name of the connector you created.
If you're using a public bucket or local credentials (no connector), omit the create_secrets_from_connectors line.

After creating the model, you can add additional model settings directly to the file.

Deploy to Rill Cloud

When deploying a project to Rill Cloud, Rill requires you to explicitly provide an access key and secret for an AWS service account with access to S3 used in your project. Please refer to our connector YAML reference docs for more information.

If you subsequently add sources that require new credentials (or if you simply entered the wrong credentials during the initial deploy), you can update the credentials by pushing the Deploy button to update your project or by running the following command in the CLI:

rill env push

Did you know?

If you've already configured credentials locally (in your <RILL_PROJECT_DIRECTORY>/.env file), you can use rill env push to push these credentials to your Rill Cloud project. This will allow other users to retrieve and reuse the same credentials automatically by running rill env pull.

Appendix

How to create an AWS service account using the AWS Management Console

Here is a step-by-step guide on how to create an AWS service account with read-only access to S3:

Log in to the AWS Management Console and navigate to the IAM dashboard.
In the sidebar, select "Users" and click the "Add users" button.
Enter a username for the service account and click "Next".
Select "Attach policies directly" and grant the service account read access to data in S3:
- To grant access to data in all buckets, search for the "AmazonS3ReadOnlyAccess" policy. Check the box next to the policy to select it.
- To only grant access to data in a specific bucket, follow these steps:
  1. Click the "Create policy" button in the top right corner of the "Permissions policies" box.
  2. Select the "JSON" tab in the top right corner of the "Policy editor".
  3. Paste the following policy and replace [BUCKET_NAME] with the name of your bucket:
    { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "s3:GetObject", "s3:ListBucket" ], "Resource": [ "arn:aws:s3:::[BUCKET_NAME]", "arn:aws:s3:::[BUCKET_NAME]/*" ] } ] }
  4. Click "Next".
  5. Give the policy a name and click "Create policy".
  6. Go back to the service account creation flow. Click the refresh button next to the "Create policy" button.
  7. Search for the policy you just created. Check the box next to the policy to select it.
After attaching a policy, click "Next". Then, under "Set permissions boundaries and tags", click the "Create user" button.
On the "Users" page, navigate to the newly created user and go to the "Security credentials" tab.
Under the "Access keys" section, click "Create access key".
On the "Access key best practices & alternatives" screen, select "Third-party service", confirm the checkbox, and click "Next".
On the "Set description tag" screen, optionally enter a description, and click "Create access key".
Note down the "Access key" and "Secret access key" values for the service account. (Hint: Click the ❐ icon next to the secrets to copy them to the clipboard.)

How to create an AWS service account using the `aws` CLI

Here is a step-by-step guide on how to create an AWS service account with read-only access to S3 using the AWS CLI:

Open a terminal window and install the AWS CLI if it is not already installed on your system.
Run the following command to create a new user (optionally replace rill-service-account with a name of your choice):
```
aws iam create-user --no-cli-pager --user-name rill-service-account
```

Grant the user read access to data in S3:

To grant access to data in all buckets, run the following command:

aws iam attach-user-policy \
    --policy-arn arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess \
    --user-name rill-service-account

To only grant access to data in a specific bucket:

Create a custom policy by running the following command, replacing [POLICY_NAME] with a custom name and [BUCKET_NAME] with the bucket name:

aws iam create-policy \
    --policy-name [POLICY_NAME] \
    --policy-document \
'{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:GetObject",
                "s3:ListBucket"
            ],
            "Resource": [
                "arn:aws:s3:::[BUCKET_NAME]",
                "arn:aws:s3:::[BUCKET_NAME]/*"
            ]
        }
    ]
}'

Attach the custom policy to the user by running the following command, replacing [POLICY_NAME] with the custom name set in the previous step:
```
aws iam attach-user-policy \
    --policy-arn arn:aws:iam::aws:policy/[POLICY_NAME] \
    --user-name rill-service-account
```

Run the following command to create an access key pair for the user:
```
aws iam create-access-key --user-name rill-service-account
```
Note down the AccessKeyId and SecretAccessKey values in the returned JSON object. Press "q" to exit the page.

How to create an IAM role for cross-account access with Rill-provided AWS account

To set up an IAM role that grants Rill's AWS account access to your S3 buckets:

Log in to the AWS Management Console of your account that owns the S3 bucket (your resource account).
Navigate to the IAM dashboard.
In the sidebar, select "Roles" and click the "Create role" button.
For trust relationship, select "AWS account" and choose "Another AWS account".
Enter Rill's AWS account ID that was provided to you by your Rill representative.
Select "Require external ID" and enter the External ID provided by Rill. This helps prevent the confused deputy problem.
Click "Next: Permissions".
Attach policies that grant the necessary S3 access permissions:
- For read-only access to all buckets, select "AmazonS3ReadOnlyAccess"
- For more restricted access, create a custom policy similar to the one described in the previous sections, limiting access to specific buckets.
Click "Next: Tags", add optional tags if desired, and then click "Next: Review".
Give the role a descriptive name (e.g., "RillDataAccess") and an optional description.
Click "Create role".
After creating the role, click on it to view its details.
Note the "Role ARN" value which looks like: arn:aws:iam::123456789012:role/RillDataAccess
Share this Role ARN with your Rill representative to complete the setup. Rill will configure their systems to assume this role when accessing your data.

Overview​

Authentication Methods​

Method 1: Access Key/Secret Key (Recommended)​

Using the UI​

Manual Configuration​

Method 2: IAM Role Assumption​

Benefits of Using IAM Roles​

Manual Configuration​

Method 3: Public Buckets​

Using the UI​

Manual Configuration​

Method 4: Local AWS Credentials (Local Development Only)​

Setup​

Manual Configuration​

Path Patterns​

Create Your First Model​

Deploy to Rill Cloud​

Appendix​

How to create an AWS service account using the AWS Management Console​

How to create an AWS service account using the aws CLI​

How to create an IAM role for cross-account access with Rill-provided AWS account​

Overview

Authentication Methods

Method 1: Access Key/Secret Key (Recommended)

Using the UI

Manual Configuration

Method 2: IAM Role Assumption

Benefits of Using IAM Roles

Manual Configuration

Method 3: Public Buckets

Using the UI

Manual Configuration

Method 4: Local AWS Credentials (Local Development Only)

Setup

Manual Configuration

Path Patterns

Create Your First Model

Deploy to Rill Cloud

Appendix

How to create an AWS service account using the AWS Management Console

How to create an AWS service account using the `aws` CLI

How to create an IAM role for cross-account access with Rill-provided AWS account