Skip to main content

Amazon Redshift

Overview

Amazon Redshift is a fully managed, petabyte-scale data warehouse service in the cloud, offering fast query and I/O performance for data analysis applications. It enables users to run complex analytical queries against structured data using SQL, ETL processes, and BI tools, leveraging massively parallel processing (MPP) to efficiently handle large volumes of data. Redshift's architecture is designed for high performance on large datasets, supporting data warehousing and analytics of all sizes, making it a pivotal component in a modern data-driven decision-making ecosystem. By leveraging the AWS SDK for Go and utilizing intermediary Parquet files in S3 (to ensure performance), you can connect to and read from Redshift data warehouses.

Authentication Methods

To connect to Amazon Redshift, you need to provide authentication credentials. Rill supports two methods:

  1. Use Access Key/Secret Key (recommended for production)
  2. Use Local AWS credentials (local development only - not recommended for production)

When you add data from Redshift through the Rill UI, the process follows two steps:

  1. Configure Authentication - Set up your Redshift connector with AWS credentials (Access Key/Secret Key)
  2. Configure Data Model - Define which database, table, or query to execute

This two-step flow ensures your credentials are securely stored in the connector configuration, while your data model references remain clean and portable.

Access Key and Secret Key credentials provide the most reliable authentication for Redshift. This method works for both local development and Rill Cloud deployments.

Using the UI

  1. Click Add Data in your Rill project
  2. Select Amazon Redshift as the data source type
  3. In the authentication step:
    • Enter your AWS Access Key ID
    • Enter your AWS Secret Access Key
    • Specify your database name
    • Specify the workgroup (for Serverless) or cluster identifier
  4. In the data model configuration step, enter your SQL query
  5. Click Create to finalize

After the model YAML is generated, you can add additional model settings directly to the file.

Manual Configuration

If you prefer to configure manually:

Step 1: Create connector configuration

Create connectors/redshift.yaml:

type: connector
driver: redshift

aws_access_key_id: "{{ .env.AWS_ACCESS_KEY_ID }}"
aws_secret_access_key: "{{ .env.AWS_SECRET_ACCESS_KEY }}"
database: "dev"

Step 2: Add credentials to .env

AWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE
AWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
Did you know?

If this project has already been deployed to Rill Cloud and credentials have been set for this connector, you can use rill env pull to pull these cloud credentials locally (into your local .env file). Please note that this may override any credentials you have set locally for this source.

Then, create your first model.

Method 2: Local AWS Credentials

For local development, you can use credentials from the AWS CLI. This method is not suitable for production or Rill Cloud deployments.

Not recommended for production

Local AWS credentials only work for local development. If you deploy to Rill Cloud using this method, your dashboards will fail. Always use Access Key/Secret Key for production deployments.

Setup

  1. Install the AWS CLI if not already installed
  2. Authenticate with your AWS account:
    • If your organization has SSO configured, reach out to your admin for instructions on how to authenticate using aws sso login
    • Otherwise, run aws configure and provide your access key, secret, and default region
  3. Verify your authentication:
    aws iam get-user --no-cli-pager

Manual Configuration

Create connectors/redshift.yaml:

type: connector
driver: redshift

database: "dev"

When no explicit credentials are provided in the connector, Rill will automatically use your local AWS CLI credentials. Then, create your first model.

Create Your First Model

Once your connector is configured using any method above, create a model to define what data to pull.

Create models/redshift_data.yaml:

type: model
connector: redshift
dev:
sql: SELECT * FROM my_schema.my_table limit 10000

sql: SELECT * FROM my_schema.my_table

After creating the model, you can add additional model settings directly to the file.

Separating Dev and Prod Environments

When ingesting data locally, consider setting parameters in your connector file to limit how much data is retrieved, since costs can scale with the data source. This also helps other developers clone the project and iterate quickly by reducing ingestion time.

For more details, see our Dev/Prod setup docs.

Deploy to Rill Cloud

When deploying a project to Rill Cloud, Rill requires you to explicitly provide an access key and secret for an AWS service account with access to Redshift used in your project. Please refer to our connector YAML reference docs for more information.

If you subsequently add sources that require new credentials (or if you simply entered the wrong credentials during the initial deploy), you can update the credentials by pushing the Deploy button to update your project or by running the following command in the CLI:

rill env push
Did you know?

If you've already configured credentials locally (in your <RILL_PROJECT_DIRECTORY>/.env file), you can use rill env push to push these credentials to your Rill Cloud project. This will allow other users to retrieve and reuse the same credentials automatically by running rill env pull.

Appendix

Check your service account permissions

Your account or service account will need to have the appropriate permissions necessary to perform these requests.

Redshift Serverless Permissions

When using Redshift Serverless, make sure to associate an IAM role (that has S3 access) with the Serverless namespace or the Redshift cluster.

What happens when Rill is reading from Redshift Serverless?

Our Redshift connector will place temporary files in Parquet format in S3 to help accelerate the extraction process (maximizing performance). To provide more details, the Redshift connector will execute the following queries/requests while ingesting data from Redshift:

  1. Redshift Serverless: GetCredentials if you are using a Workgroup name to connect.
  2. Redshift Data API: DescribeStatement, ExecuteStatement to unload data to S3.
  3. S3: ListObjects to identify files unloaded by Redshift.
  4. S3: GetObject to ingest files unloaded by Redshift.

Redshift Cluster Permissions

Similarly, when using Redshift Cluster, make sure to associate an IAM role (that has S3 access) with the appropriate Redshift cluster.

What happens when Rill is reading from a Redshift Cluster?

Our Redshift connector will place temporary files in Parquet format in S3 to help accelerate the extraction process (maximizing performance). To provide more details, the Redshift connector will execute the following queries/requests while ingesting data from Redshift:

  1. Redshift: GetClusterCredentialsWithIAM if you are using a Cluster Identifier to connect.
  2. Redshift Data API: DescribeStatement, ExecuteStatement to unload data to S3.
  3. S3: ListObjects to identify files unloaded by Redshift.
  4. S3: GetObject to ingest files unloaded by Redshift.

How to Create an AWS Service Account

For detailed instructions on creating an AWS service account with the appropriate permissions, see the S3 connector documentation.