Skip to main content

Connector YAML

Connector YAML files define how Rill connects to external data sources and OLAP engines. Each connector specifies a driver type and its required connection parameters.

Available Connector Types

OLAP Engines

Data Warehouses

Databases

Cloud Storage

  • Azure - Azure Blob Storage
  • GCS - Google Cloud Storage
  • S3 - Amazon S3 storage

Other

Security Recommendation

For all credential parameters (passwords, tokens, keys), use environment variables with the syntax {{.env.connector.<connector_driver>.<parameter_name>}}. This keeps sensitive data out of your YAML files and version control. See our credentials documentation for complete setup instructions.

Properties

type

[string] - Refers to the resource type and must be connector (required)

Common Properties

name

[string] - Name is usually inferred from the filename, but can be specified manually.

refs

[array of string] - List of resource references

dev

[object] - Overrides any properties in development environment.

prod

[object] - Overrides any properties in production environment.

Athena

driver

[string] - Refers to the driver type and must be driver athena (required)

aws_access_key_id

[string] - AWS Access Key ID used for authentication. Required when using static credentials directly or as base credentials for assuming a role.

aws_secret_access_key

[string] - AWS Secret Access Key paired with the Access Key ID. Required when using static credentials directly or as base credentials for assuming a role.

aws_access_token

[string] - AWS session token used with temporary credentials. Required only if the Access Key and Secret Key are part of a temporary session credentials.

role_arn

[string] - ARN of the IAM role to assume. When specified, the SDK uses the base credentials to call STS AssumeRole and obtain temporary credentials scoped to this role.

role_session_name

[string] - Session name to associate with the STS AssumeRole session. Used only if 'role_arn' is specified. Useful for identifying and auditing the session.

external_id

[string] - External ID required by some roles when assuming them, typically for cross-account access. Used only if 'role_arn' is specified and the role's trust policy requires it.

workgroup

[string] - Athena workgroup to use for query execution. Defaults to 'primary' if not specified.

output_location

[string] - S3 URI where Athena query results should be stored (e.g., s3://your-bucket/athena/results/). Optional if the selected workgroup has a default result configuration.

aws_region

[string] - AWS region where Athena and the result S3 bucket are located (e.g., us-east-1). Defaults to 'us-east-1' if not specified.

allow_host_access

[boolean] - Allow the Athena client to access host environment configurations such as environment variables or local AWS credential files. Defaults to true, enabling use of credentials and settings from the host environment unless explicitly disabled.

# Example: Athena connector configuration
type: connector # Must be `connector` (required)
driver: athena # Must be `athena` _(required)_
aws_access_key_id: "myawsaccesskey" # AWS Access Key ID for authentication
aws_secret_access_key: "myawssecretkey" # AWS Secret Access Key for authentication
aws_access_token: "mytemporarytoken" # AWS session token for temporary credentials
role_arn: "arn:aws:iam::123456789012:role/MyRole" # ARN of the IAM role to assume
role_session_name: "MySession" # Session name for STS AssumeRole
external_id: "MyExternalID" # External ID for cross-account access
workgroup: "primary" # Athena workgroup (defaults to 'primary')
output_location: "s3://my-bucket/athena-output/" # S3 URI for query results
aws_region: "us-east-1" # AWS region (defaults to 'us-east-1')
allow_host_access: true # Allow host environment access _(default: true)_

Azure

driver

[string] - Refers to the driver type and must be driver azure (required)

azure_storage_account

[string] - Azure storage account name

azure_storage_key

[string] - Azure storage access key

azure_storage_bucket

[string] - Name of the Azure Blob Storage container (equivalent to an S3 bucket) (required)

azure_storage_sas_token

[string] - Optional azure SAS token for authentication

azure_storage_connection_string

[string] - Optional azure connection string for storage account

allow_host_access

[boolean] - Allow access to host environment configuration

# Example: Azure connector configuration
type: connector # Must be `connector` (required)
driver: azure # Must be `azure` _(required)_
azure_storage_account: "mystorageaccount" # Azure storage account name
azure_storage_key: "credentialjsonstring" # Azure storage access key
azure_storage_sas_token: "optionaltoken" # Optional SAS token for authentication
azure_storage_connection_string: "optionalconnectionstring" # Optional connection string
azure_storage_bucket: "mycontainer" # Azure Blob Storage container name _(required)_
allow_host_access: true # Allow host environment access

BigQuery

driver

[string] - Refers to the driver type and must be driver bigquery (required)

google_application_credentials

[string] - Raw contents of the Google Cloud service account key (in JSON format) used for authentication.

project_id

[string] - Google Cloud project ID

dataset_id

[string] - BigQuery dataset ID

location

[string] - BigQuery dataset location

allow_host_access

[boolean] - Enable the BigQuery client to use credentials from the host environment when no service account JSON is provided. This includes Application Default Credentials from environment variables, local credential files, or the Google Compute Engine metadata server. Defaults to true, allowing seamless authentication in GCP environments.

# Example: BigQuery connector configuration
type: connector # Must be `connector` (required)
driver: bigquery # Must be `bigquery` _(required)_
google_application_credentials: "credentialjsonstring" # Google Cloud service account JSON
project_id: "my-project-id" # Google Cloud project ID
allow_host_access: true # Allow host environment access _(default: true)_

ClickHouse

driver

[string] - Refers to the driver type and must be driver clickhouse (required)

managed

[boolean] - true means Rill will provision the connector using the default provisioner. false disables automatic provisioning.

mode

[string] - read - Controls the operation mode for the ClickHouse connection. Defaults to 'read' for safe operation with external databases. Set to 'readwrite' to enable model creation and table mutations. Note: When 'managed: true', this is automatically set to 'readwrite'.

dsn

[string] - DSN(Data Source Name) for the ClickHouse connection

username

[string] - Username for authentication

password

[string] - Password for authentication

host

[string] - Host where the ClickHouse instance is running

port

[integer] - Port where the ClickHouse instance is accessible

database

[string] - Name of the ClickHouse database within the cluster

ssl

[boolean] - Indicates whether a secured SSL connection is required

cluster

[string] - Cluster name, required for running distributed queries

log_queries

[boolean] - Controls whether to log raw SQL queries

query_settings_override

[string] - override the default settings used in queries. Changing the default settings can lead to incorrect query results and is generally not recommended. If you need to add settings, use query_settings

query_settings

[string] - query settings to be set on dashboard queries. query_settings_override takes precedence over these settings and if set these are ignored. Each setting must be separated by a comma. Example max_threads = 8, max_memory_usage = 10000000000

embed_port

[integer] - Port to run ClickHouse locally (0 for random port)

can_scale_to_zero

[boolean] - Indicates if the database can scale to zero

max_open_conns

[integer] - Maximum number of open connections to the database

max_idle_conns

[integer] - Maximum number of idle connections in the pool

dial_timeout

[string] - Timeout for dialing the ClickHouse server

conn_max_lifetime

[string] - Maximum time a connection may be reused

read_timeout

[string] - Maximum time for a connection to read data

# Example: ClickHouse connector configuration
type: connector # Must be `connector` (required)
driver: clickhouse # Must be `clickhouse` _(required)_
managed: false # Provision the connector using the default provisioner
mode: "readwrite" # Enable model creation and table mutations
username: "myusername" # Username for authentication
password: "mypassword" # Password for authentication
host: "localhost" # Hostname of the ClickHouse server
port: 9000 # Port number of the ClickHouse server
database: "mydatabase" # Name of the ClickHouse database
ssl: true # Enable SSL for secure connection
cluster: "mycluster" # Cluster name

Druid

driver

[string] - Refers to the driver type and must be driver druid (required)

dsn

[string] - Data Source Name (DSN) for connecting to Druid (required)

username

[string] - Username for authenticating with Druid

password

[string] - Password for authenticating with Druid

host

[string] - Hostname of the Druid coordinator or broker

port

[integer] - Port number of the Druid service

ssl

[boolean] - Enable SSL for secure connection

log_queries

[boolean] - Log raw SQL queries sent to Druid

max_open_conns

[integer] - Maximum number of open database connections (0 = default, -1 = unlimited)

skip_version_check

[boolean] - Skip checking Druid version compatibility

# Example: Druid connector configuration
type: connector # Must be `connector` (required)
driver: druid # Must be `druid` _(required)_
username: "myusername" # Username for authentication
password: "mypassword" # Password for authentication
host: "localhost" # Hostname of the Druid coordinator or broker
port: 8082 # Port number of the Druid service
ssl: true # Enable SSL for secure connection

DuckDB

driver

[string] - Refers to the driver type and must be driver duckdb (required)

pool_size

[integer] - Number of concurrent connections and queries allowed

allow_host_access

[boolean] - Whether access to the local environment and file system is allowed

cpu

[integer] - Number of CPU cores available to the database

memory_limit_gb

[integer] - Amount of memory in GB available to the database

read_write_ratio

[number] - Ratio of resources allocated to the read database; used to divide CPU and memory

init_sql

[string] - is executed during database initialization.

secrets

[string] - Comma-separated list of other connector names to create temporary secrets for in DuckDB before executing a model.

log_queries

[boolean] - Whether to log raw SQL queries executed through OLAP

# Example: DuckDB connector configuration
type: connector # Must be `connector` (required)
driver: duckdb # Must be `duckdb` _(required)_
allow_host_access: true # Whether access to the local environment and file system is allowed
cpu: 4 # Number of CPU cores available to the database
memory_limit_gb: 16 # Amount of memory in GB available to the database

GCS

driver

[string] - Refers to the driver type and must be driver gcs (required)

google_application_credentials

[string] - Google Cloud credentials JSON string

bucket

[string] - Name of gcs bucket (required)

allow_host_access

[boolean] - Allow access to host environment configuration

key_id

[string] - Optional S3-compatible Key ID when used in compatibility mode

secret

[string] - Optional S3-compatible Secret when used in compatibility mode

# Example: GCS connector configuration
type: connector # Must be `connector` (required)
driver: gcs # Must be `gcs` _(required)_
google_application_credentials: "credentialjsonstring" # Google Cloud credentials JSON string
bucket: "my-gcs-bucket" # Name of gcs bucket

HTTPS

driver

[string] - Refers to the driver type and must be driver https (required)

path

[string] - The full HTTPS URI to fetch data from (required)

headers

[object] - HTTP headers to include in the request

# Example: HTTPS connector configuration
type: connector # Must be `connector` (required)
driver: https # Must be `https` _(required)_
path: "https://api.example.com/data.csv" # The full HTTPS URI to fetch data from
headers:
"Authorization": "Bearer my-token" # HTTP headers to include in the request

MotherDuck

driver

[string] - Refers to the driver type and must be driver motherduck (required)

path

[string] - Path to your MD database (required)

schema_name

[string] - Define your schema if not main, uses main by default

token

[string] - MotherDuck token (required)

init_sql

[string] - SQL executed during database initialization.

# Example: MotherDuck connector configuration
type: connector # Must be `connector` (required)
driver: motherduck # Must be `motherduck` _(required)_
token: '{{ .env.connector.motherduck.token }}' # Set the MotherDuck token from your .env file _(required)_
path: "md:my_database" # Path to your MD database
schema_name: "my_schema" # Define your schema if not main, uses main by default

MySQL

driver

[string] - Refers to the driver type and must be driver mysql (required)

dsn

[string] - DSN(Data Source Name) for the mysql connection

host

[string] - Hostname of the MySQL server

port

[integer] - Port number for the MySQL server

database

[string] - Name of the MySQL database

user

[string] - Username for authentication

password

[string] - Password for authentication

ssl_mode

[string] - SSL mode can be DISABLED, PREFERRED or REQUIRED

# Example: MySQL connector configuration
type: connector # Must be `connector` (required)
driver: mysql # Must be `mysql` _(required)_
host: "localhost" # Hostname of the MySQL server
port: 3306 # Port number for the MySQL server
database: "mydatabase" # Name of the MySQL database
user: "myusername" # Username for authentication
password: "mypassword" # Password for authentication
ssl_mode: "DISABLED" # SSL mode can be DISABLED, PREFERRED or REQUIRED

OpenAPI

driver

[string] - The driver type, must be set to "openapi"

api_key

[string] - API key for connecting to OpenAI (required)

model

[string] - The OpenAI model to use (e.g., 'gpt-4o')

base_url

[string] - The base URL for the OpenAI API (e.g., 'https://api.openai.com/v1')

api_type

[string] - The type of OpenAI API to use

api_version

[string] - The version of the OpenAI API to use (e.g., '2023-05-15'). Required when API Type is AZURE or AZURE_AD

# Example: OpenAPI connector configuration
type: connector # Must be `connector` (required)
driver: openapi # Must be `openapi` _(required)_
api_key: "my-api-key" # API key for connecting to OpenAI
model: "gpt-4o" # The OpenAI model to use (e.g., 'gpt-4o')
base_url: "https://api.openai.com/v1" # The base URL for the OpenAI API (e.g., 'https://api.openai.com/v1')
api_type: "openai" # The type of OpenAI API to use
api_version: "2023-05-15" # The version of the OpenAI API to use (e.g., '2023-05-15'). Required when API Type is AZURE or AZURE_AD

Pinot

driver

[string] - Refers to the driver type and must be driver pinot (required)

dsn

[string] - DSN(Data Source Name) for the Pinot connection (required)

username

[string] - Username for authenticating with Pinot

password

[string] - Password for authenticating with Pinot

broker_host

[string] - Hostname of the Pinot broker (required)

broker_port

[integer] - Port number for the Pinot broker

controller_host

[string] - Hostname of the Pinot controller (required)

controller_port

[integer] - Port number for the Pinot controller

ssl

[boolean] - Enable SSL connection to Pinot

log_queries

[boolean] - Log raw SQL queries executed through Pinot

max_open_conns

[integer] - Maximum number of open connections to the Pinot database

# Example: Pinot connector configuration
type: connector # Must be `connector` (required)
driver: pinot # Must be `pinot` _(required)_
username: "myusername" # Username for authentication
password: "mypassword" # Password for authentication
broker_host: "localhost" # Hostname of the Pinot broker
broker_port: 9000 # Port number for the Pinot broker
controller_host: "localhost" # Hostname of the Pinot controller
controller_port: 9000 # Port number for the Pinot controller
ssl: true # Enable SSL connection to Pinot
log_queries: true # Log raw SQL queries executed through Pinot
max_open_conns: 100 # Maximum number of open connections to the Pinot database

Postgres

driver

[string] - Refers to the driver type and must be driver postgres (required)

dsn

[string] - DSN(Data Source Name) for the postgres connection

host

[string] - Hostname of the Postgres server

port

[string] - Port number for the Postgres server

dbname

[string] - Name of the Postgres database

user

[string] - Username for authentication

password

[string] - Password for authentication

sslmode

[string] - SSL mode can be disable, allow, prefer or require

# Example: Postgres connector configuration
type: connector # Must be `connector` (required)
driver: postgres # Must be `postgres` _(required)_
host: "localhost" # Hostname of the Postgres server
port: 5432 # Port number for the Postgres server
dbname: "mydatabase" # Name of the Postgres database
user: "myusername" # Username for authentication
password: "mypassword" # Password for authentication
sslmode: "disable" # SSL mode can be disable, allow, prefer or require

Redshift

driver

[string] - Refers to the driver type and must be driver redshift (required)

aws_access_key_id

[string] - AWS Access Key ID used for authenticating with Redshift. (required)

aws_secret_access_key

[string] - AWS Secret Access Key used for authenticating with Redshift. (required)

aws_access_token

[string] - AWS Session Token for temporary credentials (optional).

region

[string] - AWS region where the Redshift cluster or workgroup is hosted (e.g., 'us-east-1').

database

[string] - Name of the Redshift database to query. (required)

workgroup

[string] - Workgroup name for Redshift Serverless, in case of provisioned Redshift clusters use 'cluster_identifier'.

cluster_identifier

[string] - Cluster identifier for provisioned Redshift clusters, in case of Redshift Serverless use 'workgroup' .

# Example: Redshift connector configuration
type: connector # Must be `connector` (required)
driver: redshift # Must be `redshift` _(required)_
aws_access_key_id: "my-access-key-id" # AWS Access Key ID used for authenticating with Redshift.
aws_secret_access_key: "my-secret-access-key" # AWS Secret Access Key used for authenticating with Redshift.
aws_access_token: "my-access-token" # AWS Session Token for temporary credentials (optional).
region: "us-east-1" # AWS region where the Redshift cluster or workgroup is hosted (e.g., 'us-east-1').
database: "mydatabase" # Name of the Redshift database to query.
workgroup: "my-workgroup" # Workgroup name for Redshift Serverless, in case of provisioned Redshift clusters use 'cluster_identifier'.
cluster_identifier: "my-cluster-identifier" # Cluster identifier for provisioned Redshift clusters, in case of Redshift Serverless use 'workgroup' .

S3

driver

[string] - Refers to the driver type and must be driver s3 (required)

aws_access_key_id

[string] - AWS Access Key ID used for authentication

aws_secret_access_key

[string] - AWS Secret Access Key used for authentication

aws_access_token

[string] - Optional AWS session token for temporary credentials

bucket

[string] - Name of s3 bucket (required)

endpoint

[string] - Optional custom endpoint URL for S3-compatible storage

region

[string] - AWS region of the S3 bucket

allow_host_access

[boolean] - Allow access to host environment configuration

retain_files

[boolean] - Whether to retain intermediate files after processing

# Example: S3 connector configuration
type: connector # Must be `connector` (required)
driver: s3 # Must be `s3` _(required)_
aws_access_key_id: "my-access-key-id" # AWS Access Key ID used for authentication
aws_secret_access_key: "my-secret-access-key" # AWS Secret Access Key used for authentication
aws_access_token: "my-access-token" # Optional AWS session token for temporary credentials
bucket: "my-s3-bucket" # Name of s3 bucket
endpoint: "https://my-s3-endpoint.com" # Optional custom endpoint URL for S3-compatible storage
region: "us-east-1" # AWS region of the S3 bucket

Salesforce

driver

[string] - Refers to the driver type and must be driver salesforce (required)

username

[string] - Salesforce account username (required)

password

[string] - Salesforce account password (secret)

key

[string] - Authentication key for Salesforce (secret)

endpoint

[string] - Salesforce API endpoint URL (required)

client_id

[string] - Client ID used for Salesforce OAuth authentication (required)

# Example: Salesforce connector configuration
type: connector # Must be `connector` (required)
driver: salesforce # Must be `salesforce` _(required)_
username: "myusername" # Salesforce account username
password: "mypassword" # Salesforce account password (secret)
endpoint: "https://login.salesforce.com" # Salesforce API endpoint URL
client_id: "my-client-id" # Client ID used for Salesforce OAuth authentication

Slack

driver

[string] - Refers to the driver type and must be driver slack (required)

bot_token

[string] - Bot token used for authenticating Slack API requests (required)

# Example: Slack connector configuration
type: connector # Must be `connector` (required)
driver: slack # Must be `slack` _(required)_
bot_token: "xoxb-my-bot-token" # Bot token used for authenticating Slack API requests

Snowflake

driver

[string] - Refers to the driver type and must be driver snowflake (required)

account

[string] - Snowflake account identifier. To find your Snowflake account identifier, look at your Snowflake account URL. The account identifier is everything before .snowflakecomputing.com

user

[string] - Username for the Snowflake connection.

password

[string] - Password for the Snowflake connection. (deprecated, use privateKey instead)

privateKey

[string] - Private key for JWT authentication.

tip

Private key must be generated as a PKCS#8 (nocrypt) key, since the Snowflake Go driver only supports unencrypted private keys. After generating, it must be base64 URL encoded.

Example commands to generate and encode:

# Generate a 2048-bit unencrypted PKCS#8 private key
openssl genrsa 2048 | openssl pkcs8 -topk8 -inform PEM -out rsa_key.p8 -nocrypt

# Convert URL safe format for Snowflake
cat rsa_key.p8 | grep -v "\----" | tr -d '\n' | tr '+/' '-_'

See: https://docs.snowflake.com/en/user-guide/key-pair-auth

authenticator

[string] - Optional authenticator type (e.g., SNOWFLAKE_JWT).

database

[string] - Name of the Snowflake database.

schema

[string] - Schema within the database to use.

warehouse

[string] - Compute warehouse to use for queries.

role

[string] - Snowflake role to use.

dsn

[string] - DSN (Data Source Name) for the Snowflake connection.

This is intended for advanced configuration where you want to specify properties that are not explicitly defined above.
It can only be used when the other connection fields (account, user, password, database, schema, warehouse, role, authenticator, privateKey) are not used.

For details on private key generation and encoding, see the privateKey property.

parallel_fetch_limit

[integer] - Maximum number of concurrent fetches during query execution.

# Example: Snowflake connector basic configuration
type: connector
driver: snowflake
account: my_account_identifier
user: my_user
privateKey: '{{ .env.SNOWFLAKE_PRIVATE_KEY }}' # define SNOWFLAKE_PRIVATE_KEY in .env file
database: my_db
schema: my_schema
warehouse: my_wh
role: my_role
parallel_fetch_limit: 2
# Example: Snowflake connector advance configuration
type: connector
driver: snowflake
dsn: '{{ .env.SNOWFLAKE_DSN }}' # define SNOWFLAKE_DSN in .env file like SNOWFLAKE_DSN='my_username@my_account/my_db/my_schema?warehouse=my_wh&role=my_role&authenticator=SNOWFLAKE_JWT&privateKey=my_private_key'
parallel_fetch_limit: 2

SQLite

driver

[string] - Refers to the driver type and must be driver sqlite (required)

dsn

[string] - DSN(Data Source Name) for the sqlite connection (required)

# Example: SQLite connector configuration
type: connector # Must be `connector` (required)
driver: sqlite # Must be `sqlite` _(required)_
dsn: "file:mydatabase.db" # DSN for the sqlite connection