Databricks

Overview

Databricks is a unified data and AI platform built on top of Apache Spark and the lakehouse architecture. Rill connects to Databricks SQL warehouses using the Databricks SQL Go driver to ingest data into Rill's embedded engine.

Authentication

Rill authenticates to Databricks using a personal access token. The token must have access to the SQL warehouse and to the catalogs and schemas you intend to query.

When you add data from Databricks through the Rill UI, the process follows two steps:

Configure Authentication - Set up your Databricks connector with the SQL warehouse host, HTTP path, and personal access token
Configure Data Model - Define which table or query to ingest

This two-step flow ensures your credentials are securely stored in the connector configuration, while your data model references remain clean and portable.

Using the UI

Click Add Data in your Rill project
Select Databricks as the data source type
In the authentication step:
- Enter the Server hostname of your SQL warehouse (e.g. dbc-xxxxxxxx-xxxx.cloud.databricks.com)
- Enter the HTTP path of your SQL warehouse (e.g. /sql/1.0/warehouses/xxxxxxxxxxxxxxxx)
- Paste your Personal Access Token
- Optionally specify a default Catalog and Schema
In the data model configuration step, enter your SQL query
Click Create to finalize

After the model YAML is generated, you can add additional model settings directly to the file.

Manual Configuration

If you prefer to configure manually:

Step 1: Create connector configuration

Create connectors/databricks.yaml:

type: connector
driver: databricks

host: "dbc-xxxxxxxx-xxxx.cloud.databricks.com"
http_path: "/sql/1.0/warehouses/xxxxxxxxxxxxxxxx"
token: "{{ .env.DATABRICKS_TOKEN }}"
catalog: "main"     # optional
schema: "default"   # optional

Step 2: Add credentials to .env

DATABRICKS_TOKEN=dapi...

Did you know?

If this project has already been deployed to Rill Cloud and credentials have been set for this connector, you can use rill env pull to pull these cloud credentials locally (into your local .env file). Please note that this may override any credentials you have set locally for this source.

Then, create your first model.

Connection String (DSN)

For advanced configuration, you can specify a single DSN instead of the individual fields above. The DSN cannot be combined with host, http_path, token, catalog, or schema.

type: connector
driver: databricks

dsn: "{{ .env.DATABRICKS_DSN }}"

DATABRICKS_DSN=token:dapi...@dbc-xxxxxxxx-xxxx.cloud.databricks.com:443/sql/1.0/warehouses/xxxxxxxxxxxxxxxx?catalog=main&schema=default

See the Databricks SQL Go driver documentation for the full list of supported DSN parameters.

Create Your First Model

Once your connector is configured, create a model to define what data to pull.

Create models/databricks_data.yaml:

type: model
connector: databricks

dev:
  sql: SELECT * FROM main.my_schema.my_table LIMIT 10000

sql: SELECT * FROM main.my_schema.my_table

After creating the model, you can add additional model settings directly to the file.

Separating Dev and Prod Environments

When ingesting data locally, consider setting parameters in your connector file to limit how much data is retrieved, since costs can scale with the data source. This also helps other developers clone the project and iterate quickly by reducing ingestion time.

For more details, see our Dev/Prod setup docs.

Deploy to Rill Cloud

When deploying a project to Rill Cloud, Rill requires you to explicitly provide a Databricks personal access token along with the SQL warehouse host and HTTP path with access to Databricks used in your project. Please refer to our connector YAML reference docs for more information.

If you subsequently add sources that require new credentials (or if you simply entered the wrong credentials during the initial deploy), you can update the credentials by pushing the Deploy button to update your project or by running the following command in the CLI:

rill env push

Did you know?

If you've already configured credentials locally (in your <RILL_PROJECT_DIRECTORY>/.env file), you can use rill env push to push these credentials to your Rill Cloud project. This will allow other users to retrieve and reuse the same credentials automatically by running rill env pull.

Appendix

Finding your SQL Warehouse Connection Details

In the Databricks workspace:

Navigate to SQL Warehouses and select the warehouse you want to use.
Open the Connection details tab.
Copy the Server hostname and HTTP path.

To create a personal access token:

Click your profile icon and open Settings.
Go to Developer → Access tokens and click Generate new token.
Provide a name and lifetime, then copy the token. Databricks only shows the token once.

For service principal authentication and other auth methods, see the Databricks SQL driver authentication docs.

Unsupported Data Types

The Databricks ingestion path streams results as Arrow and writes them to Parquet. Columns of type TIME are not currently supported — exclude them from your query or cast them to a different type (for example, STRING or TIMESTAMP).

Check your service principal permissions

The user or service principal associated with the personal access token needs SELECT on the tables being read and USE CATALOG / USE SCHEMA on the parent catalog and schema. See the Unity Catalog privileges reference for the full list.

Overview​

Authentication​

Using the UI​

Manual Configuration​

Connection String (DSN)​

Create Your First Model​

Separating Dev and Prod Environments​

Deploy to Rill Cloud​

Appendix​

Finding your SQL Warehouse Connection Details​

Unsupported Data Types​