Skip to main content

DuckDB / MotherDuck

Overview

DuckDB is an in-process SQL OLAP database management system designed for analytical workloads, aiming to be fast, reliable, and easy to integrate into data analysis applications. It supports standard SQL and operates directly on data in Pandas DataFrames, CSV files, and Parquet files, making it highly suitable for on-the-fly data analysis and machine learning projects. Rill supports natively connecting to and reading from a persisted DuckDB database that it has access to as a source by utilizing the DuckDB Go driver.

MotherDuck, on the other hand, is a managed DuckDB-in-the-cloud service, providing enhanced features for scalability, security, and collaboration within larger organizations. It offers advanced management tools, security features like access control and encryption, and support for concurrent access, enabling teams to leverage DuckDB's analytical capabilities at scale while ensuring data governance and security. Similarly, Rill supports natively connecting to and reading from Motherduck as a source by utilizing the DuckDB Go driver


Connecting to External DuckDB as a Source

As noted above, if you wish to connect to a persistent DuckDB database to read existing tables, Rill will first need to be able to access the underlying DuckDB database. Once access has been established, the data will be read from your external databases into the built-in DuckDB database in Rill.

Local credentials

If creating a new DuckDB source from the UI, you should pass in the appropriate path to the DuckDB database file under DB and use the appopriate DuckDB select statement to read in the table under SQL:


On the other hand, if you are creating the source YAML file directly, the definition should look something like:

type: "source"
connector: "duckdb"
sql: "SELECT * from <duckdb_table>"
db: "<path_to_duckdb_db_file>"
If deploying to Rill Cloud

If you plan to deploy a project containing a DuckDB source to Rill Cloud, it is recommended that you move the DuckDB database file to a data folder in your Rill project home directory. You can then use the relative path of the db file in your source definition (e.g. data/test_duckdb.db).

Cloud deployment

Once a project with a DuckDB source has been deployed using, Rill Cloud will need to be able to have access to and retrieve the underlying persisted database file. In most cases, this means that the corresponding DuckDB database file should be included within a directory in your Git repository, which will allow you to specify a relative path in your source definition (from the project root).

When Using An External DuckDB Database

If the DuckDB database file is external to your Rill project directory, you will still be able to use the fully qualified path to read this DuckDB database locally using Rill Developer. However, when deployed to Rill Cloud, this source will throw an error.

Connecting to MotherDuck

If you are creating the source YAML file directly, the definition should look something like:

type: "source"
connector: "motherduck"
sql: "SELECT * from <my_db>.<duckdb_table>"
db: "md:<my_db>"

Local credentials

When using Rill Developer on your local machine (i.e. rill start), Rill will use the motherduck_token configured in your environment variables to attempt to establish a connection with MotherDuck. If this is not defined, you will need to set this environment variable appropriately.

export motherduck_token='<token>'
tip

An alternative option would be to set this line through your bash profile.

info

For more information about authenticating with an appropriate service token, please refer to MotherDuck's documentation.

Did you know?

If this project has already been deployed to Rill Cloud and credentials have been set for this source, you can use rill env pull to pull these cloud credentials locally (into your local .env file). Please note that this may override any credentials that you have set locally for this source.

Cloud deployment

Once a project with a MotherDuck source has been deployed, Rill requires you to explicitly provide the motherduck token using the following command:

rill env configure
info

Note that you must cd into the Git repository that your project was deployed from before running rill env configure.

Did you know?

If you've configured credentials locally already (in your <RILL_PROJECT_DIRECTORY>/.env file), you can use rill env push to push these credentials to your Rill Cloud project. This will allow other users to retrieve / reuse the same credentials automatically by running rill env pull.