Skip to main content

2. Import the Source

Let's start at the beginning of all data pipelines, the source.

What is a Source?

In Rill, the source is your data. Whether this is from a data warehouse, cloud storage, or a RDBMS, Rill can read and import this data.

Import Data?

By default, the underlying OLAP engine utilized is DuckDB (see Connect OLAP engines).

Please see our docs for the supported list of connectors.

Adding a source is simple!

Select the +Add dropdown and select Data, this will open a UI of supported connectors.


For our tutorial, let's add two GCS storage from our public storage. These are datasets derived from the commit history and modified files of our friends at ClickHouse's GitHub repository. You need to add each one separately. Please refer the GIF above.

gs://rilldata-public/github-analytics/Clickhouse/*/*/modified_files_*.parquet
gs://rilldata-public/github-analytics/Clickhouse/*/*/commits_*.parquet

Once imported, youll see the UI change with a few things..

  1. The source_name.YAML file created in the file explorer
  2. DuckDB database, created in the Connectors explorer
  3. Within the DuckDB database, our imported data as a table with a preview.
  4. The right panel giving a summary of the data

Now we're ready to create a model.

Don't see what you're looking for?

We are continually adding new sources and connectors in our releases. For a comprehensive list, you can refer to our connectors page. Please don't hesitate to reach out either if there's a connector you'd like us to add!

If this it your first time, you may need to refresh the browser for DuckDB to appear in the UI.

Too much data?

By default, all environments running locally are considered dev environments. This means that you can use environmental variables to filter the input data as Rill Developer is designed for testing purposes. For example, you can filter the repository data on the author_date column or simply use limit ####.

sql: "select * from read_parquet('gs://rilldata-public/github-analytics/Clickhouse/*/*/commits_*.parquet')
{{if dev}} where author_date < TIMESTAMPTZ '2015-01-01 00:00:00 Z' {{end}}"

You will need to start rill with rill start --env dev.


Was this content helpful?