2. Import the Source
Let's start at the beginning of all data pipelines, the source.
What is a Source?
In Rill, the source is your data. Whether this is from a data warehouse, cloud storage, or a RDBMS, Rill can read and import this data.
By default, the underlying OLAP engine utilized is DuckDB (see Connect OLAP engines).
Please see our docs for the supported list of connectors.
Adding a source is simple!
Select the +Add
dropdown and select Data
, this will open a UI of supported connectors.
For our tutorial, let's add two GCS storage from our public storage. These are datasets derived from the commit history and modified files of our friends at ClickHouse's GitHub repository. You need to add each one separately. Please refer the GIF above.
gs://rilldata-public/github-analytics/Clickhouse/*/*/modified_files_*.parquet
gs://rilldata-public/github-analytics/Clickhouse/*/*/commits_*.parquet
Once imported, youll see the UI change with a few things..
- The
source_name
.YAML file created in the file explorer - DuckDB database, created in the Connectors explorer
- Within the DuckDB database, our imported data as a table with a preview.
- The right panel giving a summary of the data
Now we're ready to create a model
.
Don't see what you're looking for?
We are continually adding new sources and connectors in our releases. For a comprehensive list, you can refer to our connectors page. Please don't hesitate to reach out either if there's a connector you'd like us to add!
If this it your first time, you may need to refresh the browser for DuckDB to appear in the UI.
By default, all environments running locally are considered dev
environments. This means that you can use environmental variables to filter the input data as Rill Developer is designed for testing purposes. For example, you can filter the repository data on the author_date
column or simply use limit ####
.
sql: "select * from read_parquet('gs://rilldata-public/github-analytics/Clickhouse/*/*/commits_*.parquet')
{{if dev}} where author_date < TIMESTAMPTZ '2015-01-01 00:00:00 Z' {{end}}"
Was this content helpful?