Partitions and Increments
In order to help with data ingestion into Rill, we will introduce the concepts of partitions and incremental models Before diving into our ClickHouse project, let's understand what each of these are used for and do.
While we will go over the main points to get started, there are more customizations possibilities, so we recommend reviewing the reference guide and docs along with following the tutorial.
Incremental Model
An incremental model is defined using the following key-pair.
incremental: true
Once this is enabled, Rill will configure the model YAML as an incrementing model. In the following examples, we will use both a time based incremental and glob based increments.
Partitioned Model
Partitions in models are enabled by defining the partition parameter as seen below:
partitions:
sql/glob: some partition definition
Depending on your data, this can be defined as a SQL:
statement or a glob:
pattern. Once configured, Rill will try to partition your existing data into smaller subcategories which allows you to refresh specific partitions instead of re-ingesting the whole dataset. (only when incremental is enabled)
By running the following command, you can see all the available partitions.
rill project partitions <model_name>
Let's look at a few simple examples before diving into our ClickHouse project.