Partitions and Increments
In order to help with data ingestion into Rill, we will introduce the concepts of partitions and incremental models Before diving into our ClickHouse project, let's understand what each of these are used for and do.
While we will go over the main points to get started, there are more customizations possiblities so we recommened to review the reference guide and docs along with following the tutorial.
Incremental Model
An incremental model is defined using the following key pair.
incremental: true
Once this is enabled, Rill will configure the model YAML as an incrementing model. In some of the examples, we will use both a time based incremental and glob based increments.
Partitioned Model
Partitions in models are enabled by defining the partition parameter as seen below:
partitions:
sql/glob: some partition definition
Depending on your data, this can be defined as a SQL:
statement or a glob:
pattern. Once configured, Rill will try to partition your existing data into smaller subcategories which allows you to refresh specific partitions only instead of reingesting the whole dataset. (only when incremental is enabled)
By running the following command, you can see all the available partitions.
rill project partitions <model_name>
Let's look at a few simple examples before diving into our ClickHouse project.
Was this content helpful?