Model YAML

Properties

`type`

[string] - Refers to the resource type and must be model (required)

`refresh`

[object] - Specifies the refresh schedule that Rill should follow to re-ingest and update the underlying model data

cron - [string] - A cron expression that defines the execution schedule
time_zone - [string] - Time zone to interpret the schedule in (e.g., 'UTC', 'America/Los_Angeles').
disable - [boolean] - If true, disables the resource without deleting it.
ref_update - [boolean] - If true, allows the resource to run when a dependency updates.
run_in_dev - [boolean] - If true, allows the schedule to run in development mode.

`connector`

[string] - Refers to the connector type or named connector for the source.

`sql`

[string] - Raw SQL query to run against source (required)

`timeout`

[string] - The maximum time to wait for model ingestion

`incremental`

[boolean] - whether incremental modeling is required (optional)

`change_mode`

[string] - Configure how changes to the model specifications are applied (optional). 'reset' will drop and recreate the model automatically, 'manual' will require a manual full or incremental refresh to apply changes, and 'patch' will switch to the new logic without re-processing historical data (only applies for incremental models).

`state`

[oneOf] - Refers to the explicitly defined state of your model, cannot be used with partitions (optional)

option 1 - [object] - Executes a raw SQL query against the project's data models.
- sql - [string] - Raw SQL query to run against existing models in the project. (required)
- connector - [string] - specifies the connector to use when running SQL or glob queries.
option 2 - [object] - Executes a SQL query that targets a defined metrics view.
- metrics_sql - [string] - SQL query that targets a metrics view in the project (required)
option 3 - [object] - Calls a custom API defined in the project to compute data.
- api - [string] - Name of a custom API defined in the project. (required)
- args - [object] - Arguments to pass to the custom API.
option 4 - [object] - Uses a file-matching pattern (glob) to query data from a connector.
- glob - [anyOf] - Defines the file path or pattern to query from the specified connector. (required)
  - option 1 - [string] - A simple file path/glob pattern as a string.
  - option 2 - [object] - An object-based configuration for specifying a file path/glob pattern with advanced options.
- connector - [string] - Specifies the connector to use with the glob input.
option 5 - [object] - Uses the status of a resource as data.
- resource_status - [object] - Based on resource status (required)
  - where_error - [boolean] - Indicates whether the condition should trigger when the resource is in an error state.

`partitions`

[oneOf] - Refers to the how your data is partitioned, cannot be used with state. (optional)

option 1 - [object] - Executes a raw SQL query against the project's data models.
- sql - [string] - Raw SQL query to run against existing models in the project. (required)
- connector - [string] - specifies the connector to use when running SQL or glob queries.
option 2 - [object] - Executes a SQL query that targets a defined metrics view.
- metrics_sql - [string] - SQL query that targets a metrics view in the project (required)
option 3 - [object] - Calls a custom API defined in the project to compute data.
- api - [string] - Name of a custom API defined in the project. (required)
- args - [object] - Arguments to pass to the custom API.
option 4 - [object] - Uses a file-matching pattern (glob) to query data from a connector.
- glob - [anyOf] - Defines the file path or pattern to query from the specified connector. (required)
  - option 1 - [string] - A simple file path/glob pattern as a string.
  - option 2 - [object] - An object-based configuration for specifying a file path/glob pattern with advanced options.
- connector - [string] - Specifies the connector to use with the glob input.
option 5 - [object] - Uses the status of a resource as data.
- resource_status - [object] - Based on resource status (required)
  - where_error - [boolean] - Indicates whether the condition should trigger when the resource is in an error state.

`materialize`

[boolean] - models will be materialized in olap

`partitions_watermark`

[string] - Refers to a customizable timestamp that can be set to check if an object has been updated (optional).

`partitions_concurrency`

[integer] - Refers to the number of concurrent partitions that can be read at the same time (optional).

`stage`

[object] - in the case of staging models, where an input source does not support direct write to the output and a staging table is required

connector - [string] - Refers to the connector type for the staging table (required)

`output`

[object] - to define the properties of output

table - [string] - Name of the output table. If not specified, the model name is used.
materialize - [boolean] - Whether to materialize the model as a table or view
connector - [string] - Refers to the connector type for the output table. Can be clickhouse or duckdb and their named connector
incremental_strategy - [string] - Strategy to use for incremental updates. Can be 'append', 'merge' or 'partition_overwrite'
unique_key - [array of string] - List of columns that uniquely identify a row for merge strategy
partition_by - [string] - Column or expression to partition the table by

Additional properties for output when connector is clickhouse

type - [string] - Type to materialize the model into. Can be 'TABLE', 'VIEW' or 'DICTIONARY'
columns - [string] - Column names and types. Can also include indexes. If unspecified, detected from the query.
engine_full - [string] - Full engine definition in SQL format. Can include partition keys, order, TTL, etc.
engine - [string] - Table engine to use. Default is MergeTree
order_by - [string] - ORDER BY clause.
partition_by - [string] - Partition BY clause.
primary_key - [string] - PRIMARY KEY clause.
sample_by - [string] - SAMPLE BY clause.
ttl - [string] - TTL settings for the table or columns.
table_settings - [string] - Table-specific settings.
query_settings - [string] - Settings used in insert/create table as select queries.
distributed_settings - [string] - Settings for distributed table.
distributed_sharding_key - [string] - Sharding key for distributed table.
dictionary_source_user - [string] - User for accessing the source dictionary table (used if type is DICTIONARY).
dictionary_source_password - [string] - Password for the dictionary source user.

Common Properties

`name`

[string] - Name is usually inferred from the filename, but can be specified manually.

`refs`

[array of string] - List of resource references

`dev`

[object] - Overrides any properties in development environment.

`prod`

[object] - Overrides any properties in production environment.

Additional properties when `connector` is `athena` or named connector for athena

`output_location`

[string] - Output location for query results in S3.

`workgroup`

[string] - AWS Athena workgroup to use for queries.

`region`

[string] - AWS region to connect to Athena and the output location.

Additional properties when `connector` is `azure` or named connector of azure

`path`

[string] - Path to the source

`account`

[string] - Account identifier

`uri`

[string] - Source URI

`extract`

[object] - Arbitrary key-value pairs for extraction settings

`glob`

[object] - Settings related to glob file matching.

max_total_size - [integer] - Maximum total size (in bytes) matched by glob
max_objects_matched - [integer] - Maximum number of objects matched by glob
max_objects_listed - [integer] - Maximum number of objects listed in glob
page_size - [integer] - Page size for glob listing

`batch_size`

[string] - Size of a batch (e.g., '100MB')

Additional properties when `connector` is `bigquery` or named connector of bigquery

`project_id`

[string] - ID of the BigQuery project.

Additional properties when `connector` is `duckdb` or named connector of duckdb

`path`

[string] - Path to the data source.

`format`

[string] - Format of the data source (e.g., csv, json, parquet).

`pre_exec`

[string] - refers to SQL queries to run before the main query, available for DuckDB-based models. (optional). Ensure pre_exec queries are idempotent. Use IF NOT EXISTS statements when applicable.

`post_exec`

[string] - refers to a SQL query that is run after the main query, available for DuckDB-based models. (optional). Ensure post_exec queries are idempotent. Use IF EXISTS statements when applicable.

pre_exec: ATTACH IF NOT EXISTS 'dbname=postgres host=localhost port=5432 user=postgres password=postgres' AS postgres_db (TYPE POSTGRES);
sql: SELECT * FROM postgres_query('postgres_db', 'SELECT * FROM USERS')
post_exec: DETACH DATABASE IF EXISTS postgres_db

Additional properties when `connector` is `gcs` or named connector of gcs

`path`

[string] - Path to the source

`uri`

[string] - Source URI

`extract`

[object] - key-value pairs for extraction settings

`glob`

[object] - Settings related to glob file matching.

max_total_size - [integer] - Maximum total size (in bytes) matched by glob
max_objects_matched - [integer] - Maximum number of objects matched by glob
max_objects_listed - [integer] - Maximum number of objects listed in glob
page_size - [integer] - Page size for glob listing

`batch_size`

[string] - Size of a batch (e.g., '100MB')

Additional properties when `connector` is `local_file` or named connector of local_file

`path`

[string] - Path to the data source.

`format`

[string] - Format of the data source (e.g., csv, json, parquet).

Additional properties when `connector` is `redshift` or named connector of redshift

`output_location`

[string] - S3 location where query results are stored.

`workgroup`

[string] - Redshift Serverless workgroup to use.

`database`

[string] - Name of the Redshift database.

`cluster_identifier`

[string] - Identifier of the Redshift cluster.

`role_arn`

[string] - ARN of the IAM role to assume for Redshift access.

`region`

[string] - AWS region of the Redshift deployment.

Additional properties when `connector` is `s3` or named connector of s3

`region`

[string] - AWS region

`endpoint`

[string] - AWS Endpoint

`path`

[string] - Path to the source

`uri`

[string] - Source URI

`extract`

[object] - key-value pairs for extraction settings

`glob`

[object] - Settings related to glob file matching.

max_total_size - [integer] - Maximum total size (in bytes) matched by glob
max_objects_matched - [integer] - Maximum number of objects matched by glob
max_objects_listed - [integer] - Maximum number of objects listed in glob
page_size - [integer] - Page size for glob listing

`batch_size`

[string] - Size of a batch (e.g., '100MB')

Additional properties when `connector` is `salesforce` or named connector of salesforce

`soql`

[string] - SOQL query to execute against the Salesforce instance.

`sobject`

[string] - Salesforce object (e.g., Account, Contact) targeted by the query.

`queryAll`

[boolean] - Whether to include deleted and archived records in the query (uses queryAll API).

Properties
Common Properties
Additional properties when connector is athena or named connector for athena
Additional properties when connector is azure or named connector of azure
Additional properties when connector is bigquery or named connector of bigquery
- project_id
Additional properties when connector is duckdb or named connector of duckdb
Additional properties when connector is gcs or named connector of gcs
Additional properties when connector is local_file or named connector of local_file
- path
- format
Additional properties when connector is redshift or named connector of redshift
Additional properties when connector is s3 or named connector of s3
Additional properties when connector is salesforce or named connector of salesforce

Properties​

type​

refresh​

connector​

sql​

timeout​

incremental​

change_mode​

state​

partitions​

materialize​

partitions_watermark​

partitions_concurrency​

stage​

output​

Common Properties​

name​

refs​

dev​

prod​

Additional properties when connector is athena or named connector for athena​

output_location​

workgroup​

region​

Additional properties when connector is azure or named connector of azure​

path​

account​

uri​

extract​

glob​

batch_size​

Additional properties when connector is bigquery or named connector of bigquery​

project_id​

Additional properties when connector is duckdb or named connector of duckdb​

path​

format​

pre_exec​

post_exec​

Additional properties when connector is gcs or named connector of gcs​

path​

uri​

extract​

glob​

batch_size​

Additional properties when connector is local_file or named connector of local_file​

path​

format​

Additional properties when connector is redshift or named connector of redshift​

output_location​

workgroup​

database​

cluster_identifier​

role_arn​

region​

Additional properties when connector is s3 or named connector of s3​

region​

endpoint​

path​

uri​

extract​

glob​

batch_size​

Additional properties when connector is salesforce or named connector of salesforce​

soql​

sobject​

queryAll​

Properties

`type`

`refresh`

`connector`

`sql`

`timeout`

`incremental`

`change_mode`

`state`

`partitions`

`materialize`

`partitions_watermark`

`partitions_concurrency`

`stage`

`output`

Common Properties

`name`

`refs`

`dev`

`prod`

Additional properties when `connector` is `athena` or named connector for athena

`output_location`

`workgroup`

`region`

Additional properties when `connector` is `azure` or named connector of azure

`path`

`account`

`uri`

`extract`

`glob`

`batch_size`

Additional properties when `connector` is `bigquery` or named connector of bigquery

`project_id`

Additional properties when `connector` is `duckdb` or named connector of duckdb

`path`

`format`

`pre_exec`

`post_exec`

Additional properties when `connector` is `gcs` or named connector of gcs

`path`

`uri`

`extract`

`glob`

`batch_size`

Additional properties when `connector` is `local_file` or named connector of local_file

`path`

`format`

Additional properties when `connector` is `redshift` or named connector of redshift

`output_location`

`workgroup`

`database`

`cluster_identifier`

`role_arn`

`region`

Additional properties when `connector` is `s3` or named connector of s3

`region`

`endpoint`

`path`

`uri`

`extract`

`glob`

`batch_size`

Additional properties when `connector` is `salesforce` or named connector of salesforce

`soql`

`sobject`

`queryAll`