Model YAML
Properties
type
[string] - Refers to the resource type and must be model
(required)
refresh
[object] - Specifies the refresh schedule that Rill should follow to re-ingest and update the underlying model data
-
cron
- [string] - A cron expression that defines the execution schedule -
time_zone
- [string] - Time zone to interpret the schedule in (e.g., 'UTC', 'America/Los_Angeles'). -
disable
- [boolean] - If true, disables the resource without deleting it. -
ref_update
- [boolean] - If true, allows the resource to run when a dependency updates. -
run_in_dev
- [boolean] - If true, allows the schedule to run in development mode.
connector
[string] - Refers to the connector type or named connector for the source.
sql
[string] - Raw SQL query to run against source (required)
timeout
[string] - The maximum time to wait for model ingestion
incremental
[boolean] - whether incremental modeling is required (optional)
change_mode
[string] - Configure how changes to the model specifications are applied (optional). 'reset' will drop and recreate the model automatically, 'manual' will require a manual full or incremental refresh to apply changes, and 'patch' will switch to the new logic without re-processing historical data (only applies for incremental models).
state
[oneOf] - Refers to the explicitly defined state of your model, cannot be used with partitions (optional)
-
option 1 - [object] - Executes a raw SQL query against the project's data models.
-
sql
- [string] - Raw SQL query to run against existing models in the project. (required) -
connector
- [string] - specifies the connector to use when running SQL or glob queries.
-
-
option 2 - [object] - Executes a SQL query that targets a defined metrics view.
metrics_sql
- [string] - SQL query that targets a metrics view in the project (required)
-
option 3 - [object] - Calls a custom API defined in the project to compute data.
-
api
- [string] - Name of a custom API defined in the project. (required) -
args
- [object] - Arguments to pass to the custom API.
-
-
option 4 - [object] - Uses a file-matching pattern (glob) to query data from a connector.
-
glob
- [anyOf] - Defines the file path or pattern to query from the specified connector. (required)-
option 1 - [string] - A simple file path/glob pattern as a string.
-
option 2 - [object] - An object-based configuration for specifying a file path/glob pattern with advanced options.
-
-
connector
- [string] - Specifies the connector to use with the glob input.
-
-
option 5 - [object] - Uses the status of a resource as data.
-
resource_status
- [object] - Based on resource status (required)where_error
- [boolean] - Indicates whether the condition should trigger when the resource is in an error state.
-
partitions
[oneOf] - Refers to the how your data is partitioned, cannot be used with state. (optional)
-
option 1 - [object] - Executes a raw SQL query against the project's data models.
-
sql
- [string] - Raw SQL query to run against existing models in the project. (required) -
connector
- [string] - specifies the connector to use when running SQL or glob queries.
-
-
option 2 - [object] - Executes a SQL query that targets a defined metrics view.
metrics_sql
- [string] - SQL query that targets a metrics view in the project (required)
-
option 3 - [object] - Calls a custom API defined in the project to compute data.
-
api
- [string] - Name of a custom API defined in the project. (required) -
args
- [object] - Arguments to pass to the custom API.
-
-
option 4 - [object] - Uses a file-matching pattern (glob) to query data from a connector.
-
glob
- [anyOf] - Defines the file path or pattern to query from the specified connector. (required)-
option 1 - [string] - A simple file path/glob pattern as a string.
-
option 2 - [object] - An object-based configuration for specifying a file path/glob pattern with advanced options.
-
-
connector
- [string] - Specifies the connector to use with the glob input.
-
-
option 5 - [object] - Uses the status of a resource as data.
-
resource_status
- [object] - Based on resource status (required)where_error
- [boolean] - Indicates whether the condition should trigger when the resource is in an error state.
-
materialize
[boolean] - models will be materialized in olap
partitions_watermark
[string] - Refers to a customizable timestamp that can be set to check if an object has been updated (optional).
partitions_concurrency
[integer] - Refers to the number of concurrent partitions that can be read at the same time (optional).
stage
[object] - in the case of staging models, where an input source does not support direct write to the output and a staging table is required
connector
- [string] - Refers to the connector type for the staging table (required)
output
[object] - to define the properties of output
-
table
- [string] - Name of the output table. If not specified, the model name is used. -
materialize
- [boolean] - Whether to materialize the model as a table or view -
connector
- [string] - Refers to the connector type for the output table. Can beclickhouse
orduckdb
and their named connector -
incremental_strategy
- [string] - Strategy to use for incremental updates. Can be 'append', 'merge' or 'partition_overwrite' -
unique_key
- [array of string] - List of columns that uniquely identify a row for merge strategy -
partition_by
- [string] - Column or expression to partition the table by
Additional properties for output
when connector
is clickhouse
-
type
- [string] - Type to materialize the model into. Can be 'TABLE', 'VIEW' or 'DICTIONARY' -
columns
- [string] - Column names and types. Can also include indexes. If unspecified, detected from the query. -
engine_full
- [string] - Full engine definition in SQL format. Can include partition keys, order, TTL, etc. -
engine
- [string] - Table engine to use. Default is MergeTree -
order_by
- [string] - ORDER BY clause. -
partition_by
- [string] - Partition BY clause. -
primary_key
- [string] - PRIMARY KEY clause. -
sample_by
- [string] - SAMPLE BY clause. -
ttl
- [string] - TTL settings for the table or columns. -
table_settings
- [string] - Table-specific settings. -
query_settings
- [string] - Settings used in insert/create table as select queries. -
distributed_settings
- [string] - Settings for distributed table. -
distributed_sharding_key
- [string] - Sharding key for distributed table. -
dictionary_source_user
- [string] - User for accessing the source dictionary table (used if type is DICTIONARY). -
dictionary_source_password
- [string] - Password for the dictionary source user.
Common Properties
name
[string] - Name is usually inferred from the filename, but can be specified manually.
refs
[array of string] - List of resource references
dev
[object] - Overrides any properties in development environment.
prod
[object] - Overrides any properties in production environment.
Additional properties when connector
is athena
or named connector for athena
output_location
[string] - Output location for query results in S3.
workgroup
[string] - AWS Athena workgroup to use for queries.
region
[string] - AWS region to connect to Athena and the output location.
Additional properties when connector
is azure
or named connector of azure
path
[string] - Path to the source
account
[string] - Account identifier
uri
[string] - Source URI
extract
[object] - Arbitrary key-value pairs for extraction settings
glob
[object] - Settings related to glob file matching.
-
max_total_size
- [integer] - Maximum total size (in bytes) matched by glob -
max_objects_matched
- [integer] - Maximum number of objects matched by glob -
max_objects_listed
- [integer] - Maximum number of objects listed in glob -
page_size
- [integer] - Page size for glob listing
batch_size
[string] - Size of a batch (e.g., '100MB')
Additional properties when connector
is bigquery
or named connector of bigquery
project_id
[string] - ID of the BigQuery project.
Additional properties when connector
is duckdb
or named connector of duckdb
path
[string] - Path to the data source.
format
[string] - Format of the data source (e.g., csv, json, parquet).
pre_exec
[string] - refers to a SQL queries to run before the main query, available for DuckDB based models
post_exec
[string] - refers to a SQL query that is run after the main query, available for DuckDB based models
Additional properties when connector
is gcs
or named connector of gcs
path
[string] - Path to the source
uri
[string] - Source URI
extract
[object] - key-value pairs for extraction settings
glob
[object] - Settings related to glob file matching.
-
max_total_size
- [integer] - Maximum total size (in bytes) matched by glob -
max_objects_matched
- [integer] - Maximum number of objects matched by glob -
max_objects_listed
- [integer] - Maximum number of objects listed in glob -
page_size
- [integer] - Page size for glob listing
batch_size
[string] - Size of a batch (e.g., '100MB')
Additional properties when connector
is local_file
or named connector of local_file
path
[string] - Path to the data source.
format
[string] - Format of the data source (e.g., csv, json, parquet).
Additional properties when connector
is redshift
or named connector of redshift
output_location
[string] - S3 location where query results are stored.
workgroup
[string] - Redshift Serverless workgroup to use.
database
[string] - Name of the Redshift database.
cluster_identifier
[string] - Identifier of the Redshift cluster.
role_arn
[string] - ARN of the IAM role to assume for Redshift access.
region
[string] - AWS region of the Redshift deployment.
Additional properties when connector
is s3
or named connector of s3
region
[string] - AWS region
endpoint
[string] - AWS Endpoint
path
[string] - Path to the source
uri
[string] - Source URI
extract
[object] - key-value pairs for extraction settings
glob
[object] - Settings related to glob file matching.
-
max_total_size
- [integer] - Maximum total size (in bytes) matched by glob -
max_objects_matched
- [integer] - Maximum number of objects matched by glob -
max_objects_listed
- [integer] - Maximum number of objects listed in glob -
page_size
- [integer] - Page size for glob listing
batch_size
[string] - Size of a batch (e.g., '100MB')
Additional properties when connector
is salesforce
or named connector of salesforce
soql
[string] - SOQL query to execute against the Salesforce instance.
sobject
[string] - Salesforce object (e.g., Account, Contact) targeted by the query.
queryAll
[boolean] - Whether to include deleted and archived records in the query (uses queryAll API).