Connect Sources
Rill supports a multitude of connectors to ingest data from various sources: local files, S3 or GCS buckets, download using HTTP(S), databases, data warehouses, and the list goes on. Rill can ingest .csv
, .tsv
, .json
,and .parquet
files, which may be compressed (.gz
). This can be done either through the UI directly, when working with Rill Developer, or by pushing the logic into the source YAML definition directly (see Using Code sections below).
To provide a non-exhaustive list, Rill supports the following connectors:
- Google Cloud Storage
- S3
- Azure Blob Storage
- BigQuery
- Athena
- Redshift
- Kafka
- DuckDB and MotherDuck
- PostgreSQL
- MySQL
- SQLite
- Snowflake
- Salesforce
- Google Sheets
Rill is continually adding new sources and connectors in our releases. For a comprehensive list, you can refer to our Connectors page. Please don't hesitate to reach out either if there's a connector you'd like us to add!
Rill works best for slicing and dicing data meaning keeping data closer to raw to retain that granularity for flexible analysis. When loading data - be careful with adding pre-aggregated metrics like averages as that could lead to unintended results like a sum of an average. Instead load the two raw metrics and calculate the derived metric in your model or dashboard.
Adding a local file
Using the UI
To import a file using the UI, click "+" by Sources in the left hand navigation pane, select "Local File", and navigate to the specific file. Alternately, try dragging and dropping the file directly onto the Rill interface.
Using code
When you add a source using the UI, a code definition will automatically be created as a .yaml
file in your Rill project in the sources
directory. However, you can also create sources more directly by creating the artifact.
In your Rill project directory, create a source_name.yaml
file in the sources
directory with the following content:
type: source
connector: local_file
path: /path/to/local/data.csv
Rill will ingest the data next time you run rill start
.
Note that if you provide a relative path, the path should be relative to your Rill project root (where your rill.yaml
file is located), not relative to the sources
directory.
To import data from multiple files, you can use a glob pattern to specify the files you want to include. To learn more about the syntax and details of glob patterns, please refer to the documentation on glob patterns.
For more details about available configurations and properties, check our Source YAML reference page.
Adding a remote source
Using the UI
To add a remote source using the UI, click "+" by Sources in the left hand navigation pane and select the location where your remote files are stored ("Google Cloud Storage", "Amazon S3", or "http(s)"). Enter your file's URI and click "Add Source".
After import, you can reimport your data whenever you want by clicking the "refresh source" button in the Rill UI.
Using code
When you add a source using the UI or CLI, a code definition will automatically be created as a .yaml
file in your Rill project in the sources
directory.
For example, to create a remote http(s) source, create a source_name.yaml
file in the sources
directory with the following contents:
type: source
connector: https
uri: https://data.example.org/path/to/file.parquet
For a full list of connector types available, please see our Connectors and Source YAML reference pages.
You can also push filters to your source definition using inline editing. Common use cases for inline editing:
- Filter data to only use part of source files
- Push transformations for key fields to source (particularly casting time fields, data types)
- Resolve ingestion issues by declaring types (examples: STRUCT with different values to VARCHAR, fields mixed with INT and VARCHAR values)
To import data from multiple files, you can use a glob pattern to specify the files you want to include. To learn more about the syntax and details of glob patterns, please refer to the documentation on glob patterns.
For more details about available configurations and properties, check our Source YAML reference page.
Authenticating remote sources
Rill requires an appropriate set of credentials to connect to remote data sources, whether those are buckets (e.g. S3 or GCS) or data warehouses (e.g. Snowflake). When running Rill locally, Rill Developer attempts to find existing credentials that have been configured on your machine. When deploying projects to Rill Cloud, you must explicitly provide service account credentials with correct access permissions.
Please see our Configuring Credentials and Deployment Credentials for more information about setting up and using credentials in Rill.
External OLAP tables
Rill also has the ability to set up a "live connection" with an OLAP engine to discover existing tables and execute OLAP queries directly on the engine without having to transfer data to another OLAP engine. By default, the embedded OLAP engine that comes with Rill is DuckDB.
For more details about configuring and/or changing the OLAP engine used by Rill, please see our OLAP Engines reference documentation.
Rill Developer vs Rill Cloud
There is a difference between Rill Developer and Rill Cloud and they work hand-in-hand to provide a shared experience. For distributed teams, Rill Developer is primarily meant for local development and modeling purposes while Rill Cloud is where the primary dashboard consumption occurs and helps to enable shared collaboration at scale. For Rill Developer, as the size or volume of source data continues to grow (or reaches a certain size), it is strongly recommended to work with a segment of the data for modeling purposes instead of the full dataset (i.e. think of it as a "dev partition"), which is meant to help the developer validate the model logic and verify that the correct results are being produced. Then, after the model and dashboard configurations have been finalized, the project can be deployed to Rill Cloud against the full range of data and dashboards can be explored by other end users.
We are one Slack, email, or chat message away. Please feel free to contact us - we'd love to help!