BigQuery
Overview
Google BigQuery is a fully-managed, serverless data warehouse that enables scalable and cost-effective analysis of large datasets using SQL-like queries. It supports a highly scalable and flexible architecture, allowing users to analyze large amounts of data in real-time and making it suitable for BI/ML applications. Rill supports natively connecting to and reading from BigQuery as a source by leveraging the BigQuery SDK.
Local credentials
When using Rill Developer on your local machine (i.e. rill start
), Rill uses the credentials configured in your local environment using the Google Cloud CLI (gcloud
). Follow these steps to configure it:
-
Open a terminal window and run
gcloud auth list
to check if you already have the Google Cloud CLI installed and authenticated. -
If it did not print information about your user, follow the steps on Install the Google Cloud CLI. Make sure to run
gcloud init
after installation as described in the tutorial.
You have now configured Google Cloud access from your local environment. Rill will detect and use your credentials next time you try to ingest a source.
If this project has already been deployed to Rill Cloud and credentials have been set for this source, you can use rill env pull
to pull these cloud credentials locally (into your local .env
file). Please note that this may override any credentials that you have set locally for this source.
Cloud deployment
When deploying a project to Rill Cloud, Rill requires you to explicitly provide a JSON key file for a Google Cloud service account with access to BigQuery used in your project.
When you first deploy a project using rill deploy
, you will be prompted to provide credentials for the remote sources in your project that require authentication.
If you subsequently add sources that require new credentials (or if you input the wrong credentials during the initial deploy), you can update the credentials used by Rill Cloud by running:
rill env configure
Note that you must cd
into the Git repository that your project was deployed from before running rill env configure
.
If you've configured credentials locally already (in your <RILL_PROJECT_DIRECTORY>/.env
file), you can use rill env push
to push these credentials to your Rill Cloud project. This will allow other users to retrieve / reuse the same credentials automatically by running rill env pull
.
Appendix
How to create a service account using the Google Cloud Console
Here is a step-by-step guide on how to create a Google Cloud service account with access to BigQuery:
-
Navigate to the Service Accounts page under "IAM & Admin" in the Google Cloud Console.
-
Click the "Create Service Account" button at the top of the page.
-
In the "Create Service Account" window, enter a name for the service account, then click "Create and continue".
-
In the "Role" field, search for and select the following BigQuery roles:
- roles/bigquery.dataViewer (Lowest-level resources: Table, View)
- provides the ability to read data and metadata from the project's datasets/dataset's tables/table or view
- roles/bigquery.readSessionUser (Lowest-level resources: Project)
- provides the ability to create and use read sessions that can be used to read data from BigQuery managed tables using the Storage API (to read data from BigQuery at high speeds). The role does not provide any other permissions related to BigQuery datasets, tables, or other resources.
- roles/bigquery.jobUser (Lowest-level resources: Project)
- provides permissions to run BigQuery-specific jobs (including queries), within the project and respecting limits set by roles above
Click "Continue", then click "Done".
Note: BigQuery has storage and compute separated from each other so the lowest-level resource where compute-specific roles are granted is a project, while lowest-level for data-specific roles is table/view.
- roles/bigquery.dataViewer (Lowest-level resources: Table, View)
-
On the "Service Accounts" page, locate the service account you just created and click on the three dots on the right-hand side. Select "Manage Keys" from the dropdown menu.
-
On the "Keys" page, click the "Add key" button and select "Create new key".
-
Choose the "JSON" key type and click "Create".
-
Download and save the JSON key file to a secure location on your computer.