Skip to main content

Google Cloud Storage (GCS)

Overview

Google Cloud Storage (GCS) is a scalable, fully managed, and highly reliable object storage service offered by Google Cloud, designed to store and access data from anywhere in the world. It provides a secure and cost-effective way to store data, including in common data storage formats such as CSV and parquet. Rill supports natively connecting to GCS using the provided Google Cloud Storage URI of your bucket to retrieve and read files.

Connecting to GCS

Local credentials

When using Rill Developer on your local machine (i.e. rill start), Rill uses the credentials configured in your local environment using the Google Cloud CLI (gcloud). Follow these steps to configure it:

  1. Open a terminal window and run gcloud auth list to check if you already have the Google Cloud CLI installed and authenticated.

  2. If it did not print information about your user, follow the steps on Install the Google Cloud CLI. Make sure to run gcloud init after installation as described in the tutorial.

tip

Once you have gcloud installed, run this command to set your default credentials via the CLI:

gcloud auth application-default login

You have now configured Google Cloud access from your local environment. Rill will detect and use your credentials next time you try to ingest a source.

info

As an alternative, you can also run locally with assumed credentials such as a service account key. This is useful for when you have multiple profiles or may receive limited access to a bucket.

In that case, run the following command upon start - replacing the path_to_json_key_file with the service account file:

GOOGLE_APPLICATION_CREDENTIALS=<path_to_json_key_file> rill start

Did you know?

If this project has already been deployed to Rill Cloud and credentials have been set for this source, you can use rill env pull to pull these cloud credentials locally (into your local .env file). Please note that this may override any credentials that you have set locally for this source.

Cloud deployment

When deploying a project to Rill Cloud, Rill requires a JSON key file to be explicitly provided for a Google Cloud service account with appropriate read access / permissions to the buckets used in your project.

When you first deploy a project using rill deploy, you will be prompted to provide credentials for the remote sources in your project that require authentication.

If you subsequently add sources that require new credentials (or if you input the wrong credentials during the initial deploy), you can update the credentials used by Rill Cloud by running:

rill env configure
info

Note that you must cd into the Git repository that your project was deployed from before running rill env configure.

Did you know?

If you've configured credentials locally already (in your <RILL_PROJECT_DIRECTORY>/.env file), you can use rill env push to push these credentials to your Rill Cloud project. This will allow other users to retrieve / reuse the same credentials automatically by running rill env pull.

Appendix

How to create a service account using the Google Cloud Console

Here is a step-by-step guide on how to create a Google Cloud service account with read-only access to GCS:

  1. Navigate to the Service Accounts page under "IAM & Admin" in the Google Cloud Console.

  2. Click the "Create Service Account" button at the top of the page.

  3. In the "Create Service Account" window, enter a name for the service account, then click "Create and continue".

  4. In the "Role" field, search for and select the "Storage Object Viewer" role. Click "Continue", then click "Done".

    • This grants the service account access to data in all buckets. To only grant access to data in a specific bucket, leave the "Role" field blank, click "Done", then follow the steps described in Add a principal to a bucket-level policy.
  5. On the "Service Accounts" page, locate the service account you just created and click on the three dots on the right-hand side. Select "Manage Keys" from the dropdown menu.

  6. On the "Keys" page, click the "Add key" button and select "Create new key".

  7. Choose the "JSON" key type and click "Create".

  8. Download and save the JSON key file to a secure location on your computer.

How to create a service account using the gcloud CLI

  1. Open a terminal window and follow the steps on Install the Google Cloud CLI if you haven't already done so.

  2. You will need your Google Cloud project ID to complete this tutorial. Run the following command to show it:

    gcloud config get project
  3. Replace [PROJECT_ID] with your project ID in the following command, and run it to create a new service account (optionally also replace rill-service-account with a name of your choice):

    gcloud iam service-accounts create rill-service-account --project [PROJECT_ID]
  4. Grant the service account access to data in Google Cloud Storage:

    • To grant access to data in all buckets, replace [PROJECT_ID] with your project ID in the following command, and run it:
      gcloud projects add-iam-policy-binding [PROJECT_ID] \
      --member="serviceAccount:rill-service-account@[PROJECT_ID].iam.gserviceaccount.com" \
      --role="roles/storage.objectViewer"
    • To only grant access to data in a specific bucket, replace [BUCKET_NAME] and [PROJECT_ID] with your details in the following command, and run it:
      gcloud storage buckets add-iam-policy-binding gs://[BUCKET_NAME] \
      --member="serviceAccount:rill-service-account@[PROJECT_ID].iam.gserviceaccount.com" \
      --role="roles/storage.objectViewer"
  5. Replace [PROJECT_ID] with your project ID in the following command, and run it to create a key file for the service account:

    gcloud iam service-accounts keys create rill-service-account.json \
    --iam-account rill-service-account@[PROJECT_ID].iam.gserviceaccount.com
  6. You have now created a JSON key file named rill-service-account.json in your current working directory.