Core Concepts

Rill and Druid: Better Together

Overview

Rill enables you to leverage all the power of Druid with a serverless cloud service that is simple, secure, and elastic. Rill is designed to fit into your existing analytics ecosystem. You can read from a wide variety of streamed or batch data sources such as Kafka and Big Query, and you can perform analytics using industry standard tools such as Tableau and Looker.

As such, this section introduces the core concepts related to managing your datasources, querying your data and terminology around application integration.

Rill Cloud Console

Rill Cloud Console, RCC, is Rill's console for your Druid cloud database service. From the console, you'll be able to create and access your team workspaces and from a workspace, a user will see all of the datasets available in that workspace.

A quick overview of the main components of RCC will help you get started.

Organization

When your company subscribes to Rill, it is assigned an organization name. Your organization will have one or more administrators.

Workspaces

A workspace is a virtual space for a team of users. User permissions are granted at the workspace level. When a user joins a workspace, they have access to all of the datasets in that workspace.

Workspaces are the perimeter for security. For example, if a workspace is granted permission to read from a BigQuery project, all users in the workspace with Editor privilege will be able to import data from that BigQuery project.

Datasets

A dataset includes any data that imported from an external datasource such as Google BigQuery, AWS, or a Kafka stream. The import is done to Druid via a process called ingestion. This ingestion process applies optimizations to your data and to the Druid configuration of your data to support fast queries.

For example, if your original data is at the millisecond, the import may aggregate to the hour or day. Or you may remove high cardinality columns or apply aggregation approximations such as HLL or Sketch to provide aggregative counts of those high cardinality columns. It will also specify segmentation of the data across the physical Druid cluster to optimize access.

All of these actions are specified at ingestion time, and, if you are in the RCC viewing a dataset, these actions have already been applied to your data.

User Types

Users may have "Admin", "Viewer" or "Editor" privilege.

As an Admin, you can create workspaces and invite users to those workspaces. Alternately, you may whitelist everyone in a particular email domain so that they can log in to Rill without an invitation. Admins may also modify workspace logos and set up security features like API keys & Service Accounts.

As an Editor you'll additionally be able to create new datasets, loading from common warehouses such as BigQuery or Amazon S3 or from streaming solutions such as Kafka.

As a Viewer you'll have fast query access via the Druid SQL Console, command line or programmatic APIs, or visualization tools like Tableau.

Querying your Data

Once your dataset has been created you can query it through a variety of interfaces.

From within RCC, you can click on Druid Console to query the data from the interactive Druid SQL Console. You can find extensive details on query concepts and best practices via the Apache Druid docs.

Beyond native console querying, you can connect other BI tools and applications to your dataset to create fast and interactive dashboards and reports. More details can be found in our application integration docs.

Rill also provides access to Rill Explore - an easy-to-use interface designed specifically for operational analytics focused on ad hoc data exploration. For more details on Explore, visit our Explore docs section.

You can also query your dataset through API access - this will allow you to connect your own internal tools directly to your Druid database. More details can be found in the Apache Druid API documentation.