Apache Druid is a high performance real-time analytics database that provides the analytics power behind the Rill service. Stand-alone, Druid provides many (100+) knobs to manage the cluster and jobs that run as part of the cluster. Correct configuration of these knobs is imperative for achieving performance while maintaining a cost effective solution. As an admin of a standalone Apache Druid cluster, you (or an admin) would have to manage performance and scaling: increasing the cluster configuration to run large ingestion or query jobs, tuning the cluster for various datasources, and then scaling back down (to manage cost) when those jobs are complete.
When you work with Rill, these knobs are adjusted dynamically as part of the managed service to achieve a performant and cost effective solution. Our goal is to keep you focused on your data and relive the burden of any cluster administration.
However, to build some familiarity with Druid, we provide this tutorial to visualize the knobs available during data ingestion. This tutorial uses the Apache Druid open source project so you'll notice that it looks slightly different from the Rill Cloud Console.
The process of creating your dataset provides opportunities for optimization based on your business needs so we recommend you become familiar with the process of ingesting data.
Updated 5 months ago