Apache Druid
Deploy Apache Druid on Kubernetes — a high-performance real-time analytics database for fast slice-and-dice analysis on large datasets, commonly used as the backend for BI dashboards and real-time data exploration.
Druid uses deep storage to share segment data between nodes. The default deepStorage.type: local stores segments
on each historical node’s local disk, making them inaccessible to other nodes. In any multi-node or production
deployment, switch to S3-compatible deep storage so all historical nodes can read segments. Use
deepStorage.s3.existingSecret to keep credentials out of the values file.
Druid’s default JVM settings allocate 256 MB–1 GiB per component. The six components together require a minimum of 4
GiB RAM for a functional deployment. Size the JVM heap per component (javaOpts) based on your actual query and
ingestion workload before going to production.
Component Architecture
| Component | Port | Role | JVM Default |
|---|---|---|---|
| coordinator | 8081 | Segment assignment and load balancing. | -Xmx512m |
| overlord | 8090 | Task management (ingestion scheduling). | -Xmx512m |
| broker | 8082 | Query routing to historical + realtime. | -Xmx1g |
| router | 8888 | Web console and API gateway (ingress target). | -Xmx256m |
| historical | 8083 | Segment storage and query serving. | -Xmx1g |
| middlemanager | 8091 | Task worker (ingestion, compaction). | -Xmx256m |
Key Features
- 6 independent Deployments — each component scales and tunes independently
- Deep storage — local (dev) or S3/MinIO (production)
- Two persistent volumes — historical (segment cache) and middlemanager (task storage)
- ZooKeeper — bundled native StatefulSet or external connection string
- PostgreSQL metadata — bundled subchart or external PostgreSQL/MySQL
- Gateway API — optional native Kubernetes
HTTPRoutefor the router/web console - Dual-stack Services — optional
ipFamilyPolicyandipFamiliesfor IPv4/IPv6 clusters - External Secrets Operator — optional
ExternalSecretresources for metadata and S3 credentials - NetworkPolicy — optional ingress and egress controls for compatible CNIs
Installation
HTTPS repository:
helm repo add helmforge https://repo.helmforge.dev
helm repo update
helm install druid helmforge/druid -f values.yaml
OCI registry:
helm install druid oci://ghcr.io/helmforgedev/helm/druid -f values.yaml
Deployment Examples
# values.yaml — Druid development mode (local deep storage, not for production)
# All segments stored locally on historical node disk
# NOT suitable for multi-node or production use
historical:
persistence:
size: 50Gi
middleManager:
persistence:
size: 20Gi
postgresql:
enabled: true
auth:
password: 'db-password'
zookeeper:
enabled: true
replicaCount: 1
ingress:
enabled: true
ingressClassName: traefik
hosts:
- host: druid.example.com
paths:
- path: /
pathType: Prefix# values.yaml — Druid production mode with S3 deep storage (required for HA)
deepStorage:
type: s3
s3:
bucket: druid-deep-storage
baseKey: druid/segments
region: us-east-1
endpointUrl: '' # set for MinIO: https://minio.example.com
existingSecret: druid-s3-credentials
existingSecretAccessKeyKey: access-key
existingSecretSecretKeyKey: secret-key
coordinator:
javaOpts: '-Xms512m -Xmx1g'
overlord:
javaOpts: '-Xms512m -Xmx1g'
broker:
javaOpts: '-Xms1g -Xmx2g'
historical:
javaOpts: '-Xms1g -Xmx2g'
persistence:
size: 100Gi # local segment cache only; segments also stored in S3
middleManager:
workerCapacity: 4
persistence:
size: 50Gi
postgresql:
enabled: true
auth:
password: 'strong-db-password'
zookeeper:
enabled: true
replicaCount: 3
persistence:
size: 5Gi
ingress:
enabled: true
ingressClassName: traefik
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
hosts:
- host: druid.example.com
paths:
- path: /
pathType: Prefix
tls:
- secretName: druid-tls
hosts:
- druid.example.com# values.yaml — Druid with external PostgreSQL and ZooKeeper
postgresql:
enabled: false
zookeeper:
enabled: false
metadata:
mode: external
external:
type: postgresql
host: postgres.database.svc.cluster.local
port: 5432
name: druid
username: druid
existingSecret: druid-db-credentials
existingSecretPasswordKey: password
zookeeperConfig:
mode: external
external:
hosts: 'zk-0.zookeeper:2181,zk-1.zookeeper:2181,zk-2.zookeeper:2181'
deepStorage:
type: s3
s3:
bucket: druid-segments
existingSecret: druid-s3-credentials# values.yaml — Druid horizontally scaled (multiple brokers and historical nodes)
deepStorage:
type: s3
s3:
bucket: druid-segments
existingSecret: druid-s3-credentials
# Scale query layer
broker:
replicaCount: 3
javaOpts: '-Xms2g -Xmx4g'
# Scale storage layer (shared S3 enables multiple historical nodes)
historical:
replicaCount: 3
javaOpts: '-Xms2g -Xmx4g'
persistence:
size: 200Gi
# Scale ingestion layer
middleManager:
replicaCount: 2
workerCapacity: 8
persistence:
size: 100Gi
postgresql:
enabled: true
auth:
password: 'strong-db-password'
zookeeper:
enabled: true
replicaCount: 3Configuration Reference
Image
| Parameter | Type | Default | Description |
|---|---|---|---|
image.repository | string | docker.io/apache/druid | Druid image. |
image.tag | string | "37.0.0" | Image tag. |
Apache Druid 37.0.0 removes Hadoop-based ingestion, moves S3 integrations to AWS SDK v2, and enables Broker segment metadata cache by default. Review the upstream release notes and upgrade notes before upgrading production deployments.
Components
All 6 components share this structure:
| Parameter | Type | Default | Description |
|---|---|---|---|
<component>.enabled | boolean | true | Deploy this component. |
<component>.replicaCount | integer | 1 | Pod replicas. |
<component>.javaOpts | string | (see table above) | JVM flags. Tune -Xms/-Xmx for production. |
<component>.extraProperties | string | "" | Extra runtime.properties lines. |
<component>.resources | object | {} | CPU and memory requests/limits. |
Historical and middlemanager also have:
| Parameter | Type | Default | Description |
|---|---|---|---|
historical.persistence.size | string | 10Gi | Segment cache PVC size. |
middleManager.persistence.size | string | 10Gi | Task storage PVC size. |
middleManager.workerCapacity | integer | 2 | Concurrent tasks per middlemanager pod. |
Deep Storage
| Parameter | Type | Default | Description |
|---|---|---|---|
deepStorage.type | string | local | local (dev only) or s3 (production). |
deepStorage.s3.bucket | string | "" | S3 bucket name. |
deepStorage.s3.baseKey | string | druid/segments | S3 key prefix for segments. |
deepStorage.s3.region | string | us-east-1 | AWS region. |
deepStorage.s3.endpointUrl | string | "" | Custom S3 endpoint (MinIO, Ceph, etc.). |
deepStorage.s3.existingSecret | string | "" | Existing secret with S3 credentials. |
deepStorage.s3.existingSecretAccessKeyKey | string | access-key | Key for S3 access key. |
deepStorage.s3.existingSecretSecretKeyKey | string | secret-key | Key for S3 secret key. |
Metadata and ZooKeeper
| Parameter | Type | Default | Description |
|---|---|---|---|
metadata.mode | string | subchart | Mode: subchart or external. |
metadata.external.host | string | "" | External PostgreSQL hostname. |
metadata.external.existingSecret | string | "" | Existing secret with database password. |
postgresql.enabled | boolean | true | Deploy the bundled PostgreSQL subchart. |
postgresql.auth.password | string | "" | Password. Auto-generated if empty. |
zookeeperConfig.mode | string | subchart | Mode: subchart or external. |
zookeeperConfig.external.hosts | string | "" | External ZooKeeper hosts (host1:port,host2:port). |
zookeeper.enabled | boolean | true | Deploy the bundled ZooKeeper StatefulSet. |
zookeeper.replicaCount | integer | 1 | ZooKeeper replicas (3 for production HA). |
zookeeper.persistence.size | string | 2Gi | ZooKeeper data PVC size. |
Service and Ingress
| Parameter | Type | Default | Description |
|---|---|---|---|
service.type | string | ClusterIP | Service type (routes to router/web console). |
service.port | integer | 80 | Service port. |
service.ipFamilyPolicy | string | omitted | Optional Service IP family policy. |
service.ipFamilies | array | omitted | Optional ordered Service IP families. |
ingress.enabled | boolean | false | Enable Ingress (routes to router on port 8888). |
ingress.ingressClassName | string | traefik | Ingress class name. |
druid.extraCommonProperties | string | "" | Extra runtime properties applied to all components. |
druid.extraEnv | array | [] | Extra environment variables for all components. |
podSecurityContext.fsGroup | integer | 1000 | Shared filesystem group for Druid volumes. |
securityContext.runAsNonRoot | boolean | true | Run Druid containers as a non-root user. |
networkPolicy.enabled | boolean | false | Render NetworkPolicy resources. |
extraManifests | array | [] | Extra Kubernetes manifests. |
Dual-stack fields are omitted by default so Services inherit the cluster defaults. Set them only when your cluster has IPv6 or dual-stack networking enabled:
service:
ipFamilyPolicy: PreferDualStack
ipFamilies:
- IPv4
- IPv6
Gateway API
The chart can render a native Kubernetes Gateway API HTTPRoute for the Druid router. It intentionally does not
create a Gateway; reference the shared Gateway provided by your cluster platform.
| Parameter | Type | Default | Description |
|---|---|---|---|
gatewayAPI.enabled | boolean | false | Render an HTTPRoute for the router Service. |
gatewayAPI.parentRefs | array | [] | Gateway parent references. |
gatewayAPI.hostnames | array | [] | Route hostnames. |
gatewayAPI.paths | array | / | HTTP path matches routed to the router Service. |
gatewayAPI.annotations | object | {} | Extra annotations on the HTTPRoute. |
gatewayAPI:
enabled: true
parentRefs:
- name: shared-gateway
namespace: gateway-system
sectionName: https
hostnames:
- druid.example.com
paths:
- type: PathPrefix
value: /
External Secrets
When External Secrets Operator is installed, the chart can render
ExternalSecret resources for metadata and S3 credentials. Set the chart existingSecret values to the same Secret
names ESO will create so workloads consume externally managed credentials.
| Parameter | Type | Default | Description |
|---|---|---|---|
externalSecrets.enabled | boolean | false | Enable ExternalSecret rendering. |
externalSecrets.apiVersion | string | external-secrets.io/v1 | ExternalSecret API version. |
externalSecrets.secretStoreRef.name | string | "" | SecretStore or ClusterSecretStore name. |
externalSecrets.secretStoreRef.kind | string | SecretStore | Store kind. |
externalSecrets.metadata.enabled | boolean | false | Render metadata database ExternalSecret. |
externalSecrets.deepStorage.enabled | boolean | false | Render S3 deep storage ExternalSecret. |
metadata:
mode: external
external:
existingSecret: druid-metadata
deepStorage:
type: s3
s3:
bucket: druid-segments
existingSecret: druid-s3
externalSecrets:
enabled: true
secretStoreRef:
name: platform-secrets
kind: ClusterSecretStore
metadata:
enabled: true
data:
- secretKey: password
remoteRef:
key: druid/metadata
property: password
deepStorage:
enabled: true
data:
- secretKey: access-key
remoteRef:
key: druid/s3
property: access-key
- secretKey: secret-key
remoteRef:
key: druid/s3
property: secret-key
Security Context
Druid workload containers run as UID/GID 1000 by default with privilege escalation disabled, all Linux capabilities
dropped, and the runtime default seccomp profile. The prepare-dirs init container runs as root only to create and
chown writable Druid directories.
NetworkPolicy
NetworkPolicy is opt-in because enforcement depends on the cluster CNI. When enabled, ingress defaults to same-namespace traffic on Druid component ports. Egress rules are optional and can allow DNS, same-namespace dependencies, HTTP, HTTPS, and extra peers:
networkPolicy:
enabled: true
ingress:
allowSameNamespace: true
egress:
enabled: true
allowDNS: true
allowSameNamespace: true
allowHTTPS: true