Skip to content

CKAN

Deploy CKAN on Kubernetes — the world’s leading open-source data management system for publishing, sharing, and discovering datasets. Powers government open-data portals and data hubs worldwide.

ckan.siteUrl is required — incorrect value breaks all dataset links and API responses

CKAN uses ckan.siteUrl to generate absolute URLs for datasets, resources, and API responses. Setting it to the wrong value (such as the default http://localhost:5000) causes all links to be incorrect and OAuth callbacks to fail. Always set it to the full public URL before deployment.

CKAN requires two PostgreSQL databases: ckan (main) and datastore (DataStore API)

The DataStore extension (used by DataPusher for CSV/Excel ingestion) requires a separate datastore database with a dedicated read-only user (datastore_ro). When using the bundled PostgreSQL subchart, both databases are created automatically. For external PostgreSQL, create both databases and the read-only user manually before deploying.

Key Features

  • uWSGI application — CKAN web app on port 5000 behind a ClusterIP service
  • DataPusher — automatic CSV/Excel resource loading into the DataStore API
  • CKAN-specific Solrckan/ckan-solr StatefulSet with bundled search schema
  • Dual PostgreSQL databasesckan (metadata) + datastore (DataStore API)
  • Three secrets — sysadmin password, Beaker session secret, JWT secret
  • pg_dump backup — daily PostgreSQL S3 backup CronJob

Installation

HTTPS repository:

helm repo add helmforge https://repo.helmforge.dev
helm repo update
helm install ckan helmforge/ckan -f values.yaml

OCI registry:

helm install ckan oci://ghcr.io/helmforgedev/helm/ckan -f values.yaml

Deployment Examples

# values.yaml — CKAN with bundled PostgreSQL, Redis, Solr, and DataPusher
ckan:
  siteUrl: 'https://data.example.com' # required; wrong value breaks all links
  siteTitle: 'My Open Data Portal'
  sysadminName: admin
  sysadminEmail: admin@example.com
  existingSecret: ckan-secrets
  existingSecretPasswordKey: sysadmin-password
  existingSecretSessionKey: session-secret
  existingSecretJwtKey: jwt-secret

postgresql:
  enabled: true
  auth:
    database: ckan
    username: ckan
    password: 'strong-db-password'

redis:
  enabled: true
  auth:
    password: 'strong-redis-password'

solr:
  enabled: true
  persistence:
    size: 10Gi

datapusher:
  enabled: true # auto-imports CSV/Excel files into DataStore API

persistence:
  enabled: true
  size: 50Gi # uploaded datasets and resources

ingress:
  enabled: true
  ingressClassName: traefik
  annotations:
    cert-manager.io/cluster-issuer: letsencrypt-prod
  hosts:
    - host: data.example.com
      paths:
        - path: /
          pathType: Prefix
  tls:
    - secretName: ckan-tls
      hosts:
        - data.example.com
# values.yaml — CKAN with external PostgreSQL (both ckan + datastore DBs) and Redis
# Both databases must be created in PostgreSQL before deploying:
#   CREATE DATABASE ckan;
#   CREATE DATABASE datastore;
#   CREATE USER ckan WITH PASSWORD '...';
#   CREATE USER datastore_ro WITH PASSWORD '...';
#   GRANT ALL ON DATABASE ckan TO ckan;
#   GRANT CONNECT ON DATABASE datastore TO ckan;
#   GRANT CONNECT ON DATABASE datastore TO datastore_ro;

ckan:
  siteUrl: 'https://data.example.com'
  existingSecret: ckan-secrets

postgresql:
  enabled: false

redis:
  enabled: false

database:
  mode: external
  external:
    host: postgres.database.svc.cluster.local
    port: 5432
    ckanDatabase: ckan
    datastoreDatabase: datastore
    username: ckan
    datastoreReadUsername: datastore_ro
    existingSecret: ckan-db-credentials
    existingSecretPasswordKey: password

redisConfig:
  mode: external
  external:
    url: 'redis://:strong-redis-password@redis.cache.svc.cluster.local:6379/0'

solr:
  enabled: true # always use the bundled CKAN-specific Solr image

ingress:
  enabled: true
  ingressClassName: traefik
  hosts:
    - host: data.example.com
      paths:
        - path: /
          pathType: Prefix
# values.yaml — CKAN with additional plugins
ckan:
  siteUrl: 'https://data.example.com'
  existingSecret: ckan-secrets
  # Default plugins (keep all for full functionality):
  # envvars image_view text_view datatables_view
  plugins: 'envvars image_view text_view datatables_view datastore resource_proxy geo_view'
  extraEnv:
    - name: CKAN__DATAPUSHER__URL
      value: 'http://ckan-datapusher:8800'
    - name: CKAN__DATAPUSHER__CALLBACK_URL_BASE
      value: 'http://ckan-ckan:5000'

postgresql:
  enabled: true
  auth:
    password: 'strong-db-password'

redis:
  enabled: true
  auth:
    password: 'strong-redis-password'
# values.yaml — CKAN with daily pg_dump backup
# Note: backup covers the ckan database only; uploaded files in /var/lib/ckan (PVC)
# are NOT included. Back up the PVC separately using Velero or storage snapshots.
ckan:
  siteUrl: 'https://data.example.com'
  existingSecret: ckan-secrets

postgresql:
  enabled: true
  auth:
    password: 'strong-db-password'

redis:
  enabled: true
  auth:
    password: 'strong-redis-password'

backup:
  enabled: true
  schedule: '0 3 * * *'
  s3:
    endpoint: https://s3.amazonaws.com
    bucket: ckan-backups
    existingSecret: ckan-s3-credentials

ingress:
  enabled: true
  ingressClassName: traefik
  hosts:
    - host: data.example.com
      paths:
        - path: /
          pathType: Prefix

Configuration Reference

Image

ParameterTypeDefaultDescription
image.repositorystringdocker.io/ckan/ckan-baseCKAN image.
image.tagstring"2.11.4"Image tag.

CKAN Application

ParameterTypeDefaultDescription
ckan.siteUrlstringhttp://localhost:5000Required. Full public URL (breaks all links if wrong).
ckan.siteTitlestringCKANPortal display name.
ckan.sysadminNamestringadminSysadmin username.
ckan.sysadminEmailstringadmin@ckan.localSysadmin email.
ckan.sysadminPasswordstring""Sysadmin password. Auto-generated if empty.
ckan.existingSecretstring""Existing secret with all three secrets.
ckan.existingSecretPasswordKeystringsysadmin-passwordKey for sysadmin password.
ckan.existingSecretSessionKeystringsession-secretKey for Beaker session secret.
ckan.existingSecretJwtKeystringjwt-secretKey for JWT secret.
ckan.pluginsstringenvvars image_view text_view ...Space-separated list of active CKAN plugins.
ckan.replicaCountinteger1CKAN web pod replicas.
ckan.extraEnvarray[]Extra environment variables.

DataPusher

ParameterTypeDefaultDescription
datapusher.enabledbooleantrueDeploy the DataPusher service (CSV/Excel → DataStore).
datapusher.replicaCountinteger1DataPusher pod replicas.
datapusher.portinteger8800DataPusher service port.

Solr

Do not share Solr with other applications — CKAN uses a custom schema

The bundled Solr uses the ckan/ckan-solr image with a CKAN-specific schema. Connecting another application to this Solr instance may overwrite the schema and break CKAN’s search indexing.

ParameterTypeDefaultDescription
solr.enabledbooleantrueDeploy the bundled CKAN-specific Solr StatefulSet.
solr.persistence.sizestring5GiSolr PVC size.
solr.externalUrlstring""External Solr URL (when solr.enabled: false).

Database

ParameterTypeDefaultDescription
database.modestringsubchartMode: subchart or external.
database.external.hoststring""External PostgreSQL hostname.
database.external.ckanDatabasestringckanCKAN main database name.
database.external.datastoreDatabasestringdatastoreDataStore extension database name.
database.external.datastoreReadUsernamestringdatastore_roRead-only user for the DataStore API.
database.external.existingSecretstring""Existing secret with database passwords.
postgresql.enabledbooleantrueDeploy the bundled PostgreSQL subchart.
postgresql.auth.passwordstring""Password. Auto-generated if empty.

Redis

ParameterTypeDefaultDescription
redisConfig.modestringsubchartMode: subchart or external.
redisConfig.external.urlstring""Full Redis URL: redis://:password@host:6379/0.
redis.enabledbooleantrueDeploy the bundled Redis subchart.
redis.auth.passwordstring""Password. Auto-generated if empty.

Persistence and Service

ParameterTypeDefaultDescription
persistence.enabledbooleantrueEnable PVC for /var/lib/ckan (uploaded resources).
persistence.sizestring10GiPVC size.
service.portinteger80Service port.
ingress.enabledbooleanfalseEnable an Ingress resource.
ingress.ingressClassNamestringtraefikIngress class name.

Backup

Backup covers the ckan database only — uploaded files in /var/lib/ckan are not included

The S3 backup CronJob runs pg_dump on the CKAN PostgreSQL database. Dataset files and resource uploads stored in the /var/lib/ckan PVC are not included in the backup. Use Velero, NFS snapshots, or storage provider snapshots to protect the PVC data.

ParameterTypeDefaultDescription
backup.enabledbooleanfalseEnable scheduled pg_dump S3 backup.
backup.schedulestring"0 3 * * *"Cron schedule.
backup.s3.endpointstring""S3-compatible endpoint URL.
backup.s3.bucketstring""Target bucket name.
backup.s3.existingSecretstring""Existing secret with S3 credentials.
extraManifestsarray[]Extra Kubernetes manifests.

More Information