Best Practices for Creating Production-Ready Helm charts

Three years have passed since the first release of Helm, and it has indeed made a name for itself. Both avowed fans and fervent haters agree that the Kubernetes "apt-get equivalent" is the standard way of deploying to production (at least for now, let's see what Operators end up bringing to the table). During this time, Bitnami has contributed to the project in many ways. You can find us in PRs in Helm's code, in solutions like Kubeapps, and especially in what we are mostly known for: our huge application library.

As maintainers of a collection of more than 45 Helm charts, we know that creating a maintainable, secure and production-ready chart is far from trivial. In this sense, this blog post shows essential features that any chart developer should know.

Tip

Good practices guides are available in our Useful Links section.

Best Practices for Creating Production-Ready Helm charts

Use non-root containers

Ensuring that a container is able to perform only a very limited set of operations is vital for production deployments. This is possible thanks to the use of non-root containers, which are executed by a user different from root. Although creating a non-root container is a bit more complex than a root container (especially regarding filesystem permissions), it is absolutely worth it. Also, in environments like Openshift, using non-root containers is mandatory.

In order to make your Helm chart work with non-root containers, add the securityContext section to your yaml files.

This is what we do, for instance, in the Bitnami Elasticsearch Helm chart. This chart deploys several Elasticsearch StatefulSets and Deployments (data, ingestion, coordinating and master nodes), all of them with non-root containers. If we check the master node StatefulSet, we see the following:

spec:
  {{- if .Values.securityContext.enabled }}
  securityContext:
    fsGroup: {{ .Values.securityContext.fsGroup }}
  {{- end }}

The snippet above changes the permissions of the mounted volumes, so the container user can access them for read/write operations. In addition to this, inside the container definition, we see another securityContext block:

{{- if .Values.securityContext.enabled }}
securityContext:
  runAsUser: {{ .Values.securityContext.runAsUser }}
{{- end }}

In this part we specify the user running the container. In the values.yaml file, we set the default values for these parameters:

## Pod Security Context
## ref: https://kubernetes.io/docs/tasks/configure-pod-container/security-context/
##
securityContext:
  enabled: true
  fsGroup: 1001
  runAsUser: 1001

With these changes, the chart will work as non-root in platforms like GKE, Minikube or Openshift.

Do not persist the configuration

Adding persistence is an essential part of deploying stateful applications. In our experience, deciding what or what not to persist can be tricky. After several iterations in our charts, we found that persisting the application configuration is not a recommended practice. One advantage of Kubernetes is that you can change the deployment parameters very easily by just doing kubectl edit deployment or helm upgrade. If the configuration is persisted, none of the changes would be applied. So, when developing a production-ready Helm chart, make sure that the configuration can be easily changed with kubectl or helm upgrade. One common practice is to create a ConfigMap with the configuration and have it mounted in the container. Let's use the Bitnami RabbitMQ chart as an example:

apiVersion: v1
kind: ConfigMap
metadata:
  name: {{ template "rabbitmq.fullname" . }}-config
  labels:
    app: {{ template "rabbitmq.name" . }}
    chart: {{ template "rabbitmq.chart" .  }}
    release: "{{ .Release.Name }}"
    heritage: "{{ .Release.Service }}"
data:
  enabled_plugins: |-
{{ template "rabbitmq.plugins" . }}
  rabbitmq.conf: |-
    ##username and password
    default_user={{.Values.rabbitmq.username}}
    default_pass=CHANGEME
{{ .Values.rabbitmq.configuration | indent 4 }}
{{ .Values.rabbitmq.extraConfiguration | indent 4 }}

Note that there is a section in the values.yaml file that allows you to include any custom configuration:

  ## Configuration file content: required cluster configuration
  ## Do not override unless you know what you are doing. To add more configuration, use `extraConfiguration` instead
  configuration: |-
    ## Clustering
    cluster_formation.peer_discovery_backend  = rabbit_peer_discovery_k8s
    cluster_formation.k8s.host = kubernetes.default.svc.cluster.local
    cluster_formation.node_cleanup.interval = 10
    cluster_formation.node_cleanup.only_log_warning = true
    cluster_partition_handling = autoheal
    # queue master locator
    queue_master_locator=min-masters
    # enable guest user
    loopback_users.guest = false
  ## Configuration file content: extra configuration
  ## Use this instead of  `configuration` to add more configuration
  extraConfiguration: |-
    #disk_free_limit.absolute = 50MB
    #management.load_definitions = /app/load_definition.json

This ConfigMap then gets mounted in the container filesystem, as shown in this extract of the StatefulSet spec:

volumes:
- name: config-volume
    configMap:
    name: {{ template "rabbitmq.fullname" . }}-config

If the application needs to write in the configuration file, then you'll need to create a copy inside the container, as ConfigMaps are mounted as read-only. This is done in the same spec:

containers:
- name: rabbitmq
image: {{ template "rabbitmq.image" . }}
imagePullPolicy: {{ .Values.image.pullPolicy | quote }}
command:
      # ...
      #copy the mounted configuration to both places
      cp  /opt/bitnami/rabbitmq/conf/* /opt/bitnami/rabbitmq/etc/rabbitmq
      # ...

This will make your chart not only easy to upgrade, but also more adaptable to user needs, as they can provide their custom configuration file.

Integrate charts with logging and monitoring tools

If we are talking about production environments, we are talking about observability. It is essential having our deployments properly monitored so we can early detect potential issues. It also essential to have application usage, cost and resource consumption metrics. In order to gather this information, you would commonly deploy logging stacks like EFK (ElasticSearch, Fluentd, and Kibana and monitoring tools like Prometheus. Bitnami offers the Bitnami Kubernetes Production Runtime (BKPR) that easily installs these tools (along with others) so your cluster is ready to handle production workloads.

When writing your chart, make sure that your deployment is able to work with the above tools seamlessly. To do so, ensure the following:

  • All the containers log to stdout/stderr (so the EFK stack can easily ingest all the logging information)
  • Prometheus exporters are included (either using sidecar containers or having a separate deployment)

All Bitnami charts work with BKPR (which includes EFK and Prometheus) out of the box. Let's take a look at the Bitnami PostgreSQL chart and Bitnami PostgreSQL container to see how we did it.

To begin with, the process inside the container runs at the foreground, so all the logging information is written to stdout/stderr, as shown below:

info "** Starting PostgreSQL **"
if am_i_root; then
    exec gosu "$POSTGRESQL_DAEMON_USER" "${cmd}" "${flags[@]}"
else
    exec "${cmd}" "${flags[@]}"
fi

With this, we ensured that it works with EFK. Then, in the chart we added a sidecar container for the Prometheus metrics:

{{- if .Values.metrics.enabled }}
        - name: metrics
          image: {{ template "postgresql.metrics.image" . }}
          imagePullPolicy: {{ .Values.metrics.image.pullPolicy | quote }}
         {{- if .Values.metrics.securityContext.enabled }}
          securityContext:
            runAsUser: {{ .Values.metrics.securityContext.runAsUser }}
        {{- end }}
          env:
            {{- $database := required "In order to enable metrics you need to specify a database (.Values.postgresqlDatabase or .Values.global.postgresql.postgresqlDatabase)" (include "postgresql.database" .) }}
            - name: DATA_SOURCE_URI
              value: {{ printf "127.0.0.1:%d/%s?sslmode=disable" (int (include "postgresql.port" .)) $database | quote }}
            {{- if .Values.usePasswordFile }}
            - name: DATA_SOURCE_PASS_FILE
              value: "/opt/bitnami/postgresql/secrets/postgresql-password"
            {{- else }}
            - name: DATA_SOURCE_PASS
              valueFrom:
                secretKeyRef:
                  name: {{ template "postgresql.secretName" . }}
                  key: postgresql-password
            {{- end }}
            - name: DATA_SOURCE_USER
              value: {{ template "postgresql.username" . }}
          {{- if .Values.livenessProbe.enabled }}
          livenessProbe:
            httpGet:
              path: /
              port: http-metrics
            initialDelaySeconds: {{ .Values.metrics.livenessProbe.initialDelaySeconds }}
            periodSeconds: {{ .Values.metrics.livenessProbe.periodSeconds }}
            timeoutSeconds: {{ .Values.metrics.livenessProbe.timeoutSeconds }}
            successThreshold: {{ .Values.metrics.livenessProbe.successThreshold }}
            failureThreshold: {{ .Values.metrics.livenessProbe.failureThreshold }}
          {{- end }}
          {{- if .Values.readinessProbe.enabled }}
          readinessProbe:
            httpGet:
              path: /
              port: http-metrics
            initialDelaySeconds: {{ .Values.metrics.readinessProbe.initialDelaySeconds }}
            periodSeconds: {{ .Values.metrics.readinessProbe.periodSeconds }}
            timeoutSeconds: {{ .Values.metrics.readinessProbe.timeoutSeconds }}
            successThreshold: {{ .Values.metrics.readinessProbe.successThreshold }}
            failureThreshold: {{ .Values.metrics.readinessProbe.failureThreshold }}
          {{- end }}
          volumeMounts:
            {{- if .Values.usePasswordFile }}
            - name: postgresql-password
              mountPath: /opt/bitnami/postgresql/secrets/
            {{- end }}
            {{- if .Values.metrics.customMetrics }}
            - name: custom-metrics
              mountPath: /conf
              readOnly: true
          args: ["--extend.query-path", "/conf/custom-metrics.yaml"]
            {{- end }}
          ports:
            - name: http-metrics
              containerPort: 9187
          {{- if .Values.metrics.resources }}
          resources: {{- toYaml .Values.metrics.resources | nindent 12 }}
          {{- end }}
{{- end }}

We also made sure that the pods or services contain the proper annotations that Prometheus uses to detect exporters. In this case, we defined them in the chart's values.yaml file, as shown below:

metrics:
  enabled: false
  service:
    type: ClusterIP
    annotations:
      prometheus.io/scrape: "true"
      prometheus.io/port: "9187"
  #...

In the case of the PostgreSQL chart, these annotations go to a metrics service, separate from the PostgreSQL service, which is defined as below:

{{- if .Values.metrics.enabled }}
apiVersion: v1
kind: Service
metadata:
  name: {{ template "postgresql.fullname" . }}-metrics
  labels:
    app: {{ template "postgresql.name" . }}
    chart: {{ template "postgresql.chart" . }}
    release: {{ .Release.Name | quote }}
    heritage: {{ .Release.Service | quote }}
  annotations:
{{ toYaml .Values.metrics.service.annotations | indent 4 }}
spec:
  type: {{ .Values.metrics.service.type }}
  {{- if and (eq .Values.metrics.service.type "LoadBalancer") .Values.metrics.service.loadBalancerIP }}
  loadBalancerIP: {{ .Values.metrics.service.loadBalancerIP }}
  {{- end }}
  ports:
    - name: http-metrics
      port: 9187
      targetPort: http-metrics
  selector:
    app: {{ template "postgresql.name" . }}
    release: {{ .Release.Name }}
    role: master
{{- end }}

With these modifications, your chart will seamlessly integrate with your monitoring platform. All the obtained metrics will be crucial for maintaining the deployment in good shape.

Production workloads in Kubernetes are possible

Now you know some essential guidelines for creating secured (with non-root containers), adaptable (with proper configuration management), and observable (with proper monitoring) charts. With these features, you have covered the basics to ensure that your application can be deployed to production. However, this is just another step in your journey to mastering Helm. You should also take into account other features like upgradability, usability, stability and testing.

Useful Links

To learn more, check the following links:

This tutorial is part of the series

Best Practices for Creating Production-Ready Helm Charts

Learn how to create a custom Helm chart from scratch, the guidelines you need to follow to make production-ready charts, and which are the basic concepts you need to know for running Helm charts in production.