Bitnami's Best Practices for Securing and Hardening Helm Charts

Introduction

When developing a chart, it is important to ensure that the packaged content (chart source code, container images, and subcharts) is created by following best practices in terms of security, efficiency and performance.

This article will go over the key points Bitnami takes into account when publishing Bitnami Helm charts. It covers the best practices applied to the bundled containers, the use of configuration as ConfigMaps, integration with logging and monitoring tools, and the release process, including CVE scanning and tests.

The Bitnami pipeline

A Helm chart is composed of different containers and subcharts. Therefore, when securing and hardening Helm charts, it is important to ensure that the containers used in the main chart and subcharts are also secure and hardened.

In order to have full control over the published charts, an indispensable requirement for all Bitnami Helm charts is that all the bundled images are released through the Bitnami pipeline following Bitnami's best practices for securing and hardening containers.

ConfigMaps for configuration

Tip

In our experience, deciding which data should or should not be persistent can be complicated. After several iterations, our recommended approach has been to use ConfigMaps, but this recommendation could change depending on the configuration file or scenario. One advantage of Kubernetes is that users can change the deployment parameters very easily by just executing kubectl edit deployment or helm upgrade. If the configuration is persistent, none of the changes will be applied. So, when developing Bitnami Helm charts, we make sure that the configuration can be easily changed with kubectl or helm upgrade.

One common practice is to create a ConfigMap with the configuration and have it mounted in the container. Let's use the Bitnami RabbitMQ chart as an example:

apiVersion: v1
kind: ConfigMap
metadata:
  name: {{ template "rabbitmq.fullname" . }}-config
  namespace: {{ .Release.Namespace }}
  labels: {{- include "common.labels.standard" . | nindent 4 }}
data:
  rabbitmq.conf: |-
    {{- include "common.tplvalues.render" (dict "value" .Values.configuration "context" $) | nindent 4 }}
  {{- if .Values.advancedConfiguration}}
  advanced.config: |-
    {{- include "common.tplvalues.render" (dict "value" .Values.advancedConfiguration "context" $) | nindent 4 }}
  {{- end }}

Note that there is a section in the values.yaml file which allows you to include custom configuration:

## Configuration file content: required cluster configuration
## Do not override unless you know what you are doing.
## To add more configuration, use `extraConfiguration` of `advancedConfiguration` instead
##
configuration: |-
  ## Username and password
  default_user = {{ .Values.auth.username }}
  default_pass = CHANGEME
  ## Clustering
  cluster_formation.peer_discovery_backend  = rabbit_peer_discovery_k8s
  cluster_formation.k8s.host = kubernetes.default.svc.{{ .Values.clusterDomain }}
  cluster_formation.node_cleanup.interval = 10
  cluster_formation.node_cleanup.only_log_warning = true
  cluster_partition_handling = autoheal

## Configuration file content: extra configuration
## Use this instead of `configuration` to add more configuration
##
extraConfiguration: |-
  #default_vhost = {{ .Release.Namespace }}-vhost
  #disk_free_limit.absolute = 50MB
  #load_definitions = /app/load_definition.json

This ConfigMap then gets mounted in the container filesystem, as shown in this extract of the StatefulSet spec:

volumes:
  - name: configuration
    configMap:
      name: {{ template "rabbitmq.fullname" . }}-config
      items:
        - key: rabbitmq.conf
          path: rabbitmq.conf
        {{- if .Values.advancedConfiguration}}
        - key: advanced.config
          path: advanced.config
        {{- end }}

This approach makes Bitnami charts easy to upgrade and also more adaptable to user needs, as users can provide their own custom configuration file.

Integration with logging and monitoring tools

One of the key concerns when deploying charts in production environments is observability. It is essential to have deployments properly monitored for early detection of potential issues. It is also important to have application usage, cost, and resource consumption metrics. In order to gather this information, users commonly deploy logging stacks like EFK (ElasticSearch, Fluentd, and Kibana and monitoring tools like Prometheus. In the same way, there are Bitnami charts available for each of those solutions.

Bitnami charts are developed ensuring that deployments are able to work with the above tools seamlessly. To achieve this, the Bitnami charts ensure that:

  • All the containers log to stdout/stderr (so that the EFK stack can easily ingest all the logging information)
  • Prometheus exporters are included (either using sidecar containers or having a separate deployment)

Bitnami offers the Bitnami Kubernetes Production Runtime (BKPR) that installs all these tools (along with others) and makes your cluster capable of handling production workloads. All Bitnami charts work with BKPR (which includes EFK and Prometheus) out of the box. Let's take a look at the Bitnami PostgreSQL chart and Bitnami PostgreSQL container to see how this is achieved.

To begin with, the process inside the container runs in the foreground, so all the logging information is written to stdout/stderr, as shown below:

info "** Starting PostgreSQL **"
if am_i_root; then
    exec gosu "$POSTGRESQL_DAEMON_USER" "${cmd}" "${flags[@]}"
else
    exec "${cmd}" "${flags[@]}"
fi

This ensures that it works with EFK.

Although there are different approaches to implement logging capabilities, such as adding a logging agent at the node level or configuring the application to push the info to the backend, the most common approach is to use sidecar containers. Learn more about logging architectures.

In the example above, the chart adds a sidecar container for Prometheus metrics:

      containers:
{{- if .Values.metrics.enabled }}
        - name: metrics
          image: {{ template "postgresql.metrics.image" . }}
          imagePullPolicy: {{ .Values.metrics.image.pullPolicy | quote }}
         {{- if .Values.metrics.securityContext.enabled }}
          securityContext:
            runAsUser: {{ .Values.metrics.securityContext.runAsUser }}
        {{- end }}
          env:
            {{- $database := required "In order to enable metrics you need to specify a database (.Values.postgresqlDatabase or .Values.global.postgresql.postgresqlDatabase)" (include "postgresql.database" .) }}
            {{- $sslmode := ternary "require" "disable" .Values.tls.enabled }}
            {{- if and .Values.tls.enabled .Values.tls.certCAFilename }}
            - name: DATA_SOURCE_NAME
              value: {{ printf "host=127.0.0.1 port=%d user=%s sslmode=%s sslcert=%s sslkey=%s" (int (include "postgresql.port" .)) (include "postgresql.username" .) $sslmode (include "postgresql.tlsCert" .) (include "postgresql.tlsCertKey" .) }}
            {{- else }}
            - name: DATA_SOURCE_URI
              value: {{ printf "127.0.0.1:%d/%s?sslmode=%s" (int (include "postgresql.port" .)) $database $sslmode }}
            {{- end }}
            {{- if .Values.usePasswordFile }}
            - name: DATA_SOURCE_PASS_FILE
              value: "/opt/bitnami/postgresql/secrets/postgresql-password"
            {{- else }}
            - name: DATA_SOURCE_PASS
              valueFrom:
                secretKeyRef:
                  name: {{ template "postgresql.secretName" . }}
                  key: postgresql-password
            {{- end }}
            - name: DATA_SOURCE_USER
              value: {{ template "postgresql.username" . }}
            {{- if .Values.metrics.extraEnvVars }}
            {{- include "postgresql.tplValue" (dict "value" .Values.metrics.extraEnvVars "context" $) | nindent 12 }}
            {{- end }}
          {{- if .Values.livenessProbe.enabled }}
          livenessProbe:
            httpGet:
              path: /
              port: http-metrics
            initialDelaySeconds: {{ .Values.metrics.livenessProbe.initialDelaySeconds }}
            periodSeconds: {{ .Values.metrics.livenessProbe.periodSeconds }}
            timeoutSeconds: {{ .Values.metrics.livenessProbe.timeoutSeconds }}
            successThreshold: {{ .Values.metrics.livenessProbe.successThreshold }}
            failureThreshold: {{ .Values.metrics.livenessProbe.failureThreshold }}
          {{- end }}
          {{- if .Values.readinessProbe.enabled }}
          readinessProbe:
            httpGet:
              path: /
              port: http-metrics
            initialDelaySeconds: {{ .Values.metrics.readinessProbe.initialDelaySeconds }}
            periodSeconds: {{ .Values.metrics.readinessProbe.periodSeconds }}
            timeoutSeconds: {{ .Values.metrics.readinessProbe.timeoutSeconds }}
            successThreshold: {{ .Values.metrics.readinessProbe.successThreshold }}
            failureThreshold: {{ .Values.metrics.readinessProbe.failureThreshold }}
          {{- end }}
          volumeMounts:
            {{- if .Values.usePasswordFile }}
            - name: postgresql-password
              mountPath: /opt/bitnami/postgresql/secrets/
            {{- end }}
            {{- if .Values.tls.enabled }}
            - name: postgresql-certificates
              mountPath: /opt/bitnami/postgresql/certs
              readOnly: true
            {{- end }}
            {{- if .Values.metrics.customMetrics }}
            - name: custom-metrics
              mountPath: /conf
              readOnly: true
          args: ["--extend.query-path", "/conf/custom-metrics.yaml"]
            {{- end }}
          ports:
            - name: http-metrics
              containerPort: 9187
          {{- if .Values.metrics.resources }}
          resources: {{- toYaml .Values.metrics.resources | nindent 12 }}
          {{- end }}
{{- end }}

Bitnami also ensures that the pods or services contain the proper annotations for Prometheus to detect exporters. In this case, they are defined in the chart's values.yaml file, as shown below:

## Configure metrics exporter
##
metrics:
  enabled: false
  # resources: {}
  service:
    type: ClusterIP
    annotations:
      prometheus.io/scrape: 'true'
      prometheus.io/port: '9187'

In the case of the PostgreSQL chart, these annotations go to a metrics service, separate from the PostgreSQL service, which is defined as below:

{{- if .Values.metrics.enabled }}
apiVersion: v1
kind: Service
metadata:
  name: {{ template "postgresql.fullname" . }}-metrics
  labels:
  {{- include "common.labels.standard" . | nindent 4 }}
  annotations:
  {{- if .Values.commonAnnotations }}
    {{- include "postgresql.tplValue" ( dict "value" .Values.commonAnnotations "context" $ ) | nindent 4 }}
  {{- end }}
    {{- toYaml .Values.metrics.service.annotations | nindent 4 }}
spec:
  type: {{ .Values.metrics.service.type }}
  {{- if and (eq .Values.metrics.service.type "LoadBalancer") .Values.metrics.service.loadBalancerIP }}
  loadBalancerIP: {{ .Values.metrics.service.loadBalancerIP }}
  {{- end }}
  ports:
    - name: http-metrics
      port: 9187
      targetPort: http-metrics
  selector:
  {{- include "common.labels.matchLabels" . | nindent 4 }}
    role: master
{{- end }}

Apart from that, a ConfigMap is created to support a custom configuration file:

{{- if and .Values.metrics.enabled .Values.metrics.customMetrics }}
apiVersion: v1
kind: ConfigMap
metadata:
  name: {{ template "postgresql.metricsCM" . }}
  labels:
  {{- include "common.labels.standard" . | nindent 4 }}
  {{- if .Values.commonAnnotations }}
  annotations: {{- include "postgresql.tplValue" ( dict "value" .Values.commonAnnotations "context" $ ) | nindent 4 }}
  {{- end }}
data:
  custom-metrics.yaml: {{ toYaml .Values.metrics.customMetrics | quote }}
{{- end }}

Some parameters related to the metrics exporter can be configured in the values.yaml:

  ## Define additional custom metrics
  ## ref: https://github.com/wrouesnel/postgres_exporter#adding-new-metrics-via-a-config-file
  customMetrics:
    pg_database:
      query: "SELECT d.datname AS name, CASE WHEN pg_catalog.has_database_privilege(d.datname, 'CONNECT') THEN pg_catalog.pg_database_size(d.datname) ELSE 0 END AS size_bytes FROM pg_catalog.pg_database d where datname not in ('template0', 'template1', 'postgres')"
      metrics:
        - name:
            usage: "LABEL"
            description: "Name of the database"
        - size_bytes:
            usage: "GAUGE"
            description: "Size of the database in bytes"

Apart from all of the above, this container has its own probes, environment variables and security context.

These modifications ensure that Bitnami charts seamlessly integrate with monitoring platforms. The metrics obtained can be used to keep the deployment in good condition throughout its lifetime.

Bitnami release process and tests

This Bitnami release process is another important pillar for keeping Bitnami charts safe, updated and fully functional.

Charts releases are triggered under different conditions:

  • Upstream new version: If there is a new version of the main container bundled in the chart, Bitnami triggers a new release.
  • Maintenance release: If there was no update in the last 30 days (this time period can be customized for charts released as part of the Tanzu Application Catalog, Bitnami triggers a new release.
  • CVE detected: Bitnami triggers a release of a chart when a package that includes a fix for a CVE is detected.
  • User PR: When a pull request or any other change performed by a user or the Bitnami team is merged into the Bitnami GitHub repository, Bitnami triggers a new release.

In all the above cases, the main image is updated to the latest version and the secondary containers (such as metrics exporters) and chart dependencies are also updated to the latest published version at that time.

However, before the release is performed, various scanners and tests are executed. These stop the release from proceeding if the result is not successful.

CVE scanning

To ensure that all Bitnami images include the latest security fixes, Bitnami implements the following policies:

  • Bitnami triggers a release of a new Helm chart when a new version of the main server or application is detected. For example, if the system automatically detects a new version of RabbitMQ, Bitnami’s pipeline automatically releases a new container with that version and also releases the corresponding Helm chart if it passes all tests. This way, Bitnami ensures that the application version released is always the latest stable one and has the latest security fixes.
  • The system scans all Bitnami containers and releases new images daily with the latest available system packages. Once the pipeline detects there is a new package that fixes a CVE, our team triggers the release of a new Helm chart to point to the latest container images.
  • The Bitnami team monitors different CVE feeds - such as Heartbleed or Shellshock - to fix the most critical issues as soon as possible. Once a critical issue is detected in any of the charts included in the Bitnami catalog (or any of the assets that Bitnami distributes amongst its different cloud providers), a new solution is released. Usually, Bitnami provides updates in less than 48 business hours.
Tip

Open CVEs are CVEs that depend directly on the Linux distribution maintainers and have not yet been fixed by those maintainers. Bitnami is not able to fix such CVEs directly. Learn more about Bitnami's open CVE policy.

Verification and functional tests

During the release process, Bitnami charts are tested on different Kubernetes platforms such as GKE, AKS, IKS, TKG, and others. Charts are tested using different Kubernetes server versions and Helm versions. Apart from these tests, different scenarios are configured in order to test other functionalities beyond the default parameters, like SMTP or LDAP configuration.

Two types of tests are executed for each Kubernetes platform:

  • Verification tests: This type of testing involves inspecting a deployment to check certain properties. For example, checking if a particular file exists on the system and if it has the correct permissions.
  • Functional tests: This type of testing is used to verify that an application is behaving as expected from the user's perspective. For example, if the application must be accessible using a web browser, functional testing uses a headless browser to interact with the application and perform common actions such as logging in and out and adding users.

Upgrade tests

One of the most common use cases for Helm charts is the upgrade process. Helm charts follow the SemVer specification (MAJOR.MINOR.PATCH) to handle chart version numbering. According to that specification, backward compatibility should be guaranteed for MINOR and PATCH versions, while it is not guaranteed in MAJOR versions.

From Bitnami, when a new change is implemented in the chart, the Bitnami team determine the version number change required. Typically, bug fixes are PATCH changes, feature additions are MINOR changes and MAJOR changes occur when backward compatibility cannot be guaranteed.

In the case of MAJOR versions, the changes and, if possible, an upgrade path is documented in the README. Here is an example from the PostgreSQL chart.

When the changes are not MAJOR changes, backward compatibility should be guaranteed. To test this, Bitnami applies a "Chart Upgrade" test as part of the Bitnami release pipeline to ensure that helm upgrade works for non-major changes. In this test:

  • The first available version in the current major version is installed (such as X.0.0).
  • Tests are executed to populate some data or create content.
  • The helm upgrade command is executed to install the most recent version (such as X.3.5).
  • Tests are executed to check that the populated data is still present in the upgraded chart.

Bitnami has also published some guides about how to backup and restore deployments in Kubernetes for common infrastructure charts like MongoDB and MariaDB Galera.

Conclusions

By implementing the above steps in the Bitnami package and release process, Bitnami ensures that its Helm charts are packaged following best practices in terms of security and performance and can be safely used on most platforms as part of production deployments.

Useful links

To learn more about the topics discussed in this guide, use the links below: