Deploy a Scalable Apache Kafka/Zookeeper Cluster on Kubernetes with Bitnami and Helm

Introduction

Apache Kafka is a well-known open source tool for real-time message streaming, typically used in combination with Apache Zookeeper to create scalable, fault-tolerant clusters for application messaging. Apache Kafka can also be integrated with other open source data-oriented solutions such as Apache Hadoop, Apache Spark or Apache HBase for real-time analysis and rendering of streaming data.

To make it easy to deploy a scalable Apache Kafka cluster in production environments, Bitnami offers both an Apache Kafka Helm chart and an Apache Zookeeper Helm chart. These two charts make it easy to set up an Apache Kafka environment that is horizontally scalable, fault-tolerant and reliable. These two charts also follow current best practices for security and scalability, thereby ensuring that your Apache Kafka cluster is ready for immediate production use.

Assumptions and prerequisites

This guide assumes that:

Step 1: Deploy Apache Zookeeper

The first step is to deploy Apache Zookeeper on your Kubernetes cluster using Bitnami's Helm chart. Your Apache Kafka deployment will use this Apache Zookeeper deployment for coordination and management.

First, add the Bitnami charts repository to Helm:

helm repo add bitnami https://charts.bitnami.com/bitnami

Next, execute the following command to deploy an Apache Zookeeper cluster with three nodes:

helm install zookeeper bitnami/zookeeper \
  --set replicaCount=3 \
  --set auth.enabled=false \
  --set allowAnonymousLogin=true
Tip

Since this Apache Zookeeper cluster will not be exposed publicly, it is deployed with authentication disabled. For production environments, consider using the production configuration instead.

Wait for a few minutes until the chart is deployed and note the service name displayed in the output, as you will need this in subsequent steps.

Zookeeper deployment

Step 2: Deploy Apache Kafka

The next step is to deploy Apache Kafka, again with Bitnami's Helm chart. In this case, you will provide the name of the Apache Zookeeper service as a parameter to the Helm chart.

Execute the command below, replacing the ZOOKEEPER-SERVICE-NAME placeholder with the Apache Zookeeper service name obtained at the end of Step 1:

helm install kafka bitnami/kafka \
  --set zookeeper.enabled=false \
  --set replicaCount=3 \
  --set externalZookeeper.servers=ZOOKEEPER-SERVICE-NAME

This command will deploy a three-node Apache Kafka cluster and configure the nodes to connect to the Apache Zookeeper service. Wait for a few minutes until the chart is deployed and note the service name displayed in the output, as you will need this in the next step.

Kafka deployment
Tip

If you previously deployed the Apache Zookeeper service with authentication enabled, you will need to add parameters to the command shown above so that the Apache Kafka pods can authenticate against the Apache Zookeeper service. Similarly, if you deploy the Apache Kafka service using a public load balancer, you will need to configure client authentication to protect the deployment from unauthorized access. Refer to the security section of the Apache Kafka container documentation for more information.

To confirm that the Apache Kafka and Apache Zookeeper deployments are connected, check the logs for any of the Apache Kafka pods and ensure that you see lines similar to the ones shown below, which confirm the connection:

Zookeeper-Kafka communication

Step 3: Test Apache Kafka

At this point, your Apache Kafka cluster is ready for work. Test it as follows:

  • Create a topic named mytopic using the commands below. Replace the ZOOKEEPER-SERVICE-NAME placeholder with the Apache Zookeeper service name obtained at the end of Step 1:

    export POD_NAME=$(kubectl get pods --namespace default -l "app.kubernetes.io/name=kafka,app.kubernetes.io/instance=kafka,app.kubernetes.io/component=kafka" -o jsonpath="{.items[0].metadata.name}")
    
    kubectl --namespace default exec -it $POD_NAME -- kafka-topics.sh --create --zookeeper ZOOKEEPER-SERVICE-NAME:2181 --replication-factor 1 --partitions 1 --topic mytopic
    
  • Start a Kafka message consumer. This consumer will connect to the cluster and retrieve and display messages as they are published to the mytopic topic. Replace the KAFKA-SERVICE-NAME placeholder with the Apache Kafka service name obtained at the end of Step 2:

    kubectl --namespace default exec -it $POD_NAME -- kafka-console-consumer.sh --bootstrap-server KAFKA-SERVICE-NAME:9092 --topic mytopic --consumer.config /opt/bitnami/kafka/conf/consumer.properties &
    
  • Using a different console, start a Kafka message producer and produce some messages by running the command below and then entering some messages, each on a separate line. Replace the KAFKA-SERVICE-NAME placeholder with the Apache Kafka service name obtained at the end of Step 2:

    kubectl --namespace default exec -it $POD_NAME -- kafka-console-producer.sh --broker-list KAFKA-SERVICE-NAME:9092 --topic mytopic --producer.config /opt/bitnami/kafka/conf/producer.properties
    > message 1
    > message 2
    ...
    

The messages should appear in the Kafka message consumer, as shown below. This indicates that the cluster is operational and applications can now use it for message production and consumption.

Kafka test

Step 4: Scale Apache Kafka

One of the nice things about deploying on Kubernetes is that it's very easy to scale out your Apache Kafka and Apache Zookeeper deployments. To illustrate, scale the Apache Kafka deployment up to 7 nodes with the following command:

helm upgrade kafka bitnami/kafka \
  --set zookeeper.enabled=false \
  --set replicaCount=7 \
  --set externalZookeeper.servers=ZOOKEEPER-SERVICE-NAME

Similarly, you can scale up the Apache Zookeeper deployment as well:

helm upgrade zookeeper bitnami/zookeeper \
  --set replicaCount=5 \
  --set auth.enabled=false \
  --set allowAnonymousLogin=true

You now have a horizontally scalable Apache Kafka cluster running on Kubernetes.

Useful links

To learn more about the topics discussed in this guide, use the links below: