Deploy your Bitnami Kafka Stack on Google Multi-Tier Solutions now! Launch Now

Bitnami Kafka for Google Multi-Tier Solutions

Description

Apache Kafka is publish-subscribe messaging rethought as a distributed commit log.

What are the differences between a Bitnami Single-Tier Solution and Multi-Tier Solution?

Single-tier architecture implies that all the required components of an application run on a single server. If your environment is growing and becoming more complex, a single layer architecture will not meet your scalability requirements. Single-Tier Solutions are great for departmental applications, smaller production environments, new users, or those applications that don't support multi-tier architectures.

The typical architecture of a Bitnami Single-Tier Solution looks like this:

Single-tier architecture

Multi-tier architecture involves more than one server and infrastructure resource. For example, the Front End-Database topology separates the application server from the database server. This allows you to extend workloads in the cloud and tailor your application to meet specific scalability and reliability goals. Multi-Tier Solutions provide more sophisticated deployment topologies for improved scalability and reliability for larger production or mission critical environments.

TIP: Not sure if you have chosen the right solution? Check out the Bitnami Multi-Tier solutions features and benefits to learn more about the benefits of Multi-Tier.

This Bitnami Multi-Tier Solution uses several Kafka brokers, with Zookeeper nodes to manage the cluster. This topology is illustrated below:

Multi-tier architecture

First steps with the Bitnami Kafka Stack

Welcome to your new Bitnami application running on Google Cloud Platform! Here are a few questions (and answers!) you might need when first starting with your application.

What credentials do I need?

You need two sets of credentials:

  • The application credentials, consisting of a username and password. These credentials allow you to log in to your new Bitnami application.

  • The server credentials, consisting of an SSH username and key. These credentials allow you to log in to your Google Cloud Platform server using an SSH client and execute commands on the server using the command line.

What is the administrator username set for me to log in to the application for the first time?

Username: user

What is the administrator password?

The password is randomly assigned when you first launched the application through the Google Cloud Launcher. Refer to the FAQ to learn how to find the application credentials.

What SSH username should I use for secure shell access to my application?

SSH username: bitnami

How do I get my SSH key or password?

You will need to create and associate an SSH key pair with the server(s). Use the same key pair for secure shell access to the server. Click here for more information.

How to start or stop the services?

NOTE: The steps below require you to execute the commands on the remote server. Please check our FAQ for instructions on how to connect to your server through SSH.

Each Bitnami server includes a control script that lets you easily stop, start and restart all the services installed on the current individual server.

Obtain the status of a service with the service bitnami status command:

$ sudo service bitnami status

Use the service bitnami command to start, stop or restart all the services in a similar manner:

  • Start all the services.

    $ sudo service bitnami start
    
  • Stop all the services.

    $ sudo service bitnami stop
    
  • Restart all the services.

    $ sudo service bitnami restart
    
TIP: To start, restart or stop individually each server of the cluster, check the FAQ section about how to start or stop servers in a Multi-Tier Solution.

What is the default configuration?

Kafka default configuration

Kafka configuration files

The Kafka configuration files are located at the /opt/bitnami/kafka/config/ directory.

Kafka ports

Each Kafka server has a single broker running on port 9092. Only conections from the local network are allowed.

Kafka log files

The Kafka log files are created at the /opt/bitnami/kafka/logs/ directory.

Zookeeper default configuration

Zookeeper configuration files

The Zookeeper configuration files are located at the /opt/bitnami/zookeeper/conf/ directory.

Zookeeper ports

By default, the Zookeeper server runs on port 2181. Only conections from local network are allowed.

How is the cluster configured?

The Bitnami Multi-Tier Solution for Kafka uses multiple VMs, consisting of a cluster of one or more brokers to provide a horizontally scalable Kafka cluster where the data/topic is replicated across the selected nodes. The cluster is configured as follows:

  • ZooKeeper instances: They are configured in the cluster and running on nodes independently from Kafka.
  • Disk: Each Kafka node have a 30GB SSD disk attached (default, configurable).
  • Authentication: Client, inter-broker, and Zookeeper authentication.

How to check cluster status?

You can check the cluster status by running the following command:

zkCli.sh -server ZOOKEEPER_PRIMARY_NODE:2181 ls /brokers/ids | tail -n1

ZOOKEEPER_PRIMARY_NODE is a placeholder, substitute it with the name of your first ZooKeeper node. For instance, if your deployment is called _my-first-deploy_, to access your first Zookeeper node replace the placeholder with the following _my-first-deploy-zk-0_ t.

You will see an output like this:

[1002, 1001]

The number of elements of the array and the number of Kafka nodes of your deployment must be the same.

To check the status of your topics:

  • Export the authentication configuration:

     $ export KAFKA_OPTS="-Djava.security.auth.login.config=/opt/bitnami/kafka/conf/kafka_jaas.conf"
    
  • And execute:

    kafka-topics.sh –describe –zookeeper ZOOKEEPER_PRIMARY_NODE:2181

The output should be similar to this:

Topic:9ckbm157djyhyh2r9lwg0hpvi	PartitionCount:1	ReplicationFactor:1	Configs:
    Topic: 9ckbm157djyhyh2r9lwg0hpvi	Partition: 0	Leader: 1002	Replicas: 1002	Isr: 1002
Topic:test0910551	PartitionCount:2	ReplicationFactor:2	Configs:
   Topic: test0910551	Partition: 0	Leader: 1002	Replicas: 1001,1002	Isr: 1002
   Topic: test0910551	Partition: 1	Leader: 1002	Replicas: 1002,1001	Isr: 1002

How to connect to cluster nodes?

Some operations such as changing the application password, require that some actions will be repeated in each cluster node. That way, you need to connect to each node for the changes to take effect in the whole cluster. Follow the steps below to connect the cluster nodes in your Google deployments:

  • Log in to the Google Cloud Console and select your project.
  • Navigate to the "Compute Engine -> VM instances" page.

    Select your instances

  • You will see a list of all the instances launched in your project. Click on the instance that corresponds to the node you want to connect.

    Select the node you want to connect

  • Once you are in the "VM instance details" page, go to the "Remote access" section and click the "SSH" button.

    Connect through SSH to the selected node

    Now you should be connected to the node you have selected:

    Connect through SSH to the selected node

NOTE: Remember to repeat the same operation to connect to each cluster node.

How to add nodes to the cluster?

IMPORTANT: These steps assume that you have already installed the Google Cloud SDK and you are signed in to the Google Cloud Platform through the gcloud command-line client. If this is not the case, please refer to the Google Cloud SDK documentation for instructions on how to install and use the command-line client.

To add nodes to the database cluster, follow these steps:

  • Log in to the Google Cloud Console.
  • Browse to the Deployment Manager and select the deployment to which you wish to add nodes.
  • In the deployment overview, review the deployment properties and click to view the "Expanded Config" deployment configuration file.

    Expanded configuration

  • Copy or download the contents of the "Expanded Config" file to the server with the Google Cloud SDK as expanded-config.yaml.
  • Edit the file and add configuration for one or more additional nodes, by copying the configuration and metadata for an existing node and its corresponding data disk. Then updating the copied configuration to use a unique name for the new node(s) and data disk(s).

    To add a new Kafka node to a Kafka cluster, here is an abridged example of the configuration and metadata that you would update to add a new node and data disk. To create a unique name for the new node, you would typically replace the XX placeholder in the node name with a number.

    NOTE: The code block below is an illustrative example and may differ in your specific deployment. You should always copy the code block from your deployment's current configuration file.
      [...]
      - metadata:
          dependsOn:
          - kafka-mt-kafka-0
          - kafka-mt-kafka-1
          - kafka-mt-kafka-XX
          - kafka-mt-config
        name: kafka-multivm-software
       [...]
       - name: kafka-mt-kafka-XX
         properties:
           bootDiskType: pd-standard
           canIpForward: false
           disks:
           - autoDelete: true
             boot: true
             deviceName: kafka-mt-kafka-XX-boot
             initializeParams:
               diskType: https://www.googleapis.com/compute/v1/projects/bitnamigcetest2/zones/us-central1-f/diskTypes/pd-standard
               sourceImage: projects/bitnamigcetest2/global/images/kafka-mt
             type: PERSISTENT
           - autoDelete: true
             boot: false
             deviceName: kafka-mt-kafka-XX-data
             source: $(ref.kafka-mt-kafka-XX-data.selfLink)
             type: PERSISTENT
           [...]
             - key: PROVISIONER_DATA_DISK
               value: kafka-mt-kafka-XX-data
           [...]
           tags:
             items:
             - kafka-mt-kafka-XX
           zone: us-central1-f
         type: compute.v1.instance
         metadata:
           dependsOn:
           - kafka-mt-kafka-XX-data
       - name: kafka-mt-kafka-XX-data
         properties:
           sizeGb: 30
           type: https://www.googleapis.com/compute/v1/projects/bitnamigcetest2/zones/us-central1-f/diskTypes/pd-ssd
           zone: us-central1-f
         type: compute.v1.disk
    
  • Preview the updated deployment with the command below. Replace the DEPLOYMENT-ID placeholder in the command below with the correct name of your deployment.

      $ gcloud deployment-manager deployments update DEPLOYMENT-ID --config expanded-config.yaml --preview
    
  • Once you have verified that the deployment preview is correct, confirm the deployment and initialize the new node(s):

      $ gcloud deployment-manager deployments update DEPLOYMENT-ID
    

How to connect instances hosted in separate virtual networks or VPCs?

The Google Cloud Platform makes it possible to connect instances hosted in separate Virtual Private Clouds (VPCs), even if those instances belong to different projects or are hosted in different regions. This feature, known as VPC Network Peering, can result in better security (as services do not need to be exposed on public IP addresses) and performance (due to use of private, rather than public, networks and IP addresses).

Learn more about VPC Network Peering.

How to connect to Kafka from a different machine?

Connecting to Kafka from the same network

IMPORTANT: We strongly discourage opening ports to allow inbound connections to the server from a different network. Making the application's network ports public is a significant security risk. The recommended way for connecting two instances deployed in different networks is by using VPC network peering. If you must make it accessible over a public IP address, we recommend restricting access to a trusted list of source IP addresses and ports using firewall rules. To do so, follow the instructions below.

To connect to the Kafka cluster from the same network where is running you just need to use a Kafka client and access the port 9092. You can find an example using the builtin Kafka client on the how to run a Kafka producer and consumer from the server itself? section.

NOTE: Remember that you can find the required configuration parameters in the /opt/bitnami/kafka/conf/kafka_jaas.conf file. You can also need some parameters from /opt/bitnami/kafka/conf/producer.properties in order to produce messages and from /opt/bitnami/kafka/conf/consumer.properties in order to consume them.

Connecting to Kafka from a different network

If you must connect to the application from a machine that is not running in the same network as the Kafka cluster, you can follow these approaches (these are shown in order of preference, from the most secure to the less recommended solution):

NOTE: You should only access using an SSH tunnel if you wish to temporarily connect to, or use, the Kafka console. This approach is not recommended to permanently connect your application to the Kafka cluster, as a connectivity failure in the SSH tunnel would affect your application's functionality.
  • Option 3: Make the server publicly accessible and restrict access to a trusted list of source IP addresses using firewall rules. Refer to the FAQ for information on opening ports in the server firewall.

How to connect instances hosted in separate virtual networks or VPCs?

The Google Cloud Platform makes it possible to connect instances hosted in separate Virtual Private Clouds (VPCs), even if those instances belong to different projects or are hosted in different regions. This feature, known as VPC Network Peering, can result in better security (as services do not need to be exposed on public IP addresses) and performance (due to use of private, rather than public, networks and IP addresses).

Learn more about VPC Network Peering.

How to change Kafka password?

You can change the Kafka password at any moment by editing the /opt/bitnami/kafka/conf/kafka_jaas.conf file. To do so, follow these instructions:

NOTE: This procedure must be done in each Kafka node you have configured in your cluster. Check the how to connect to cluster nodes section for further information on this.
  • Connect to the server through SSH.
  • Stop all the services:

    $ sudo service bitnami stop
    
  • Edit the /opt/bitnami/kafka/conf/kafka_jaas.conf file and edit the password for both the Kafka Client and the Kafka Server as shown below:

    Change Kafka password

  • Restart services to take the changes effect:

    $ sudo service bitnami restart
    

How to create a Kafka multi-broker cluster?

This section describes the creation of a multi-broker Kafka cluster with brokers located on different hosts. In this scenario:

  • One server hosts the Zookeeper server and a Kafka broker
  • The second server hosts a a second Kafka broker
  • The third server hosts a producer and a consumer

Kafka cluster

NOTE: Before beginning, ensure that ports 2181 (Zookeeper) and 9092 (Kafka) are open on the first server and port 9092 (Kafka) is open on the second server. Also ensure that remote connections are possible between the three servers (instructions).

Configuring the first server (Zookeeper manager and Kafka broker)

The default configuration may be used as is. However, you must perform the steps below:

  • Delete the contents of the Zookeeper and Kafka temporary directories

     $ sudo rm -rf /opt/bitnami/kafka/tmp/kafka-logs
     $ sudo rm -rf /opt/bitnami/zookeeper/tmp/zookeeper
    
  • Restart the Kafka and Zookeeper services.

     $ sudo /opt/bitnami/ctlscript.sh restart kafka
     $ sudo /opt/bitnami/ctlscript.sh restart zookeeper
    

Configuring the second server (Kafka broker)

  • Edit the /opt/bitnami/kafka/config/server.properties configuration file and update the broker.id parameter.

     broker.id = 1
    

    This broker id must be unique in the Kafka ecosystem.

  • In the same file, update the zookeeper.connect parameter to reflect the public IP address of the first server.

     zookeeper.connect=PUBLIC_IP_ADDRESS_OF_ZOOKEEPER_MANAGER:2181
    
  • Delete the contents of the Zookeeper and Kafka temporary directories

     $ sudo rm -rf /opt/bitnami/kafka/tmp/kafka-logs
     $ sudo rm -rf /opt/bitnami/zookeeper/tmp/zookeeper
    
  • Stop the Zookeeper service.

     $ sudo /opt/bitnami/ctlscript.sh stop zookeeper
    
  • Restart the Kafka service.

     $ sudo /opt/bitnami/ctlscript.sh restart kafka
    

Configuring the third server (Kafka message producer/consumer)

  • Edit the /opt/bitnami/kafka/config/producer.properties file and update the metadata.broker.list parameter with the public IP addresses of the two brokers:

     metadata.broker.list=PUBLIC_IP_ADDRESS_OF_FIRST_KAFKA_BROKER:9092, PUBLIC_IP_ADDRESS_OF_SECOND_KAFKA_BROKER:9092
    
  • Edit the /opt/bitnami/kafka/config/consumer.properties file and update the zookeeper.connect parameter to reflect the public IP address of the first server.

     zookeeper.connect=PUBLIC_IP_ADDRESS_OF_ZOOKEEPER_MANAGER:2181
    
  • Since this host only serves as a producer and a consumer, stop the Kafka and Zookeeper services:

     $ sudo /opt/bitnami/ctlscript.sh stop kafka
     $ sudo /opt/bitnami/ctlscript.sh stop zookeeper
    

Testing the cluster

NOTE: The following commands should be executed on the third server (Kafka message producer/consumer).
  • Create a new topic.

     $ /opt/bitnami/kafka/bin/kafka-topics.sh --create --zookeeper PUBLIC_IP_ADDRESS_OF_FIRST_KAFKA_BROKER:2181 --replication-factor 2 --partitions 1 --topic multiBroker
    
  • Produce some messages by running the command below and then entering some messages, each on a separate line. Enter Ctrl-C to end.

     $ /opt/bitnami/kafka/bin/kafka-console-producer.sh --broker-list PUBLIC_IP_ADDRESS_OF_FIRST_KAFKA_BROKER:9092 --topic multiBroker
     this is a message
     this is another message
     ^C
    
  • Consume the messages. The consumer will connect to the cluster and retrieve and display the messages you entered in the previous step.

     $ /opt/bitnami/kafka/bin/kafka-console-consumer.sh --zookeeper PUBLIC_IP_ADDRESS_OF_FIRST_KAFKA_BROKER:2181 --topic multiBroker --from-beginning
     this is a message
     this is another message
     ^C
    

How to run a Kafka producer and consumer from the server itself?

To publish and collect your first message you can follow these instructions:

  • Export the authentication configuration:

     $ export KAFKA_OPTS="-Djava.security.auth.login.config=/opt/bitnami/kafka/conf/kafka_jaas.conf"
    
  • Declare a new topic:

     $ /opt/bitnami/kafka/bin/kafka-topics.sh --create --zookeeper ZOOKEEPER_PRIMARY_NODE:2181 --replication-factor 1 --partitions 1 --topic test
    

    Where ZOOKEEPER_PRIMARY_NODE is a placeholder that must be substituted with your first ZooKeeper node. For instance, if your deploy is called my-first-deploy, then you must use my-first-deploy-zk-0 to access you first zookeeper node. --replication-factor is used to indicate how many servers are going to have a copy of the logs, and --partitions is used to choose the number of partitions we are creating for the topic.

  • Start a new producer on the same Kafka server and generates a message in the topic. Remember to replace SERVER-IP with your server's public IP address. Enter CTRL-D to end the message.

     $ /opt/bitnami/kafka/bin/kafka-console-producer.sh --broker-list SERVER-IP:9092 --producer.config /opt/bitnami/kafka/conf/producer.properties --topic test
    
     this is my first message
     and this one my second
    
  • Push Ctrl+D when you finish entering your messages.

  • Collect and display the first message in the consumer:

     $ /opt/bitnami/kafka/bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test --consumer.config /opt/bitnami/kafka/conf/consumer.properties --from-beginning
    

Troubleshooting

If you get a message like this:

  org.apache.kafka.common.KafkaException: Failed to construct kafka producer
   at org.apache.kafka.clients.producer.KafkaProducer.<init>(KafkaProducer.java:433)
   at org.apache.kafka.clients.producer.KafkaProducer.<init>(KafkaProducer.java:291)
   at kafka.producer.NewShinyProducer.<init>(BaseProducer.scala:40)
   at kafka.tools.ConsoleProducer$.main(ConsoleProducer.scala:49)
   at kafka.tools.ConsoleProducer.main(ConsoleProducer.scala)
  Caused by: java.lang.IllegalArgumentException: Could not find a 'KafkaClient' entry in the JAAS configuration. System property 'java.security.auth.login.config' is not set
   at org.apache.kafka.common.security.JaasContext.defaultContext(JaasContext.java:131)
   at org.apache.kafka.common.security.JaasContext.load(JaasContext.java:96)
   at org.apache.kafka.common.security.JaasContext.load(JaasContext.java:78)
   at org.apache.kafka.common.network.ChannelBuilders.create(ChannelBuilders.java:103)
   at org.apache.kafka.common.network.ChannelBuilders.clientChannelBuilder(ChannelBuilders.java:61)
   at org.apache.kafka.clients.ClientUtils.createChannelBuilder(ClientUtils.java:86)
   at org.apache.kafka.clients.producer.KafkaProducer.<init>(KafkaProducer.java:390)
   [...]

or like this one:

  [ERROR Unknown error when running consumer:  (kafka.tools.ConsoleConsumer$)
  org.apache.kafka.common.KafkaException: Failed to construct kafka consumer
   at org.apache.kafka.clients.consumer.KafkaConsumer.<init>(KafkaConsumer.java:781)
   at org.apache.kafka.clients.consumer.KafkaConsumer.<init>(KafkaConsumer.java:635)
   at org.apache.kafka.clients.consumer.KafkaConsumer.<init>(KafkaConsumer.java:617)
   at kafka.consumer.NewShinyConsumer.<init>(BaseConsumer.scala:61)
   at kafka.tools.ConsoleConsumer$.run(ConsoleConsumer.scala:78)
   at kafka.tools.ConsoleConsumer$.main(ConsoleConsumer.scala:54)
   at kafka.tools.ConsoleConsumer.main(ConsoleConsumer.scala)
  Caused by: java.lang.IllegalArgumentException: Could not find a 'KafkaClient' entry in the JAAS configuration. System property 'java.security.auth.login.config' is not set
   at org.apache.kafka.common.security.JaasContext.defaultContext(JaasContext.java:131)
   at org.apache.kafka.common.security.JaasContext.load(JaasContext.java:96)
   at org.apache.kafka.common.security.JaasContext.load(JaasContext.java:78)
   at org.apache.kafka.common.network.ChannelBuilders.create(ChannelBuilders.java:103)
   at org.apache.kafka.common.network.ChannelBuilders.clientChannelBuilder(ChannelBuilders.java:61)
   at org.apache.kafka.clients.ClientUtils.createChannelBuilder(ClientUtils.java:86)
   at org.apache.kafka.clients.consumer.KafkaConsumer.<init>(KafkaConsumer.java:702)

it is possible you did not configure the authentication. Run the following command to export the kafka_jaas.conf file with the required credentials for the client.

  $ export KAFKA_OPTS="-Djava.security.auth.login.config=/opt/bitnami/kafka/conf/kafka_jaas.conf"

or, in case your are usin your own client you not using the authentication parameters properly.

How to debug Kafka and Zookeeper errors?

The main Kafka log file is created at /opt/bitnami/kafka/logs/server.log.

The main Zookeeper log file is created at /opt/bitnami/zookeeper/logs/zookeeper.out.

google-templates

Bitnami Documentation