googleads
Kafka And Zookeeper Multi Node Cluster Setup | DevOps Services
preloder
DevOps Web Development

Kafka And Zookeeper Multi Node Cluster Setup

Kafka is a distributed streaming platform that is used to build real-time pipelines and streaming apps. It is a good replacement for a traditional message broker. For applications with large-scale message processing Kafka is the most preferred.

It is used by very large applications like twitter, LinkedIn, uber, etc. And Zookeeper is a centralized service that maintains configuration information, naming and provides the intention of this blog, how to set up a Kafka and Zookeeper multi-node cluster for message streaming process.

If you want to implement high availability in a production environment, the Apache Kafka server cluster must consist of multiple servers.

For a cluster to be always up and running, the majority of the nodes in the cluster should be up. So, it is always recommended to run zookeeper cluster in the odd number of servers.

In this blog, I’ll set up Kafka and zookeeper cluster with 3 nodes.

What is Kafka?

Kafka is used for building real-time data pipelines and streaming apps.

What is Zookeeper?

ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. All of these kinds of services are used in some form or another by distributed applications.

 

Prerequisites

  1. Install Java(Minimum 1.7).
  2. Kafka zookeeper binary files.

Install Java

Install the java in all instances

In Ubuntu,

Add the PPA using below command

 

Run commands to update system package index and install Java installer script:

 

In CentOS,

 

check the version

 

Install Zookeeper

Download the Zookeeper binaries on your all instances and extract them.

 

Install Kafka

Download the Kafka binaries on your all instances and extract them.

 

Update zookeeper properties

Create Zookeeper.properties file using the below command in all instances.

 

And update the zookeeper.properties file with below content in all instances.

 

 

  • The value of dataDir with the directory where you would like ZooKeeper to save its data and log respectively.
  • clientPort property, as the name suggests, is for the clients to connect to ZooKeeper Service.
  • x in server.x denotes the id of Node. Each server.x row must have a unique id. Each server is assigned an id by creating a file named myid, one for each server, which resides in that server’s data directory, as specified by the configuration file parameter dataDir.

Create myid file:

 

 

  • The ports, :2888:3888(Don’t change) that is at the end of the nodes. Zookeeper nodes will use these ports to connect the individual follower nodes to the leader nodes. The other port is used for leader election.
  • And x.x.x.x is each node IP Address better to use private IP here. If you are trying to use public IP, Current node IP should be replaced with 0.0.0.0 in each node.

 

Update Kafka server properties

Update kafka server.properties file in all instances with the below content. This file is located in /opt/kafka/config/server.properties

 

 

  • The broker.id property is the unique and permanent name of each node in the cluster.
  • advertised.host.name Hostname to publish to ZooKeeper for clients to use.
  • zookeeper.connect Specifies the ZooKeeper connection string in the form hostname:port where host (better to use private IP of each node) and port are the host and port of a ZooKeeper server.

Start the services

Before starting the Kafka service, start the zookeeper service using below command in all instances

 


Check the zookeeper status using the below command. In these nodes, any of one zookeeper service will act as a leader remaining will be followers.

 

Start the kafka service using below command in all instances

 

Test the services from terminal

Create the topic using below command

 

Get the list of topics using below command

 

Consume the messages using below command

 

It will keep on listen, currently, there are messages in that topic.

Note: we have to use –zookeeper option with anyone or list of clustered node IP address’s or domain’s while creating topics, listing topics and consuming the message.

Produce the messages using below command (open the new terminal and try it).

 

It will keep on listening, type the message and hit the enter and now check in previous consumer terminal.

It’s possible to test these operations from other systems (out of cluster nodes). if advertised.host.name property is configured by public IP in Kafka server properties. For more details about the Kafka and Zookeeper feel free to connect with us.

And also for the regular updates on technology follow our blog 

The following two tabs change content below.

Reddy Sai

Young Senior Software Developer having 3+years of experience in Web development, skilled in Golang, Ruby, AngularJS, Ruby on Rails & NodeJS. With a handful of skills, he keeps thriving through all the Edge of Web development to become a perfect back end developer. Besides, this young techie will always buzz up & surprise us with his beautiful Movie collections.

7 thoughts on “Kafka And Zookeeper Multi Node Cluster Setup”

  1. Hi,
    these instructions are great. I was trying to install on AWS centos instance. When I tried to get status of zookeeper, it was giving me it is probably not running.
    Please check below and please please help me to resolve this to make zookeeper and kafka up and running.
    [root@ip-172-31-81-231 bin]# ./zkServer.sh start
    ZooKeeper JMX enabled by default
    Using config: /opt/zookeeper/zookeeper-3.4.10/bin/../conf/zoo.cfg
    Starting zookeeper … STARTED
    [root@ip-172-31-81-231 bin]# ./zkServer.sh status
    ZooKeeper JMX enabled by default
    Using config: /opt/zookeeper/zookeeper-3.4.10/bin/../conf/zoo.cfg
    Error contacting service. It is probably not running.
    [root@ip-172-31-81-231 bin]# cd ../../..
    [root@ip-172-31-81-231 opt]# cd ..
    [root@ip-172-31-81-231 /]# cd /tmp
    [root@ip-172-31-81-231 tmp]# cd zookeeperlogs
    [root@ip-172-31-81-231 zookeeperlogs]# ll
    total 8
    -rw-r–r– 1 root root 33 Apr 2 17:20 myid
    -rw-r–r– 1 root root 4 Apr 2 18:14 zookeeper_server.pid
    [root@ip-172-31-81-231 zookeeperlogs]# vi zookeeper_server.pid
    [root@ip-172-31-81-231 zookeeperlogs]# cd ../..
    [root@ip-172-31-81-231 /]# cd /opt/zookeeper/zookeeper-3.4.10/bin
    [root@ip-172-31-81-231 bin]# nohup ./zkServer.sh start /opt/kafka/kafka_2.11-0.10.1.1/config/zookeeper.properties &
    [1] 2554
    [root@ip-172-31-81-231 bin]# nohup: ignoring input and appending output to nohup.out'
    [1]+ Done nohup ./zkServer.sh start /opt/kafka/kafka_2.11-0.10.1.1/config/zookeeper.properties
    [root@ip-172-31-81-231 bin]# ps -ef | grep -i zookeeper
    root 2572 2388 0 18:20 pts/0 00:00:00 grep -i zookeeper
    [root@ip-172-31-81-231 bin]# ps -ef | grep -i zookeeper
    root 2576 2388 0 18:24 pts/0 00:00:00 grep -i zookeeper
    [root@ip-172-31-81-231 bin]# cd ../../..
    [root@ip-172-31-81-231 opt]# cd ..
    [root@ip-172-31-81-231 /]# cd /opt
    [root@ip-172-31-81-231 opt]# ./zookeeper/zookeeper-3.4.10/bin/zkServer.sh start kafka/kafka_2.11-0.10.1.1/config/zookeeper.properties
    ZooKeeper JMX enabled by default
    Using config: kafka/kafka_2.11-0.10.1.1/config/zookeeper.properties
    Starting zookeeper ... STARTED
    [root@ip-172-31-81-231 opt]# ./zookeeper/zookeeper-3.4.10/bin/zkServer.sh status
    ZooKeeper JMX enabled by default
    Using config: /opt/zookeeper/zookeeper-3.4.10/bin/../conf/zoo.cfg
    Error contacting service. It is probably not running.
    [root@ip-172-31-81-231 opt]# nohup ./zookeeper/zookeeper-3.4.10/b
    bin/ build.xml
    [root@ip-172-31-81-231 opt]# nohup ./zookeeper/zookeeper-3.4.10/bin/zkServer.sh start kafka/kafka_2.11-0.10.1.1/config/zookeeper.properties&
    [1] 2612
    [root@ip-172-31-81-231 opt]# nohup: ignoring input and appending output to
    nohup.out’
    [1]+ Done nohup ./zookeeper/zookeeper-3.4.10/bin/zkServer.sh start kafka/kafka_2.11-0.10.1.1/config/zookeeper.properties
    [root@ip-172-31-81-231 opt]# ^C
    [root@ip-172-31-81-231 opt]# ps -ef | grep -i zookeeper
    root 2626 2388 0 18:33 pts/0 00:00:00 grep -i zookeeper

    1. Hi Roopa,
      Please define the appropriate Directory path dataDir=/tmp/zookeeper in zookeeper.properties file and check the myid file in that you have declared the unique id. And also you can check these below instructions,
      The value of dataDir with the directory where you would like ZooKeeper to save its data and log respectively.
      clientPort property, as the name suggests, is for the clients to connect to ZooKeeper Service.
      x in server.x denotes the id of node. Each server.x row must have unique id. Each server is assigned an id by creating a file named myid, one for each server, which resides in that server’s data directory, as specified by the configuration file parameter dataDir.
      Hope this will work, Thank You!

  2. how can I set up multiserver kafka cluster between machines of various public IP addresses?I can only configure server.properties for it but don’t know how to configure zookeeper.properties,create a topic ,start a producer and a consumer.

    1. Hi Akshay,
      In the blog, I clearly explained how to set up a multiserver Kafka cluster between machines of various private IP addresses. If you are trying to use public IP, then the current node IP should be replaced with 0.0.0.0 in each node.
      To configure zookeeper.properties, create a topic, start a producer and a consumer, I have already given examples.
      If you are struggling anywhere, let us know we are always ready to help. Thank you!

  3. Hi,
    Just a doubt..
    For production I need to setup 3 servers – server 1, server 2 , server 3 ( very similar to your example )
    Should i install zookeeper and kafka in all these 3 servers also or only the zookeeper is enough ?
    If yes will the config file ( zookeeper.properties and server.properties) be the same in all 3 servers ?
    Please help me in this regard.
    Thanks in advance !

    1. Yes! you have to install zookeeper and kafka in 3 servers. And, the configuration will be same in all servers(you have to follow the some unique rules like unique broker id in server.properties and unique Id of node in zookeeper.properties). We clearly explained about configurations please follow this.

Leave a Reply

Your email address will not be published. Required fields are marked *

[contact-form-7 id="120788" title="Web Page Form"]

Schedule Your Call