This tutorial provides step by step instructions to configure and start up Apache ZooKeeper 3.4.6 Multi-node cluster (also known as Ensemble).
Apache ZooKeeper, at its core, provides an API to let you manage your application state in a highly read-dominant concurrent and distributed environment. It is optimized for and performs well in the scenario where read operations greatly outnumber write operations.
This article assumes that you have got the basic idea of technical architecture and components of Apache ZooKeeper. If you are totally new to Apache ZooKeeper, you are strongly recommended to read the article - Introduction to Apache ZooKeeper.
First thing that we would need in order to install Apache ZooKeeper are multiple machines. In this tutorial, We will be utilizing following virtual machines to install Apache ZooKeeper -
|Parameter Name||Virtual Machine 1||Virtual Machine 2|
|No of CPU Cores||4||4|
|RAM||6 GB||6 GB|
Apart from above machines, please ensure that the following pre-requisites have been fulfilled to ensure that you are able to follow this article without any issues-
- JDK 6 or higher installed on all the virtual machines
- JAVA_HOME variable set to the path where JDK is installed
- Root access on all the virtual machines as all the steps should ideally be performed by root user
- Updated /etc/hosts file on both the virtual machines with the IP address of other virtual machines. E.g. /etc/hosts on VM1 will need to have IP address of VM2 along with hostname (VM2). In my case, this additional line in VM1 hosts file looks like 192.168.111.132 VM2.
First step to install Apache ZooKeeper is to download its binaries on both the virtual machines. In this article, we will be installing Apache ZooKeeper 3.4.6 to set up cluster which can be downloaded from here.
Once the libraries have been downloaded on the virtual machines, you can extract it to a directory where you would like ZooKeeper to be installed. We will refer this directory as $ZooKeeper_Base_Dir throughout this tutorial.
Once Apache ZooKeeper has been extracted on all the virtual machines, next step is to configure these. Below diagram depicts the deployment architecture that we will be setting up -
We don't need to mark any node as Leader node during configuration as the leader is automatically chosen by ZooKeeper service. So, configuration for all the nodes will be same. First part of configuration involves creating/updating a configuration file called zoo.cfg in $ZooKeeper_Base_Dir/conf directory with following contents:
tickTime=2000 #Replace the value of dataDir with the directory where you would like ZooKeeper to save its data dataDir=<$ZooKeeper_Base_Dir/data> #Replace the value of dataLogDir with the directory where you would like ZooKeeper to log dataLogDir=<$ZooKeeper_Base_Dir/logs> clientPort=2181 initLimit=10 syncLimit=5 server.1=192.168.111.130:2888:3888 server.2=192.168.111.132:2888:3888
First thing that you would need to do in above zoo.cfg file is to replace the value of dataDir and dataLogDir with the directory where you would like ZooKeeper to save its data and log respectively. Now, let's talk about some of the important parts of above configuration.
clientPort property, as the name suggests, is for the clients to connect to ZooKeeper Service.
Next let's talk about the last two entries in server.x=hostname:nnnnn:mmmmmm format. Firstly, there are two port numbers nnnnn(2888) and mmmmm(3888). The first followers use to connect to the leader, and the second is for leader election. Secondly, x in server.x denotes the id of node. Each server.x row must have unique id. Each server is assigned an id by creating a file named myid, one for each server, which resides in that server's data directory, as specified by the configuration file parameter dataDir.
The myid file consists of a single line containing only the text of that machine's id. So myid of server 1 would contain the text 1 and nothing else. The id must be unique within the ensemble and should have a value between 1 and 255.
Once you are all set up, next step is to start the cluster. On all the virtual machines, go to bin directory of Apache ZooKeeper and execute the following commands -
You can execute the follow command to check the status of Apache ZooKeeper -
In order to stop Apache ZooKeeper, execute the following command on all the virtual machines -
Thank you for reading through the tutorial. In case of any feedback/questions/concerns, you can communicate same to us through your comments and we shall get back to you as soon as possible.