Friday, March 15, 2019

ZooKeeper | A Reliable, Scalable Distributed Coordination

In previous posts we learnt about various big data projects/systems, all of these systems are distributed and clustered in nature. For distribution and cluster management, all of them needs one or another low level API. ZooKeeper can be seen as one of those low level APIs which can be used to build a distributed co-ordination system.

ZooKeeper is a highly reliable, scalable, distributed coordination system. As per ZooKeeper wiki 
"ZooKeeper allows distributed processes to coordinate with each other through a shared hierarchical name space of data registers".
ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization and group services. It provides a very simple interface to a centralized coordination service. The service itself is distributed and highly reliable.

Distributed applications use Zookeeper to store and mediate updates to important configuration information. Many top level big data projects like Hadoop, Kafka, HBase, Accumulo, Solr uses ZooKeeper as a distributed co-ordination system. Extensive list of projects powered by ZooKeeper can be found here.

As ZooKeeper wiki says it coordinates using a shared hierarchical data registers, in ZooKeeper terms these registers are known as ZNODEs.

ZooKeeper comes with bunch of "out of the box" benifits like:
  • Fast
    • ZooKeeper is fast with workloads where reads to the data are more than writes. The ideal read/write ratio is about 10:1.
  • Reliable
    • ZooKeeper is replicated over a set of servers known as ensemble. All the servers are visible to each other. The ZK service is available hence there is no single point of failure.
  • Simple
    • ZooKeeper follows a simple data model and maintains a standard hierarchical name space, similar to files and directories on a file system.
ZooKeeper Ensemble

ZooKeeper Use Cases

ZooKeeper address typically following use cases:
  • Configuration Management
    • Can be used by cluster members for loading configuration data from a centralized source.
    • Provides easier, simpler deployment/provisioning.
  • Distributed Cluster Management
    • Can be used for notifying node join/leave events.
    • Can be used to query node status in real time.
  • Naming Services
    • Can be used for writing naming service which may provide central naming registry.
  • Distributed Synchronization
    • Can be used for distributed cluster level locks, barriers, queues etc. which are not possible by normal programming paradigm.
  • Leader Election
    • Can be used for implementing leader election semantics in a distributed system in which one node acts as master over others.
  • Centralized Registry
    • Can be used to develop a highly reliable and simple data registry system.

Key Features

  • Simple & Replicated Data Model
    • Provides a very simple data model in which which nodes are stored in ZK similar to a normal hierarchical file system.
    • Node data is synchronized/replicated with all ZK nodes in an ensemble such that client connects to any of the ZK node and gets same data.
  • Simple API
    • Very simple API which is easy to understand & use. API operations includes create, delete, exists, get-data, set-data, get-children like simple operations.
    • Supports Java, Scala, C#, Node.js, Python, Erlang, Ruby and many more. Note that this list includes community developed binding APIs as well (full list is available here).
  • Node Type Support
    • Supports various types of node for different functionalities (out of the box) like persistence, ephemeral, sequence nodes. 
  • Node Watch Support
    • Provides support for listening node events like node created, deleted or altered.
  • ZooKeeper Guarantees
    • Though ZooKeeper is very fast and very simple but it needs to cater complicated service needs like synchronization, to achieve this goal ZK provides a set of guarantees (as per ZooKeeper wiki):
      • Sequential Consistency - Updates from a client will be applied in the order that they were sent.
      • Atomicity - Updates either succeed or fail. No partial results.
      • Single System Image - A client will see the same view of the service regardless of the server that it connects to. 
      • Reliability - Once an update has been applied, it will persist from that time forward until a client overwrites the update.
      • Timeliness - The clients view of the system is guaranteed to be up-to-date within a certain time bound.

Key Concepts

  • ZooKeeper Service
    • ZooKeeper Service is a daemon running for serving ZK node data.
    • It is replicated over a set of machines & all machines store a copy of the data (in memory).‏
    • A leader is elected on service startup.
    • Clients only connect to a single ZooKeeper server. The client maintains a TCP connection through which it sends requests, gets responses, gets watch events, and sends heartbeats (note that ZooKeeper is TCP oriented system).
    • Client can read from any Zookeeper server, writes go through the leader & needs majority consensus.
  • ZooKeeper Shell (
    • ZooKeeper binaries comes with a commandline shell, which can be used to access/manipulate ZooKeeper nodes.
    • Commandline can be connected & used as follows (assuming ZOOKEEPER_HOME points to a valid ZooKeeper installation and ZooKeeper server is running on localhost:2181):
      [anishsneh@localhost ~]$ cd $ZOOKEEPER_HOME
      [anishsneh@localhost zookeeper-3.4.6]$ ./bin/ 
      [zk: localhost:2181(CONNECTED) 0] help
      ZooKeeper -server host:port cmd args
       connect host:port
       get path [watch]
       ls path [watch]
       set path data [version]
       rmr path
       delquota [-n|-b] path
       printwatches on|off
       create [-s] [-e] path data acl
       stat path [watch]
       ls2 path [watch]
       listquota path
       setAcl path acl
       getAcl path
       sync path
       redo cmdno
       addauth scheme auth
       delete path [version]
       setquota -n|-b val path
      [zk: localhost:2181(CONNECTED) 0] create /root "data001"
      Created /root
      [zk: localhost:2181(CONNECTED) 1]
      [zk: localhost:2181(CONNECTED) 1] ls /root
      [zk: localhost:2181(CONNECTED) 2]
      [zk: localhost:2181(CONNECTED) 2] create /root/child01 "data002"
      Created /root/child01
      [zk: localhost:2181(CONNECTED) 3]
      [zk: localhost:2181(CONNECTED) 3] create /root/child02 "data003"
      Created /root/child02
      [zk: localhost:2181(CONNECTED) 4]
      [zk: localhost:2181(CONNECTED) 4] create /root/child01/grandchild01 "data004"
      Created /root/child01/grandchild01
      [zk: localhost:2181(CONNECTED) 5]
      [zk: localhost:2181(CONNECTED) 5] create /root/child02/grandchild02 "data005"
      Created /root/child02/grandchild02
      [zk: localhost:2181(CONNECTED) 6] 
      [zk: localhost:2181(CONNECTED) 6] ls2 /root
      [child01, child02]
      cZxid = 0xf
      ctime = Tue Oct 13 01:06:04 IST 2015
      mZxid = 0xf
      mtime = Tue Oct 13 01:06:04 IST 2015
      pZxid = 0x12
      cversion = 2
      dataVersion = 0
      aclVersion = 0
      ephemeralOwner = 0x0
      dataLength = 9
      numChildren = 2
      [zk: localhost:2181(CONNECTED) 7]
      [zk: localhost:2181(CONNECTED) 7] ls /root 
      [child01, child02]
      [zk: localhost:2181(CONNECTED) 8]
      [zk: localhost:2181(CONNECTED) 8] ls /root/child01
      [zk: localhost:2181(CONNECTED) 9]
      [zk: localhost:2181(CONNECTED) 9] ls /root/child01/grandchild01
      [zk: localhost:2181(CONNECTED) 10]
      [zk: localhost:2181(CONNECTED) 10] get /root
      cZxid = 0xf
      ctime = Tue Oct 13 01:06:04 IST 2015
      mZxid = 0xf
      mtime = Tue Oct 13 01:06:04 IST 2015
      pZxid = 0x12
      cversion = 2
      dataVersion = 0
      aclVersion = 0
      ephemeralOwner = 0x0
      dataLength = 9
      numChildren = 2
  • ZooKeeper Data Model
    • ZooKeeper follows a very simple data model in which nodes are written just like a normal file system's file/directory.
    • Data is kept in a hierarchal name space. Each node in the namespace is called as a ZNode.
    • Every ZNode has data (given as byte[]) and can optionally have children. 
    • ZNode paths: canonical, absolute, slash-separated, note that there are no relative path references.
  • ZNode
    • Unlike is standard file systems, each node in a ZooKeeper namespace can have data associated with it as well as children. It is like having a file-system that allows a file to also be a directory
    • ZNodes maintain a information about version numbers for data changes, ACL changes and timestamps for cache validations and coordination. Each time a znode's data changes, the version number increases. 
    • Client always receives the version of the data along with the node data.
  • ZNode Types
    • Persistent Nodes - Once created remain forever, unless explicitly deleted. 
    • Ephemeral Nodes - Exists as long as the session is active in other words exists as long as the client who created the node is connected. These type of nodes cannot have children. 
    • Sequence Nodes - These nodes append a monotonically increasing counter to the end of path to support uniqueness in the names. It is applicable to both persistent & ephemeral nodes.
  • ZNode Operations
    • Conceptually following operations can be performed on a ZNode:
  • ZNode Watches
    • A watch refers to the listener which can be set to listen node change events.
    • Clients may listen to following events on ZNodes (in other words client can set watches on the follwing events):
      • NodeChildrenChanged 
      • NodeCreated
      • NodeDataChanged
      • NodeDeleted
    • Changes to a ZNode trigger the watch and ZooKeeper sends the client a notification. 
    • Note that watches are one time triggers and are always ordered.
    • Client should be capable to handle latency between getting the event and sending a new request to get a watch.
  • ZNode Reads & Writes
    • Read requests are processed locally at the ZooKeeper server to which the client is currently connected.
    • Write requests are forwarded to the leader and go through majority consensus before a response is generated.
  • API Synchronicity
    • API methods calls can be synchronous or asynchronous
      • Synchronous:
        exists("/demo-cluster/conf", null);
      • Asynchronous:
        exists("/demo-cluster/conf", null, new StatCallback() {
          public processResult(int rc, String path, Object ctx, Stat stat){
           //process result when called back later
         }, null

Useful Links

In next post we will learn to setup ZooKeeper ensemble and implement a basic leader election algorithm using Java API.


  1. Archie 420 Dispensary is a trusted Cannabis dispensary base in Los Angeles California USA. It is one of the top dispensary in this part of the country. They do deliver Marijuana in the USA and to over 25 countries in the world. Purple haze weed for sales You can always visit their dispensary in Los Angeles using the address on their website. Place your order and get served by the best dispensary in the planet. Have fun.

  2. Crystal online pharmacy is a trusted online drug store with a wide range of products to suit the needs of our clients. Buy greens online Crystal Pharmacy do strive to offer the best service and ship products world wide. All the products listed on our website are Ava in stock. Expect your order to be processed Immediately when you send us your request. We deal with varieties of drugs for our customers satisfaction. We cross barriers with our products and struggle hard to meet human satisfaction. When shopping with us, Be safe and secured and you will realize how swift we are with our services.

  3. Universal Gun sales is a trusted Firearm company base in Los Angeles California USA. It is one of the top Firearms Company in this part of the country. Buy hunting Riffles online They do offer the best firearms deal in the USA and to over 25 countries in the world. You can always visit their shop in Los Angeles using the address on their website. Place your order and get served by the best Firearm Company in the planet. Have fun.

  4. Health Experts have proven that regular exercise coupled with a good diet allow you to live longer and healthier. In this busy day and age, not everyone has the time to go to the gym - resulting to a lot of overweight people that desperately need to exercise. A healthy alternative is for you to Buy Home Gym Equipments that you can store in your own home or even at your office. Here are some tips when buying home gym equipment.

    First, know your fitness goals and keep these goals in mind when you are buying home gym equipment. One of the biggest mistakes that people make is buying the biggest or trendiest fitness machine simply because they like how it looks. More often than not, these end up gathering dust in your storage rooms or garage because you never end up using them. It is important to take note of what particular type of physical activity you want or enjoy doing before you buy your exercise machine. If you are looking to loose a few pounds and you enjoy walking or hiking, a treadmill is the best option for you. If you are looking to tone your lower body while burning calories a stationary bike is your obvious choice. Similarly, Special Equipments for Core Strength Exercises, Strength Training Weight Vests, Core Strength & Abdominal Trainers, Home Cardio Training, Strength Training Power Cages, Strength Training Racks & More.

    Second, set aside a budget before Buying Home Gym Equipments. Quality exercise machines do not come cheap. They are constantly exposed to wear and tear when they are used properly. So, pick machines that are built to last and have passed quality certifications to get the most out of your money. If you are operating on a tight budget, think about investing in several weights, We can Provide you High Quality Home Gym Equipment at Very Low Prices for your Complete Uses: Core Strength Exercises, Strength Training Weight Vests, Core Strength & Abdominal Trainers, Home Cardio Training, Strength Training Power Cages, Strength Training Racks & More.

    Its the Right Time to Buy Home Gym Equipment for you at Very Low Prices.

  5. 국내 최고 스포츠 토토, 바카라, 우리카지노, 바이너리 옵션 등 검증완료된 메이져 온라인게임 사이트 추천해 드립니다. 공식인증업체, 먹튀 검증 완료된 오라인 사이트만 한 곳에 모아 추천해 드립니다 - 카지노 사이트 - 바카라 사이트 - 안전 놀이터 - 사설 토토 - 카지노 솔루션.

    온라인 카지노, 바카라, 스포츠 토토, 바이너리 옵션 등 온라인 게임의 최신 정보를 제공해 드립니다.

    탑 카지노 게임즈에서는 이용자 분들의 안전한 이용을 약속드리며 100% 신뢰할 수 있고 엄선된 바카라, 스포츠 토토, 온라인 카지노, 바이너리 옵션 게임 사이트 만을 추천해 드립니다.

  6. Oferecemos os melhores serviços para seguidores, curtidas, comentários e visualizações no Instagram. Os serviços são entregues de forma rápida, segura e por um preço honesto e barato.
    Todos os serviços são iniciados de forma automática e você pode fazer um teste grátis de seguidores, curtidas e visualizações no Instagram.

    O Go Followers surgiu em 2015 e durante alguns anos foi o melhor e maior site para comprar seguidores, curtidas e comentários no instagram. Com o passar dos anos evoluímos e em 2021 lançamos um novo site simples e intuitivo para aumentar a experiência dos mais de 150.000 clientes atendidos.

    Nossa ferramenta envia seguidores brasileiros reais e ativos em sua grande maioria para qualquer perfil do Instagram. Escolha o melhor pacote para comprar seguidores instagram de alta qualidade. Você poderá escolher alguns pacotes como seguidores com curtidas, seguidores masculinos, seguidores femininos e seguidores automáticos.
    Todos os pedidos são feitos de forma automática e iniciam em poucos minutos após a compra, tá esperando o que?