In a cluster environment services on nodes may have to be coordinated for various reasons. For e.g., when a configuration change is made to a distributed computing component like HDFS
, the HDFS
service on all nodes shouldn’t stop at the same time to restart so that the configuration takes in effect. Stopping of the service on all the nodes will end up in unavailability which is not desired to put it lightly.
There are many options to perform orchestration/coordination with varied maturity when you manage a cluster using Chef
. Here we look at how Chef
and ZooKeeper
can work together to perform coordination of services on cluster nodes. We will use the need to control and coordinate service restart so that the service in all nodes are not stopped at the same time as the example to explain the solution.