Under the Covers of a Distributed Virtual Computing Platform – Part 2: ZZ Top

In case you missed Part 1 – Part 1: Built For Scale and Agility
No it’s not Billy Gibbons, Dusty Hill, or drummer Frank Beard. It’s Zeus and Zookeeper providing the strong blues that allow the Nutanix Distributed File System to maintain it’s configuration across the entire cluster.

Zeus is the Nutanix library that all other components use to access the cluster configuration. As mentioned before Zeus allows the interaction with other components in the file system but allows the component, Zookeeper to replaced if need be. This is very important as the open source community is having 200,000+ engineers in your back pocket. There is interesting article about Netflix using Zookeeper as well. Sure you still need bright minds but we have those. I think our Hardware to Software Engineering spit was 1 to 9. End of the day we are software company that delivers medicine to Enterprises in a hardware form factor. Zeus keeps tracks of IP addresses of ESXi hosts, virtualized storage controllers, and health information thru IPMI(ilo\DRAC), capacities, data replication rules and all of the cluster configuration. Zeus helps to provide the glue between storage & compute to form a single active identity. Even without having the IPMI plugged in the Nutanix Command Center UI can get all the health stats it needs.

Zookeeper runs only on three nodes on the cluster, no matter how big or small the cluster gets. Since it’s tracking configuration data that doesn’t change the often there is no impact on performance. Using multiple nodes prevents stale data from being returned to other components, while having an odd number provides a method for breaking ties if two nodes have different information. One Zookeeper node is elected as the leader. The leader receives all requests for information and confers with the two follower nodes. If the leader stops responding, a new leader
is elected automatically. You can easily tell who the cluster leader is by doing a cluster status. Nutanix uses Paxos-like algorithm for consistency.

The cluster leader will also be the node that will be the point man for the support team. The support team comprised of Rock Stars from VMware, Oracle, Cisco and on and on seat right beside engineering for fast response. The cluster leader’s entry point into support can be shut off for cold sites but does provide a available link to support.

Zookeeper has no dependencies, meaning that it can start without any other cluster components running. More info on Zookeeper

Part 3 will be our Distributed Metadata layer with Medusa & Cassandra

Speak Your Mind