Mar
13

Data Locality: Congestion Not Latency

I love Nutanix data locality for helping with the noisy neighbour syndrome and because it helps in spreading the data evenly across the cluster. Spreading the data across the cluster has impact on rebuild times and bottlenecks.

Our Director of Engineering, Vishal Sinha brought up another good point yesterday around data locality. He mentioned that what kills network performance is not latency but congestion. Congestion can come in many forms – microbursts (200Mbps burst for 10ms, which equates to 20Gbps equivalent of traffic on the 10G port for that 10ms, resulting in lots of traffic getting dropped if the switch does not has enough buffer), or for e.g. a mis-behaving NIC sending PAUSE frame and slowing down the network.
datalocality
Our data locality feature drastically reduces chances of running into network related storage issues since we don’t rely on network for reads and we need to write to only one remote node. It’s all about reducing coupling and dependency between various components, and limiting resource consumption. Do more with less, make components independent == scalability. Data locality is core component of distributed systems, for example hadoop.

If you want to read more about microburst, here is a link: http://blog.endace.com/2012/01/what-is-a-microburst-really/

Comments

  1. Thanks for your article. I want to understand what you me with “we need to write to only one remote node”?
    In your other article nutanix-drop-it-like-its-hot is following said:
    Write IO – Data is always written locally. Data is replicated on other nodes for high availability. Replica’s are spread aacross the cluster for high performance.

    Now I am confusioned? To one node or to several?

    http://itbloodpressure.com/2012/06/12/nutanix-drop-it-like-its-hot/

    • dlessner says:

      When data is written to the oplog with RF=2, a copy is written locally and one remotely. Multiple machines are doing this to all the different nodes in the cluster. Reads are localized. When data is auto tier’d from flash to HDD Nutanix will localize copy’s to where the workload is located.

      If you didn’t write at least one copy locally you would have a high wire amplification on the network.

Trackbacks

  1. […] locality has an important role to play in performance, network congestion and in […]

  2. […] within Nutanix provides many benefits around adding new hardware into the cluster, performance, reducing network congestion and ensuring performance is consistent. Some people will argue about 10 Gb networking is fast […]

Speak Your Mind

*