Data Locality is the ability to keep compute and storage close together. The amount of data for the big guys that is being ingested into their storage environments is increasing at a ridiculous rate. Facebook has over 350 millions photos day being uploaded into clusters across the world. Once the photos are loaded, they need to be indexed and multiple copies are created for different areas of the site. The daily storage requirements alone a day are well north of 100 TB a day. Data locality gives Facebook the ability to plop data down where it makes the most sense and then when needed, process that huge amount of data without having to move the whole data set across the wire. Efficiency equals profits.
“Ok Dwayne, I am not Facebook and don’t even have 100 TB total. Why do I care? Josh Odgers has a great blog post on data locality and why is important for vSphere DRS clusters. I wanted to highlight the ability to run mixed workloads on the same cluster. Nutanix with its ability to deliver data locality can prevent the noisy neighbour problem from happening.
In the first grouping of ESXi servers on the left is using disk shares to control performance. The problem with disk shares is that they only work at the host level. Virtual machines on another hosts have no concept of what it’s neighbour is doing. Since each host is using traditional shared storage, it’s first come, first served on the datastore in question. There really isn’t a great way to deliver QofS for the workload in this situation.
In the grouping of ESXi servers that are using Storage IO control have the ability to use disk shares at the cluster cluster. The total number of disk shares will be now used across the cluster for the particular datastore that has Storage IO control enabled. Since VM C has 500 shares out of possible 2500 shares (VM A + VM B + VM C) it gets 20% of the overall performance if there was contention. By default Storage IO control will be activated if datastore is experiencing over 30 ms of latency.
In the last grouping of servers on the right are nodes apart of a Nutanix cluster. Each node adds a predetermined about of performance. For example lets say each node delivers 20,000 IOPS. VM A would get 15,000 IOPS, VM B gets 5,000 IOPS and VM C gets 20,000 IOPS. The flash resources that make the cluster are available to all nodes but delivering performance locally will always happen first. This isolation prevents the noisy neighbour and can give you the ability to use certain nodes for Test/Dev with buying like for like gear and impacting production.
If you look at the features with Enterprise Plus Nutanix can provide them without the extra licensing. Storage DRS is not needed usually because you only have one volume to manage with Nutanix and we provide the benefits of Flash Cache inherently. Another added plus is the benefits Nutanix provides carry over to other hypervisors that may not have these advanced features available.
That being said I do think Storage IO control and Storage DRS have great benefit in cloud environments. In cloud environments where you might be utilizing many different vendors to provide storage with no tiering capabilities. Plus storage DRS does provides smart placement of VM’s in regards to storage performance in a mixed storage environments.
While running a larger scale VDI deploy north of 5,000 users plus a large scale SQL in the same cluster wouldn’t make a lot of sense from a failure domain perspective it can be done. I think there is still great value in companies that want to spin up VDI maybe as way to give remote access or for a small project without having to worry about bringing down their SQL environment.
And if your against the mixed workloads at the very least you can ensure consistent performance of your workloads. Take a look at the below graph. The data was collected using View Planner running on our 2000 model(first model provided by Nutanix). In the land of VDI, user experience and consistency go hand in hand.
When using traditional storage that has two controllers combined with flash, Storage IO control still can’t help the inevitable which is a bottleneck. Maybe a handful of virtual machines had the shares set high enough to make an impact across the cluster but most will grind to a halt. Scale out and data locality for happy users.
Other Nutanix Related Articles