Intel 3D XPoint and NVM Express – Data Locality Matters For Hyper-Coverged Systems

Today data locality within Nutanix provides many benefits around adding new hardware into the cluster, performance, reducing network congestion and ensuring performance is consistent. Some people will argue about 10 Gb networking is fast enough and latency isn’t a problem. Maybe impact today is not as noticeable on low workloads but network congestion can measured and noticed on high performing systems today. Its only going to get worse with new technologies like 3D XPoint and NVM Express coming to market.

Nutanix only ever sends 1 write over the network where other vendors without data locality potential could be sending 2 writes over the network and also serving remote reads across the network as well. As you look at the density and the performance in 3D XPoint it becomes evident that data locality is going to a must check box feature.

3D XPoint Improvements over NAND

You can also see below that NVM Express can drive 6 GB/s! not 6 Gb. The 10 Gb will become a bottlneck even a 1 write across the network let alone 2 writes and reads going across the network.

To get a full insight of 3D XPoint and NVM Experss and the impact of data locality watch the below video from Intel.


Betting Against Intel and Moore’s Law for Storage

When I hear about companies essential betting against Moore’s Law it makes me think of crazy thoughts like me giving up peanut butter or beer. Too crazy right?! Take a gaze at what Intel is currently doing for storage vendors below:


Some of the more interesting ones in the hyper-converged space are detailed below.

Increased number of execution buffers and execution units:
More parallel instructions, more threads all leading to more throughput

CRC instructions:
For data protection, every enterprise vendor should be checking their data for data consistency and prevention against bit rot.

Intel Advanced Vector Extensions for wide integer vector operations(XOR, P+Q)
Nutanix uses this ability to calculate XOR for EC-X (erasure coding). It uses subcycles of a cpu on a bit of data which really helps has Nutanix parallels the operaion against all of the CPU’s in the cluster. Other vendors could use this for RAID 6.

Intel Secure Hash Algorithm Extensions
Nutanix uses these extensions to accelerate SHA-1 fingerprinting for inline and post-process dedupe. Making use of the Intel SHA Extensions on processors is designed to provide a performance increase over current single buffer software implementations using general purpose instructions.

Below is video that talks about how Nutanix does Dedupe

Transactional Synchronization Extensions (TSX)
Adds hardware transactional memory support, speeding up execution of multi-threaded software through lock elision. Multi-thread apps like the Nutanix Controller virtual machine can enjoy at 30-40% boost when mutiple threads are used. This provides a big boast for in-RAM caching.

Reduced VM Entry / Exit Latency
This helps the virtual machine, in this case the virtual storage controller it never really has to exit into the virtual machine manager due to a shadowing process. The end result is low latency and the penalty for user space vs kernel is removed from the table. Also happens to be one of the reasons why you can virtualize giant Oracle databases.

Intel VT-d
Improve reliability and security through device isolation using hardware assisted remapping and improved I/O performance and availability by direct assignment of devices. Nutanix directly maps SSDs and HDDs and removes the yo-yo affect of going through the hypervisor.

Intel Quick Assist


Intel has a Quick Assist card that will do full offload for compression and SHA-1 hashes. Guess what? This features on this card will be going on the CPU in the future. Nutanix could use this card today but chooses not to for service ability reasons but you can bet your bottom dollar that we’ll use it once the feature is baked onto the CPU’s.

To top everything else above, the next Intel CPU’s will deliver 44 cores per 2 socket server and 88 threads with hyper-threading!

If you want to watch a full break down of all the features, you can watch this video with Intel at Tech Field Day


Scale-Out Storage – In the Hypervisor Kernel or in a VM?

A new tech note from Nutanix discussing architectural considerations with implementing a converged, scale-­‐out storage fabric that run across a cluster of nodes. This paper focuses on high availability and resiliency for virtualizing business critical applications. The paper covers running storage services embedded in the hypervisor kernel and as virtual machine in the user space.

Scale-Out Storage – In the Hypervisor Kernel or in a VM?