Intel 3D XPoint and NVM Express – Data Locality Matters For Hyper-Coverged Systems

Today data locality within Nutanix provides many benefits around adding new hardware into the cluster, performance, reducing network congestion and ensuring performance is consistent. Some people will argue about 10 Gb networking is fast enough and latency isn’t a problem. Maybe impact today is not as noticeable on low workloads but network congestion can measured and noticed on high performing systems today. Its only going to get worse with new technologies like 3D XPoint and NVM Express coming to market.

Nutanix only ever sends 1 write over the network where other vendors without data locality potential could be sending 2 writes over the network and also serving remote reads across the network as well. As you look at the density and the performance in 3D XPoint it becomes evident that data locality is going to a must check box feature.

3D XPoint Improvements over NAND

You can also see below that NVM Express can drive 6 GB/s! not 6 Gb. The 10 Gb will become a bottlneck even a 1 write across the network let alone 2 writes and reads going across the network.

To get a full insight of 3D XPoint and NVM Experss and the impact of data locality watch the below video from Intel.


Betting Against Intel and Moore’s Law for Storage

When I hear about companies essential betting against Moore’s Law it makes me think of crazy thoughts like me giving up peanut butter or beer. Too crazy right?! Take a gaze at what Intel is currently doing for storage vendors below:


Some of the more interesting ones in the hyper-converged space are detailed below.

Increased number of execution buffers and execution units:
More parallel instructions, more threads all leading to more throughput

CRC instructions:
For data protection, every enterprise vendor should be checking their data for data consistency and prevention against bit rot.

Intel Advanced Vector Extensions for wide integer vector operations(XOR, P+Q)
Nutanix uses this ability to calculate XOR for EC-X (erasure coding). It uses subcycles of a cpu on a bit of data which really helps has Nutanix parallels the operaion against all of the CPU’s in the cluster. Other vendors could use this for RAID 6.

Intel Secure Hash Algorithm Extensions
Nutanix uses these extensions to accelerate SHA-1 fingerprinting for inline and post-process dedupe. Making use of the Intel SHA Extensions on processors is designed to provide a performance increase over current single buffer software implementations using general purpose instructions.

Below is video that talks about how Nutanix does Dedupe

Transactional Synchronization Extensions (TSX)
Adds hardware transactional memory support, speeding up execution of multi-threaded software through lock elision. Multi-thread apps like the Nutanix Controller virtual machine can enjoy at 30-40% boost when mutiple threads are used. This provides a big boast for in-RAM caching.

Reduced VM Entry / Exit Latency
This helps the virtual machine, in this case the virtual storage controller it never really has to exit into the virtual machine manager due to a shadowing process. The end result is low latency and the penalty for user space vs kernel is removed from the table. Also happens to be one of the reasons why you can virtualize giant Oracle databases.

Intel VT-d
Improve reliability and security through device isolation using hardware assisted remapping and improved I/O performance and availability by direct assignment of devices. Nutanix directly maps SSDs and HDDs and removes the yo-yo affect of going through the hypervisor.

Intel Quick Assist


Intel has a Quick Assist card that will do full offload for compression and SHA-1 hashes. Guess what? This features on this card will be going on the CPU in the future. Nutanix could use this card today but chooses not to for service ability reasons but you can bet your bottom dollar that we’ll use it once the feature is baked onto the CPU’s.

To top everything else above, the next Intel CPU’s will deliver 44 cores per 2 socket server and 88 threads with hyper-threading!

If you want to watch a full break down of all the features, you can watch this video with Intel at Tech Field Day


Nutanix Volume API and Containers #dockercon

When I first heard that Nutanix was creating a Volume API to give access to virtual machines via iSCSI I thought it was just arming our customers another option in running MS Exchange. I was sitting in Acropolis training last week and it dawned on me how great this will be for containers and OpenStack. So what is Nutanix Volume API all about?

The volumes API exposes back-end NDFS storage to guest operating system, physical hosts, and containers through iSCSI. iSCSI support allows any operating system to use the storage capabilities of NDFS. In this deployment scenario, the operating system works directly with Nutanix storage bypassing any hypervisor.

Volumes API consists of the following entities:

Volume group
iSCSI target and group of disk devices.
Storage devices in the volume group (displayed as LUNs for the iSCSI target).
Allowing a specified initiator IQN access to the volume group.

The following image shows an example of a VM running on Nutanix with its operating system hosted on the Nutanix storage, mounting the volumes directly.


Now your OpenStack and containers instances can be blown away and your data will persist! I think this is a big plus for Nutanix and containers running on the Acropolis hypervisor. Future integration with Docker Plugins should now be easier.


VVOLS – Improving Per VM Management

Lots up hype around VVOls (virtual volumes) these days as we approach the next vSphere release. When VVOLS was first talked about in 2011 I wasn’t really impressed. The idea of VVOLs didn’t seem new to me because Nutanix at the time had already broken free of LUNS and were already well down path of Per-VM management. Fast forward today I still think there is a little bit of get out jail card for traditional storage vendors but my attitude has defiantly changed. There is just a wealth of information that is available out on the web, like the VMworld website and you can see the direction that VMware is going with it. Not only does VVOLS give per VM management, it separates out capacity from performance and simplifies connecting storage. I personally think VMware is making a very smart tactical move. VMware is creating their own control plane. This will be a heavily used notch that their competitors will have to overcome.

To me the crown jewel of the whole implementation is VASA ( vSphere Storage APIs for Storage Awareness ) which was first introduced in vSphere 5.0. VASA was limited in its initial release, vendors could only assign one capability to each datastore and users could also supply one capability. VASA 2.0 is probably what a lot of people thought it was initial going to be with VASA 1.0. VASA 2.0 is the foundation for Storage Policy Based Management (SPBM) in my mind. VASA with SPBM will allow for placement of virtual disks, gives insights into the newly found storage containers and will also allow better support for Storage DRS. I am also hopping VVOLS eliminates the limit of 2048 VM’s per datastore when High Availability is enabled. We preach giant datastores at Nutanix so it will be nice to have everything on par.

Nutanix is supporting VVOLS but no commitment to public timeline and I don’t lead engineering but my intelligent guess is that VASA 2.0 will get implemented with our ZooKeeper layer that is responsible for maintaining configuration. Like VASA 2.0, ZooKeeper is highly available and already keeps track of all of the containers you create on Nutanix. VASA 2.0 and ZooKeeper are also similar in that they’re not in the IO path. vCenter controls the activation of VASA providers but after that it’s host to provider so vCenter can be shutdown without impact of availability.

Protocol EndPoint(PE) is another component that makes up the VVOLs family. PE helps abstract connecting to all of your VVOLS whether your iSCSI, NFS and fiber channel. With Nutanix you don’t have to really worry about connecting your underlying storage or setting up multi-pathing, this all taken care of you under the covers. PE may or may not cause a lot of existing storage vendors a lot of grief. Additional overhead will be needed to take into account because now instead of flopping over a LUN to another controller you’re flopping over possibly hundreds of VVOLS.
If you look at the breakup of a VVOL, use see many VVOLS actually make up one virtual machine.

5 Types of VVOls:

• Config-VVOL – Metadata
• Data-VVOL – VMDK’s
• Mem-VVOL – Snapshots
• Swap-VVol – Swap files
• Other-VVOls – Vendors

Simply put there will be a lot more connections to manage. This could be additional stress on the “hardware added” re-sellers. If you’re relying on NVRAM and you’re already full or tight on space this is going to make matters better. Nutanix has always had to do this so I would think things would change much here. Currently today any file over 512 bytes is mapped to a vDisk on the Nutanix side so any overhead should stay the same.

VVOLs is also introducing the concept of storage containers. When I give a demo of Nutanix Prism UI I have been saying at least for the last year that our containers are just a way of grouping VM’s that you want to have the same policy or capabilities. VVOLS and Nutanix are pretty similar this way. Both VVOLs Storage Containers and Nutanix Containers share these common traits:

• A storage container is only limited by available hardware.
• You must have at least one storage container
• You can have multiple storage containers per array/file system
• You assign the capabilities to the storage container

VVOL storage containers will not allow for you span storage controllers unless that is already built into the underlying storage system.

The exciting bits is that VVOLS will enable you to get a lot of the same functionality that you would have had to gone into the storage system UI to get like snapshots and compression. While I think this is great for management and user experience I think a lot of people are going to have to re-examine their security on vCenter. Nothing like giving access to everyone to create unlimited snapshots, let’s see what happens! It is probably more of an enterprise issue though. Last VMUG I was at in Edmonton when I asked if the storage person and the virtual person where the same person the vast majority of people put up their hand. I guess more checks and balances are never a bad thing.

Personally it’s great to see the similarities in architecture being used for VVOLs compared to Nutanix. While there will some heavy lifting in regards to API integration at least the meat and potatoes of the product won’t have to change to accommodate. In general VVOLs will make life easier for Nutanix and our end users so thumbs up here.

Please comment below if have thoughts on the topic.


Why Did The #Dell + #Nutanix Deal happen? Bob Wallace on theCube Explains

After spending several years trying to carve itself a slice of the convergence infrastructure market to little avail, Dell Inc. finally changed strategies in June, teaming up with Nutanix Inc. to supply the hardware for its wildly popular hyperconverged appliances. Bob Wallace, the head of OEM sales for the fast-growing startup, appeared on SiliconANGLE’s theCUBE in a recent interview to share how the alliance is helping both companies pursue their respective goals with hosts Dave Vellante and Stu Miniman.

Check out the SiliconANGLE YouTube Channel


Nutanix and EVO:RAIL\VSAN – Data Placement

Nutanix and VSAN\EVO:RAIL are different in many ways. One such way is how data is spread out through the cluster.

• VSAN is a distributed object file system
• VSAN metadata lives with the vm, each vm has it’s own witness
• Nutanix is a distributed file system
• Nutanix metadata is global

VSAN\EVO:RAIL will break up its objects (VMDK’s) into components. Those components get placed evenly among the cluster. I am not sure on the algorithm but it appears to be capacity based. Once the components are placed on a node they stay there until:

• They are deleted
• The 255 GB component (default size) fills up and another one is created
• The Node goes offline and a rebuild happens
• Maintenance mode is issued and someone selects the evacuate data option.

So in a fresh brand new cluster things are pretty evenly distributed.


VSAN distributes data with the use of components

VSAN distributes data with the use of components

Nutanix uses data locality as the main principle in placement of all initial data. One copy is written locally, one copy remotely. As more writes occur the secondary copy of the data keeps getting spread evenly across the cluster. Reads stay local to the node. Nutanix uses extent and extent groups as the mechanism to coalesce the data (4 MB).

A new Nutanix cluster or one running for a long time, things are kept level and balanced based on a percentage of overall capacity. This method accounts for clusters with mixed nodes\needs. More here.



Nutanix places copies of data with the use of extent groups.

So you go to expand your cluster…

With VSAN after you add a node (compute, SSD, HDD) to a cluster and you vMotion workload over to the new node what happens? Essential nothing. The additional capacity would get added to the cluster but there is no additional performance benefit. The VM’s that are moved to the new node continue to hit the same resources across the cluster. The additional flash and HDD sit there idle.


Impact of adding a new node with VSAN and moving virtual machines over.

Impact of adding a new node with VSAN and moving virtual machines over.

When you add a node to Nutanix and vMotion workloads over they start writing locally and get to benefit from the additional flash resources right away. Not only is this important from a performance perspective, it also keeps available data capacity level in the event of a failure.



Impact of adding a new node with Nutanix and moving virtual machines over.

Since data is spread evenly across the cluster in the event of hard drive failing all of the nodes in Nutanix can help with rebuilding the data. With VSAN only the nodes containing the components can help with the rebuild.

Note: Nutanix rebuilds cold data to cold data (HDD to HDD), VSAN rebuilds data into the SSD Cache. If you lose a SSD with VSAN all backing HDD need to be rebuilt. The data from HDD on VSAN will flood into the cluster SSD tier and will affect performance. This is one of the reasons I believe why 13 RAID controllers were pulled from the HCL. I do find it very interesting because one of the RAID controllers pulled is one that Nutanix uses today.

Nutanix will always write the minimum two copies of data in the cluster regardless of the state of the clusters. If it can’t the guest won’t get the acknowledgment. When VSAN has a host that is absent it will write only 1 copy if the other half of the components are on the absent host. At some point VSAN will know it has written too much with only 1 copy and start the component rebuild before the 60 minute timer. I don’t know the exact algorithm here either, it’s just what I have observed after shutting a host down. I think this is one of the reasons that VSAN recommends writing 3 copies of data.

[Update: VMware changed the KB article after this post. It was 3 copies of data and has been adjusted to 2 copies (FT > 0) Not sure what changed on their side. There is no explanation for the change in the KB.]

Data locality has an important role to play in performance, network congestion and in availability.

More on Nutanix – EVO

Learn more about Nutanix


Choice: vSphere Licensing on Nutanix

Nutanix provides choice on what vSphere license you can apply to your environment. If your at a remote site, you can run vSphere essentials, if you have a high density deployment you can run vSphere Enterprise Plus. In short, the choice is left up to the customer on what makes sense.

It’s important to have flexibility around licensing because VMware can add\remove packages any time. For example, VMware announced a great ROBO license edition recently, VMware vSphere Remote Office Branch Office Standard and Advanced Editions. Now you can purchase VMware licensing per packs of VM versus paying per socket. Enterprises that have lots of remote sites but few resources running in them can take a look at the NX-1020 that has native built-in replication plus the appropriate licensing pack.

Nutanix Perfect for Remote Sites

Licensing can change any time, Nutanix provides flexibility to meet your needs. Check out this great video on why vSphere for Remote Sites.

What happens if you have an existing Enterprise Licensing Agreement? No problem! Go ahead and apply your existing license. If your needs change, rest assured Nutanix will keep running.

This flexibility in licensing comes from the fact that Nutanix runs in the user space and hasn’t created any dependences on vCenter. Nutanix will continue to run without vCenter and management of your cluster is not affected by vCenter going down. The Nutanix UI known as PRISM is highly available and designed to scale right along with your cluster, from 5 VM’s to +50,000 VM’s.

Pick what works for you.


First posted on


Do you have a vCenter plugin?

vCenter Plugins are a bad proposition and fit into the bucket of “too good to be true” when talking about storage. Having your storage dependent on vCenter creates a situation where you are now tied to something that is a single point of failure, has is security implications, it can limit your design/solution due to vCenter’s lack of scale and restricts your control over future choices. In most cases storage companies have plugins because the initial setup with ESXi is complex and they are trying to mask some the work needed to get it up and running.

Single Point of Failure

Even VMware Technical marketing staff admit that vCenter is limited in options to keep it protected. Now that Heartbeat is end of life there really isn’t a good solution in place.
What happens when vCenter goes down? Do you rely on the plugin to provide UI for administration? How easy is it run commands from the CLI when the phones light up with support calls?

Nutanix ability to scale and remain availability is not dependent on a plugin.


If I was to place money, I would bet no one can write a plugin for vCenter better than VMware with security in mind. VMware has decided not to create a plugin for EVOL:RAIL and stand up a separate web server. I might be reading in between the lines but punching more holes into vCenter is not a good thing. How hardened can a 3rd party plugin be? Chances are the plugin will ending up talking to esxi hosts thru vpxuser which essential is a root account. It’s not the root account that is scary, it how people get access to it. Does the plugin use vCenter security rights? Too me there is just more questions than answers.

From the vendor side, as VMware goes from vSphere 5.5 -> 6.0 -> .Next, the plugin will have to be in lock step and will cause more work and time in making sure the plugin works versus pouring that time and effort into new development.


Scale and number nodes are affected by vCenter’s ability to manage the environment. Both cluster size and linked mode by a part in the overall management. If the plugin is needed to manage the environment can you scale past the limits of vCenter? How many additional management points does this cause? In a multi-site deployment can you use two vCenters? Experience tells me vCenter at the data center managing remote sites hasn’t been fun in the past.

If you’re a hyper-converged vendor do you have to license all of your nodes if you just need storage due to the plugin? If you just need a storage node you do have the option of just adding it to the cluster and not vCenter with Nutanix.

One of the most scariest things from an operations point of view is patching. Does patching vCenter cause issues to the plugin? Do you have to follow the HCL matrix when patching the plugin? Today patching Horizon View you have to worry about storage version, VMware Tools, View Agent, View Composer, vCenter, Database version, Anti-virus version and adding a plugin to the mix will not help.

I think vCenter plugins are over-hyped for what they can cause in return. Maybe Nutanix will get one but this kid doesn’t see the need. If the future direction is for web-based client, having another tab open in Chrome is not a big deal.


VMware EUC & Nutanix Relationship via @theCUBE

Courtney Burry & Sachin Chheda on theCUBE

* Talks about VMware and Nutanix partnership.

* Partnership on the Horizon 6 RA

* Workspace is about applications


EVO RAIL: Status Quo for Nutanix

boatSome will make a big splash about the launch of EVO RAIL but the reality is that things remain status quo. While I do work for Nutanix and I am admittedly biased, the fact is that Nutanix as a company was formed in 2009 and has been selling since 2011. VSAN and now EVO RAIL is a validation of what Nutanix has been doing over the last 5 years. In this case, high tide lifts all boats.

Nutanix will continue to partner with VMware for all solutions, just like VDI, RDS, SRM, Server Virt, Big data applications like Splunk and private cloud. Yes we will compete with VSAN but I think the products are worlds apart mostly due to architectural decisions. Nutanix helps to sell vSphere and enable all the solutions that VMware provides today. Nutanix has various models that serve Tier 1 SQL\Oracle all the way down to the remote branch where you might want only a hand full of VM’s. Today EVO RAIL is only positioned to serve only Tier 2, Test/Dev and VDI. The presentation I sat in on as a vExpert confirmed Teir 1 was not a current use case. I do feel that this is mistake for EVO RAIL. By not being able to address Tier 1 which I would include VDI in the use case, you end up creating silos in the data center which is everything that SDDC should be trying to eliminate.

Nutanix Uses Cases

Some of the available Nutanix Uses Cases


Nutanix is still King of Scale but I am interested to hear more about EVO RACK which still in tech preview. EVO RAIL in version 1.0 will only scale to 16 nodes\servers or 4 appliances. Nutanix doesn’t really have a limit but tends to follow hypervisor limits, most Nutanix RA’s are around 48 nodes from a failure domain perspective.

Some important differences between Nutanix and EVO RAIL:

* Nutanix starts at 3 nodes, EVO RAIL starts at 4 nodes.

* Nutanix uses Hot Optimized Tiering based on data analysis and cache from RAM which can be deduped, EVO RAIL uses caching from SSD(70% of all SDD is used for cache).

* You can buy 1 Nutanix node at a time, EVO RAIL only is sold with 4 nodes at a time. Though I think this has do with trying to keep a single sku. The SMB in the market will find it had to make this jump though. On the Enterprise side you need to be able to have different node types if your compute\capacity doesn’t match up.

* Nutanix can scale with different node types ranging in different levels of storage and compute, EVO RAIL today is a hard locked configuration. You are unable to even change the amount of RAM from the OEM vendor. CPU’s are only 6 core which leads to needing more nodes = more licenses.

* EVO RAIL is only spec’d for 250 desktops\100 general server VM’s per appliance. Nutanix can deliver 440 desktops per 2U appliance with a medium Login VSI workload and 200 general server VM’s when enabling features like inline dedupe on the 3460 series. In short we have no limits if you don’t have CPU\RAM contention.


* Nutanix has 1 Storage Controller(VM) per host that takes cares of VM Cailber Snapshots, inline compression, inline Dedupe, Map Reduce Dedupe, Map Reduce compression, Analytics, Cluster Health, Replication, hardware support. EVO Rail will have a EVO management software(web server), vCenter VM, Log insight VM and a VM from the OEM Vendor for hardware support and vSphere replication VM if needed.

* Nutanix is able to have separation between compute and storage clusters. EVO RAIL is one large compute cluster with only storage container. By having separation you can have smaller compute clusters and still enjoy one giant volume. This is really just an issue of having flexibility on design.

* Nutanix can run with any license of vSphere, EVO RAIL license is Enterprise Plus. I am not sure how that will affect pricing. I suspect the OEM will be made to keep it at normal prices because if would affect the rest of their business.

* Nutanix can manage multiple large\small cluster with Prism Central. EVO RAIL has no multi-cluster management.

* Nutanix you get to use all of the available hard drives for all of the data out of the box. EVO RAIL you have to increase the stripe width to take advantage of all the available disks when data is moved from cache
to hard disk.

* Nutanix offers both Analysis and built in troubleshooting tools in the Virtual Storage Controller. You don’t have to add another VM in to provide the services.

Chad Sakac mentioned in one of his articles “my application stack has these rich protection/replication and availability SLAs – because it’s how it was built before we started thinking about CI models””, that you might not pick EVO RAIL and go to a Vblock. I disagree on the CI part. Nutanix has the highest levels of data protection today. Synchronous writes, bit rot prevention, all data is check summed, data is continuously scrubbed in low periods, Nutanix based snapshots for backup and DR.

It’s a shame that EVO RAIL went with the form factor they did. VSAN can lose up to 3 nodes at any one time which is good but in the current design it will need5 copies of data to ensure that a block going down will not cause data loss when you go to scale the solution. I think they should have stayed with a 1 node – 2 U solution. Nutanix has a feature called Availability Domains that allows us to survive a whole block going down and the cluster can still function. This feature doesn’t require any additional storage capacity to use the feature, just the minimum two copies of data.

More information on Availability Domains can be found on the Nutanix Bible


* Nutanix can Scale past 32 nodes, VSAN is supported for 32 nodes but yet EVO RAIL is only supported for 16 nodes. I don’t know why they made this decision.

* Prism Element has no limits to the number objects that it can manage. EVO RAIL is still limited by the number of components. I believe that the limited hardware specs are being used to limit the number components so this does not become an issue in the field.

* Nutanix when you add a node you can enjoy the performance benefits right away. EVO RAIL you have to wait until new VM’s are created to make use of the new flash and hard drives(or a perform a maintenance operation). Lot of this has to do on how Nutanix controls the placement of data, data locality helps with this.

I think the launch of EVO RAIL shows how important hardware still is when achieving 5 9’s of availability. Look out dual headed storage architectures, your lunch just got a little smaller again.