Today Nutanix with the help our brilliant Engineering team released Dedupe for RAM and Flash and a future release will allow dedupe for the capacity tier. This new feature is apart of the 3.5 code release and is available to all Nutanix customers. The release has over 75 features including, a new REST-API, a New UI and a Site Recovery Adapter for VMware’s SRM.
Dedupe will allow for even lower latency’s, lower TCO for Flash and keep Nutanix in the lead on Scale Out Technologies. I want to talk about the importance of this feature in terms of the history of Nutanix, and the impact on the future of the Data Center.
From the early days when started at Nutanix you could find hours of footage on how the Nutanix Distributed File System(NDFS) was structured. If you where brand new to the company you would have assumed that we already had dedupe. Dedupe was well thought out in advance when forming the distributed file system to be integrated into the distributed meta-data service, curator, our distributed self-healing and the impact on CPU utilization.
Why Now? The Nutanix engineering bench is second to none but they are real people and only work so many hours a day. (The snack room is full of caffeinated goodies!) NDFS already had two features that where saving tons of space:
Quick Clones: Upon cloning a virtual machine, new meta-data is created to track pointers to where the existing data is residing and then new writes are tracked. This saves spaces but allows the IO to saved for real workloads and not maintenance tasks.
Compression: Both inline(flash tier) and post-process(capacity Tier) can shrink the amount of capacity needed by 2X to 5X.
Dedupe was added to the commit for 3.5 when a couple of Engineers came over to Nutanix where their previous companies invented some of the core technologies that we use, like Apache Cassandra. This allowed for easier impact analysis on the overall performance of the file system.
To go along with QA stepped up its game. The system was always automated but time has allowed more and more corner cases to be added. This is very important when mixing reliability and dedupe to a file system. Also time has allowed QA to use more and more nodes in our “Dog Food” Cluster. Dog Food is a mix of 50+ nodes of 2000 – 6070 series nodes that try to cover off even the most unique customer deployment.
This future of dedupe across RAM, flash and hard drives gives the added performance and additional capacity on all tiers to drive down costs but there is more. You now have a full-blown future proof investment. When ReRAM or the next new wave of technology comes along Nutanix is positioned to use the whole stack cost effectively. Since the feature is applied in the software stack it can easily be transported up the stack with minimal engineering efforts.Two companies that are particular interesting are Crossbar and Diablo. I would encourage you to read up , take a look at them and then look at your current infrastructure. If it will cause you to do a wholesale rip and replace I think you should question your philosophy on how you build out your infrastructure.
How it works?
I will be giving a vBrown Bag Session on three features that are X-Men Like with Super Hero Powers at VMworld. One of these features is dedupe. I will post the recording here when it’s available. The session is 2:30 pm in the hang space on the Tuesday. I will be highlighting the dynamic nature of our dedupe process and how it avoids AV storms with VDI. I will also talk about how we fingerprint data and how we decide on what and what does not get deduped.
Related Blog Posts: