After spending several years trying to carve itself a slice of the convergence infrastructure market to little avail, Dell Inc. finally changed strategies in June, teaming up with Nutanix Inc. to supply the hardware for its wildly popular hyperconverged appliances. Bob Wallace, the head of OEM sales for the fast-growing startup, appeared on SiliconANGLE’s theCUBE in a recent interview to share how the alliance is helping both companies pursue their respective goals with hosts Dave Vellante and Stu Miniman.
Prior to 4.0.2, Only changed bytes existed to help tell the tail of the difference between snapshots and was misleading depending on how you interpret the results. Today both fields exist so you can see how much space the snapshots are taking up and how much IO is going through the system.
Exclusive Usage Bytes:
Total space that can be reclaimed when this snapshot is deleted and garbage collected.
Amount of user I/O to the entities in this protection domain between the previous snapshot and this snapshot.
The exclusive bytes field is calculated with the power of MapReduce and stats are fed back to the Prism UI. This particular stat is not real time and takes approximately about an hour to show up.
Today Nutanix provide Async Replication as of NOS 4.0.2.
Since lots of small remote sites have limited bandwidth it doesn’t make sense impacting running workloads and wearing out your flash on the destination side. If the aggregate incoming bandwidth required to maintain the current change rate is <= 500 Mb/s it is recommended to skip the performance tier(SSD's). This is a general guideline with a 4 node cluster. As your cluster grows you can add 100 Mb/s per node to that number. For Example, For a 12 node cluster it should be able to safely handle a change rate of <= 1,300 Mb/s.
I would recommend the destination container be brand new as well when you setup the remote site if possible. That way you can apply the appropriate polices without impacting other workloads.
If licensing permits I would also use MapReduce Compression on the destination container to save space.
To skip the performance tier, use the following command from the NCLI
ncli ctr edit sequential-io-priority-order=DAS-SATA,SSD-SATA,SSD-PCIe name=
You can always reserve the above change to the defaults if your perform a fail over.
NOTE: SSD’s are fully covered under support regardless of usage.
FYI: I’ve been with Nutanix for over 2 years, but my first 8 months I was a corporate SE before the first Sales person that was hired in Canada which was Anton.
With NOS 4.1 GA around the corner I thought I should get started with playing with Cloud Connect with the tech Preview on NOS 4.0.1. Cloud Connect provides an infrastructure free hassle way to backup your virtual environments.
More information can be found on: Nutanix – High Availability and Data Protection
Common Amazon Terms
AWS = Amazon Web Services
EC2 = Elastic Compute Cloud
EBS = Elastic Block Storage
S3 = Simple Storage Service
AMI = Amazon Machine Images
Today Cloud Connect only works with AWS but Azure is also slated. You can deploy to any availability region which is great as Amazon is launching a new data center in Germany. In NOS 4.1 everything will be UI driven but in tech preview you do have to run one command line to setup your Nutanix AMI. Until 4.1 the AMI is private so you have to engage support to give you access to the AMI. You will have to give support your customer ID.
***** Make sure your cluster has the external cluster IP setup **************
On my cluster I ran:
The AWS ID & Key can be found from AWS support portal
Almost everything is automatically selected for the user except the AWS region & subnet (which decides connectivity type – SSH tunnel vs pre-existing VPN tunnel). Current workflow takes AWS API keys as input & expects that AWS API keys can be then used for below AWS operations:-
1. Query regions in EC2
2. Query subnets, VPCs & VPN gateways
3. Query AMIs in EC2
4. Query/create/modify security groups in EC2/VPC
5. Create/run/list AWS instances
6. Create/list/delete snapshots of EBS volumes
7. Create S3 buckets & write data into those buckets
Output from the Command
AWS subnet id not specified, launching instance in EC2
2014-10-23 08:48:14 INFO create_aws_instance:368 Started instance i-83b27e89 is in state pending
2014-10-23 08:48:57 INFO create_aws_instance:420 Waiting for AWS instance to be accessible...
2014-10-23 08:49:42 INFO create_aws_instance:420 Waiting for AWS instance to be accessible...
2014-10-23 08:50:27 INFO create_aws_instance:420 Waiting for AWS instance to be accessible...
2014-10-23 08:51:00 INFO create_aws_instance:638 Change of nutanix password Succeeded.
2014-10-23 08:51:00 INFO create_aws_instance:634 Creating single node cluster for cloud_data_gateway...
2014-10-23 08:52:09 INFO create_aws_instance:638 Creation of cluster (ip : XX.XXXX.XXXX.XXX) on AWS Succeeded.
2014-10-23 08:52:16 INFO create_aws_instance:634 Adding DNS serverv to cluster...
2014-10-23 08:53:19 INFO create_aws_instance:512 Successfully configured passworless ssh access.
2014-10-23 08:53:23 INFO create_aws_instance:638 Configuration of the cloud disk Succeeded.
2014-10-23 08:53:31 INFO create_aws_instance:638 Creation of storage pool backup-sp Succeeded.
2014-10-23 08:53:35 INFO create_aws_instance:638 Creation of container backup-ctr Succeeded.
2014-10-23 08:53:39 INFO create_aws_instance:638 Configuration of remote site 'local' on AWS instance Succeeded.
2014-10-23 08:53:41 INFO create_aws_instance:614 Configuration of remote site 'aws_54-244-175-211' on local cluster Succeeded.
2014-10-23 08:53:41 INFO create_aws_instance:391 AWS instance was successfully configured.
2014-10-23 08:53:41 INFO create_aws_instance:402 Instance i-83b27e89 private_ip XX.XXX.XXX.XXX. public_ip XXX.XXX.XXX.XXX VPC None Name Nutanix_EC2_
The Remote site does get setup in PRISM. The only thing that you might want to check off is to compress the data on the wire. I was using an ssh tunnel but for production you should run a VPC. Using a VPC you can achieve up to 25% better throughput.
This holiday season Nutanix is making web-scale wishes come true for non-profit organizations! I am excited for Web-Scale Wish, a Non-Profit Datacenter Makeover that will gift $500,000 worth of Nutanix technology to causes nominated by the community. I worked at a Non-Profit in Edmonton for 4 years so this is great to see. Non-profit funding should go directly towards research or helping people in need so technology is usually the first thing to get pulled from the budget.
The Grand Prize
A 48-hour Web-Scale Datacenter Makeover from Nutanix! Includes NX-3350, 3 Pro software licenses, 3 years of support, and 2-day installation service.
Who Can Win?
Web-Scale Wish is a multi-region campaign spanning countries in three theaters. To qualify, an organization must:
* Be a non-partisan non-profit
* Be located in the United States, Canada, the United Kingdom, Australia, or New Zealand
* Have a working datacenter that could benefit from web-scale converged infrastructure
* Be able to complete and meet environment assessment criteria
Nominations can be submitted online starting October 15th.
Anyone can nominate a non-profit organization or the organization can enter directly. Nominations will be collected from October 15 – November 21, 2014. Participating organizations must complete an environment assessment to qualify.
Get all the information at http://www.webscalewish.com/
Nutanix and VSAN\EVO:RAIL are different in many ways. One such way is how data is spread out through the cluster.
• VSAN is a distributed object file system
• VSAN metadata lives with the vm, each vm has it’s own witness
• Nutanix is a distributed file system
• Nutanix metadata is global
VSAN\EVO:RAIL will break up its objects (VMDK’s) into components. Those components get placed evenly among the cluster. I am not sure on the algorithm but it appears to be capacity based. Once the components are placed on a node they stay there until:
• They are deleted
• The 255 GB component (default size) fills up and another one is created
• The Node goes offline and a rebuild happens
• Maintenance mode is issued and someone selects the evacuate data option.
So in a fresh brand new cluster things are pretty evenly distributed.
Nutanix uses data locality as the main principle in placement of all initial data. One copy is written locally, one copy remotely. As more writes occur the secondary copy of the data keeps getting spread evenly across the cluster. Reads stay local to the node. Nutanix uses extent and extent groups as the mechanism to coalesce the data (4 MB).
A new Nutanix cluster or one running for a long time, things are kept level and balanced based on a percentage of overall capacity. This method accounts for clusters with mixed nodes\needs. More here.
So you go to expand your cluster…
With VSAN after you add a node (compute, SSD, HDD) to a cluster and you vMotion workload over to the new node what happens? Essential nothing. The additional capacity would get added to the cluster but there is no additional performance benefit. The VM’s that are moved to the new node continue to hit the same resources across the cluster. The additional flash and HDD sit there idle.
When you add a node to Nutanix and vMotion workloads over they start writing locally and get to benefit from the additional flash resources right away. Not only is this important from a performance perspective, it also keeps available data capacity level in the event of a failure.
Since data is spread evenly across the cluster in the event of hard drive failing all of the nodes in Nutanix can help with rebuilding the data. With VSAN only the nodes containing the components can help with the rebuild.
Note: Nutanix rebuilds cold data to cold data (HDD to HDD), VSAN rebuilds data into the SSD Cache. If you lose a SSD with VSAN all backing HDD need to be rebuilt. The data from HDD on VSAN will flood into the cluster SSD tier and will affect performance. This is one of the reasons I believe why 13 RAID controllers were pulled from the HCL. I do find it very interesting because one of the RAID controllers pulled is one that Nutanix uses today.
Nutanix will always write the minimum two copies of data in the cluster regardless of the state of the clusters. If it can’t the guest won’t get the acknowledgment. When VSAN has a host that is absent it will write only 1 copy if the other half of the components are on the absent host. At some point VSAN will know it has written too much with only 1 copy and start the component rebuild before the 60 minute timer. I don’t know the exact algorithm here either, it’s just what I have observed after shutting a host down. I think this is one of the reasons that VSAN recommends writing 3 copies of data.
[Update: VMware changed the KB article after this post. It was 3 copies of data and has been adjusted to 2 copies (FT > 0) Not sure what changed on their side. There is no explanation for the change in the KB.]
Data locality has an important role to play in performance, network congestion and in availability.
More on Nutanix – EVO
Learn more about Nutanix
Splunk Enterprise scales to collect and index tens of terabytes of data per day. And because the insights from your data are mission-critical, Splunk software’s index replication technology provides the availability you need, even as you scale out your low-cost, distributed computing environment. Automatic load balancing optimizes workloads and response times and provides built-in failover support. Out-of-the-box reporting and analytics capabilities deliver rapid insights from your data.
Splunk DB Connect delivers reliable, scalable, real-time integration between Splunk and traditional relational databases. Splunk Hadoop Connect provides bi-directional integration to easily and reliably move data between Splunk Enterprise and Hadoop.
Learn why you should virtualize Splunk and how Nutanix and Splunk combine web-scale approaches with the likes of map-reduce to deliver insights and value from your infrastructure.
Check out the full Splunk RA on Nutanix
Check out the Nutanix speaking sessions at MISA BC