FYI: I’ve been with Nutanix for over 2 years, but my first 8 months I was a corporate SE before the first Sales person that was hired in Canada which was Anton.
With NOS 4.1 GA around the corner I thought I should get started with playing with Cloud Connect with the tech Preview on NOS 4.0.1. Cloud Connect provides an infrastructure free hassle way to backup your virtual environments.
More information can be found on: Nutanix – High Availability and Data Protection
Common Amazon Terms
AWS = Amazon Web Services
EC2 = Elastic Compute Cloud
EBS = Elastic Block Storage
S3 = Simple Storage Service
AMI = Amazon Machine Images
Today Cloud Connect only works with AWS but Azure is also slated. You can deploy to any availability region which is great as Amazon is launching a new data center in Germany. In NOS 4.1 everything will be UI driven but in tech preview you do have to run one command line to setup your Nutanix AMI. Until 4.1 the AMI is private so you have to engage support to give you access to the AMI. You will have to give support your customer ID.
***** Make sure your cluster has the external cluster IP setup **************
On my cluster I ran:
The AWS ID & Key can be found from AWS support portal
Almost everything is automatically selected for the user except the AWS region & subnet (which decides connectivity type – SSH tunnel vs pre-existing VPN tunnel). Current workflow takes AWS API keys as input & expects that AWS API keys can be then used for below AWS operations:-
1. Query regions in EC2
2. Query subnets, VPCs & VPN gateways
3. Query AMIs in EC2
4. Query/create/modify security groups in EC2/VPC
5. Create/run/list AWS instances
6. Create/list/delete snapshots of EBS volumes
7. Create S3 buckets & write data into those buckets
Output from the Command
AWS subnet id not specified, launching instance in EC2
2014-10-23 08:48:14 INFO create_aws_instance:368 Started instance i-83b27e89 is in state pending
2014-10-23 08:48:57 INFO create_aws_instance:420 Waiting for AWS instance to be accessible...
2014-10-23 08:49:42 INFO create_aws_instance:420 Waiting for AWS instance to be accessible...
2014-10-23 08:50:27 INFO create_aws_instance:420 Waiting for AWS instance to be accessible...
2014-10-23 08:51:00 INFO create_aws_instance:638 Change of nutanix password Succeeded.
2014-10-23 08:51:00 INFO create_aws_instance:634 Creating single node cluster for cloud_data_gateway...
2014-10-23 08:52:09 INFO create_aws_instance:638 Creation of cluster (ip : XX.XXXX.XXXX.XXX) on AWS Succeeded.
2014-10-23 08:52:16 INFO create_aws_instance:634 Adding DNS serverv to cluster...
2014-10-23 08:53:19 INFO create_aws_instance:512 Successfully configured passworless ssh access.
2014-10-23 08:53:23 INFO create_aws_instance:638 Configuration of the cloud disk Succeeded.
2014-10-23 08:53:31 INFO create_aws_instance:638 Creation of storage pool backup-sp Succeeded.
2014-10-23 08:53:35 INFO create_aws_instance:638 Creation of container backup-ctr Succeeded.
2014-10-23 08:53:39 INFO create_aws_instance:638 Configuration of remote site 'local' on AWS instance Succeeded.
2014-10-23 08:53:41 INFO create_aws_instance:614 Configuration of remote site 'aws_54-244-175-211' on local cluster Succeeded.
2014-10-23 08:53:41 INFO create_aws_instance:391 AWS instance was successfully configured.
2014-10-23 08:53:41 INFO create_aws_instance:402 Instance i-83b27e89 private_ip XX.XXX.XXX.XXX. public_ip XXX.XXX.XXX.XXX VPC None Name Nutanix_EC2_
The Remote site does get setup in PRISM. The only thing that you might want to check off is to compress the data on the wire. I was using an ssh tunnel but for production you should run a VPC. Using a VPC you can achieve up to 25% better throughput.
This holiday season Nutanix is making web-scale wishes come true for non-profit organizations! I am excited for Web-Scale Wish, a Non-Profit Datacenter Makeover that will gift $500,000 worth of Nutanix technology to causes nominated by the community. I worked at a Non-Profit in Edmonton for 4 years so this is great to see. Non-profit funding should go directly towards research or helping people in need so technology is usually the first thing to get pulled from the budget.
The Grand Prize
A 48-hour Web-Scale Datacenter Makeover from Nutanix! Includes NX-3350, 3 Pro software licenses, 3 years of support, and 2-day installation service.
Who Can Win?
Web-Scale Wish is a multi-region campaign spanning countries in three theaters. To qualify, an organization must:
* Be a non-partisan non-profit
* Be located in the United States, Canada, the United Kingdom, Australia, or New Zealand
* Have a working datacenter that could benefit from web-scale converged infrastructure
* Be able to complete and meet environment assessment criteria
Nominations can be submitted online starting October 15th.
Anyone can nominate a non-profit organization or the organization can enter directly. Nominations will be collected from October 15 – November 21, 2014. Participating organizations must complete an environment assessment to qualify.
Get all the information at http://www.webscalewish.com/
Nutanix and VSAN\EVO:RAIL are different in many ways. One such way is how data is spread out through the cluster.
• VSAN is a distributed object file system
• VSAN metadata lives with the vm, each vm has it’s own witness
• Nutanix is a distributed file system
• Nutanix metadata is global
VSAN\EVO:RAIL will break up its objects (VMDK’s) into components. Those components get placed evenly among the cluster. I am not sure on the algorithm but it appears to be capacity based. Once the components are placed on a node they stay there until:
• They are deleted
• The 255 GB component (default size) fills up and another one is created
• The Node goes offline and a rebuild happens
• Maintenance mode is issued and someone selects the evacuate data option.
So in a fresh brand new cluster things are pretty evenly distributed.
Nutanix uses data locality as the main principle in placement of all initial data. One copy is written locally, one copy remotely. As more writes occur the secondary copy of the data keeps getting spread evenly across the cluster. Reads stay local to the node. Nutanix uses extent and extent groups as the mechanism to coalesce the data (4 MB).
A new Nutanix cluster or one running for a long time, things are kept level and balanced based on a percentage of overall capacity. This method accounts for clusters with mixed nodes\needs. More here.
So you go to expand your cluster…
With VSAN after you add a node (compute, SSD, HDD) to a cluster and you vMotion workload over to the new node what happens? Essential nothing. The additional capacity would get added to the cluster but there is no additional performance benefit. The VM’s that are moved to the new node continue to hit the same resources across the cluster. The additional flash and HDD sit there idle.
When you add a node to Nutanix and vMotion workloads over they start writing locally and get to benefit from the additional flash resources right away. Not only is this important from a performance perspective, it also keeps available data capacity level in the event of a failure.
Since data is spread evenly across the cluster in the event of hard drive failing all of the nodes in Nutanix can help with rebuilding the data. With VSAN only the nodes containing the components can help with the rebuild.
Note: Nutanix rebuilds cold data to cold data (HDD to HDD), VSAN rebuilds data into the SSD Cache. If you lose a SSD with VSAN all backing HDD need to be rebuilt. The data from HDD on VSAN will flood into the cluster SSD tier and will affect performance. This is one of the reasons I believe why 13 RAID controllers were pulled from the HCL. I do find it very interesting because one of the RAID controllers pulled is one that Nutanix uses today.
Nutanix will always write the minimum two copies of data in the cluster regardless of the state of the clusters. If it can’t the guest won’t get the acknowledgment. When VSAN has a host that is absent it will write only 1 copy if the other half of the components are on the absent host. At some point VSAN will know it has written too much with only 1 copy and start the component rebuild before the 60 minute timer. I don’t know the exact algorithm here either, it’s just what I have observed after shutting a host down. I think this is one of the reasons that VSAN recommends writing 3 copies of data.
[Update: VMware changed the KB article after this post. It was 3 copies of data and has been adjusted to 2 copies (FT > 0) Not sure what changed on their side. There is no explanation for the change in the KB.]
Data locality has an important role to play in performance, network congestion and in availability.
More on Nutanix – EVO
Learn more about Nutanix
Splunk Enterprise scales to collect and index tens of terabytes of data per day. And because the insights from your data are mission-critical, Splunk software’s index replication technology provides the availability you need, even as you scale out your low-cost, distributed computing environment. Automatic load balancing optimizes workloads and response times and provides built-in failover support. Out-of-the-box reporting and analytics capabilities deliver rapid insights from your data.
Splunk DB Connect delivers reliable, scalable, real-time integration between Splunk and traditional relational databases. Splunk Hadoop Connect provides bi-directional integration to easily and reliably move data between Splunk Enterprise and Hadoop.
Learn why you should virtualize Splunk and how Nutanix and Splunk combine web-scale approaches with the likes of map-reduce to deliver insights and value from your infrastructure.
Check out the full Splunk RA on Nutanix
Check out the Nutanix speaking sessions at MISA BC
One of the added benefits of scale-out storage is the addition of multiple storage controllers. When you have more than 2 storage controllers and you lose one due to failure or maintenance like a rolling upgrade, you can do so with minimal impact.
Below are the results of 8-node cluster with 700 desktops running a Login VSI medium workload. One of the 8 storage controllers is shutdown to see the impact on the cluster. No desktops were rebooted or shutdown. IOPS dropped from 2,000 to 1,496 and latency had a brief spike from 4ms to 22.37 ms.
Things to think about for hyper-convergence?
* Is data spread out evenly that additional controllers will help?
* Do you have to vMotion VM’s to preform an upgrade? If so, can you meet your maintenance window?
* Will dependences on the Hypervisor management stack cause you to patch both your control and data plane?
See more proven results on scaling with the VMware Horizon 6.0 with the View RA
Nutanix provides choice on what vSphere license you can apply to your environment. If your at a remote site, you can run vSphere essentials, if you have a high density deployment you can run vSphere Enterprise Plus. In short, the choice is left up to the customer on what makes sense.
It’s important to have flexibility around licensing because VMware can add\remove packages any time. For example, VMware announced a great ROBO license edition recently, VMware vSphere Remote Office Branch Office Standard and Advanced Editions. Now you can purchase VMware licensing per packs of VM versus paying per socket. Enterprises that have lots of remote sites but few resources running in them can take a look at the NX-1020 that has native built-in replication plus the appropriate licensing pack.
What happens if you have an existing Enterprise Licensing Agreement? No problem! Go ahead and apply your existing license. If your needs change, rest assured Nutanix will keep running.
This flexibility in licensing comes from the fact that Nutanix runs in the user space and hasn’t created any dependences on vCenter. Nutanix will continue to run without vCenter and management of your cluster is not affected by vCenter going down. The Nutanix UI known as PRISM is highly available and designed to scale right along with your cluster, from 5 VM’s to +50,000 VM’s.
Pick what works for you.
First posted on Nutanix.com
vCenter Plugins are a bad proposition and fit into the bucket of “too good to be true” when talking about storage. Having your storage dependent on vCenter creates a situation where you are now tied to something that is a single point of failure, has is security implications, it can limit your design/solution due to vCenter’s lack of scale and restricts your control over future choices. In most cases storage companies have plugins because the initial setup with ESXi is complex and they are trying to mask some the work needed to get it up and running.
Single Point of Failure
Even VMware Technical marketing staff admit that vCenter is limited in options to keep it protected. Now that Heartbeat is end of life there really isn’t a good solution in place.
What happens when vCenter goes down? Do you rely on the plugin to provide UI for administration? How easy is it run commands from the CLI when the phones light up with support calls?
Nutanix ability to scale and remain availability is not dependent on a plugin.
If I was to place money, I would bet no one can write a plugin for vCenter better than VMware with security in mind. VMware has decided not to create a plugin for EVOL:RAIL and stand up a separate web server. I might be reading in between the lines but punching more holes into vCenter is not a good thing. How hardened can a 3rd party plugin be? Chances are the plugin will ending up talking to esxi hosts thru vpxuser which essential is a root account. It’s not the root account that is scary, it how people get access to it. Does the plugin use vCenter security rights? Too me there is just more questions than answers.
From the vendor side, as VMware goes from vSphere 5.5 -> 6.0 -> .Next, the plugin will have to be in lock step and will cause more work and time in making sure the plugin works versus pouring that time and effort into new development.
Scale and number nodes are affected by vCenter’s ability to manage the environment. Both cluster size and linked mode by a part in the overall management. If the plugin is needed to manage the environment can you scale past the limits of vCenter? How many additional management points does this cause? In a multi-site deployment can you use two vCenters? Experience tells me vCenter at the data center managing remote sites hasn’t been fun in the past.
If you’re a hyper-converged vendor do you have to license all of your nodes if you just need storage due to the plugin? If you just need a storage node you do have the option of just adding it to the cluster and not vCenter with Nutanix.
One of the most scariest things from an operations point of view is patching. Does patching vCenter cause issues to the plugin? Do you have to follow the HCL matrix when patching the plugin? Today patching Horizon View you have to worry about storage version, VMware Tools, View Agent, View Composer, vCenter, Database version, Anti-virus version and adding a plugin to the mix will not help.
I think vCenter plugins are over-hyped for what they can cause in return. Maybe Nutanix will get one but this kid doesn’t see the need. If the future direction is for web-based client, having another tab open in Chrome is not a big deal.