Oct
    23

    Cloud Connect – Automated AWS Setup

    With NOS 4.1 GA around the corner I thought I should get started with playing with Cloud Connect with the tech Preview on NOS 4.0.1. Cloud Connect provides an infrastructure free hassle way to backup your virtual environments.

    More information can be found on: Nutanix – High Availability and Data Protection

    Common Amazon Terms

    AWS = Amazon Web Services
    EC2 = Elastic Compute Cloud
    EBS = Elastic Block Storage
    S3 = Simple Storage Service
    AMI = Amazon Machine Images

    Today Cloud Connect only works with AWS but Azure is also slated. You can deploy to any availability region which is great as Amazon is launching a new data center in Germany. In NOS 4.1 everything will be UI driven but in tech preview you do have to run one command line to setup your Nutanix AMI. Until 4.1 the AMI is private so you have to engage support to give you access to the AMI. You will have to give support your customer ID.

    ***** Make sure your cluster has the external cluster IP setup **************

    On my cluster I ran:

    create_aws_instance --aws_key_id=--aws_secret_key= --aws_region=us-west-2 --nutanix_password=nutanix/4u --local_ctr=servers

    The AWS ID & Key can be found from AWS support portal

    Required AWS information.

    Required AWS information.

    Almost everything is automatically selected for the user except the AWS region & subnet (which decides connectivity type – SSH tunnel vs pre-existing VPN tunnel). Current workflow takes AWS API keys as input & expects that AWS API keys can be then used for below AWS operations:-

    1. Query regions in EC2
    2. Query subnets, VPCs & VPN gateways
    3. Query AMIs in EC2
    4. Query/create/modify security groups in EC2/VPC
    5. Create/run/list AWS instances
    6. Create/list/delete snapshots of EBS volumes
    7. Create S3 buckets & write data into those buckets

    Output from the Command

    AWS subnet id not specified, launching instance in EC2
    2014-10-23 08:48:14 INFO create_aws_instance:368 Started instance i-83b27e89 is in state pending
    .........................
    2014-10-23 08:48:57 INFO create_aws_instance:420 Waiting for AWS instance to be accessible...
    2014-10-23 08:49:42 INFO create_aws_instance:420 Waiting for AWS instance to be accessible...
    2014-10-23 08:50:27 INFO create_aws_instance:420 Waiting for AWS instance to be accessible...
    2014-10-23 08:51:00 INFO create_aws_instance:638 Change of nutanix password Succeeded.
    2014-10-23 08:51:00 INFO create_aws_instance:634 Creating single node cluster for cloud_data_gateway...
    2014-10-23 08:52:09 INFO create_aws_instance:638 Creation of cluster (ip : XX.XXXX.XXXX.XXX) on AWS Succeeded.
    2014-10-23 08:52:16 INFO create_aws_instance:634 Adding DNS serverv to cluster...
    2014-10-23 08:53:19 INFO create_aws_instance:512 Successfully configured passworless ssh access.
    2014-10-23 08:53:23 INFO create_aws_instance:638 Configuration of the cloud disk Succeeded.
    2014-10-23 08:53:31 INFO create_aws_instance:638 Creation of storage pool backup-sp Succeeded.
    2014-10-23 08:53:35 INFO create_aws_instance:638 Creation of container backup-ctr Succeeded.
    2014-10-23 08:53:39 INFO create_aws_instance:638 Configuration of remote site 'local' on AWS instance Succeeded.
    2014-10-23 08:53:41 INFO create_aws_instance:614 Configuration of remote site 'aws_54-244-175-211' on local cluster Succeeded.
    2014-10-23 08:53:41 INFO create_aws_instance:391 AWS instance was successfully configured.
    2014-10-23 08:53:41 INFO create_aws_instance:402 Instance i-83b27e89 private_ip XX.XXX.XXX.XXX. public_ip XXX.XXX.XXX.XXX VPC None Name Nutanix_EC2_

    The Remote site does get setup in PRISM. The only thing that you might want to check off is to compress the data on the wire. I was using an ssh tunnel but for production you should run a VPC. Using a VPC you can achieve up to 25% better throughput.

    Remote Site is automatically setup for you.  It picks a port from 3000-4000.

    Remote Site is automatically setup for you. It picks a port from 3000-4000.

    All you have to do next is create your Protection Domain and pick which VM’s you want to back up.
    cloudconnect-pd

    easy

    Oct
    15

    Web-scale Wish for the Hoilday Season

    This holiday season Nutanix is making web-scale wishes come true for non-profit organizations! I am excited for Web-Scale Wish, a Non-Profit Datacenter Makeover that will gift $500,000 worth of Nutanix technology to causes nominated by the community. I worked at a Non-Profit in Edmonton for 4 years so this is great to see. Non-profit funding should go directly towards research or helping people in need so technology is usually the first thing to get pulled from the budget.

    The Grand Prize
    A 48-hour Web-Scale Datacenter Makeover from Nutanix! Includes NX-3350, 3 Pro software licenses, 3 years of support, and 2-day installation service.

    Who Can Win?
    Web-Scale Wish is a multi-region campaign spanning countries in three theaters. To qualify, an organization must:

    * Be a non-partisan non-profit
    * Be located in the United States, Canada, the United Kingdom, Australia, or New Zealand
    * Have a working datacenter that could benefit from web-scale converged infrastructure
    * Be able to complete and meet environment assessment criteria

    Nominations can be submitted online starting October 15th.

    Anyone can nominate a non-profit organization or the organization can enter directly. Nominations will be collected from October 15 – November 21, 2014. Participating organizations must complete an environment assessment to qualify.

    Get all the information at http://www.webscalewish.com/

    Sep
    23

    Fast Access with PreLaunch and Session Linger in Citrix XenApp 7.6

    Session lingering seems like a great fit shared environments like schools and hospitals.

    More info on Citrix Validated Solution for Nutanix – 1000 users in 6U of space.

    Sep
    22

    Prism Element and Prism Central Walkthrough

    A brief overview of the Nutanix UI and multi-cluster management. Take a look at one of the Nutanix internal engineering clusters used for quality assurance and load testing. Over 28 nodes of different sizes and product families acting as one giant cluster.

    Also posted on Nutanix.com

    Sep
    17

    Nutanix and EVO:RAIL\VSAN – Data Placement

    Nutanix and VSAN\EVO:RAIL are different in many ways. One such way is how data is spread out through the cluster.

    • VSAN is a distributed object file system
    • VSAN metadata lives with the vm, each vm has it’s own witness
    • Nutanix is a distributed file system
    • Nutanix metadata is global

    VSAN\EVO:RAIL will break up its objects (VMDK’s) into components. Those components get placed evenly among the cluster. I am not sure on the algorithm but it appears to be capacity based. Once the components are placed on a node they stay there until:

    • They are deleted
    • The 255 GB component (default size) fills up and another one is created
    • The Node goes offline and a rebuild happens
    • Maintenance mode is issued and someone selects the evacuate data option.

    So in a fresh brand new cluster things are pretty evenly distributed.

    VSAN

    VSAN distributes data with the use of components

    VSAN distributes data with the use of components

    Nutanix uses data locality as the main principle in placement of all initial data. One copy is written locally, one copy remotely. As more writes occur the secondary copy of the data keeps getting spread evenly across the cluster. Reads stay local to the node. Nutanix uses extent and extent groups as the mechanism to coalesce the data (4 MB).

    A new Nutanix cluster or one running for a long time, things are kept level and balanced based on a percentage of overall capacity. This method accounts for clusters with mixed nodes\needs. More here.

    Nutanix

    Nutanix

    Nutanix places copies of data with the use of extent groups.

    So you go to expand your cluster…

    With VSAN after you add a node (compute, SSD, HDD) to a cluster and you vMotion workload over to the new node what happens? Essential nothing. The additional capacity would get added to the cluster but there is no additional performance benefit. The VM’s that are moved to the new node continue to hit the same resources across the cluster. The additional flash and HDD sit there idle.

    VSAN

    Impact of adding a new node with VSAN and moving virtual machines over.

    Impact of adding a new node with VSAN and moving virtual machines over.

    When you add a node to Nutanix and vMotion workloads over they start writing locally and get to benefit from the additional flash resources right away. Not only is this important from a performance perspective, it also keeps available data capacity level in the event of a failure.

    Nutanix

    Nutanix

    Impact of adding a new node with Nutanix and moving virtual machines over.

    Since data is spread evenly across the cluster in the event of hard drive failing all of the nodes in Nutanix can help with rebuilding the data. With VSAN only the nodes containing the components can help with the rebuild.

    Note: Nutanix rebuilds cold data to cold data (HDD to HDD), VSAN rebuilds data into the SSD Cache. If you lose a SSD with VSAN all backing HDD need to be rebuilt. The data from HDD on VSAN will flood into the cluster SSD tier and will affect performance. This is one of the reasons I believe why 13 RAID controllers were pulled from the HCL. I do find it very interesting because one of the RAID controllers pulled is one that Nutanix uses today.

    Nutanix will always write the minimum two copies of data in the cluster regardless of the state of the clusters. If it can’t the guest won’t get the acknowledgment. When VSAN has a host that is absent it will write only 1 copy if the other half of the components are on the absent host. At some point VSAN will know it has written too much with only 1 copy and start the component rebuild before the 60 minute timer. I don’t know the exact algorithm here either, it’s just what I have observed after shutting a host down. I think this is one of the reasons that VSAN recommends writing 3 copies of data.

    [Update: VMware changed the KB article after this post. It was 3 copies of data and has been adjusted to 2 copies (FT > 0) Not sure what changed on their side. There is no explanation for the change in the KB.]

    Data locality has an important role to play in performance, network congestion and in availability.

    More on Nutanix – EVO

    Learn more about Nutanix

    Sep
    15

    Machine Data Merged with Web-Scale. #Splunk #Nutanix #MISABC2014

    Screen Shot 2014-09-15 at 10.33.27 PM

    Splunk Enterprise scales to collect and index tens of terabytes of data per day. And because the insights from your data are mission-critical, Splunk software’s index replication technology provides the availability you need, even as you scale out your low-cost, distributed computing environment. Automatic load balancing optimizes workloads and response times and provides built-in failover support. Out-of-the-box reporting and analytics capabilities deliver rapid insights from your data.

    Splunk DB Connect delivers reliable, scalable, real-time integration between Splunk and traditional relational databases. Splunk Hadoop Connect provides bi-directional integration to easily and reliably move data between Splunk Enterprise and Hadoop.

    Learn why you should virtualize Splunk and how Nutanix and Splunk combine web-scale approaches with the likes of map-reduce to deliver insights and value from your infrastructure.

    Check out the full Splunk RA on Nutanix

    Check out the Nutanix speaking sessions at MISA BC

    Sep
    12

    Nutanix High Availability and Continuity – Impact on Ops

    One of the added benefits of scale-out storage is the addition of multiple storage controllers. When you have more than 2 storage controllers and you lose one due to failure or maintenance like a rolling upgrade, you can do so with minimal impact.

    Below are the results of 8-node cluster with 700 desktops running a Login VSI medium workload. One of the 8 storage controllers is shutdown to see the impact on the cluster. No desktops were rebooted or shutdown. IOPS dropped from 2,000 to 1,496 and latency had a brief spike from 4ms to 22.37 ms.

    Impact of shutting off a Nutanix Storage Controller. No vMotion for upgrades or maintenance.

    Impact of shutting off a Nutanix Storage Controller. No vMotion for upgrades or maintenance.

    Things to think about for hyper-convergence?

    * Is data spread out evenly that additional controllers will help?
    * Do you have to vMotion VM’s to preform an upgrade? If so, can you meet your maintenance window?
    * Will dependences on the Hypervisor management stack cause you to patch both your control and data plane?

    See more proven results on scaling with the VMware Horizon 6.0 with the View RA

    Sep
    11

    Choice: vSphere Licensing on Nutanix

    Nutanix provides choice on what vSphere license you can apply to your environment. If your at a remote site, you can run vSphere essentials, if you have a high density deployment you can run vSphere Enterprise Plus. In short, the choice is left up to the customer on what makes sense.

    It’s important to have flexibility around licensing because VMware can add\remove packages any time. For example, VMware announced a great ROBO license edition recently, VMware vSphere Remote Office Branch Office Standard and Advanced Editions. Now you can purchase VMware licensing per packs of VM versus paying per socket. Enterprises that have lots of remote sites but few resources running in them can take a look at the NX-1020 that has native built-in replication plus the appropriate licensing pack.

    Nutanix Perfect for Remote Sites

    Licensing can change any time, Nutanix provides flexibility to meet your needs. Check out this great video on why vSphere for Remote Sites.

    What happens if you have an existing Enterprise Licensing Agreement? No problem! Go ahead and apply your existing license. If your needs change, rest assured Nutanix will keep running.

    This flexibility in licensing comes from the fact that Nutanix runs in the user space and hasn’t created any dependences on vCenter. Nutanix will continue to run without vCenter and management of your cluster is not affected by vCenter going down. The Nutanix UI known as PRISM is highly available and designed to scale right along with your cluster, from 5 VM’s to +50,000 VM’s.

    Pick what works for you.

    @dlink7

    First posted on Nutanix.com

    Sep
    08

    Do you have a vCenter plugin?

    vCenter Plugins are a bad proposition and fit into the bucket of “too good to be true” when talking about storage. Having your storage dependent on vCenter creates a situation where you are now tied to something that is a single point of failure, has is security implications, it can limit your design/solution due to vCenter’s lack of scale and restricts your control over future choices. In most cases storage companies have plugins because the initial setup with ESXi is complex and they are trying to mask some the work needed to get it up and running.

    Single Point of Failure

    Even VMware Technical marketing staff admit that vCenter is limited in options to keep it protected. Now that Heartbeat is end of life there really isn’t a good solution in place.
    spof
    What happens when vCenter goes down? Do you rely on the plugin to provide UI for administration? How easy is it run commands from the CLI when the phones light up with support calls?

    Nutanix ability to scale and remain availability is not dependent on a plugin.

    Security

    If I was to place money, I would bet no one can write a plugin for vCenter better than VMware with security in mind. VMware has decided not to create a plugin for EVOL:RAIL and stand up a separate web server. I might be reading in between the lines but punching more holes into vCenter is not a good thing. How hardened can a 3rd party plugin be? Chances are the plugin will ending up talking to esxi hosts thru vpxuser which essential is a root account. It’s not the root account that is scary, it how people get access to it. Does the plugin use vCenter security rights? Too me there is just more questions than answers.

    From the vendor side, as VMware goes from vSphere 5.5 -> 6.0 -> .Next, the plugin will have to be in lock step and will cause more work and time in making sure the plugin works versus pouring that time and effort into new development.

    Scale\Limits

    Scale and number nodes are affected by vCenter’s ability to manage the environment. Both cluster size and linked mode by a part in the overall management. If the plugin is needed to manage the environment can you scale past the limits of vCenter? How many additional management points does this cause? In a multi-site deployment can you use two vCenters? Experience tells me vCenter at the data center managing remote sites hasn’t been fun in the past.

    If you’re a hyper-converged vendor do you have to license all of your nodes if you just need storage due to the plugin? If you just need a storage node you do have the option of just adding it to the cluster and not vCenter with Nutanix.

    One of the most scariest things from an operations point of view is patching. Does patching vCenter cause issues to the plugin? Do you have to follow the HCL matrix when patching the plugin? Today patching Horizon View you have to worry about storage version, VMware Tools, View Agent, View Composer, vCenter, Database version, Anti-virus version and adding a plugin to the mix will not help.

    I think vCenter plugins are over-hyped for what they can cause in return. Maybe Nutanix will get one but this kid doesn’t see the need. If the future direction is for web-based client, having another tab open in Chrome is not a big deal.

    Sep
    08

    Hypervisor Agnostic Backuup -> Hyper-V -> ESXi -> AWS

    Backup Everything

    Cloud Connect is going GA with the release of NOS 4.1 and I did notice the new Backup checkbox when setting up Remote Sites. I just assumed it was for setting up the remote site in AWS. While it is needed for AWS it also allows you take one Nutanix Cluster running any hypervisor like Hyper-V and backup it up to another Nutanix Cluster running any Hypervisor like vSphere ESXi.

    Hypervisor Agonistic

    If I was to read the documentation I would have found this:

    Backup: Check the box to enable backup (only) to this site. Backup allows the remote site to be used as a backup (replication) target. This means data can be backed up to this site and snapshots can be retrieved from the site to restore locally, but failover protection (that is, running failover VMs directly from the remote site) is not enabled.

    Note: Remote sites can be in the same data center or in another facility, remote sites are just different physical clusters.

    I can see this being is used a lot for ROBOs where Hyper-V is being used to cut costs on licensing. Customers then can use Prism Central for viewing of all their remote sites and data center clusters. All of the feel good features still work, compression and deduplication and you can mix and match model to get the best price\performance\capacity that works for you.

    Nutanix is the only hyper-converged vendor that offers this flexibility and choice today.