Commvault Best Practices on Nutanix

    I first remember seeing Commvault in 2007 in the pages of Network World and thought it looked pretty interesting then. At the time I was an CA ARCserve junky and prayed everyday I didn’t have to restore anything. Almost 10 years latter tape is still around, virtualization spawned countless backup vendors and Cloud now makes a easy backup target. Today Commvault is still relevant and plays in all of the aforementioned spaces and like most tech companies we have our own overlap with them to some degree. For me Commvault just has so many options it’s almost a problem of what to use where and when.

    The newly released Best Practice Guide with Commvault talks about some of the many options that should be used with Nutanix. Probably the big things that would stand out in my mind if I was new to Nutanix and then read the guide would be the use of a proxy on every host and some of the caveats around Intellisnap.

    Proxy On Every Host

    What weights more? A pound of feathers or a pound of bricks? The point here is you need a proxy regardless and the proxy is sized on how much data you will be backing up. So instead of having 1 giant proxy you now have smaller proxies that are distributed across the cluster. Smaller proxies can read from local Hot SSD tier and limit network traffic so they can help to limit bottlenecks in your infrastructure.

    IntelliSnap is probaly one of the most talked about Commvault features. IntelliSnap allows you to create a point-in-time application-consistent snapshot of backup data on the DSF. The backup administrator doesn’t need to log on to Prism to provide this functionality. A Nutanix-based snapshot is created on the storage array as soon as the VMware snapshot is completed; the system then immediately removes the VMware snapshot. This approach minimizes the size of the redo log and shortens the reconciliation process to reduce the impact on the virtual machine being backed up and minimize the storage requirement for the temporary file. It also allows near-instantaneous snapshot mounts for data access.

    With IntelliSnap it’s important to realize that it was invented at a time where LUNS ruled the storage workload. IntelliSnap in some sense turns Nutanix’s giant volumes/containers the hypervisors sees into a giant LUN. Behind the scenes when Intellisnap is used it snaps the whole container regardless if the VMs are being backed up or not. So you should do a little planning when using IntelliSnap. This is ok since IntelliSnap should be used for high transnational VMs and not every VM in the data center. I just like to point out that streaming backups with CBT is still a great choice.

    With that being said you can checkout the full guide at the Nutanix Website: Commvault Best Practices


    Impact of Nutanix VSS Hardware Support

    When 4.6 was released I wrote about how the newly added VSS support with Nutanix Guest Tools (NGT) was the gem of the release. It was fairly big compliment considering some of the important updates that were in the release like cross hypervisor DR and another giant leap in performance.

    I finally set some time aside to test the impact of taking a application consistent snapshot with VMware Tools vs the Nutanix VSS Hardware Support.

    vmware-vss-qWhen an application consistent snapshot workflow without NGT on ESXi, we take an ESXi snapshot so VMware tools can be used to quiesce the file system. Every time we take an ESXi snapshot, it results in creation of delta disks, During this process ESXi “stuns” the VM to remap virtual disks to these delta files. The amount of stun depends on the number of virtual disks that are attached to the VM and speed in which the delta disks can be created (capability of the underlying storage to process NFS meta-data update operations + releasing/creating/acquiring lock files for all the virtual disks). In this time, the VM is totally unresponsive. No application will run inside the VM, and pings to the VM will fail.

    We then delete the snapshot (after backing up the files via hardware snap on the Nutanix side) which results in another set of stuns (deleting a snapshot causes two stuns, one fixed time stun + another stun based on the number of virtual disks). This essentially means that we are causing two or three stuns in rapid succession. These stuns cause meta-data updates in addition to the flushing of data during the VSS snapshot operations.

    Customers have reported in set of VMs running Microsoft clustering, these VMs can be voted out due to heartbeat failure. VMware gives customer guidance on increasing timers if your using Microsoft clustering to get around this situation.

    To test this out I used HammerDB with a SQL 2014 running on Windows 2012R2. The tests were run on ESXi 6.0 with hardware version 11.


    VMware Tools with VSS based Snapshot
    I was going to try to stitch the images together because of the time it took but decided to leave as is.


    The total process took ~4 minutes.

    NGT with VSS Hardware Support based Snapshot
    NGT based VSS snapshots don’t cause VM stuns. The application will be stunned temporarily within Windows to flush the data, but pings and other things should work.


    The total process took ~1 minute.


    NGT with VSS hardware support is the Belle of the Ball! While there is no fixed number to explain the max stun times. It depends on how heavy the workload is but what we can see is the effect of not using NGT for application consistent snapshot and it’s pretty big. The collapsing of ESXi snapshots cause additional load and should be avoided if possible. NGT offers hypervisor agnostic approach and currently works with AHV as well.

    Note: Hypervisor snapshot consolidation is better in ESXi 6 than ESXi 5.5.

    Thanks to Karthik Chandrasekaran and Manan Shah for all their hard work and contribution to this blog post.


    SAP Best Practices and Sizing on Nutanix

    SAP-NETWEAVERAt the heart of SAP Business Suite is the SAP ERP application, which is supplemented by SAP
    CRM, SAP SRM, SAP PLM, and SAP SCM. From financial accounting through manufacturing, logistics, sales, marketing, and human resources, SAP Business Suite manages all the key mission-critical business processes that occur each day in companies around the world. SAP NetWeaver is the technical foundation for many SAP applications; it is a solution stack of SAP’s technology products.

    Deploying and operating SAP Business Suite applications in your environment is not a trivial task. Nutanix enterprise cloud platforms provide the reliability, predictability, and performance that the SAP Business Suite demands, all with an efficient and elegant management interface.

    The Nutanix platform offers SAP customers a range of benefits, including:

    • Lower risk and cost on the first hyperconverged platform SAP-certified for NetWeaver applications.
    • A turnkey validated framework that dramatically reduces the time to deploy your SAP
    • Mission-critical availability with a self-healing foundation and VM-centric data protection, including support for the top enterprise backup solutions.
    • Flexibility to choose among industry-leading SAP-supported hypervisors.
    • Simplified operations, including application- and VM-level metrics alongside single-click
    provisioning and upgrades.
    • Reduced TCO from infrastructure right-sized for your SAP workload.
    • A best-in-class worldwide support system whose knowledge and commitment to customer service has earned the Omega NorthFace Scoreboard Award for three consecutive years.

    Read the Solution Note for best practices with both Hyper-V and VMware and sizing guidelines => SAP Solution Note


    Save Your Time With Nutanix Automatic Support

    Best Industry Support

    The feature known as Pulse is enabled by default and sends cluster status information automatically to Nutanix customer support. After you have completed initial setup, created a cluster, and opened ports 80 or 8443 in your firewall, AOS sends a Pulse message from each cluster once every 24 hours. Each message includes cluster configuration and health status that can be used by Nutanix Support to address any cluster operation issues.

    AOS can also send automatic alert email notifications to Nutanix Support by default through ports 80 or 8443. Like Pulse, any configured firewall must have these ports open. Some examples of conditions that will automatically generate a proactive case with Nutanix support with a Priority Level P4.

    The Stargate process is down for more than 3 hours
    Curator scan fails
    Hardware Clock Failure
    Faulty RAM module
    Power Supply failure
    Unable to fetch IPMI SDR repository (IPMI Error)
    HyperV networking
    System operations
    Disk Capacity > 90%
    Bad Drive

    You can optionally use your own SMTP server to send Pulse and alert notifications. If you do not or cannot configure an SMTP server, another option is to implement an HTTP proxy as part of your overall support scheme.

    While the best thing is never to a get a call, 2nd best is not waiting in line to open a ticket. Have a great week!


    3rd Generation Erasure Coding (EC-X) – What’s Next?

    Take time for all things: great haste makes great waste. Benjamin Franklin

    I don’t profess to be an erasure coding genius but I know enough that it would be very poor choice for workloads that has lots of overwrites, cycling thru lots of snapshots and running erasure coding inline would really only be suited for a WORM application which is not typical for a lot of virtual environments. Nutanix first released erasure coding as EC-X in AOS 4.1.3 as a tech preview and has learned lots along the way with it’s agile software development method.

    With AOS 4.6.1 being released on April 18th more improvements were added for EC-X.

      Faster reclamation – simply put if your EC strip is changed holes start appearing in your strip. You need an efficient of plugging the holes and allow them to be encoded again. /ol>

        Advanced EC-X selection heuristics – Nutanix engineering has come up with an algorithm to determine to use blocks form the same virtual hard drive or blocks from through out the container. Better selection reduces the need to fix strips and reduce CPU load on the cluster. This also helps to fix the problem of cycling through lots of snapshots.
        Strip compaction – If a EC-X strip has too many holes it won’t even try to fill the gaps. It will determine to move the data out of the strip

      With the mission to enable enterprise cloud more and more of the features are becoming self adjusting to truly allow for set and forget. The end goal is to have all the features turned on and let the system side. I am looking forward to watching the announcements at .Next in June.

      come to  .Next


    Operations Getting Down With DJ RunC & ContainerD


    runC and containerd does sound like some rappers from the 80’s. While in the land of hip hop Run–D.M.C. was legendary in creating new school rap, Docker has thrown it’s interia behind runC and containerD to pave the way for future success. runC is an implementation of the Open Container Initiative (OCI) spec which Docker has donated a huge chunk of their own work to the project. runC is a standalone binary that allows you run a single OCI container. This is big because now everyone has a standard way to run a container which creates better portability and creates good code hygiene.

    containerD is a new piece of infrastructure plumbing that allows you to run multiple containers using runC. It’s kinda like a simple init system. containterD takes care of the simple CRUD operations against containers but image management still lives with the Docker Engine. containerD is also event driven so you can build untop of it.


    With the release of Docker 1.11 runC and contianerD is fully integrated. I think this important because if your going to pick a horse in the container race you have a company in Docker that is leading with committers for OCI which is essentially helping to set direction for containers. On the operations side of the house if I have to upgrade the Docker Engine, there is now a road map to have an upgrade without affecting your running containers. It’s great containers can run and die but it’s even better if they never fail :-)

    Docker 1.11 also added DNS round robin load balancing. While may it seems crude to the likes of a F5 or Netscaler engineer I always find simple wins and see it used in lots of places. If you give multiple containers the same alias, Docker’s service discovery will return the addresses of all of the containers for round-robin DNS.

    I think the the 1.11 release of Docker will continue to build great things. Let’s just hope it doesn’t lead to over played Run–D.M.C spoof shirts.


    Quickly Pin Your Virtual Hard Drive To Flash #vExpert #NTC

    If you need to ensure performance with Flash Mode here is a quick way to get your job done.

    Find the disk UUID
    ncli virtual-disk ls | grep -B 3 -A 6


    ncli virtual-disk ls | grep m1_8 -B 3 -A 6

    Virtual Disk Id : 00052faf-34c2-58fc-64dd-0cc47a673b8c::313a49:6000C29b-93c9-bfe1-58d9-e718993e5a06
    Virtual Disk Uuid : 1dc11a7f-63ac-422a-ac27-442d5fcfc91a
    Virtual Disk Path : /hdfs/cdh-m1/cdh-m1_8.vmdk
    Attached VM Name : cdh-m1
    Cluster Uuid : 00052faf-34c2-58fc-64dd-0cc47a673b8c
    Virtual Disk Capacity : 268435456000
    Pinning Enabled : Flase

    Set 25 GB to pin to flash of the vdisk
    ncli virtual-disk update-pinning id=00052faf-34c2-58fc-64dd-0cc47a673b8c::313a49:6000C29b-93c9-bfe1-58d9-e718993e5a06 pinned-space=25 tier-name=SSD-SATA

    Pinned Space is in GB.

    In this case I was pinning a Hadoop NameNode directories to flash because I wanted to include their physical node in the cluster to help with replication traffic.


    Horizon 7: Notes & important cliff notes from the docs

    I was travelling last week and when I was sitting on the plane reviewing some Horzion 7 docs. I thought I would capture the bits that tend to make or break your installation. The below bits are good reminders on what to do and what not to do.

    NOTE When installing replicated View Connection Server instances, you must usually configure the instances in the same physical location and connect them over a high-performance LAN. Otherwise, latency issues could cause the View LDAP configurations on View Connection Server instances to become inconsistent. A user could be denied access when connecting to a View Connection Server instance with an out-of-date configuration

    IMPORTANT The physical or virtual machine that hosts View Connection Server must have an IP address
    that does not change. In an IPv4 environment, configure a static IP address. In an IPv6 environment, machines automatically get IP addresses that do not change.

    IMPORTANT To use a group of replicated View Connection Server instances across a WAN, MAN (metropolitan area network), or other non-LAN, in scenarios where a View deployment needs to span datacenters, you must use the Cloud Pod Architecture feature. You can link together 25 View pods to provide a single large desktop brokering and management environment for five geographically distant sites and provide desktops and applications for up to 50,000 users.

    Cloud Pod Architecture

    NOTE Windows Server 2008 R2 with no service pack is no longer supported.

    To use View Administrator with your Web browser, you must install Adobe Flash Player 10.1 or later

    IMPORTANT If you create the View Composer database on the same SQL Server instance as vCenter Server,
    do not overwrite the vCenter Server database.

    IMPORTANT To run View in an IPv6 environment, you must specify IPv6 when you install all View
    components. – you can’t change it after the fact.

    NOTE View does not require you to enter an IPv6 address in any administrative tasks. In cases where you can specify either a fully qualified domain name (FQDN) or an IPv6 address, it is highly recommended that you specify an FQDN to avoid potential errors.

    NOTE To ensure that View runs in FIPS (Federal Information Processing Standard) mode, you must enable FIPS when you install all View components.

    NOTE You might need to set the UPN for built-in Active Directory accounts, even if the certificate is issued
    from the same domain. Built-in accounts, including Administrator, do not have a UPN set by default.

    Enrollment Server Installation

    NOTE Because this feature requires that a certificate authority also be set up,and specific configuration performed, the installation procedure for the enrollment server is provided in the View Administration document,

    NOTE View Connection Server does not make, nor does it require, any schema or configuration updates to Active Directory.

    IMPORTANT You will need the data recovery password to keep View operating and avoid downtime in
    a Business Continuity and Disaster Recovery (BCDR) scenario. You can provide a password reminder
    with the password when you install View Connection Server.

    IMPORTANT When you perform a silent installation, the full command line, including the data recovery
    password, is logged in the installer’s vminst.log file. After the installation is complete, either delete this
    log file or change the data recovery password by using View Administrator.

    NOTE Replication functionality is provided by View LDAP, which uses the same replication technology as
    Active Directory.

    NOTE You cannot pair an older version of security server with the current version of View Connection
    Server. If you configure a pairing password on the current version of View Connecton Server and try to install an older version of security server, the pairing password will be invalid.

    IMPORTANT If you do not provide the security server pairing password to the View Connection Server installation program within the password timeout period, the password becomes invalid and you must configure a new password.

    IMPORTANT If you use a load balancer, it must have an IP address that does not change. In an IPv4 environment, configure a static IP address. In an IPv6 environment, machines automatically get IP addresses that do not change.

    NOTE If the installation is cancelled or aborted, you might have to remove IPsec rules for the security server
    before you can begin the installation again. Take this step even if you already removed IPsec rules prior to
    reinstalling or upgrading security server.

    CAUTION If you remove the IPsec rules for an active security server, all communication with the security
    server is lost until you upgrade or reinstall the security server. Therefore, if you use a load balancer to manage a group of security servers, perform this procedure on one server and then upgrade that server before removing IPsec rules for the next server. You can remove servers from production and add them back one-by-one in this manner to avoid requiring any downtime for your end users.

    IMPORTANT Replace the default certificate as soon as possible. The default certificate is not signed by a
    Certificate Authority (CA). Use of certificates that are not signed by a CA can allow untrusted parties to intercept traffic by masquerading as your server.

    IMPORTANT To configure View Connection Server or security server to use a certificate, you must change the
    certificate Friendly name to vdm. Also, the certificate must have an accompanying private key.

    IMPORTANT If you plan to use this feature and you are using multiple View pods that share some ESXi hosts,
    you must enable the View Storage Accelerator feature for all pools that are on the shared ESXi hosts. Having
    inconsistent settings in multiple pods can cause instability of the virtual machines on the shared ESXi hosts.

    View Storage Accelerator is now qualified to work in configurations that use View replica tiering, in which
    replicas are stored on a separate datastore than linked clones. Although the performance benefits of using
    View Storage Accelerator with View replica tiering are not materially significant, certain capacity-related
    benefits might be realized by storing the replicas on a separate datastore. Hence, this combination is tested
    and supported.

    NOTE You can also use Access Point appliances, rather than security servers, for secure external access to Horizon 7 servers and desktops. If you use Access Point appliances, you must disable the secure gateways on View Connection Server instances and enable these gateways on the Access Point appliances.

    IMPORTANT Do not change the JVM heap size on 64-bit Windows Server computers. Changing this value
    might make View Connection Server behavior unstable. On 64-bit computers, the View Connection Server
    service sets the JVM heap size to accord with the physical memory.

    IMPORTANT Syslog data is sent across the network without software-based encryption, and might contain
    sensitive data, such as user names. VMware recommends using link-layer security, such as IPSEC, to avoid
    the possibility of this data being monitored on the network.

    IMPORTANT View Composer is an optional component. If you plan to provision instant clones, you do not need to install View Composer.

    NOTE Virtual Volumes is compatible with the View storage accelerator feature but not with the space efficient
    disk format feature, which reclaims disk space by wiping and shrinking disks.

    NOTE Instant clones do not support Virtual Volumes.


    Docker Machine for Windows By Pictures

    A native Mac and Windows app, Docker can now be be installed, launched and utilized from a system toolbar like any other packaged app. As with Docker for Linux, Docker for Windows brings a deeper integration with each of these platforms, leveraging the native virtualization features of respective platforms.

    You in need to turn on hyper-v for windows. The docker bits will run in MobyLinux VM.

    You in need to turn on hyper-v for windows. The docker bits will run in MobyLinux VM.

    Every install needs the whale. It makes the EULA fun.

    Every install needs the whale. It makes the EULA fun.

    Starting up the VM.

    Starting up the VM.

    I think this might move out after the beta

    I think this might move out after the beta

    Making the magic happen  from the system try.

    Making the magic happen from the system try.

    After the installation you'll see DockerNAT. It's used to talk been your desktop/laptop and the MobyLinux

    After the installation you’ll see DockerNAT. It’s used to talk been your desktop/laptop and the MobyLinux

    Now you can run all your docker commands right from your desktop. A Windows named pipe is used on Windows to talk to the MobyLinux VM

    Now you can run all your docker commands right from your desktop. A Windows named pipe is used on Windows to talk to the MobyLinux VM


    Ease of use and performance
    Leverage native hypervisor support on both platforms
    Fewer steps by leveraging native capabilities (virtualization, networking, filesystems) increases performance and reliability

    Resolves Dependency Issues
    No need to install app framework or runtime
    Integrated products which include Docker Compose and offer a streamlined installation process that no longer requires non-system third-party software like VirtualBox
    Use any version control manager

    In-container development accelerates development
    Devs can simply use a single text editor or IDE to code their application.
    Faster Docker-driven iteration cycles because code changes can be tested instantaneously on the laptop without the need to build the Docker application image first.

    Advanced Networking capabilities
    Docker for Mac and Windows includes a DNS server for containers, and is integrated with the Mac OS X and Windows networking system
    Use Docker more easily over a VPN.


    Microsoft Captaining the Open Source Ship with Docker

    Short blog on explaining why Microsoft and Docker are making a big push in the data center on the Nutanix Community site. Looking for feedback and your thoughts -> Microsoft Captaining the Open Source Ship with Docker