Aug
16

Move Your DBs From Cloud or 3-Tier Clunker to Nutanix with Xtract

Xtract for DBs enables you to migrate your Microsoft SQL Server instances from non-Nutanix infrastructures (source) to Nutanix Cloud Platform (target) with a 1-click operation. You can migrate both virtual and physical SQL Servers to Nutanix. Xtract captures the state of your source SQL Server environments, applies any recommended changes, recreates the state on Nutanix, and then migrates the underlying data.

Xtract is a virtual appliance that runs as a web application. It migrates your source SQL Server instances to Nutanix in the following four phases:

Scanning. Scans and discovers your existing SQL Server environments through application-level inspection.
Design. Creates an automated best practice design for the target SQL Servers.
Deployment. Automates the full-stack deployment of the target SQL Servers with best practices.
Migration. Migrates the underlying SQL Server databases and security settings from your source SQL Servers to the target SQL Servers.
Note: Xtract supports SQL Server 2008 R2 through SQL Server 2016 running on Windows 2008 through Windows 2012 R2.

Xtract first scans your source SQL Server instances, so that it can generate a best-practice design template for your target SQL Server environment. To scan the source SQL Server instances, Xtract requires the access credentials of the source SQL Server instances to connect to the listening ports.

You can group one or more SQL Server instances for migration. Xtract performs migrations at the instance level, which means that all databases registered to a SQL Server instance are migrated and managed as part of a single migration plan. Xtract allows you to create multiple migration plans to assist with a phased migration of different SQL Server instances over time.

xtract

Once the full restore is complete and transaction logs are in the process of getting replayed, you can perform the following actions on your SQL Server instances:

In the Migration Plans screen, you can perform one of the following:

Start Cutover
The cutover operation quiesces the source SQL Server databases by placing them in the single user mode, takes a final backup, restores the backup to the target server, and then brings all the databases in the target server online and ready for use. This action completes all migration activities for a migration plan.

Test

The test operation takes a point-in-time copy of the databases in the source instance and brings them online for testing in the target SQL Server instance. This action does not provide a rollback. Once a Test action has been initiated, you can perform tests on the copy. However, if you want to perform a cutover after the Test operation, you should begin again from the scanning phase.

Come to the Nutanix Booth at VMworld in Vegas to see it in action. One-click yourself out of your AWS bill.

Jun
30

The Down Low on Near-Sync On Nutanix

Nutanix refers to its current implementation of redirect-on-write snapshots as vDisk based snapshots. Nutanix has continued to improve on its implementation of snapshots by adding in Light-Weight Snapshots (LWS) to provide near-sync replication. LWS uses markers instead of creating full snapshots for RPO 15 minutes and under. LWS further reduce overhead with managing metadata and remove overhead associated high number of frequent caused by long snapshot chains. The administrator doesn’t have to worry about setting a policy between using vDisk snapshots or LWS. Acropolis Operating System (AOS) will transition between the two forms of replication based on the RPO and available bandwidth. If the network can’t handle the low RPO replication will transition out of near-sync. When the network is OK again to meet the near-sync requirements AOS will start using LWS again. In over-subscribed networks, near-sync can provide almost the same level protection a synchronous replication without impacting the running workload.

The administrator only need to set the RPO, no knowledge of near-sync is needed.

The administrator only need to set the RPO, no knowledge of near-sync is needed.

The tradeoff is that all changes are handled in SSD when near-sync is enabled. Due to this trade off Nutanix reserves a percentage of SSD space to be used by LWS when it’s enabled.

near-sync

In the above diagram, first a vDisk based snapshot is taken and replicated to the remote site. Once the fully replication is complete, LWS will begin at the set schedule. If there is no remote site setup LWS will happen locally right way. If you have the bandwidth available life is good but that’s not always the case in the real world. If you miss your RPO target repeatedly it will automatically transition back to vDisk based snapshots. Once vDisk based snapshots meets occurs fast enough it will automatically transition back to near-sync. Both transitioning out and into near-sync is controlled by advanced settings called gflags.
One the destination side AOS creates hydration points. Hydration points is a way for the LWS to transition into a vDisk based snapshot. The process for inline hydration is to:

1. Create a staging area for each VM (CG) that’s protected by the production domain
2. The staging area is essentially a directory with a set of vdisks for the VM.
3. Afterwards, any new incoming LWS’s will be applied to the same set of vdisks.
4. And the staging area can be snapshotted from time to time and then you would have individual vdisk-backed snapshots.

The source side doesn’t need to hydrate as a vDisk based snapshot is taken every hour.

Have questions? Please leave a comment.

Jun
29

ROBO Deployments & Operations Best Practices on Nutanix

The Nutanix platform’s self-healing design reduces operational and support costs, such as unnecessary site visits and overtime. With Nutanix, you can proactively schedule projects and site visits on a regular cadence, rather than working around emergencies. Prism, our end-to-end infrastructure management tool, streamlines remote cluster operations via one-click upgrades, while also providing simple orchestration for multiple cluster upgrades. Following the best practices in this new document ensures that your business services are quickly restored in the event of a disaster. The Nutanix Enterprise Cloud Platform makes deploying and operating remote and branch offices as easy as deploying to the public cloud, but with control and security on your own terms.

One section I would like to call out in the doc is how to seed your customer data if your dealing with poor WAN links.

Seed Procedure

The following procedure lets you use seed cluster (SC) storage capacity to bypass the network replication step. In the course of this procedure, the administrator stores a snapshot of the VMs on the SC while it’s installed in the ROBO site, then physically ships it to the main datacenter.

Install and configure application VMs on a ROBO cluster.
Create a protection domain (PD) called PD1 on the ROBO cluster for the VMs and volume groups.
Create an out-of-band snapshot S1 for the PD on ROBO with no expiration.
Create an empty PD called PD1 (same name used in step 2) on the SC.
Deactivate PD1 on the SC.
Create remote sites on the ROBO cluster and the SC.
Retrieve snapshot S1 from the ROBO cluster to the SC (via Prism on the SC).
Ship the SC to the datacenter.
ReIP the SC.
Create remote sites on the SC cluster and on the datacenter main cluster (DC1).
Create PD1 (same name used in steps 2 and 4) on DC1.
Deactivate PD1 on DC1.
Retrieve S1 from the SC to DC1 (via Prism on DC1). Prism generates an alert here, but though it appears to be a full data replication, the SC transferred metadata information only.
Create remote sites on DC1 and the ROBO cluster.
Set up a replication schedule for PD1 on the ROBO cluster in Prism.
Once the first scheduled replication is successful, you can delete snapshot S1 to reclaim space.

To get all of the best practices please download the full document here, https://portal.nutanix.com/#/page/solutions/details?targetId=BP-2083-ROBO-Deployment:BP-2083-ROBO-Deployment

Jun
28

Rubrik and AHV: Say No to Proxies

rubrik
The last couple of years I am a huge fan of backup software that removes the need for having proxies. Rubrik provides a proxy-less backup solution by using the Nutanix Data Services Virtual IP address to talk directory to each individual virtual disk that it needs to back up.
Rubrik and Nutanix have some key advantages with this solution:
• AOS 5.1+ with version 3 API’s provides change region tracking for quick efficient backup with no hypervisor based snap. This allow for quick and efficient snapshots.
• With AHV and data locality, Rubrik can grab the most recently changed data without flooding the network which can happen when the copy and VM might not live on the same host. For Nutanix the reads happen locally.
• Rubrik has access to ever virtual disk by making an iSCSI connection to bypass the need of proxies.
• AOS can redirect the 2nd RF copy away from a node with it’s advanced data placement if the backup load becomes too great during a backup window. Thus protecting your mission critical apps that running 24-7.
• Did I mention no proxies? 🙂

Stop by the Rubrik booth and catch their session if your at .Next this week.

May
01

Acropolis Container Services on #Nutanix

This is the first release of a turnkey solution for deploying Docker containers in a Nutanix cluster. Instead of swiping your credit card for AWS EC2 you can deploy your containers through the built in Self Service Portal. Now it’s not all totally new because Nutanix previously released a volume plug-in for Docker. What is new is:

* The Acropolis Container Services(ACS) provisions multiple VMs as container machines to run Docker
containers on them.
* Containers are deployed apart of projects. In projects, users can deploy VM’s or containers. You can assign quotas to the projects over, storage, CPU and memory.
* ACS can use the public Docker registry is provided by default, but if you have a separate Docker registry you
want to use, configure access to that registry as well.
* One-Click upgrades for the Container machines.
* Basic monitoring with a containers view in the self-service portal allows you to view summary information about containers connected to this portal and access detailed information about each container.

    Feb
    16

    Nutanix AFS DR Failing over from vSphere to AHV (video 3:30)

    A quick video showing the fail-over for Acorpolis File Services. The deployment setups a lot of the need peices but you will still have to set a schedule and map the new container(vStore) that is being used by AFS to the remote site.

    Remember you want the number of FSVMS making up the file server to be the same of less than the number of nodes at the remote site.

    Feb
    13

    Docker Datacenter: Usability For Better Security.

    With the new release of Docker Datacenter 2.1 it’s clear the Docker is very serious about the enterprise and providing the tooling that is very easy to use. Docker has made the leap to supporting enterprise applications with its embedded security and ease of use. DCC 2.1 and Docker-engine-cs 1.13 give the additional control needed for operations and development teams to control their own experience.

    Docker datacenter continues to build on containers as a service. In the 1.12 release of DDC it enabled agility and portability for continuous integration and started on the journey of protecting the development supply chain throughout the whole lifecycle. The new release of DDC’s focuses on security, specifically secret management.
    The previous version of DDC already had wealth of security features
    • LDAP/AD integration
    • Role based access control for teams
    • SS0 and push/pull images with Docker Trusted Registry
    • Imaging signing – prevent running a container unless image signed by member of a designated
    • Out of the box TLS with easy setup, including cert rotation.

    With the DDC 2.1 the march on security is being made successful by allowing both operations and developers to have a usable system without having to lean into security for support and help. The native integration with the management plane allows for end to end container lifecycle management. You also inherit the model that’s independent no matter the infrastructure you’re running on it will work. It can be made to be dynamic and ephemeral like the containers it’s managing. This is why I feel PAAS is dead. With so much choice and security you don’t have to limit yourself where you deploy to, a very similar design decision to Nutanix by enabling choice. Choice gives you access to more developers and the freedom to color outside the lines of the guardrails that a PAAS solution may empose.

    Docker Datacenter Secrets Architecture

    ctr-secruity3
    1) Everything in the store is encrypted, notably that includes all of the data that is stored in the orchestration . With least privlege — only node is distributed to the containers that need them. Since the management mayor is scalable you also get that for your key management as well. Due to the management layer being so easy to set up you don’t have developers embedding secrets in Github to get a quick work around.
    2) Containers and the filesystem makes secret only available to only the designated app . Docker expose secrets to the application via a file system that is stored in memory. The same rotation of certificates for the management letter also happens with the certificates for the application. In the diagram above the red service only talks of the red service and the blue service is isolated by itself even though it’s running on the same node as the red service/application.
    3) If you decide that you want to integrate with a third-party application like Twitter and be easily done. Your Twitter credentials can be stored in the raft cluster which is your manager nodes. When you go to create the twitter app you give it access to the credentials and even do a “service-update” if you need swap them out without the need to touch every node in your environment.

    With a simple interface for both developers and IT operations both have a pain-free way to do their jobs and provide a secure environment. By not creating road blocks and slowing down development or operations teams will get automatic by in.

    Feb
    07

    Nutanix AFS – Maximums

    Nutanix AFS Maximums – Tested limits. (ver 2.0.2)
    Configurable Item Maximum Value
    Number of Connections per FSVM 250 for 12 GB of memory
    500 for 16 GB of memory
    1000 for 24 GB of memory
    1500 for 32 GB of memory
    2000 for 40 GB of memory
    2500 for 60 GB of memory
    4000 for 96 GB of memory
    Number of FSVMs 16 or equal to the number of CVMs (choose the lowest number)
    Max RAM per FSVM 96 GB (tested)
    Max vCPUs per FSVM 12
    Data size for home share 200 TB per FSVM
    Data size for General Purpose Share 40 TB
    Share Name 80 characters
    File Server Name 15 characters
    Share Description 80 characters
    Windows Previous Version 24 (1 per hour) adjustable with support
    Throttle Bandwith limit 2048 MBps
    Data Protection Bandwith limit 2048 MBps
    Max recovery time objective for Async DR 60 minutes

    s-l300

    Jan
    08

    Client Tuning Recommendations for ABS (Acropolis Block Services)

    Client Tuning Recommendations for ABS (Acropolis Block Services)

    o For large block sequential workloads, with I/O sizes of 1 MB or larger, it’s beneficial to increase the iSCSI MaxTransferLength from 256 KB to 1 MB.

    * Windows: Details on the MaxTransferLength setting are available at the following link: https://blogs.msdn.microsoft.com/san/2008/07/27/microsoft-iscsi-software-initiator-and-isns-server-timers-quick-reference/.

    * Linux: Settings in the /etc/iscsi/iscsid.conf file; node.conn[0].iscsi.MaxRecvDataSegmentLength

    o For workloads with large storage queue depth requirements, it can be beneficial to increase the initiator and device iSCSI client queue depths.

    * Windows: Details on the MaxPendingRequests setting are available at the following link: https://blogs.msdn.microsoft.com/san/2008/07/27/microsoft-iscsi-software-initiator-and-isns-server-timers-quick-reference/.

    * Linux: Settings in the /etc/iscsi/iscsid.conf file; Initiator limit: node.session.cmds_max (Default: 128); Device limit: node.session.queue_depth (Default: 32)

    For more best practices download the ABS best practice guide

    Jan
    05

    Nutanix AFS – Domain Activation

    Well if it’s not DNS stealing hours of your life, the next thing to make your partner angry as you miss family supper is Active Directory(AD). In more complex AD setups you may find your self going to the command line to attach your AFS instance to AD.

    Some important requirements to remember:

      While a deployment could fail due to AD, the FSVM(file server VMs) still get deployed. You can do the join domain process from the UI or NCLI afterwards.

      joindoamin

      The user attaching to the domain must be a domain admin or have similar rights. Why? The join domain process will create 1 computer account in the default Computers OU and create A service principal name (SPN) for DNS. If you don’t use the default Computers OU you will have to use the organizational-unit option from NCLI to change it to the appropriate OU. The computer account can be created in a specified container by using a forward slash mark to denote hierarchies (for example, organizational_unit/inner_organizational_unit).

      example

      stayoutad

      Command was

      ncli> fs join-domain uuid=d9c78493-d0f6-4645-848e-234a6ef31acc organizational-unit="stayout/afs" windows-ad-domain-name=tenanta.com preferred-domain-controller=tenanta-dc01.tenanta.com windows-ad-username=bob windows-ad-password=dfld#ld(3&jkflJJddu

      AFS needs at least 1 writable DC to complete the domain join. After the domain join is can authenticate using a local read only DC. Timing (latency) may cause problems here. To pick an individual DC you can use preferred-domain-controller from the NCLI.

    NCLI Join-Domain Options

    Entity:
    file-server | fs : Minerva file server

    Action:
    join-domain : Join the File Server to the Windows AD domain specified.

    Required Argument(s):
    uuid : UUID of the FileServer
    windows-ad-domain-name : The windows AD domain the file server is
    associated with.
    windows-ad-username : The name of a user account with administrative
    privileges in the AD domain the file server is associated with.
    windows-ad-password : The password for the above Windows AD account

    Optional Argument(s):
    organizational-unit : An Organizational unit container is where the AFS
    machine account will be created as part of domain join
    operation. Default container OU is "computers". Examples:
    Engineering, Department/Engineering.
    overwrite : Overwrite the AD user account.
    preferred-domain-controller : Preferred domain controller to use for
    all join-domain operations.

    NOTE: preferred-domain-controller needs to be FQDN

    If you need to do further troubleshooting you can ssh into one of the FSVMs and run

    afs get_leader

    Then navigate to the /data/logs and look at the minerva logs.

    Shouldn't be an issue in most environments but I've included used ports just in case.


    Required AD Permissions

    Delegating permissions in an Active Directory (AD) enables the administrator to assign permissions in the directory to unprivileged domain users. For example, to enable a regular user to join machines to the domain without knowing the domain administrator credentials.

    Adding the Delegation
    ---------------------
    To enable a user to join and remove machines to and from the domain:
    - Open the Active Directory Users and Computers (ADUC) console as domain administrator.
    - Right-click to the CN=Computer container (or desired alternate OU) and select "Delegate control".
    - Click "Next".
    - Click "Add" and select the required user and click "Next".
    - Select "Create a custom task to delegate".
    - Select "Only the following objects in the folder" and check "Computer objects" from the list.
    - Additionally select the options "Create selected objects in the folder" and "Delete selected objects in this folder". Click "Next".
    - Select "General" and "Property-specific", select the following permissions from the list:
    - Reset password
    - Read and write account restrictions
    - Read and write DNS host name attributes
    - Validated write to DNS host name
    - Validated write to service principal name
    - Write servicePrincipalName
    - Write Operating System
    - Write Operating System Version
    - Write OperatingSystemServicePack
    - Click "Next".
    - Click "Finish".
    After that, wait for AD replication to finish and then the delegated user can use its credentials to join AFS to a domain.


    Domain Port Requirements

    The following services and ports are used by AFS file server for Active Directory communication.

    UDP and TCP Port 88
    Forest level trust authentication for Kerberos
    UDP and TCP Port 53
    DNS from client to domain controller and domain controller to domain controller
    UDP and TCP Port 389
    LDAP to handle normal queries from client computers to the domain controllers
    UDP and TCP Port 123
    NTP traffic for the Windows Time Service
    UDP and TCP Port 464
    Kerberos Password Change for replication, user and computer authentication, and trusts
    UDP and TCP Port 3268 and 3269
    Global Catalog from client to domain controllers
    UDP and TCP Port 445
    SMB protocol for file replication
    UDP and TCP Port 135
    Port-mapper for RPC communication
    UDP and TCP High Ports
    Randomly allocated TCP high ports for RPC from ports 49152 to ports 65535