Jan
08

Client Tuning Recommendations for ABS (Acropolis Block Services)

Client Tuning Recommendations for ABS (Acropolis Block Services)

o For large block sequential workloads, with I/O sizes of 1 MB or larger, it’s beneficial to increase the iSCSI MaxTransferLength from 256 KB to 1 MB.

* Windows: Details on the MaxTransferLength setting are available at the following link: https://blogs.msdn.microsoft.com/san/2008/07/27/microsoft-iscsi-software-initiator-and-isns-server-timers-quick-reference/.

* Linux: Settings in the /etc/iscsi/iscsid.conf file; node.conn[0].iscsi.MaxRecvDataSegmentLength

o For workloads with large storage queue depth requirements, it can be beneficial to increase the initiator and device iSCSI client queue depths.

* Windows: Details on the MaxPendingRequests setting are available at the following link: https://blogs.msdn.microsoft.com/san/2008/07/27/microsoft-iscsi-software-initiator-and-isns-server-timers-quick-reference/.

* Linux: Settings in the /etc/iscsi/iscsid.conf file; Initiator limit: node.session.cmds_max (Default: 128); Device limit: node.session.queue_depth (Default: 32)

For more best practices download the ABS best practice guide

Jan
05

Nutanix AFS – Domain Activation

Well if it’s not DNS stealing hours of your life, the next thing to make your partner angry as you miss family supper is Active Directory(AD). In more complex AD setups you may find your self going to the command line to attach your AFS instance to AD.

Some important requirements to remember:

    While a deployment could fail due to AD, the FSVM(file server VMs) still get deployed. You can do the join domain process from the UI or NCLI afterwards.

    joindoamin

    The user attaching to the domain must be a domain admin or have similar rights. Why? The join domain process will create 1 computer account in the default Computers OU and create A service principal name (SPN) for DNS. If you don’t use the default Computers OU you will have to use the organizational-unit option from NCLI to change it to the appropriate OU. The computer account can be created in a specified container by using a forward slash mark to denote hierarchies (for example, organizational_unit/inner_organizational_unit).

    example

    stayoutad

    Command was

    ncli> fs join-domain uuid=d9c78493-d0f6-4645-848e-234a6ef31acc organizational-unit="stayout/afs" windows-ad-domain-name=tenanta.com preferred-domain-controller=tenanta-dc01.tenanta.com windows-ad-username=bob windows-ad-password=dfld#ld(3&jkflJJddu

    AFS needs at least 1 writable DC to complete the domain join. After the domain join is can authenticate using a local read only DC. Timing (latency) may cause problems here. To pick an individual DC you can use preferred-domain-controller from the NCLI.

NCLI Join-Domain Options

Entity:
file-server | fs : Minerva file server

Action:
join-domain : Join the File Server to the Windows AD domain specified.

Required Argument(s):
uuid : UUID of the FileServer
windows-ad-domain-name : The windows AD domain the file server is
associated with.
windows-ad-username : The name of a user account with administrative
privileges in the AD domain the file server is associated with.
windows-ad-password : The password for the above Windows AD account

Optional Argument(s):
organizational-unit : An Organizational unit container is where the AFS
machine account will be created as part of domain join
operation. Default container OU is "computers". Examples:
Engineering, Department/Engineering.
overwrite : Overwrite the AD user account.
preferred-domain-controller : Preferred domain controller to use for
all join-domain operations.

NOTE: preferred-domain-controller needs to be FQDN

If you need to do further troubleshooting you can ssh into one of the FSVMs and run

afs get_leader

Then navigate to the /data/logs and look at the minerva logs.

Shouldn't be an issue in most environments but I've included used ports just in case.


Required AD Permissions

Delegating permissions in an Active Directory (AD) enables the administrator to assign permissions in the directory to unprivileged domain users. For example, to enable a regular user to join machines to the domain without knowing the domain administrator credentials.

Adding the Delegation
---------------------
To enable a user to join and remove machines to and from the domain:
- Open the Active Directory Users and Computers (ADUC) console as domain administrator.
- Right-click to the CN=Computer container (or desired alternate OU) and select "Delegate control".
- Click "Next".
- Click "Add" and select the required user and click "Next".
- Select "Create a custom task to delegate".
- Select "Only the following objects in the folder" and check "Computer objects" from the list.
- Additionally select the options "Create selected objects in the folder" and "Delete selected objects in this folder". Click "Next".
- Select "General" and "Property-specific", select the following permissions from the list:
- Reset password
- Read and write account restrictions
- Read and write DNS host name attributes
- Validated write to DNS host name
- Validated write to service principal name
- Write servicePrincipalName
- Write Operating System
- Write Operating System Version
- Write OperatingSystemServicePack
- Click "Next".
- Click "Finish".
After that, wait for AD replication to finish and then the delegated user can use its credentials to join AFS to a domain.


Domain Port Requirements

The following services and ports are used by AFS file server for Active Directory communication.

UDP and TCP Port 88
Forest level trust authentication for Kerberos
UDP and TCP Port 53
DNS from client to domain controller and domain controller to domain controller
UDP and TCP Port 389
LDAP to handle normal queries from client computers to the domain controllers
UDP and TCP Port 123
NTP traffic for the Windows Time Service
UDP and TCP Port 464
Kerberos Password Change for replication, user and computer authentication, and trusts
UDP and TCP Port 3268 and 3269
Global Catalog from client to domain controllers
UDP and TCP Port 445
SMB protocol for file replication
UDP and TCP Port 135
Port-mapper for RPC communication
UDP and TCP High Ports
Randomly allocated TCP high ports for RPC from ports 49152 to ports 65535

    Dec
    20

    Why losing a disk on Nutanix is no big deal (*No Humans Required)

    When Acropolis DFS detects an accumulation of errors for a particular disk (e.g., I/O errors or bad sectors) it is the Hades service running the Controller VM. The purpose of Hades is to simplify the break-fix procedures for disks and to automate several tasks that previously required manual user actions. Hades aids in fixing failing devices before the device become unrecoverable.

    Nutanix has a unified component called Stargate that manages the responsibility of receiving and processing data. All read and write requests are sent to the Stargate process running on that node. Once Stargate sees delays in responses to I/O requests to a disk, it marks a disk offline. Hades then automatically removes the disk from the data path and runs smartctl checks against it. If the checks pass, Hades then automatically marks the disk online and returns it to service. If Hades’ smartctl checks fail, or if Stargate marks a disk offline three times within one hour (regardless of the smartctl check results), Hades automatically removes the disk from the cluster, and following occurs:

    • The disk is marked for removal within the cluster Zeus configuration.
    • This disk is unmounted.
    • The Red LED of the disk is turned on to provide a visual indication of the failure.
    • The cluster automatically begins to create new replicas of any data that is stored on the disk.

    The failed disk is marked as a tombstoned Disk to prevent it from being used again without manual intervention.

    When disk is marked offline, an alert is triggered, and is immediately removed from the storage pool by the system. Curator then identifies all extents stored on the failed disk, and Acropolis DSF is then prompted to re-replicate copies of the associated replicas to restore the desired replication factor. By the time the Nutanix administrators become aware of the disk failure via Prism, SNMP trap, or email notification, Acropolis DSF will be well on its way to healing the cluster.

    Acropolis DSF data rebuild architecture provides faster rebuild times and no performance impact to workloads supported by the Nutanix cluster when compared to traditional RAID data protection schemes. RAID groups or sets typically comprise a small number of drives. When a RAID set performs a rebuild operation, typically one disk is selected to be the rebuild target. The other disks that comprise the RAID set must divert enough resources to quickly rebuild the data on the failed disk. This can lead to performance penalties for workloads served by the degraded RAID set. Acropolis DSF can distribute remote copies found on any individual disk among the remaining disks in the Nutanix cluster. Therefore Acropolis DSF replication operations can happen as background processes with no impact to cluster operations or performance. Acropolis DSF can access all disks in the cluster at any given time as a single, unified pool of storage resources. This architecture provides a very advantageous consequence. As the cluster size grows, the length of time needed to recover from a disk failure decreases as every node in the cluster participates in the replication. Since the data needed to rebuild a disk is distributed throughout the cluster, more disks are involved in the rebuild process. This increases the speed at which the affected extents are re-replicated.

    It’s important to note that Nutanix also keeps the performance consistent during the rebuild operations. For hybrid systems Nutanix rebuilds cold data to cold data so large hard drives do not flood the cache of the SSD’s. For all flash systems Nutanix has quality of service implemented for backend I/O to prevent user I/O from being impacted.

    In addition to a many-to-many rebuild approach to data availability, the Acropolis DFS data rebuild architecture ensures that all healthy disks are available for use all of the time. Unlike most traditional storage arrays, there’s no need for “hot-spare” or standby drives in a Nutanix cluster. Since data can be rebuilt to any of the remaining healthy disks, reserving physical resources for failures is unnecessary. Once healed, you can lose the next drive/node.
    8e792111719423-560fc3039a4c6

    Nov
    14

    Docker Datacenter 2.0 for Virtual Admins

    Just a short video walking thru how easy it is to get an environment up and running with Docker Datacenter 2.0 on top of AHV.

    High level points:

    * If you can deploy an VM you can setup Docker Datacenter
    * Management of new docker hosts is easliy done with pre-generated code to paste into new hosts
    * Docker Datacenter has the ability to run both services and compose apps side by side in the same Docker Datacenter environment

    Later this week I hope to have a post talking about the integration with Docker Datacenter and the Docker trusted registry.

      Oct
      31

      Eliminate Standalone NAS & What’s new with Horizon 7

      Thought I would post the links to 2 new on-demand webinars. The Horizon 7 webinar has some Nutanix but mostly focused on Instant Clones, App Volumes and user impact.

      Horizon 7: New Features and How it Impacts User Experience

      The AFS webinar has some great questions and there is a demo at the end as well.

      Eliminate Standalone NAS for your file server needs with Nutanix Acropolis File Services

      Sep
      16

      Build Large File Services Repositories on Nutanix’s Largest Capacity Nodes, the NX-6035C-G5

      Nutanix continues on its Enterprise Cloud journey at the .NEXT On-Tour event in Bangkok, Thailand. Today, we are proud to announce that we are planning to support Acropolis File Services (AFS) on our storage only nodes, the NX-6035C-G5. Acropolis File Services provides a simple and scalable solution for hosting user and shared department files across a centralized location with a single namespace. With Acropolis File Services, administrators no longer waste time with manual configuration or need Active Directory and load balancing expertise. If and when released, this will make 6035C-G5 nodes even more versatile, adding to the current capabilities of serving as a backup or replication target and running Acropolis Block Services.

      [read more here]

      Aug
      01

      The Tale Of Two Lines: Instant-Clones on Nutanix

      There was a part of me that wanted to hate on Instant Clones that are new in Horizon 7 but the fact is they’re worth the price of admission. Instant-clones has very low overhead to provide true on-demand desktops or as VMware is tagging it, Just-In-Time desktops.

      On-demand desktops with View Composer..... not happening

      On-demand desktops with View Composer….. not happening

      In my health care days the non-president desktops and shift change always resulted it some blunt force trauma around 7 am and 7 pm when staff would start their day. They only real way to counter balance the added load of login storms was to make sure the desktops were pre-built. This of course means you need so have some desktops sitting around doing nothing waiting for the these two time periods in the day, or use generic logins and then the user never disconnects which was another bag of problems.

      Instant-clones ability to clone a live running VM by simply quiescing the VM is really amazing. Have you ever changed the name of the a desktop and then windows tells you to reboot? If your like me your try to do 5 or 6 other things before you have to reboot which usually ends up in a mess. Instant-clones uses a feature called clone prep to add the VM to AD and change it’s name, all while not having to reboot the VM. When you see a power on operation inside of vCenter it’s actually just quiescing the desktop so there is very low overhead.

      The steps during Clone Prep. MS does not support Clone Prep but they didn't for View Composer so I don't see it being any different.

      The steps during Clone Prep. MS does not support Clone Prep but they didn’t for View Composer so I don’t see it being any different.

      When I went to test instant-clones I wanted to see if on-demand desktops was actually possible without destroying node densities. I had two test runs with Login VSI, 1 run with 400 knowledge users with all the desktops pre-deployed and 1 run with 400 knowledge users but I only started with 50 desktops. I had set the desktop pool to always have at least 30 free desktops until the pool got to 400 desktops.

      Instant-clones delivers on-demand desktops with very low overhead.

      Instant-clones delivers on-demand desktops with very low overhead.

      The darker blue line represents the on-demand test and you can see that the impact over 400 hundred users is pretty small. This is pretty remarkable from a CPU and memory consumption on boot that is being almost eliminated.

      It’s not all unicorns and rainbows however, instant clones does have some limitations in the first release:

      No dedicated Desktop Pools
      No RDS Desktop or Application Pools
      Limited SVGA Support – Fixed max resolution & number of monitors
      No 3D Rendering / GPU Support
      No Sysprep support – Single SID across pool
      No VVOL or VAAI NFS Hardware Clones support (Smaller desktops pools may take longer to provision)
      No Powershell
      No Multi-VLAN Support in a single Pool
      No Reusable Computer Accounts
      No Persistent Disks – Use Writable Volumes \ Flex App \ Unidesk \ RES …….

      vMotion Is supported

      Like anything use case will dictate when this gets used but its a powerful tool inside of Horizon. I plan to show some of the differences between View Composer and Instant Clones in my next posts. Also keep in mind that you still need high IO to service your desktops. Size for the peaks or face the wrath of your end users.

      Jul
      28

      The Impact On App Layering On Your VDI Environment

      I was testing instant clones in Horizon 7 and it was pretty much a requirement to use some form of application virtualization and get your user data stored off the desktops. My decision on what to select for for testing was based on that I had already had ProfileUnity from Liquidware Labs and App Volumes is bundled in View at the higher layers. I wanted to see the impact of layering on CPU and login times. I has also used UberAgent to collect some of the results. While testing I would run one test run with UberAgent to collect login times and then one with UberAgent agent turned off to collect CPU metrics.

      I used three separate applications, each in their own layer.

      * Gimp 2.8
      * iTunes 10
      * VLC

      I used AppVolumes 2.11 since 3.0 is kind of dead in the water and not recommend for existing customers so I can’t see a lot of people using it till the next release. ProUnity was version 6.5

      I first did a base run with no App Stacks or Flex Apps but with a roaming profile being stored on Acropolis File Services. The desktops were running horizon 7 agent and office 2013 and were instant clones. The desktops were Windows 10 with 2 vCPU and 2 GB of RAM. When you see the % listed is a factor of both CPUs.

      Base Run
      baserun

      So not to bad 14 secs login, probably some clean up I could do to make it faster but also not that realistic if your thinking about enterprise desktop so I was happy with this.

      I did test with 1 layer at a time until I used all of the 3 applications. There was a gradual increase in CPU and login time for each layer. The CPU cost comes from the agent and attaching the vmdk to the desktop.

      App Volumes with 3 AppStacks

      3appstacks

      So with 3 layers the CPU jumped by ~20% and the login time went up ~9 secs with App Volumes.

      3 Flex Apps

      3appstacks

      flexapp

      With 3 Flex Apps CPU jumped a bit and login times went up ~4 sec.


      Overall Review

      layeringreview

      What does this all mean?

      Well if you have users that only disconnect and reconnect and rarely log out then this means absolutely nothing for the most part. If you have a user base that gets fresh new desktops all of the time and things like large shift changes then it means your densities will go down. I like to say “Looking is for free, and touching is going to cost you”. Overall I still feel this is a small price to pay to have a successful VDI deployment and layering will help out the process.

      Jul
      09

      Making A Better Distributed System – Nutanix Degraded Node Detection

      55679934

      Distributed systems are hard, there no doubt about that. One of the major problems is what to do when a node is unhealthy and can be affecting performance of the overall cluster. Fail hard, fail fast is distributed system principle but how do you go about detecting an issue before even a failure occurs? AOS 4.5.3, 4.6.2 and 4.7 will includes the Nutanix implementation of degraded node detection and isolation. A bad performing hardware component or network issue can be a death of thousands cuts versus a failure which is pretty cut and dry. If a remote CVM is not performing well it can affect the acknowledgement of writes coming from other hosts and other factors may affect performance like:

      * Significant network bandwidth reduction
      * Network packet drops
      * CPU Soft lockups
      * Partially bad disks
      * Hardware issues

      The list of issues can even be unknown so Nutanix Engineering has come with a score systems that uses votes to make sure everything can be compared.
      Services running on each node of the cluster will publish scores/votes for services running on other nodes. Peer health scores will be computed based on various metrics like RPC latency, RPC failures/timeouts, Network latency etc. If services running on one node are consistently receiving bad scores for large period (~10 mins), then other peers will convict that node as degraded node.

      Walk, Crawl, Run – Degraded Node Expectations:

      A node will not be marked as degraded if current cluster Fault Tolerance (FT) level is less than desired value. Upgrades and break fix actions will not be allowed while a node is in the degraded state. A node will only be marked as degraded if we get bad peer health scores for 10 minutes. In AOS 4.5.3, the first shipping AOS release to include this feature, the default settings are that degraded node logging will be enabled but degraded node action will be disabled. In AOS 4.7 and AOS 4.6.2 additional user controls will be provided to select an “action policy” for when a degraded node is detected. Options should include No Action, Reboot CVM or Shutdown Node). While the peers scoring is always on, the action is side is disabled for the first release as ultra conservative approach.

      In AOS 4.5.3 if the degraded node action setting is enabled leadership of critical services will not be hosted on the degraded node. A degraded node will be put into maintenance mode and CVM will be rebooted. Services will not start on this CVM upon reboot. An Alert will be generated for degraded node.

      In AOS 4.7 and AOS 4.6.2 additional user controls will be provided to select an “action policy” for when a degraded node is detected. Options should include No Action, Reboot CVM or Shutdown Node

      To enable the degraded node action setting use the NCLI command:

      nutanix@cvm:~$ ncli cluster edit-params disable-degraded-node-monitoring=false

      The feature will further increase the availability and resilience for Nutanix customers. While top performance numbers grab the headlines, remember the first step is to have a running cluster.

      AI for the control plane………… Maybe we’ll get out voted for our jobs!

      Jun
      29

      Nutanix Security Configuration Management Automation at Work #DOD #PCI

      A short video of someone changing the security settings for a Apache Tomcat directory and files. It really could be anything, dropping a firewall, opening a port and the list goes on. The video shows how often the settings are being checked and then we manually run the automation framework to check over 600 DOD/PCI level requirements in minutes.