Acropolis Container Services on #Nutanix

    This is the first release of a turnkey solution for deploying Docker containers in a Nutanix cluster. Instead of swiping your credit card for AWS EC2 you can deploy your containers through the built in Self Service Portal. Now it’s not all totally new because Nutanix previously released a volume plug-in for Docker. What is new is:

    * The Acropolis Container Services(ACS) provisions multiple VMs as container machines to run Docker
    containers on them.
    * Containers are deployed apart of projects. In projects, users can deploy VM’s or containers. You can assign quotas to the projects over, storage, CPU and memory.
    * ACS can use the public Docker registry is provided by default, but if you have a separate Docker registry you
    want to use, configure access to that registry as well.
    * One-Click upgrades for the Container machines.
    * Basic monitoring with a containers view in the self-service portal allows you to view summary information about containers connected to this portal and access detailed information about each container.


      Moby Project Summit Notes

      The Moby Project was born out of the containerd / Docker Internals Summit

      For components to be successful they need to be successful everywhere. which lead into SwarmKit being mentioned as not being successful because no other ecosystem was using it. Seems to be a strong commitment to make everything into a component out in the open.

      Docker wants to be seen as a open source leader thru doing the hard work to support components.

      All open-source development will be under the Moby project.

      Upstream = components
      Moby = Staging area for products to move on like containerd is in the CNF project.
      – Heart of open-source activities, a place to integrate components
      – Docker remains docker
      – Docker is built with Moby
      – You use Moby to build things like Docker
      – Solomon mentions “1000 of smart people could disagree on what to do”, Docker represents it’s opinion. It’s a lot easier to agree on low level functions because there is few ways to do them.
      – Moby will end up as go libraries in Docker but that will go away.

      Moby is connected to Docker but it’s not Docker. Name inspired from the Fedora project.

      Moby is a trade off to get it out in the open early versus completeness

      GitHub should be used a support forum.

      InfraKit is a toolkit for creating and managing declarative, self-healing infrastructure. It breaks infrastructure automation down into simple, pluggable components. These components work together to actively ensure the infrastructure state matches the user’s specifications. Although InfraKit emphasizes primitives for building self-healing infrastructure, it also can be used passively like conventional tools

      LinuxKit, a toolkit for building custom minimal, immutable Linux distributions.

      – Secure defaults without compromising usability
      – Everything is replaceable and customisable
      – Immutable infrastructure applied to building Linux distributions
      – Completely stateless, but persistent storage can be attached
      – Easy tooling, with easy iteration
      – Built with containers, for running containers
      – Designed for building and running clustered applications, including but not limited to container orchestration such as Docker or Kubernetes
      – Designed from the experience of building Docker Editions, but redesigned as a general-purpose toolkit

      No master plans to change away for go.

      Breaking out the monolithic engine API will mostly likley done with gRPC. gRPC is a modern open source high performance RPC framework that can run in any environment. It can efficiently connect services in and across data centers with pluggable support for load balancing, tracing, health checking and authentication. It is also applicable in last mile of distributed computing to connect devices, mobile applications and browsers to backend services.

      SwarmKit Update
      SwarmKit for orchestrating distributed systems at any scale. It includes primitives for node discovery, raft-based consensus, task scheduling and more.

      New Features

      – Topology-Aware Scheduling
      – Secrets
      – Service Rollbacks
      – Service Logs
      – HA scheduling
      – Encrypted Raft Store
      – Health-Aware Orchestration
      – Synchronous CLI
      What is Next?
      – Direct integration of containerd into SwarmKit by passes the need for Docker Engine
      – Config Management to attach configuration to services
      – Swarm Events to watch for state changes and gRPC Watch API
      – Create a generic runtime to support new run times without changing SwarmKit
      – Instrumentation

      LibNetwork Update
      – Quality More visibility, motioning and troubleshooting.
      – Local-scoped network plugins in Swarm-mode
      – Integration with containerd


      IP Fail-Over with AFS

      A short video showing the client IP address moving around the cluster to quickly restore connectivity for your users running on Acropolis File Services.


      Nutanix AFS DR Failing over from vSphere to AHV (video 3:30)

      A quick video showing the fail-over for Acorpolis File Services. The deployment setups a lot of the need peices but you will still have to set a schedule and map the new container(vStore) that is being used by AFS to the remote site.

      Remember you want the number of FSVMS making up the file server to be the same of less than the number of nodes at the remote site.


      Docker Datacenter: Usability For Better Security.

      With the new release of Docker Datacenter 2.1 it’s clear the Docker is very serious about the enterprise and providing the tooling that is very easy to use. Docker has made the leap to supporting enterprise applications with its embedded security and ease of use. DCC 2.1 and Docker-engine-cs 1.13 give the additional control needed for operations and development teams to control their own experience.

      Docker datacenter continues to build on containers as a service. In the 1.12 release of DDC it enabled agility and portability for continuous integration and started on the journey of protecting the development supply chain throughout the whole lifecycle. The new release of DDC’s focuses on security, specifically secret management.
      The previous version of DDC already had wealth of security features
      • LDAP/AD integration
      • Role based access control for teams
      • SS0 and push/pull images with Docker Trusted Registry
      • Imaging signing – prevent running a container unless image signed by member of a designated
      • Out of the box TLS with easy setup, including cert rotation.

      With the DDC 2.1 the march on security is being made successful by allowing both operations and developers to have a usable system without having to lean into security for support and help. The native integration with the management plane allows for end to end container lifecycle management. You also inherit the model that’s independent no matter the infrastructure you’re running on it will work. It can be made to be dynamic and ephemeral like the containers it’s managing. This is why I feel PAAS is dead. With so much choice and security you don’t have to limit yourself where you deploy to, a very similar design decision to Nutanix by enabling choice. Choice gives you access to more developers and the freedom to color outside the lines of the guardrails that a PAAS solution may empose.

      Docker Datacenter Secrets Architecture

      1) Everything in the store is encrypted, notably that includes all of the data that is stored in the orchestration . With least privlege — only node is distributed to the containers that need them. Since the management mayor is scalable you also get that for your key management as well. Due to the management layer being so easy to set up you don’t have developers embedding secrets in Github to get a quick work around.
      2) Containers and the filesystem makes secret only available to only the designated app . Docker expose secrets to the application via a file system that is stored in memory. The same rotation of certificates for the management letter also happens with the certificates for the application. In the diagram above the red service only talks of the red service and the blue service is isolated by itself even though it’s running on the same node as the red service/application.
      3) If you decide that you want to integrate with a third-party application like Twitter and be easily done. Your Twitter credentials can be stored in the raft cluster which is your manager nodes. When you go to create the twitter app you give it access to the credentials and even do a “service-update” if you need swap them out without the need to touch every node in your environment.

      With a simple interface for both developers and IT operations both have a pain-free way to do their jobs and provide a secure environment. By not creating road blocks and slowing down development or operations teams will get automatic by in.


      Nutanix AFS – Maximums

      Nutanix AFS Maximums – Tested limits. (ver 2.0.2)
      Configurable Item Maximum Value
      Number of Connections per FSVM 250 for 12 GB of memory
      500 for 16 GB of memory
      1000 for 24 GB of memory
      1500 for 32 GB of memory
      2000 for 40 GB of memory
      2500 for 60 GB of memory
      4000 for 96 GB of memory
      Number of FSVMs 16 or equal to the number of CVMs (choose the lowest number)
      Max RAM per FSVM 96 GB (tested)
      Max vCPUs per FSVM 12
      Data size for home share 200 TB per FSVM
      Data size for General Purpose Share 40 TB
      Share Name 80 characters
      File Server Name 15 characters
      Share Description 80 characters
      Windows Previous Version 24 (1 per hour) adjustable with support
      Throttle Bandwith limit 2048 MBps
      Data Protection Bandwith limit 2048 MBps
      Max recovery time objective for Async DR 60 minutes



      App Volumes: Reprovisioning fails with AppStacks set to computer based assignments

      Linked clone virtual machines provisioning tasks fails.
      Recompose fails due to customization failing to join the desktops to domain.
      This issue occurs due to AppStacks being attached during the domain join process.

      On reboot after domain join c:\svroot cache is cleared losing changes to the VM.

      To resolve this issue, disable the App Volumes Service on the parent virtual machine.
      Open a command prompt as administrator and run the following commands
      sc config "svservice" start= disabled
      net stop "App Volumes Service"
      ipconfig /release
      Shutdown the virtual machine and take a snapshot.

      Create a script or batch file as below to set the service to automatic and start the service.
      sc config "svservice" start= auto
      net start "App Volumes Service"

      Copy the script to the parent virtual machine to a directory you can reference later.
      In View Administration portal you will have to reference your post-synchronization script:

      Open up View Administration Portal
      Go to Catalog – Desktop Pools – Select your pool
      Click Edit
      Select Guest Customization Tab
      Enter the file path for script in post-synchronization script name:


      Recompose the pool
      VMware KB 2147910


      Demo Time – Nutanix CE and VSA’s

      In order to successfully complete your home lab, you’re going to need configure compute (the servers), networking (routers and switches etc.) and storage. For those that are solely interested in studying or testing an individual application, operating system, or the network infrastructure, you should be able to complete this with no more storage than the local hard drive in your PC.

      For those who are looking to learn how cloud and data center technologies work as a whole however, you’re going to require some form of dedicated storage. A storage simulator or a Virtual Storage Appliance (VSA) or Nutanix CE is likely to be the best option for this task.

      If you’re studying hypervisor technologies you’re going to have to spend on compute hardware as well as any of the network infrastructure devices that are incapable of being virtualized. Unless you have a free flowing money source, you’re most likely going to want to contain the storage costs by using virtualized storage rather than SAN or NAS hardware.

      The Flackbox blog has compiled a lengthy and comprehensive list of all the available simulators and VSAs. All of the software is free but may require a customer or partner account through the vendor to be able to download. The login and system requirements for every option are included in the list as well. Thanks to Neil for putting those together.

      Nutanix CE can be seen as having high requirements for a home lab but once you factor that management is included it’s not that bad. You can also you a free instance with Ravello.

      If you don’t meet the requirement you can always use OpenFiler or StarWind if you have gear at home.

      For those looking to mimic their organization’s production environment as closely as possible, choose the VSA or simulator from your vendor.

      GUI demos are also included at the bottom of the list. These are not designed or suitable for a lab but are great for those looking to get a feel of a particular vendor’s Storage GUI.


      Client Tuning Recommendations for ABS (Acropolis Block Services)

      Client Tuning Recommendations for ABS (Acropolis Block Services)

      o For large block sequential workloads, with I/O sizes of 1 MB or larger, it’s beneficial to increase the iSCSI MaxTransferLength from 256 KB to 1 MB.

      * Windows: Details on the MaxTransferLength setting are available at the following link:

      * Linux: Settings in the /etc/iscsi/iscsid.conf file; node.conn[0].iscsi.MaxRecvDataSegmentLength

      o For workloads with large storage queue depth requirements, it can be beneficial to increase the initiator and device iSCSI client queue depths.

      * Windows: Details on the MaxPendingRequests setting are available at the following link:

      * Linux: Settings in the /etc/iscsi/iscsid.conf file; Initiator limit: node.session.cmds_max (Default: 128); Device limit: node.session.queue_depth (Default: 32)

      For more best practices download the ABS best practice guide


      Nutanix AFS – Domain Activation

      Well if it’s not DNS stealing hours of your life, the next thing to make your partner angry as you miss family supper is Active Directory(AD). In more complex AD setups you may find your self going to the command line to attach your AFS instance to AD.

      Some important requirements to remember:

        While a deployment could fail due to AD, the FSVM(file server VMs) still get deployed. You can do the join domain process from the UI or NCLI afterwards.


        The user attaching to the domain must be a domain admin or have similar rights. Why? The join domain process will create 1 computer account in the default Computers OU and create A service principal name (SPN) for DNS. If you don’t use the default Computers OU you will have to use the organizational-unit option from NCLI to change it to the appropriate OU. The computer account can be created in a specified container by using a forward slash mark to denote hierarchies (for example, organizational_unit/inner_organizational_unit).



        Command was

        ncli> fs join-domain uuid=d9c78493-d0f6-4645-848e-234a6ef31acc organizational-unit="stayout/afs" windows-ad-username=bob windows-ad-password=dfld#ld(3&jkflJJddu

        AFS needs at least 1 writable DC to complete the domain join. After the domain join is can authenticate using a local read only DC. Timing (latency) may cause problems here. To pick an individual DC you can use preferred-domain-controller from the NCLI.

      NCLI Join-Domain Options

      file-server | fs : Minerva file server

      join-domain : Join the File Server to the Windows AD domain specified.

      Required Argument(s):
      uuid : UUID of the FileServer
      windows-ad-domain-name : The windows AD domain the file server is
      associated with.
      windows-ad-username : The name of a user account with administrative
      privileges in the AD domain the file server is associated with.
      windows-ad-password : The password for the above Windows AD account

      Optional Argument(s):
      organizational-unit : An Organizational unit container is where the AFS
      machine account will be created as part of domain join
      operation. Default container OU is "computers". Examples:
      Engineering, Department/Engineering.
      overwrite : Overwrite the AD user account.
      preferred-domain-controller : Preferred domain controller to use for
      all join-domain operations.

      NOTE: preferred-domain-controller needs to be FQDN

      If you need to do further troubleshooting you can ssh into one of the FSVMs and run

      afs get_leader

      Then navigate to the /data/logs and look at the minerva logs.

      Shouldn't be an issue in most environments but I've included used ports just in case.

      Required AD Permissions

      Delegating permissions in an Active Directory (AD) enables the administrator to assign permissions in the directory to unprivileged domain users. For example, to enable a regular user to join machines to the domain without knowing the domain administrator credentials.

      Adding the Delegation
      To enable a user to join and remove machines to and from the domain:
      - Open the Active Directory Users and Computers (ADUC) console as domain administrator.
      - Right-click to the CN=Computer container (or desired alternate OU) and select "Delegate control".
      - Click "Next".
      - Click "Add" and select the required user and click "Next".
      - Select "Create a custom task to delegate".
      - Select "Only the following objects in the folder" and check "Computer objects" from the list.
      - Additionally select the options "Create selected objects in the folder" and "Delete selected objects in this folder". Click "Next".
      - Select "General" and "Property-specific", select the following permissions from the list:
      - Reset password
      - Read and write account restrictions
      - Read and write DNS host name attributes
      - Validated write to DNS host name
      - Validated write to service principal name
      - Write servicePrincipalName
      - Write Operating System
      - Write Operating System Version
      - Write OperatingSystemServicePack
      - Click "Next".
      - Click "Finish".
      After that, wait for AD replication to finish and then the delegated user can use its credentials to join AFS to a domain.

      Domain Port Requirements

      The following services and ports are used by AFS file server for Active Directory communication.

      UDP and TCP Port 88
      Forest level trust authentication for Kerberos
      UDP and TCP Port 53
      DNS from client to domain controller and domain controller to domain controller
      UDP and TCP Port 389
      LDAP to handle normal queries from client computers to the domain controllers
      UDP and TCP Port 123
      NTP traffic for the Windows Time Service
      UDP and TCP Port 464
      Kerberos Password Change for replication, user and computer authentication, and trusts
      UDP and TCP Port 3268 and 3269
      Global Catalog from client to domain controllers
      UDP and TCP Port 445
      SMB protocol for file replication
      UDP and TCP Port 135
      Port-mapper for RPC communication
      UDP and TCP High Ports
      Randomly allocated TCP high ports for RPC from ports 49152 to ports 65535