Apr
    20

    Moby Project Summit Notes

    The Moby Project was born out of the containerd / Docker Internals Summit

    For components to be successful they need to be successful everywhere. which lead into SwarmKit being mentioned as not being successful because no other ecosystem was using it. Seems to be a strong commitment to make everything into a component out in the open.

    Docker wants to be seen as a open source leader thru doing the hard work to support components.

    All open-source development will be under the Moby project.

    Upstream = components
    Moby = Staging area for products to move on like containerd is in the CNF project.
    – Heart of open-source activities, a place to integrate components
    – Docker remains docker
    – Docker is built with Moby
    – You use Moby to build things like Docker
    – Solomon mentions “1000 of smart people could disagree on what to do”, Docker represents it’s opinion. It’s a lot easier to agree on low level functions because there is few ways to do them.
    – Moby will end up as go libraries in Docker but that will go away.

    Moby is connected to Docker but it’s not Docker. Name inspired from the Fedora project.

    Moby is a trade off to get it out in the open early versus completeness

    GitHub should be used a support forum.

    InfraKit is a toolkit for creating and managing declarative, self-healing infrastructure. It breaks infrastructure automation down into simple, pluggable components. These components work together to actively ensure the infrastructure state matches the user’s specifications. Although InfraKit emphasizes primitives for building self-healing infrastructure, it also can be used passively like conventional tools

    LinuxKit, a toolkit for building custom minimal, immutable Linux distributions.

    – Secure defaults without compromising usability
    – Everything is replaceable and customisable
    – Immutable infrastructure applied to building Linux distributions
    – Completely stateless, but persistent storage can be attached
    – Easy tooling, with easy iteration
    – Built with containers, for running containers
    – Designed for building and running clustered applications, including but not limited to container orchestration such as Docker or Kubernetes
    – Designed from the experience of building Docker Editions, but redesigned as a general-purpose toolkit

    No master plans to change away for go.

    Breaking out the monolithic engine API will mostly likley done with gRPC. gRPC is a modern open source high performance RPC framework that can run in any environment. It can efficiently connect services in and across data centers with pluggable support for load balancing, tracing, health checking and authentication. It is also applicable in last mile of distributed computing to connect devices, mobile applications and browsers to backend services.

    SwarmKit Update
    SwarmKit for orchestrating distributed systems at any scale. It includes primitives for node discovery, raft-based consensus, task scheduling and more.

    New Features

    – Topology-Aware Scheduling
    – Secrets
    – Service Rollbacks
    – Service Logs
    Improvements
    – HA scheduling
    – Encrypted Raft Store
    – Health-Aware Orchestration
    – Synchronous CLI
    What is Next?
    – Direct integration of containerd into SwarmKit by passes the need for Docker Engine
    – Config Management to attach configuration to services
    – Swarm Events to watch for state changes and gRPC Watch API
    – Create a generic runtime to support new run times without changing SwarmKit
    – Instrumentation

    LibNetwork Update
    – Quality More visibility, motioning and troubleshooting.
    – Local-scoped network plugins in Swarm-mode
    – Integration with containerd

    Feb
    17

    IP Fail-Over with AFS

    A short video showing the client IP address moving around the cluster to quickly restore connectivity for your users running on Acropolis File Services.

    Feb
    16

    Nutanix AFS DR Failing over from vSphere to AHV (video 3:30)

    A quick video showing the fail-over for Acorpolis File Services. The deployment setups a lot of the need peices but you will still have to set a schedule and map the new container(vStore) that is being used by AFS to the remote site.

    Remember you want the number of FSVMS making up the file server to be the same of less than the number of nodes at the remote site.

    Feb
    13

    Docker Datacenter: Usability For Better Security.

    With the new release of Docker Datacenter 2.1 it’s clear the Docker is very serious about the enterprise and providing the tooling that is very easy to use. Docker has made the leap to supporting enterprise applications with its embedded security and ease of use. DCC 2.1 and Docker-engine-cs 1.13 give the additional control needed for operations and development teams to control their own experience.

    Docker datacenter continues to build on containers as a service. In the 1.12 release of DDC it enabled agility and portability for continuous integration and started on the journey of protecting the development supply chain throughout the whole lifecycle. The new release of DDC’s focuses on security, specifically secret management.
    The previous version of DDC already had wealth of security features
    • LDAP/AD integration
    • Role based access control for teams
    • SS0 and push/pull images with Docker Trusted Registry
    • Imaging signing – prevent running a container unless image signed by member of a designated
    • Out of the box TLS with easy setup, including cert rotation.

    With the DDC 2.1 the march on security is being made successful by allowing both operations and developers to have a usable system without having to lean into security for support and help. The native integration with the management plane allows for end to end container lifecycle management. You also inherit the model that’s independent no matter the infrastructure you’re running on it will work. It can be made to be dynamic and ephemeral like the containers it’s managing. This is why I feel PAAS is dead. With so much choice and security you don’t have to limit yourself where you deploy to, a very similar design decision to Nutanix by enabling choice. Choice gives you access to more developers and the freedom to color outside the lines of the guardrails that a PAAS solution may empose.

    Docker Datacenter Secrets Architecture

    ctr-secruity3
    1) Everything in the store is encrypted, notably that includes all of the data that is stored in the orchestration . With least privlege — only node is distributed to the containers that need them. Since the management mayor is scalable you also get that for your key management as well. Due to the management layer being so easy to set up you don’t have developers embedding secrets in Github to get a quick work around.
    2) Containers and the filesystem makes secret only available to only the designated app . Docker expose secrets to the application via a file system that is stored in memory. The same rotation of certificates for the management letter also happens with the certificates for the application. In the diagram above the red service only talks of the red service and the blue service is isolated by itself even though it’s running on the same node as the red service/application.
    3) If you decide that you want to integrate with a third-party application like Twitter and be easily done. Your Twitter credentials can be stored in the raft cluster which is your manager nodes. When you go to create the twitter app you give it access to the credentials and even do a “service-update” if you need swap them out without the need to touch every node in your environment.

    With a simple interface for both developers and IT operations both have a pain-free way to do their jobs and provide a secure environment. By not creating road blocks and slowing down development or operations teams will get automatic by in.

    Feb
    07

    Nutanix AFS – Maximums

    Nutanix AFS Maximums – Tested limits. (ver 2.0.2)
    Configurable Item Maximum Value
    Number of Connections per FSVM 250 for 12 GB of memory
    500 for 16 GB of memory
    1000 for 24 GB of memory
    1500 for 32 GB of memory
    2000 for 40 GB of memory
    2500 for 60 GB of memory
    4000 for 96 GB of memory
    Number of FSVMs 16 or equal to the number of CVMs (choose the lowest number)
    Max RAM per FSVM 96 GB (tested)
    Max vCPUs per FSVM 12
    Data size for home share 200 TB per FSVM
    Data size for General Purpose Share 40 TB
    Share Name 80 characters
    File Server Name 15 characters
    Share Description 80 characters
    Windows Previous Version 24 (1 per hour) adjustable with support
    Throttle Bandwith limit 2048 MBps
    Data Protection Bandwith limit 2048 MBps
    Max recovery time objective for Async DR 60 minutes

    s-l300

    Jan
    24

    App Volumes: Reprovisioning fails with AppStacks set to computer based assignments

    Symptoms
    Linked clone virtual machines provisioning tasks fails.
    Recompose fails due to customization failing to join the desktops to domain.
    Cause
    This issue occurs due to AppStacks being attached during the domain join process.

    On reboot after domain join c:\svroot cache is cleared losing changes to the VM.

    Resolution
    To resolve this issue, disable the App Volumes Service on the parent virtual machine.
    Open a command prompt as administrator and run the following commands
    sc config "svservice" start= disabled
    net stop "App Volumes Service"
    ipconfig /release
    Shutdown the virtual machine and take a snapshot.

    Create a script or batch file as below to set the service to automatic and start the service.
    sc config "svservice" start= auto
    net start "App Volumes Service"

    Copy the script to the parent virtual machine to a directory you can reference later.
    In View Administration portal you will have to reference your post-synchronization script:

    Open up View Administration Portal
    Go to Catalog – Desktop Pools – Select your pool
    Click Edit
    Select Guest Customization Tab
    Enter the file path for script in post-synchronization script name:

    C:\scripts\script.bat

    Recompose the pool
    VMware KB 2147910

    Jan
    11

    Demo Time – Nutanix CE and VSA’s

    In order to successfully complete your home lab, you’re going to need configure compute (the servers), networking (routers and switches etc.) and storage. For those that are solely interested in studying or testing an individual application, operating system, or the network infrastructure, you should be able to complete this with no more storage than the local hard drive in your PC.

    For those who are looking to learn how cloud and data center technologies work as a whole however, you’re going to require some form of dedicated storage. A storage simulator or a Virtual Storage Appliance (VSA) or Nutanix CE is likely to be the best option for this task.

    If you’re studying hypervisor technologies you’re going to have to spend on compute hardware as well as any of the network infrastructure devices that are incapable of being virtualized. Unless you have a free flowing money source, you’re most likely going to want to contain the storage costs by using virtualized storage rather than SAN or NAS hardware.

    The Flackbox blog has compiled a lengthy and comprehensive list of all the available simulators and VSAs. All of the software is free but may require a customer or partner account through the vendor to be able to download. The login and system requirements for every option are included in the list as well. Thanks to Neil for putting those together.

    Nutanix CE can be seen as having high requirements for a home lab but once you factor that management is included it’s not that bad. You can also you a free instance with Ravello.

    If you don’t meet the requirement you can always use OpenFiler or StarWind if you have gear at home.

    For those looking to mimic their organization’s production environment as closely as possible, choose the VSA or simulator from your vendor.

    GUI demos are also included at the bottom of the list. These are not designed or suitable for a lab but are great for those looking to get a feel of a particular vendor’s Storage GUI.

    Jan
    08

    Client Tuning Recommendations for ABS (Acropolis Block Services)

    Client Tuning Recommendations for ABS (Acropolis Block Services)

    o For large block sequential workloads, with I/O sizes of 1 MB or larger, it’s beneficial to increase the iSCSI MaxTransferLength from 256 KB to 1 MB.

    * Windows: Details on the MaxTransferLength setting are available at the following link: https://blogs.msdn.microsoft.com/san/2008/07/27/microsoft-iscsi-software-initiator-and-isns-server-timers-quick-reference/.

    * Linux: Settings in the /etc/iscsi/iscsid.conf file; node.conn[0].iscsi.MaxRecvDataSegmentLength

    o For workloads with large storage queue depth requirements, it can be beneficial to increase the initiator and device iSCSI client queue depths.

    * Windows: Details on the MaxPendingRequests setting are available at the following link: https://blogs.msdn.microsoft.com/san/2008/07/27/microsoft-iscsi-software-initiator-and-isns-server-timers-quick-reference/.

    * Linux: Settings in the /etc/iscsi/iscsid.conf file; Initiator limit: node.session.cmds_max (Default: 128); Device limit: node.session.queue_depth (Default: 32)

    For more best practices download the ABS best practice guide

    Jan
    05

    Nutanix AFS – Domain Activation

    Well if it’s not DNS stealing hours of your life, the next thing to make your partner angry as you miss family supper is Active Directory(AD). In more complex AD setups you may find your self going to the command line to attach your AFS instance to AD.

    Some important requirements to remember:

      While a deployment could fail due to AD, the FSVM(file server VMs) still get deployed. You can do the join domain process from the UI or NCLI afterwards.

      joindoamin

      The user attaching to the domain must be a domain admin or have similar rights. Why? The join domain process will create 1 computer account in the default Computers OU and create A service principal name (SPN) for DNS. If you don’t use the default Computers OU you will have to use the organizational-unit option from NCLI to change it to the appropriate OU. The computer account can be created in a specified container by using a forward slash mark to denote hierarchies (for example, organizational_unit/inner_organizational_unit).

      example

      stayoutad

      Command was

      ncli> fs join-domain uuid=d9c78493-d0f6-4645-848e-234a6ef31acc organizational-unit="stayout/afs" windows-ad-domain-name=tenanta.com preferred-domain-controller=tenanta-dc01.tenanta.com windows-ad-username=bob windows-ad-password=dfld#ld(3&jkflJJddu

      AFS needs at least 1 writable DC to complete the domain join. After the domain join is can authenticate using a local read only DC. Timing (latency) may cause problems here. To pick an individual DC you can use preferred-domain-controller from the NCLI.

    NCLI Join-Domain Options

    Entity:
    file-server | fs : Minerva file server

    Action:
    join-domain : Join the File Server to the Windows AD domain specified.

    Required Argument(s):
    uuid : UUID of the FileServer
    windows-ad-domain-name : The windows AD domain the file server is
    associated with.
    windows-ad-username : The name of a user account with administrative
    privileges in the AD domain the file server is associated with.
    windows-ad-password : The password for the above Windows AD account

    Optional Argument(s):
    organizational-unit : An Organizational unit container is where the AFS
    machine account will be created as part of domain join
    operation. Default container OU is "computers". Examples:
    Engineering, Department/Engineering.
    overwrite : Overwrite the AD user account.
    preferred-domain-controller : Preferred domain controller to use for
    all join-domain operations.

    NOTE: preferred-domain-controller needs to be FQDN

    If you need to do further troubleshooting you can ssh into one of the FSVMs and run

    afs get_leader

    Then navigate to the /data/logs and look at the minerva logs.

    Shouldn't be an issue in most environments but I've included used ports just in case.


    Required AD Permissions

    Delegating permissions in an Active Directory (AD) enables the administrator to assign permissions in the directory to unprivileged domain users. For example, to enable a regular user to join machines to the domain without knowing the domain administrator credentials.

    Adding the Delegation
    ---------------------
    To enable a user to join and remove machines to and from the domain:
    - Open the Active Directory Users and Computers (ADUC) console as domain administrator.
    - Right-click to the CN=Computer container (or desired alternate OU) and select "Delegate control".
    - Click "Next".
    - Click "Add" and select the required user and click "Next".
    - Select "Create a custom task to delegate".
    - Select "Only the following objects in the folder" and check "Computer objects" from the list.
    - Additionally select the options "Create selected objects in the folder" and "Delete selected objects in this folder". Click "Next".
    - Select "General" and "Property-specific", select the following permissions from the list:
    - Reset password
    - Read and write account restrictions
    - Read and write DNS host name attributes
    - Validated write to DNS host name
    - Validated write to service principal name
    - Write servicePrincipalName
    - Write Operating System
    - Write Operating System Version
    - Write OperatingSystemServicePack
    - Click "Next".
    - Click "Finish".
    After that, wait for AD replication to finish and then the delegated user can use its credentials to join AFS to a domain.


    Domain Port Requirements

    The following services and ports are used by AFS file server for Active Directory communication.

    UDP and TCP Port 88
    Forest level trust authentication for Kerberos
    UDP and TCP Port 53
    DNS from client to domain controller and domain controller to domain controller
    UDP and TCP Port 389
    LDAP to handle normal queries from client computers to the domain controllers
    UDP and TCP Port 123
    NTP traffic for the Windows Time Service
    UDP and TCP Port 464
    Kerberos Password Change for replication, user and computer authentication, and trusts
    UDP and TCP Port 3268 and 3269
    Global Catalog from client to domain controllers
    UDP and TCP Port 445
    SMB protocol for file replication
    UDP and TCP Port 135
    Port-mapper for RPC communication
    UDP and TCP High Ports
    Randomly allocated TCP high ports for RPC from ports 49152 to ports 65535

      Dec
      20

      Why losing a disk on Nutanix is no big deal (*No Humans Required)

      When Acropolis DFS detects an accumulation of errors for a particular disk (e.g., I/O errors or bad sectors) it is the Hades service running the Controller VM. The purpose of Hades is to simplify the break-fix procedures for disks and to automate several tasks that previously required manual user actions. Hades aids in fixing failing devices before the device become unrecoverable.

      Nutanix has a unified component called Stargate that manages the responsibility of receiving and processing data. All read and write requests are sent to the Stargate process running on that node. Once Stargate sees delays in responses to I/O requests to a disk, it marks a disk offline. Hades then automatically removes the disk from the data path and runs smartctl checks against it. If the checks pass, Hades then automatically marks the disk online and returns it to service. If Hades’ smartctl checks fail, or if Stargate marks a disk offline three times within one hour (regardless of the smartctl check results), Hades automatically removes the disk from the cluster, and following occurs:

      • The disk is marked for removal within the cluster Zeus configuration.
      • This disk is unmounted.
      • The Red LED of the disk is turned on to provide a visual indication of the failure.
      • The cluster automatically begins to create new replicas of any data that is stored on the disk.

      The failed disk is marked as a tombstoned Disk to prevent it from being used again without manual intervention.

      When disk is marked offline, an alert is triggered, and is immediately removed from the storage pool by the system. Curator then identifies all extents stored on the failed disk, and Acropolis DSF is then prompted to re-replicate copies of the associated replicas to restore the desired replication factor. By the time the Nutanix administrators become aware of the disk failure via Prism, SNMP trap, or email notification, Acropolis DSF will be well on its way to healing the cluster.

      Acropolis DSF data rebuild architecture provides faster rebuild times and no performance impact to workloads supported by the Nutanix cluster when compared to traditional RAID data protection schemes. RAID groups or sets typically comprise a small number of drives. When a RAID set performs a rebuild operation, typically one disk is selected to be the rebuild target. The other disks that comprise the RAID set must divert enough resources to quickly rebuild the data on the failed disk. This can lead to performance penalties for workloads served by the degraded RAID set. Acropolis DSF can distribute remote copies found on any individual disk among the remaining disks in the Nutanix cluster. Therefore Acropolis DSF replication operations can happen as background processes with no impact to cluster operations or performance. Acropolis DSF can access all disks in the cluster at any given time as a single, unified pool of storage resources. This architecture provides a very advantageous consequence. As the cluster size grows, the length of time needed to recover from a disk failure decreases as every node in the cluster participates in the replication. Since the data needed to rebuild a disk is distributed throughout the cluster, more disks are involved in the rebuild process. This increases the speed at which the affected extents are re-replicated.

      It’s important to note that Nutanix also keeps the performance consistent during the rebuild operations. For hybrid systems Nutanix rebuilds cold data to cold data so large hard drives do not flood the cache of the SSD’s. For all flash systems Nutanix has quality of service implemented for backend I/O to prevent user I/O from being impacted.

      In addition to a many-to-many rebuild approach to data availability, the Acropolis DFS data rebuild architecture ensures that all healthy disks are available for use all of the time. Unlike most traditional storage arrays, there’s no need for “hot-spare” or standby drives in a Nutanix cluster. Since data can be rebuilt to any of the remaining healthy disks, reserving physical resources for failures is unnecessary. Once healed, you can lose the next drive/node.
      8e792111719423-560fc3039a4c6