Dec
13

Enabling AHV Turbo on AOS 5.5

Nutanix KB 4987

From AOS 5.5, AHV Turbo replaces the QEMU SCSI data path in the AHV architecture for improved storage performance.

For maximum performance, ensure the following on your Linux guest VMs:

Enable the SCSI MQ feature by using the kernal command line:
scsi_mod.use_blk_mq=y ( I put this in a /etc/udev/rules.d/)

Kernels older than 3.17 do not support SCSI MQ.
Kernels 4.14 or later have SCSI MQ enabled by default.
For Windows VMs, AHV VirtIO drivers will support SCSI MQ in an upcoming release.

AHV Turbo improves the storage data path performance even without the guest SCSI MQ support.

Solution

Perform the following to enable AHV Turbo on AOS 5.5.

Upgrade to AOS 5.5.
Upgrade to the AHV version bundled with AOS 5.5.
Ensure your VMs have SCSI MQ enabled for maximum performance
Power cycle your VMs to enable AHV Turbo.

Note that you do not have to perform this procedure if you upgrading from AOS 5.5 to a later release. AHV Turbo will be enabled by default on your VMs in that case.

Dec
05

Handling Network Partition with Near-Sync

Near-Sync is GA!!!

Part 1: Near-Sync Primer on Nutanix
Part 2: Recovery Points and Schedules with Near-Sync

Perform the following procedure, if network partition (network isolation) between the primary and remote site occurs.

Following scenarios may occur if the network partition occurs.

1.Network between primary site (site A) and remote site (site B) is restored and both the sites are working.
Primary site tries to transition into NearSync automatically between site A and site B. No manual intervention is required.

2.Site B is not working or destroyed (for whatever reason). If you create a new site (site C) and want to establish sub-hourly schedule from A to C.
Configure sub-hourly schedule from A to C.
The configuration between A to C should succeed. No other manual intervention is required.

3.Site A is not working or destroyed (for whatever reason). If you create a new site (site C) and try to configure sub-hourly schedule from B to C.
Activate the protection domain on site B and set up the schedule between site B and site C.

Nov
19

Nutanix Additional Cluster Health Tooling: Panacea

There are over 450 health checks in the Cluster Health UI inside of Prism Element. To provide additional help a new script called “panacea” had been added. Panacea is bundled with NCC 3.5 and later to provide a user-friendly interface for very advanced troubleshooting. The Nutanix Support team can take these logs and correlate results so you don’t have to wait for the problem to reoccur again before fixing the issue.

The ability to quickly track retransmissions with a very low granularity for a distrusted system is very important. I am hoping in the future this new tooling will play into Nutanix’s ability for degraded node detection. Panacea can be ran for a specific time interval during which logs will be analyzed, possible options are:
–last_no_of_hours
–last_no_of_days
–start_time
–end_time

Login to any CVM within the cluster and the command can be ran from home/nutanix/ncc/panacea/

The below output is from using the tool when digging for network information.

Network outage can cause degraded performance. Cluster network outage
detection is based on following schemes:
1) Cassandra Paxos Request Timeout Exceptions/Message Drops
2) CVM Degraded node scoring
3) Ping latency

In some cases, intermittent network issue might NOT be reflected in ping latency, but it does have impact on TCP throughput and packet
retransmission, leading to more request timeout exceptions.

TCP Retransmission:
——————-
By default, Panacea tracks the TCP connections(destination port 7000) used by Cassandra between peer CVMs. This table displays stats of
packet Retransmissions per min in TCP socket. Frequent retransmission could cause delay in application, and may reflect the congestion status on the host or in the network.
1) Local: Local CVM IP address
2) Remote: Remote CVM IP address
3) Max/Mean/Min/STD: number of retransmissions/min, calcuated from
samples where retransmission happened.
4) %: Value distribution, % of samples is less than the value
= 25, 50, and 75
5) Ratio: N/M, N = number of samples where retransmission happened
M = total samples in the entire data set

+————–+————–+——-+——+——+——+——+——+——+———+
| Local | Remote | Max | Mean | Min | STD | 25% | 50% | 75% | Ratio |
+————–+————–+——-+——+——+——+——+——+——+———+
| XX.X.XXX.110 | XX.X.XXX.109 | 19.00 | 1.61 | 1.00 | 1.90 | 1.00 | 1.00 | 2.00 | 133/279 |
| XX.X.XXX.111 | XX.X.XXX.109 | 11.00 | 2.41 | 1.00 | 1.54 | 1.00 | 2.00 | 3.00 | 236/280 |
| XX.X.XXX.112 | XX.X.XXX.109 | 12.00 | 2.40 | 1.00 | 1.59 | 1.00 | 2.00 | 3.00 | 235/279 |
| XX.X.XXX.109 | XX.X.XXX.110 | 32.00 | 3.04 | 1.00 | 2.70 | 1.00 | 2.00 | 4.00 | 252/279 |
| XX.X.XXX.111 | XX.X.XXX.110 | 9.00 | 1.51 | 1.00 | 1.02 | 1.00 | 1.00 | 2.00 | 152/280 |
| XX.X.XXX.112 | XX.X.XXX.110 | 11.00 | 2.21 | 1.00 | 1.31 | 1.00 | 2.00 | 3.00 | 231/279 |
| XX.X.XXX.109 | XX.X.XXX.111 | 9.00 | 2.01 | 1.00 | 1.20 | 1.00 | 2.00 | 2.00 | 202/279 |
| XX.X.XXX.110 | XX.X.XXX.111 | 10.00 | 2.70 | 1.00 | 1.68 | 1.00 | 2.00 | 3.00 | 244/279 |
| XX.X.XXX.112 | XX.X.XXX.111 | 4.00 | 1.46 | 1.00 | 0.76 | 1.00 | 1.00 | 2.00 | 135/279 |
| XX.X.XXX.109 | XX.X.XXX.112 | 5.00 | 1.56 | 1.00 | 0.85 | 1.00 | 1.00 | 2.00 | 150/279 |
| XX.X.XXX.110 | XX.X.XXX.112 | 6.00 | 2.05 | 1.00 | 1.18 | 1.00 | 2.00 | 3.00 | 234/279 |
| XX.X.XXX.111 | XX.X.XXX.112 | 16.00 | 3.26 | 1.00 | 2.24 | 2.00 | 3.00 | 4.00 | 261/280 |
+————–+————–+——-+——+——+——+——+——+——+———+

Most of the 450 Cluster Health checks inside of Prism with automatic alerting

CVM | CPU
CPU Utilization

Load Level

Node Avg Load – Critical

CVM | Disk
Boot RAID Health

Disk Configuration

Disk Diagnostic Status

Disk Metadata Usage

Disk Offline Status

HDD Disk Usage

HDD I/O Latency

HDD S.M.A.R.T Health Status

Metadata Disk Mounted Check

Metro Vstore Mount Status

Non SED Disk Inserted Check

Nutanix System Partitions Usage High

Password Protected Disk Status

Physical Disk Remove Check

Physical Disk Status

SED Operation Status

SSD I/O Latency

CVM | Hardware
Agent VM Restoration

FT2 Configuration

Host Evacuation Status

Node Status

VM HA Healing Status

VM HA Status

VMs Restart Status

CVM | Memory
CVM Memory Pinned Check

CVM Memory Usage

Kernel Memory Usage

CVM | Network
CVM IP Address Configuration

CVM NTP Time Synchronization

Duplicate Remote Cluster ID Check

Host IP Pingable

IP Configuration

SMTP Configuration

Subnet Configuration

Virtual IP Configuration

vCenter Connection Check

CVM | Protection Domain
Entities Restored Check

Restored Entities Protected

CVM | Services
Admin User API Authentication Check

CVM Rebooted Check

CVM Services Status

Cassandra Waiting For Disk Replacement

Certificate Creation Status

Cluster In Override Mode

Cluster In Read-Only Mode

Curator Job Status

Curator Scan Status

Kerberos Clock Skew Status

Metadata Drive AutoAdd Disabled Check

Metadata Drive Detached Check

Metadata Drive Failed Check

Metadata Drive Ring Check

Metadata DynRingChangeOp Slow Check

Metadata DynRingChangeOp Status

Metadata Imbalance Check

Metadata Size

Node Degradation Status

RemoteSiteHighLatency

Stargate Responsive

Stargate Status

Upgrade Bundle Available

CVM | Storage Capacity
Compression Status

Finger Printing Status

Metadata Usage

NFS Metadata Size Overshoot

On-Disk Dedup Status

Space Reservation Status

vDisk Block Map Usage

vDisk Block Map Usage Warning

Cluster | CPU
CPU type on chassis check

Cluster | Disk
CVM startup dependency check

Disk online check

Duplicate disk id check

Flash Mode Configuration

Flash Mode Enabled VM Power Status

Flash Mode Usage

Incomplete disk removal

Storage Pool Flash Mode Configuration

System Defined Flash Mode Usage Limit

Cluster | Hardware
Power Supply Status

Cluster | Network
CVM Passwordless Connectivity Check

CVM to CVM Connectivity

Duplicate CVM IP check

NIC driver and firmware version check

Time Drift

Cluster | Protection Domain
Duplicate VM names

Internal Consistency Groups Check

Linked Clones in high frequency snapshot schedule

SSD Snapshot reserve space check

Snapshot file location check

Cluster | Remote Site
Cloud Remote Alert

Remote Site virtual external IP(VIP)

Cluster | Services
AWS Instance Check

AWS Instance Type Check

Acropolis Dynamic Scheduler Status

Alert Manager Service Check

Automatic Dedup disabled check

Automatic disabling of Deduplication

Backup snapshots on metro secondary check

CPS Deployment Evaluation Mode

CVM same timezone check

CVM virtual hardware version check

Cassandra Similar Token check

Cassandra metadata balanced across CVMs

Cassandra nodes up

Cassandra service status check

Cassandra tokens consistent

Check that cluster virtual IP address is part of cluster external subnet

Checkpoint snapshot on Metro configured Protection Domain

Cloud Gflags Check

Cloud Remote Version Check

Cloud remote check

Cluster NCC version check

Cluster version check

Compression disabled check

Curator scan time elapsed check

Datastore VM Count Check

E-mail alerts check

E-mail alerts contacts configuration

HTTP proxy check

Hardware configuration validation

High disk space usage

Hypervisor version check

LDAP configuration

Linked clones on Dedup check

Multiple vCenter Servers Discovered

NGT CA Setup Check

Oplog episodes check

Pulse configuration

RPO script validation on storage heavy cluster

Remote Support Status

Report Generation Failure

Report Quota Scan Failure

Send Report Through E-mail Failure

Snapshot chain height check

Snapshots space utilization status

Storage Pool SSD tier usage

Stretch Connectivity Lost

VM group Snapshot and Current Mismatch

Zookeeper active on all CVMs

Zookeeper fault tolerance check

Zookeeper nodes distributed in multi-block cluster

vDisk Count Check

Cluster | Storage Capacity
Erasure Code Configuration

Erasure Code Garbage

Erasure coding pending check

Erasure-Code-Delay Configuration

High Space Usage on Storage Container

Storage Container RF Status

Storage Container Space Usage

StoragePool Space Usage

Volume Group Space Usage

Data Protection | Protection Domain
Aged Third-party Backup Snapshot Check

Check VHDX Disks

Clone Age Check

Clone Count Check

Consistency Group Configuration

Cross Hypervisor NGT Installation Check

EntityRestoreAbort

External iSCSI Attachments Not Snapshotted

Failed To Mount NGT ISO On Recovery of VM

Failed To Recover NGT Information

Failed To Recover NGT Information for VM

Failed To Snapshot Entities

Incorrect Cluster Information in Remote Site

Metadata Volume Snapshot Persistent

Metadata Volume Snapshot Status

Metro Availability

Metro Availability Prechecks Failed

Metro Availability Secondary PD sync check

Metro Old Primary Site Hosting VMs

Metro Protection domain VMs running at Sub-optimal performance

Metro Vstore Symlinks Check

Metro/Vstore Consistency Group File Count Check

Metro/Vstore Protection Domain File Count Check

NGT Configuration

PD Active

PD Change Mode Status

PD Full Replication Status

PD Replication Expiry Status

PD Replication Skipped Status

PD Snapshot Retrieval

PD Snapshot Status

PD VM Action Status

PD VM Registration Status

Protected VM CBR Capablity

Protected VM Not Found

Protected VMs Not Found

Protected VMs Storage Configuration

Protected Volume Group Not Found

Protected Volume Groups Not Found

Protection Domain Decoupled Status

Protection Domain Initial Replication Pending to Remote Site

Protection Domain Replication Stuck

Protection Domain Snapshots Delayed

Protection Domain Snapshots Queued for Replication to Remote Site

Protection Domain VM Count Check

Protection Domain fallback to lower frequency replications to remote

Protection Domain transitioning to higher frequency snapshot schedule

Protection Domain transitioning to lower frequency snapshot schedule

Protection Domains sharing VMs

Related Entity Protection Status

Remote Site NGT Support

Remote Site Snapshot Replication Status

Remote Stargate Version Check

Replication Of Deduped Entity

Self service restore operation Failed

Snapshot Crash Consistent

Snapshot Symlink Check

Storage Container Mount

Updating Metro Failure Handling Failed

Updating Metro Failure Handling Remote Failed

VM Registration Failure

VM Registration Warning

VSS Scripts Not Installed

VSS Snapshot Status

VSS VM Reachable

VStore Snapshot Status

Volume Group Action Status

Volume Group Attachments Not Restored

Vstore Replication To Backup Only Remote

Data Protection | Remote Site
Automatic Promote Metro Availability

Cloud Remote Operation Failure

Cloud Remote Site failed to start

LWS store allocation in remote too long

Manual Break Metro Availability

Manual Promote Metro Availability

Metro Connectivity

Remote Site Health

Remote Site Network Configuration

Remote Site Network Mapping Configuration

Remote Site Operation Mode ReadOnly

Remote Site Tunnel Status

Data Protection | Witness
Authentication Failed in Witness

Witness Not Configured

Witness Not Reachable

File server | Host
File Server Upgrade Task Stuck Check

File Server VM Status

Multiple File Server Versions Check

File server | Network
File Server Entities Not Protected

File Server Invalid Snapshot Warning

File Server Network Reachable

File Server PD Active On Multiple Sites

File Server Reachable

File Server Status

Remote Site Not File Server Capable

File server | Services
Failed to add one or more file server admin users or groups

File Server AntiVirus – All ICAP Servers Down

File Server AntiVirus – Excessive Quarantined / Unquarantined Files

File Server AntiVirus – ICAP Server Down

File Server AntiVirus – Quarantined / Unquarantined Files Limit Reached

File Server AntiVirus – Scan Queue Full on FSVM

File Server AntiVirus – Scan Queue Piling Up on FSVM

File Server Clone – Snapshot invalid

File Server Clone Failed

File Server Rename Failed

Maximum connections limit reached on a file server VM

Skipped File Server Compatibility Check

File server | Storage Capacity
FSVM Time Drift Status

Failed To Run File Server Metadata Fixer Successfully

Failed To Set VM-to-VM Anti Affinity Rule

File Server AD Connectivity Failure

File Server Activation Failed

File Server CVM IP update failed

File Server DNS Updates Pending

File Server Home Share Creation Failed

File Server In Heterogeneous State

File Server Iscsi Discovery Failure

File Server Join Domain Status

File Server Network Change Failed

File Server Node Join Domain Status

File Server Performance Optimization Recommended

File Server Quota allocation failed for user

File Server Scale-out Status

File Server Share Deletion Failed

File Server Site Not Found

File Server Space Usage

File Server Space Usage Critical

File Server Storage Cleanup Failure

File Server Storage Status

File Server Unavailable Check

File Server Upgrade Failed

Incompatible File Server Activation

Share Utilization Reached Configured Limit

Host | CPU
CPU Utilization

Host | Disk
All-flash Node Intermixed Check

Host disk usage high

NVMe Status Check

SATA DOM 3ME Date and Firmware Status

SATA DOM Guest VM Check

SATADOM Connection Status

SATADOM Status

SATADOM Wearout Status

SATADOM-SL 3IE3 Wearout Status

Samsung PM1633 FW Version

Samsung PM1633 Version Compatibility

Samsung PM1633 Wearout Status

Samsung PM863a config check

Toshiba PM3 Status

Toshiba PM4 Config

Toshiba PM4 FW Version

Toshiba PM4 Status

Toshiba PM4 Version Compatibility

Host | Hardware
CPU Temperature Fetch

CPU Temperature High

CPU Voltage

CPU-VRM Temperature

Correctable ECC Errors 10 Days

Correctable ECC Errors One Day

DIMM Voltage

DIMM temperature high

DIMM-VRM Temperature

Fan Speed High

Fan Speed Low

GPU Status

GPU Temperature High

Hardware Clock Status

IPMI SDR Status

SAS Connectivity

System temperature high

Host | Memory
Memory Swap Rate

Ram Fault Status

Host | Network
10 GbE Compliance

Hypervisor IP Address Configuration

IPMI IP Address Configuration

Mellanox NIC Mixed Family check

Mellanox NIC Status check

NIC Flapping Check

NIC Link Down

Node NIC Error Rate High

Receive Packet Loss

Transmit Packet Loss

Host | Services
Datastore Remount Status

Node | Disk
Boot device connection check

Boot device status check

Descriptors to deleted files check

FusionIO PCIE-SSD: ECC errors check

Intel Drive: ECC errors

Intel SSD Configuration

LSI Disk controller firmware status

M.2 Boot Disk change check

M.2 Intel S3520 host boot drive status check

M.2 Micron5100 host boot drive status check

SATA controller

SSD Firmware Check

Samsung PM863a FW version check

Samsung PM863a status check

Samsung PM863a version compatibility check

Samsung SM863 SSD status check

Samsung SM863a version compatibility check

Node | Hardware
IPMI connectivity check

IPMI sel assertions check

IPMI sel log fetch check

IPMI sel power failure check

IPMI sensor values check

M10 GPU check

M10 and M60 GPU Mixed check

M60 GPU check

Node | Network
CVM 10 GB uplink check

Inter-CVM connectivity check

NTP configuration check

Storage routed to alternate CVM check

Node | Protection Domain
ESX VM Virtual Hardware Version Compatible

Node | Services
.dvsData directory in local datastore

Advanced Encryption Standard (AES) enabled

Autobackup check

BMC BIOS version check

CVM memory check

CVM port group renamed

Cassandra Keyspace/Column family check

Cassandra memory usage

Cassandra service restarts check

Cluster Services Down Check

DIMM Config Check

DIMMs Interoperability Check

Deduplication efficiency check

Degraded Node check

Detected VMs with non local data

EOF check

ESXi AHCI Driver version check

ESXi APD handling check

ESXi CPU model and UVM EVC mode check

ESXi Driver compatibility check

ESXi NFS hearbeat timeout check

ESXi RAM disk full check

ESXi RAM disk root usage

ESXi Scratch Configuration

ESXi TCP delayed ACK check

ESXi VAAI plugin enabled

ESXi VAAI plugin installed

ESXi configured VMK check

ESXi services check

ESXi version compatibility

File permissions check

Files in a streched VMs should be in the same Storage Container

GPU drivers installed

Garbage egroups check

Host passwordless SSH

Ivy Bridge performance check

Mellanox NIC Driver version check

NFS file count check

NSC(Nutanix Service Center) server FQDN resolution

NTP server FQDN resolution

Network adapter setting check

Non default gflags check

Notifications dropped check

PYNFS dependency check

RC local script exit statement present

Remote syslog server check

SMTP server FQDN resolution

Sanity check on local.sh

VM IDE bus check

VMKNICs subnets check

VMware hostd service check

Virtual IP check

Zookeeper Alias Check

localcli check

vim command check

Nutanix Guest Tools | VM
PostThaw Script Execution Failed

Other Checks
LWS Store Full

LWS store allocation too long

Recovery Point Objective Cannot Be Met

VM | CPU
CPU Utilization

VM | Disk
I/O Latency

Orphan VM Snapshot Check

VM | Memory
Memory Pressure

Memory Swap Rate

VM | Network
Memory Usage

Receive Packet Loss

Transmit Packet Loss

VM | Nutanix Guest Tools
Disk Configuration Update Failed

VM Guest Power Op Failed

iSCSI Configuration Failed

VM | Remote Site
VM Virtual Hardware Version Compatible

VM | Services
VM Action Status

VM | Virtual Machine
Application Consistent Snapshot Skipped

NGT Mount Failure

NGT Version Incompatible

Temporary Hypervisor Snapshot Cleanup Failed

VSS Snapshot Aborted

VSS Snapshot Not Supported

host | Network
Hypervisor time synchronized

Nov
13

NetBackup Got Upgraded: Modern Workloads use Parallel Streaming.

A new capability in NetBackup 8.1 is its ability to protect modern web-scale and big data workloads like Hadoop and NoSQL that generate massive amounts of data. With the new Veritas Parallel Streaming technology, these modern scale-out workloads can be backed up and protected with extreme efficiency by leveraging the power of multiple nodes simultaneously. The result is that organizations can now adopt these modern workloads with confidence, knowing that their data, even in massive volumes, will be protected. And since new workloads can be added via a plug-in rather than a software agent, organizations can add new workloads without having to wait for a next NetBackup software release. NetBackup Parallel Streaming also supports workloads running on hyper-converged infrastructure (HCI) from Nutanix, as Nutanix and Veritas have partnered to certify protection of those workloads on HCI.

source


Netbackup takes care of utilizing all of the Nutanix storage controllers. NetBackup will mount using NFS of the local storage controller for each node removing the need for a proxy host.

For some general best practices with the first release of this plugin:

    * Apply limit to backup maximum n*4 VMs concurrently, where n is number of nodes in the Nutanix cluster. 16 node cluster than would have 64 VMs being backed up concurrently.

    * Use Media Server as Backup Host if possible.

Note: If VMs are powered off, then Prim element VIP will be used.

Oct
27

Mounting and Enabling NGT Results in an Error Message….. CD/DVD-ROM Drive

Nutanix has a KB ARTICLE 3455

Mounting and Enabling NGT Results in an Error Message that Indicates that the VM does not have a CD/DVD-ROM Drive

If you enable Nutanix Guest Tools (NGT) in the Prism web console, the following error message is displayed.

This VM does not have a CD/DVD-ROM device.
OR
Guest Tools cannot be mounted as there not enough empty CD-ROM(s)

This error message is displayed even though the VM has the CD/DVD device.

You can go ahead and read the KB but its caused by newer VMware versions using a SATA controller instead IDE for the CD ROM. On my VM it kept switching back to SATA from IDE. I got around it by adding a 2nd CD-ROM that was IDE.

Oct
02

Automatically Snap, Clone and Backup AFS (Acropolis File Services)

I wrote a script on the Next community site that automatically snaps, clones and then you can use any backup product that can read off a SMB share. The script can be used to always have the latest backup copy and avoid impacting your production users.

Automatically https://next.nutanix.com/t5/Nutanix-Connect-Blog/24-Hour-Backup-Window-with-Nutanix-Native-File-Services/ba-p/23708

Hope you find it useful.

Sep
15

HYCU v1.5 – Nutanix AHV Backup gets new features

HYCU v1.5 has been released by Comtrade.

The biggest one for me is ABS support! Know you can use cheap and deep storage and drive all of the storage controllers. Remember that ADS works with ABS so its a great solution.

The following new features and enhancements are available with Hycu version 1.5.0:

Backup window
It is now possible to set up a time frame when your backup jobs are allowed to run (a backup window). For example, this allows you to schedule your backup jobs to run on non-production hours to reduce loads during peak hours.

Concurrent backups
You can now specify the maximum number of concurrent backup jobs per target. By doing so, you can reduce the duration of backups and the amount of queued backup jobs.

Faster recovery
Hycu enables you to keep snapshots from multiple backups on the Nutanix cluster. Keeping multiple snapshots allows you to recover a virtual machine or an application quickly, reducing downtime. If your know Commvault Intelisnap, very similar benefits

iSCSI backup target
A new type of backup target for storing the protected data is available—iSCSI, which also makes it possible for you to use a Nutanix volume group as a backup target. You can use the iSCSI backup target for backing up the Hycu backup controller as well.

Improved backup and restore performance
Hycu now utilizes Nutanix volume groups for backing up and restoring data, taking advantage of the load balancing feature offered by Nutanix. Therefore, the new version of Hycu can distribute the workload between several nodes, which results in increased
performance of your backup and restore operations, and reduced I/O load on the Nutanix cluster and containers.

Support for AWS S3-compatible storage
Hycu enables you to store your protected data to AWS S3-compatible storage.

Shared location for restoring individual files
You can now restore individual files to a shared location so that recovered data can be accessed from multiple systems.

Support for Active Directory applications
In addition to SQL Server applications, Hycu can now also detect and protect Active Directory applications running on virtual machines. You can view all the discovered applications in the Applications panel.

Expiring backups manually

If there is a restore point that you do not want to use for a data restore anymore, you can mark it as expired. Expired backups are removed from the backup target within the next 24 hours, resulting in more free storage space and helping you to keep your Hycu system clean.

Support for Nutanix API changes

Hycu supports Nutanix API changes introduced with AOS 5.1.1.1

Sep
14

AOS 5.1.2 Security Updates

A long list of updates, one-click upgrade yourself to safety.

CVE-2017-1000364 kernel: heap or stack gap jumping occurs through unbounded stack allocations ( Stack Guard or Stack Clash)

CVE-2017-1000366 glibc: heap or stack gap jumping occurs through unbounded stack allocations (Stack Guard or Stack Clash)

CVE-2017-2628 curl: negotiate not treated as connection-oriented

CVE-2017-3509 OpenJDK: improper re-use of NTLM authenticated connections (Networking, 8163520)

CVE-2017-3511 OpenJDK: untrusted extension directories search path in Launcher (JCE, 8163528)

CVE-2017-3526 OpenJDK: incomplete XML parse tree size enforcement (JAXP, 8169011)

CVE-2017-3533 OpenJDK: newline injection in the FTP client (Networking, 8170222)

CVE-2017-3539 OpenJDK: MD5 allowed for jar verification (Security, 8171121)

CVE-2017-3544 OpenJDK: newline injection in the SMTP client (Networking, 8171533)

CVE-2016-0736 httpd: Padding Oracle in Apache mod_session_crypto

CVE-2016-1546 httpd: mod_http2 denial-of-service by thread starvation

CVE-2016-2161 httpd: DoS vulnerability in mod_auth_digest

CVE-2016-8740 httpd: Incomplete handling of LimitRequestFields directive in mod_http2

CVE-2016-8743 httpd: Apache HTTP Request Parsing Whitespace Defects

CVE-2017-8779 rpcbind, libtirpc, libntirpc: Memory leak when failing to parse XDR strings or bytearrays

CVE-2017-3139 bind: assertion failure in DNSSEC validation

CVE-2017-7502 nss: Null pointer dereference when handling empty SSLv2 messages

CVE-2017-1000367 sudo: Privilege escalation in via improper get_process_ttyname() parsing

CVE-2016-8610 SSL/TLS: Malformed plain-text ALERT packets could cause remote DoS

CVE-2017-5335 gnutls: Out of memory while parsing crafted OpenPGP certificate

CVE-2017-5336 gnutls: Stack overflow in cdk_pk_get_keyid

CVE-2017-5337 gnutls: Heap read overflow in read-packet.c

CVE-2017-1000366 glibc: heap/stack gap jumping via unbounded stack allocations

CVE-2017-1000368 sudo: Privilege escalation via improper get_process_ttyname() parsing

CVE-2017-3142 bind: An error in TSIG authentication can permit unauthorized zone transfers

CVE-2017-3143 bind: An error in TSIG authentication can permit unauthorized dynamic updates

CVE-2017-10053 OpenJDK: reading of unprocessed image data in JPEGImageReader (2D, 8169209)

CVE-2017-10067 OpenJDK: JAR verifier incorrect handling of missing digest (Security, 8169392)

CVE-2017-10074 OpenJDK: integer overflows in range check loop predicates (Hotspot, 8173770)

CVE-2017-10078 OpenJDK: Nashorn incompletely blocking access to Java APIs (Scripting, 8171539)

CVE-2017-10081 OpenJDK: incorrect bracket processing in function signature handling (Hotspot, 8170966)

CVE-2017-10087 OpenJDK: insufficient access control checks in ThreadPoolExecutor (Libraries, 8172204)

CVE-2017-10089 OpenJDK: insufficient access control checks in ServiceRegistry (ImageIO, 8172461)

CVE-2017-10090 OpenJDK: insufficient access control checks in AsynchronousChannelGroupImpl (8172465, Libraries)

CVE-2017-10096 OpenJDK: insufficient access control checks in XML transformations (JAXP, 8172469)

CVE-2017-10101 OpenJDK: unrestricted access to com.sun.org.apache.xml.internal.resolver (JAXP, 8173286)

CVE-2017-10102 OpenJDK: incorrect handling of references in DGC (RMI, 8163958)

CVE-2017-10107 OpenJDK: insufficient access control checks in ActivationID (RMI, 8173697)

CVE-2017-10108 OpenJDK: unbounded memory allocation in BasicAttribute deserialization (Serialization, 8174105)

CVE-2017-10109 OpenJDK: unbounded memory allocation in CodeSource deserialization (Serialization, 8174113)

CVE-2017-10110 OpenJDK: insufficient access control checks in ImageWatched (AWT, 8174098)

CVE-2017-10111 OpenJDK: incorrect range checks in LambdaFormEditor (Libraries, 8184185)

CVE-2017-10115 OpenJDK: DSA implementation timing attack (JCE, 8175106)

CVE-2017-10116 OpenJDK: LDAPCertStore following referrals to non-LDAP URLs (Security, 8176067)

CVE-2017-10135 OpenJDK: PKCS#8 implementation timing attack (JCE, 8176760)

CVE-2017-10193 OpenJDK: incorrect key size constraint check (Security, 8179101)

CVE-2017-10198 OpenJDK: incorrect enforcement of certificate path restrictions (Security, 8179998)

Sep
14

Acropolis Dynamic Scheduler (ADS) for AHV (Compute + Memory + Storage)

The Acropolis Dynamic Scheduler (ADS) ensures that compute (CPU and RAM) and storage resources are available for VMs and volume groups (VGs) in the Nutanix cluster. ADS, enabled by default, uses real-time statistics to determine:

Initial placement of VMs and VGs, specifically which AHV host runs a particular VM at power-on or a particular VG after creation.

Required runtime optimizations, including moving particular VMs and VGs to other AHV hosts to give all workloads the best possible access to resources.

If a problem is detected, a migration plan is created and executed thereby eliminating hotspots in the cluster by migrating VMs from one host to another. This feature only detects the contentions that are currently in progress. You can monitor these tasks from the Task dashboard of the Prism Web console. You can click the VM link to view the migration information, which includes the migration path (to the destination AHV host).

The Acropolis block services feature uses the ADS feature for balancing sessions of the externally visible iSCSI targets.

Jul
06

Recovery Points and Schedules with Near-Sync on Nutanix

Primer post on near-sync

For the GA release near-sync will be only offered with a telescopic schedule (time based retention). When you set the RPO <=15min to >=1 min you will have the option to save your snapshots for X number of weeks or months.

As example if you set the RPO to 1 min and schedule 1 month retention it would look like this:

X= is the RPO
Y = is the schedule

Every X min, create a snapshot retained for 15 mins (These are the Light-Weight Snaps. They appear as normal snap in Prism)
Every hour create a snapshot retained for 6 hours.
Every day, create a snapshot retained for 1 week
One weekly snapshot retained for 4 weeks (If you select a schedule to retain for 7 weeks, Y would be 7 weeks and no monthly snap would occur)
One Monthly snapshot retained for Y months.

Subject to change, as we’re still finalizing sizing and thresholds based on real-world testing but the user will have an option to change these retention values via NCLI.

SCHEDULE