Why virtualize Hadoop nodes on the Nutanix Xtreme Computing Platform?


o Make Hadoop an App: Prism’s HTML 5 user interface makes managing infrastructure pain free with one-click upgrades. Integrated data protection can be used to manage golden images for Hadoop across multiple Nutanix clusters. Painfully firmware upgrades are easily addressed and time saved.
o No Hypervisor Tax: The Acropolis Hypervisor is included with all Nutanix clusters. Acropolis High Availability and automated Security Technical Implementation Guides (STIG) keeps your data available and secure.
o Hardware utilization: Bare-metal Hadoop deployments average 10-20% CPU utilization, a major waste of hardware resources and datacenter space. Virtualizing Hadoop allows for better hardware utilization and flexibility. Virtualization can also help in right size your solution. If you job complementation times are meeting windows no need buying more hardware. If more resources are needed, they can easily be adjusted.
o Elastic MapReduce and scaling: Dynamic addition and removal of Hadoop nodes based on load allow you to scale based upon your current needs, not what you expect. Enable supply and demand to be in true synergy. Hadoop DataNodes can be quickly clones out in seconds.
o DevOps: Big Data scientists demand performance, reliability, and a flexible scale model. IT operations relies on virtualization to tame server sprawl, increase utilization, encapsulate workloads, manage capacity growth, and alleviate disruptive outages caused by hardware downtime. By virtualizing Hadoop, Data Scientists and IT Ops mutually achieve all objectives while preserving autonomy and independence for their respective responsibilities
o Sandboxing of jobs: Buggy MapReduce jobs can quickly saturate hardware resources, creating havoc for remaining jobs in the queue. Virtualizing Hadoop clusters encapsulates and sandboxes MapReduce jobs from other important sorting runs and general purpose workloads
o Batch Scheduling & Stacked workloads: Allow all workloads and applications to co-exist, e.g. Hadoop, Virtual Desktops and Servers. Schedule job runs during off-peak hours to take advantage of idle night time and weekend hours that would otherwise go to waste. Nutanix also allows to bypass the flash tier for sequential workloads which can prevent the time it takes to rewarm cache for mixed workloads.
o New Hadoop economics: Bare metal implementations are expensive and can spiral out of control. Downtime and underutilized CPU consequences of physical server’s workloads can jeopardize project viability. Virtualizing Hadoop reduces complexity and ensures success for sophisticated projects with a scale-out grow as you go model – a perfect fit for Big Data projects
o Blazing fast performance: Up to 3,500 MB/s of sequential throughput in a compact 2U 4-node cluster. A TeraSort benchmark yields 529 MB/s in the same 2U cluster
o Unified data platform: Run multiple data processing platforms along with Hadoop YARN on a single unified data platform, Acropolis Distributed File System (ADFS).
o Flash SSDs for NoSQL: The summaries that roll up to a NoSQL database like HBase are used to run business reports and are typically memory and IOPS-heavy. Nutanix has SSD tiers coupled along with dense memory capacities. With its automatic tiering technology can transparently bring IOPS-heavy workloads to SSD tiers
o Analytic High-density Engine: With the Nutanix solution you can start small and scale. A single Nutanix block can comes packed up to 40TB storage and 96 cores in a compact 2U footprint. Given the modularity of the solution, you can granularly scale per-node (up to ~10TB/24 cores), per-block (up to ~40TB/96 cores), or with multiple blocks giving you the ability to accurately match supply with demand and minimize the upfront CapEx.
o Change management: Maintain environmental control and separation between development, test, staging, and production environments. Snapshots and fast clones can help in sharing production data with non-production jobs, without requiring full copies and unnecessary data duplication.
o Business continuity and data protection: Nutanix can provide replication across sites to provide additional protection for the NameNode and DataNodes. Replication can be setup to avoid sending wasteful temporary data across the WAN using per VM replication and container based replication.
o Data efficiency: The Nutanix solution is truly VM-centric for all compression policies. Unlike traditional solutions that perform compression mainly at the LUN level, the Nutanix solution provides all of these capabilities at the VM and file level, greatly increasing efficiency and simplicity. These capabilities ensure the highest possible compression/decompression performance on a sub-block level. While developers may or may not run jobs with compression, IT Operations can ensure cold data is effectively stored. Nutanix Erasure Coding and also be applied on top of compression saving.
o Automatic Auto-Leveling and Auto-Archive: Nutanix will spread data evenly across the cluster ensuring local drives don’t fill up causing an outage when space is available. Using Nutanix cold storage nodes cold data can be moved from compute nodes, freeing up room for hot data while not consuming additional licenses.
o Time-sliced clusters: Like public cloud EC2 environments, Nutanix can provide a truly converged cloud infrastructure allowing you to run your Hadoop, server and desktop virtualization on a single converged cloud. Get the efficiency and savings you require with a converged cloud on a truly converged architecture.

Speak Your Mind