Cloud box does virtualization sans SAN

Practice safe SOCS

Cloud-computing appliance maker Nutanix is tackling a problem that has dogged the deployment of virtual servers and desktops: all the key hypervisors require storage area networks and centralized storage.

To overcome this limitation, Nutanix has created a virtualized controller that implements a clustered file system and embeds it in a cluster-compute appliance – the compute nodes essentially become a virtual SAN.

The beauty of the Nutanix Complete Cluster, says company cofounder and CEO Dheeraj Pandey, is that the server virtualization hypervisors that run atop the appliances still think they are talking to a SAN – all the nifty high-availability, snapshotting, and live-migration features baked into these hypervisors and their management consoles continue to work just as they did before.

By banning the SAN, says Pandey, Nutanix can simplify the rollout of compute clouds at large enterprises that are familiar with complexity and high cost, and therefore have been happy – however begrudgingly – to invest in SANs. In addition, it can make cloudy infrastructure more affordable for small and medium businesses that want high-end virtualization features that used to require SANs.

Nutanix was founded in 2009 by file-system experts from Google and cluster experts from Oracle. Pandey managed the early incarnations of the Exadata product and also created the storage engine for Oracle's database. More recently, he was vice president of engineering at Aster Data Systems, which was acquired by Teradata in March of this year for $263m so the data warehouser could get its mitts on the company's nCluster hybrid row-column database and SQL-MapReduce big-data chewer.

Nutanix cofounder Mohit Aron also hails from Aster Data – he was the chief architect at the firm and did a lot of the work on nCluster. Prior to his stint at Aster, he was at Google, leading the design and development of the Google File System, the original incarnation of Google's distributed file system that supported its MapReduce big-data crunching techniques.

Pandey and Aron were joined by Ajeet Singh as the third cofounder and the company's chief products officer. Also from Aster Data, where he was director of product management, Singh had previously been part of Oracle's early cloud computing efforts. The three cofounders got their seed funding from private investors in May 2010, and pulled in $13.2m in Series A funding in April 2011 from Lightspeed Venture Partners and Blumberg Capital.

How to fool an unsuspecting server

The Nutanix Complete Cluster begins with the basic building block in today's data center: a two-socket x64 server. Nutanix currently sources two-socket tray servers from Dell and Super Micro, which cram four nodes into a 2U rack-mounted chassis – they plan to source machines from HP once it delivers on-board 10 Gigabit Ethernet ports.

Pandey says that it only supports its software stack on select hardware configurations because Nutanix has to do a lot of tuning on the server, flash storage, and disk storage that goes into its appliances – customers can't just run the Nutanix stack on whatever servers they have lying around their data center.

Each server node in the cluster is configured with two six-core Xeon 5600 processors, with eight cores allocated to run hypervisors and virtual machines, and four cores allocated to run the Nutanix virtual storage controller, called Scale-Out Converged Storage – SOCS for short. This controller virtualizes a pool of Intel and Fusion-io solid state disks and Seagate SATA drives, and presents virtual machines with block and file I/O access to data spread across these disks inside the cluster.

"The architecture is hypervisor agnostic," explains Pandey, with VMware's ESXi 4.1 hypervisor being the first one to get support on the cloud appliance. "ESX thinks it is talking to SAN storage and it is not."

Nutanix cloud appliance architecture

The Nutanix Complete Cluster can fool multiple servers into thinking they're connected to a SAN

The storage cluster implemented by SOCS on the compute nodes uses 10GE ports to link the nodes to each other, and has Gigabit Ethernet links to provide access to VMs and their workloads. The 10GE link is necessary to make use of ESXi hypervisor features such as live migration, high availability, fault tolerance, and distributed resource management, which create a lot of chatter on the network. On a SAN, can you just flip some pointers to make a VM's file point to a different physical server during a live migration, but on a Nutanix appliance, you need to move data.

Stirring up some secret HOT sauce

The SOCS storage software includes Cluster RAID, which stripes data across disk drives within a server node for high performance. What Nutanix calls Heat-Optimized Tiering cache, or HOTcache, caches data in each cluster node on a local SSD and also puts a copy on a different node in the cluster as a backup. SOCS also includes a distributed metadata service, called Medusa, that spreads the metadata around to multiple nodes for performance and fault tolerance reasons. "The secret sauce in all of this is the metadata, and it is globally addressable," says Pandey.

Nutanix cloud appliance

The Nutanix Complete Cluster appliance, complete with massive logo

SOCS sports a distributed data maintenance service called Curator that uses MapReduce techniques to figure out what bits of data are being used by what VM when and where, and automatically migrates the coldest data to disks and the hottest data to the Fusion-io and Intel SSDs.

Curator also rebalances data when nodes are added to the Nutanix cluster and moves data along with a VM when they are live-migrated, thus keeping data used by a VM as close to it as possible. SOCS includes snapshotting features like a real SAN, plus filers (called QuickClone) as well as thin provisioning and converged backup, which makes backups of files onto the file system and allows them to be pushed out to external online backup services.

Name your hypervisor – eventually

Each Nutanix cloud server node has two Xeon 5600 processors, one 320GB Fusion-io flash disk for metadata and data that's plugged into a PCI Express 2.0 peripheral slot, one 300GB Intel SSD housing system software that's slid into a SATA disk bay, and five 1TB Seagate 2.5-inch SATA drives for customer data.

Each node comes with a base 48GB of DDR3 main memory, which can be expanded to 192GB as workloads dictate. At the moment, early customers are using 10GE switches from Arista Networks and Super Micro to link Nutanix appliances together, but any 10GE switch should work.

Nutanix cloud appliance exploded view

Exploded view of a Nutanix cloud appliance

A four-node block with eight processors and 48 cores that can be allocated to VMs, 192GB of memory (expandable to 768GB), 1.25TB of user-accessible SSD storage, and 20TB of disk capacity costs $115,000. Too much? Well, there's also a starter kit with three server nodes in the Nutanix cloud appliance has a slightly discounted price at $75,000.

If you need more oomph, a full rack of Nutanix appliances – 18 blocks and 72 server nodes totaling 576 cores, 3.4TB of main memory, 18TB of SSD capacity, and 360TB of disk capacity – will run you just over $2m.

That may sound a bit pricey, but Pandey points out that compared to a rack of servers and external SANs, this setup costs 40 to 60 per cent less and delivers somewhere around ten times the bang for the buck because of the flash tiering and other SOCS goodies.

The Nutanix Complete Cluster is available now. Pandey says that the company will listen to customers about whether it should next support Microsoft's Hyper-V or Red Hat's KVM hypervisor, but Nutanix will eventually support both, as well as ESXi from VMware. Xen will also no doubt eventually be supported – if customers ask for it. ®

Similar topics

Other stories you might like

  • Battlefield 2042: Please don't be the death knell of the franchise, please don't be the death knell of the franchise

    Another terrible launch, but DICE is already working on improvements

    The RPG Greetings, traveller, and welcome back to The Register Plays Games, our monthly gaming column. Since the last edition on New World, we hit level cap and the "endgame". Around this time, item duping exploits became rife and every attempt Amazon Games made to fix it just broke something else. The post-level 60 "watermark" system for gear drops is also infuriating and tedious, but not something we were able to address in the column. So bear these things in mind if you were ever tempted. On that note, it's time to look at another newly released shit show – Battlefield 2042.

    I wanted to love Battlefield 2042, I really did. After the bum note of the first-person shooter (FPS) franchise's return to Second World War theatres with Battlefield V (2018), I stupidly assumed the next entry from EA-owned Swedish developer DICE would be a return to form. I was wrong.

    The multiplayer military FPS market is dominated by two forces: Activision's Call of Duty (COD) series and EA's Battlefield. Fans of each franchise are loyal to the point of zealotry with little crossover between player bases. Here's where I stand: COD jumped the shark with Modern Warfare 2 in 2009. It's flip-flopped from WW2 to present-day combat and back again, tried sci-fi, and even the Battle Royale trend with the free-to-play Call of Duty: Warzone (2020), which has been thoroughly ruined by hackers and developer inaction.

    Continue reading
  • American diplomats' iPhones reportedly compromised by NSO Group intrusion software

    Reuters claims nine State Department employees outside the US had their devices hacked

    The Apple iPhones of at least nine US State Department officials were compromised by an unidentified entity using NSO Group's Pegasus spyware, according to a report published Friday by Reuters.

    NSO Group in an email to The Register said it has blocked an unnamed customers' access to its system upon receiving an inquiry about the incident but has yet to confirm whether its software was involved.

    "Once the inquiry was received, and before any investigation under our compliance policy, we have decided to immediately terminate relevant customers’ access to the system, due to the severity of the allegations," an NSO spokesperson told The Register in an email. "To this point, we haven’t received any information nor the phone numbers, nor any indication that NSO’s tools were used in this case."

    Continue reading
  • Utility biz Delta-Montrose Electric Association loses billing capability and two decades of records after cyber attack

    All together now - R, A, N, S, O...

    A US utility company based in Colorado was hit by a ransomware attack in November that wiped out two decades' worth of records and knocked out billing systems that won't be restored until next week at the earliest.

    The attack was detailed by the Delta-Montrose Electric Association (DMEA) in a post on its website explaining that current customers won't be penalised for being unable to pay their bills because of the incident.

    "We are a victim of a malicious cyber security attack. In the middle of an investigation, that is as far as I’m willing to go," DMEA chief exec Alyssa Clemsen Roberts told a public board meeting, as reported by a local paper.

    Continue reading

Biting the hand that feeds IT © 1998–2021