This article is more than 1 year old

Improve your business resiliency with disaster recovery in a hybrid cloud infrastructure

No need to be so DRaaStic?

Sponsored In a world of increased ransomware attacks and intentional or unintentional attacks, businesses need to plan for a high uptime for increased resiliency of their applications.

Planning for disaster recovery (DR) when planned or unplanned downtime inevitably happens is a key task for most IT teams. Hybrid Cloud has emerged as a popular architecture for improved application uptime, and when compared with legacy setups, it’s easy to see why.

To ensure business continuity in a traditional scenario, the IT organization must dedicate ample failover infrastructure at a separate physical site to mirror important applications and services.

Your DR plan may work as intended when disaster strikes, or even when planned downtime needs to happen for regularly scheduled maintenance or upgrades. But the cost of the traditional setup is burdensome in terms of time and money. And let’s not forget that the DR equipment is largely redundant most of the time, hopefully.

In response to these shortcomings, a booming ecosystem of disaster recovery-as-a-service (DRaaS) providers has emerged. The DRaaS vendor provides a neat solution to the requirement for off-site infrastructure, enabling customers to duplicate and host servers in the third-party provider’s data center or in a public cloud.

However, while DRaaS solutions can take away the onus from the customer to manage their application uptime, this may not be ideal for businesses that want more control over managing their application recovery in case of a DR event. What is true though, is that these resources reserved for recovering your application in case of a rare DR event are little used unless disaster strikes.

A modern hybrid cloud infrastructure gives organizations an alternative option - one that enables them to retain control over the DR process while minimizing the costs of operating a secondary physical DR capability. This option can also provide an on-ramp to broader adoption of cloud services. Let’s take a closer look.

Along comes HCI

Modern data center systems based on hyperconverged infrastructure (HCI) offer an alternative option for organizations looking to utilize the cloud. HCI offers cloud-like flexibility and automation and was developed as an easier way to implement infrastructure for running virtual machines, rather than building the infrastructure out of discrete servers and storage components. A built-in DR capability of some kind is a standard feature of a number of HCI platforms. For example, Nutanix, a pioneer of HCI, has from the very beginning offered a backup capability using snapshots that can be replicated to a cluster of Nutanix servers at a remote site.

“Customers that deploy Nutanix HCI basically get the ability to do replication between sites, and if you are getting the higher tier of the offering, you get all of the built in orchestration to go and be able to failover VMs and fail them back later, and so forth,” says Thomas Cornely, Senior Vice President of Product Management at Nutanix.

The Nutanix Cloud Platform supports several DR options to meet a variety of customer Recovery Point Objective (RPO) and Recovery Time Objective (RTO) requirements, where RPO is the interval between backups and RTO is the timeframe in which an organization wishes to be able to recover in if a disaster strikes.

The simplest level is Asynchronous DR, which backs up a group of VMs and volume groups locally to the Nutanix cluster and replicates them to one or more remote sites. This offers an RTO of minutes, but an RPO of one hour, which means up to an hour’s worth of data might be lost should disaster strike. To take example, The Phia Group, a US health insurance consulting firm, said it accomplished a 24x speedup in RPO - from 24 hours to one - when it moved its DR solution to Nutanix Xi Leap.

NearSync DR uses lightweight snapshots to enable a low RPO (up to 20 seconds) delivered over any distance without affecting performance on the primary site, while synchronous replication is constrained by the physical distance between the primary and secondary site. NearSync DR also provides an RTO of minutes. With Metro or Sync DR, Nutanix offers a zero RPO at the VM level. Synchronous replication is supported between sites with under 5ms latency and offers an RTO of minutes.

Going hybrid with Nutanix

Nutanix has evolved the Nutanix Cloud Platform to enable a hybrid cloud scenario, by engineering its platform to run in public clouds, starting with AWS and coming soon with Azure. The solution, called Nutanix Clusters, uses EC2 bare-metal instances provisioned in a customer’s Amazon Virtual Private Cloud (VPC) using their existing cloud-networking and runs applications on a cloud-based cluster of Nutanix nodes just as they would on a cluster of on-premises physical nodes.

Nutanix Clusters makes it relatively simple to put in place a cloud-based DR strategy where the customer retains complete control, since the on-premises cluster of Nutanix nodes and the Nutanix Clusters on AWS are both managed through the same admin console. This also means that disaster recovery supports any workload, from virtual desktops (VDI) to virtual servers and databases, because it replicates everything to the customer’s cloud account. Nutanix also works transparently across hypervisors, so that on-premises Nutanix systems may be running with VMware’s ESXi while the DR cluster, running in AWS, might be operating using Nutanix’s own AHV hypervisor. The applications will see no difference.

But there are further advantages than convenience. When developing Nutanix Clusters, Nutanix considered how the platform should operate, in order to allow customers to realize the benefits of flexibility and cost saving that public cloud services have long promised.

This can be seen in the way that Nutanix Clusters allows customers to swiftly spin up new clusters of nodes on demand. This capability has been used to implement what Nutanix calls “Elastic DR”. The term refers to the ability to provision a DR cluster in the cloud using a bare minimum number of nodes (typically 3 nodes), and if failover becomes necessary, the DR cluster can rapidly scale to whatever size is required.

“So the idea here is that you may have an on-premises cluster of, say, 10 nodes, and a DR site which is also 10 nodes. Instead of duplicating capacity, you configure your second site with a pilot light Nutanix Cluster on AWS, and size it for just pulling in the data that you're sending to the cluster, typically three to four nodes,” says Cornely.

“When you have to do DR tests or a disaster happens, you can automatically and immediately expand your cluster to 10 nodes to match what you have on-premises to run your workloads. When you're ready to failback, you failback the workloads and then shrink the cluster back down to three nodes,” he explains. This arrangement should prove attractive to enterprise customers for several reasons. There is the obvious economic advantage of not needing to run a dedicated site for DR. Replacing that with a cloud solution that flexes as required will minimize operating costs.

An additional benefit of Clusters-based DR, Nutanix says, is that when the instance is not in use for DR, it can be put to work on other scenarios such as dev-test and capacity bursting.

“We don't have to spend a ton of time doing spreadsheets and modelling to demonstrate to you that you're going to actually save costs as well as getting something that's more agile for you,” Cornely says.

On top of that, the customer stays in control, with everything managed from Nutanix multicloud management plane (called Prism), unifying private and public cloud infrastructure operations and day-to-day workload management. And in a nice touch, from a billing perspective, Nutanix Clusters run in the organization’s existing AWS account using their existing networking setup. This greatly simplifies the steps needed to get going with the hybrid cloud setup – no need to create a new silo of another AWS account or to create new networking layers. Come as you are, and you are ready to go with Nutanix and AWS.

Nutanix also supports portability of software licenses for its platform, so that if a customer decommissions a Nutanix cluster, from a dedicated DR site or anywhere else, they can reapply those same licenses to the DR nodes they operate in AWS thereby improving the utilization of your business investment with Nutanix.

But what if the customer needs capacity beyond the number of licenses they own? The extra consumption is charged at a metered rate, Nutanix says, which accommodates organizations that need to burst capacity for a short time. Customers can get additional cost savings by leveraging their existing Nutanix Clusters and AWS Bare Metal investments to support dev/test in the Cloud or to do on demand capacity bursting when they are not in use for DR. Where an application has less stringent RPO and RTO objectives, customers can reduce costs further by hibernating the entire DR cluster to Amazon S3 storage between sync events. This capability means that all the data and configuration details associated with the cluster are stored as a snapshot in S3, and all the resources used by it are released so the customer is no longer paying for them.

“Let’s say you are looking for a daily sync. You can do that sync, have your data from on-prem go to your cluster while it’s running, then pause synchronization and hibernate that cluster for the next 24 hours, until you need to spin up the cluster again,” explains Cornely.

Nutanix’s main goal is to facilitate customers’ hybrid cloud strategies by enabling customers to place their workloads in public or private clouds on their own terms, according to their business needs. Putting the Nutanix Cloud Platform into the cloud may seem a somewhat strange idea at first glance, but it means that it becomes a kind of compatibility layer that allows on-premises workloads to run in the cloud without modification.

Elastic DR is just one of many use cases for Nutanix Clusters and can be seen as a way to draw customers into taking that first step into hybrid cloud by making a compelling case on cost and convenience grounds. Once they have Nutanix Clusters operating in their VPC, the customer can easily expand their footprint, and from there access the broad catalog of AWS cloud services. Nutanix also plans to extend Nutanix Clusters to other cloud platforms such as Microsoft’s Azure, making a multi-cloud deployment possible in the very near future.

Sponsored by Nutanix.

More about


Send us news