This article is more than 1 year old
How to keep everything fluffy in your hybrid cloud world
Five steps to happiness
Cloud computing is a big deal these days. Old farts like me can debate ad nauseam whether cloud is just a new word for what we use to call managed or hosted services and whether it is barking to call an on-premise virtualised infrastructure a private cloud.
The fact remains that whatever you call it there is a vast amount of it about and it is growing like mad.
The thing is, few organisations put their entire world in the cloud. It is common to see systems split between an on-premise (or self-managed data centre) installation and a cloud setup.
Even companies that go for a 100 per cent cloud service often spread their systems between two or more cloud providers to avoid single points of failure.
There are, of course, loads of names for these concepts: hybrid clouds, connected clouds, integrated clouds and so on.
I will use the terms “integrated” and “hybrid” synonymously to talk about the subject in hand: making your self-managed virtual setup work with a cloud setup and making two disparate cloud installations work with each other.
Step one: connectivity
The first thing you need to connect systems together is (quelle surprise) some kind of connection. Nine times out of ten this means a site-to-site VPN connection via the internet between the various locations, though more direct links are also possible.
You can run a leased line into an Amazon installation, for instance, and other telcos achieve similar results by connecting you to their managed MPLS or VPLS networks.
The nature of the connection is entirely irrelevant to how you use it, of course. Part of the point of a VPN is that the network guys can abstract its existence so that it looks to the rest of the system management team like just another network link. So regardless of how the physical connectivity works, you need to follow the same simple rules.
Routing: decide which parts of your network need to route to and from the cloud setup, and design your routing algorithms sensibly. If you are using VPNs (cheap internet connectivity) then by all means consider having several VPNs between the cloud service and your premises rather than a single bottleneck between you and the cloud.
Garbage prevention: use access control lists to restrict the flow of traffic between locations, and don't permit unnecessary junk to go over your integration link because it is bound to be a pinch point anyway and garbage will just slow things down.
Optimisation: some vendors use link optimisation hardware and by buying a similar optimiser for your end you can get a significant speed-up (typically five to eight times in my experience). WAN optimisers are not cheap but the cost can be offset by the fact that you can rent a slower, cheaper link.
Speed: before you take the plunge, make sure you have read step five (below) carefully or you will end up with a connection that is not fit for purpose.
Step two: proprietary mechanisms
Once you have a connection between your systems and the cloud location(s), you need to connect the infrastructures. This is where you have to consider the virtualisation platforms in use by your systems and those in the cloud.
Say you are using a VMware platform for internal virtualisation and the cloud provider you have chosen is also a VMware house. Assuming they present the necessary interfaces to their customers, you could use vCloud Connector to hook the two worlds together and manage them as a single whole.
I have seen it done and it is a great way of making your worlds manageable. And if you are a Microsoft house don't worry, you can use Hyper-V's assorted tools to achieve pretty much anything you want with regard to inter-site replication, management and the like.
You could look to one of the third-party management systems that works with multiple cloud installations
If you have different platforms at your various sites all is not lost; you still have a couple of options. First, you could look to one of the third-party management systems that works with multiple cloud installations – RightScale Cloud Management, for example, or if you are feeling geeky then something like OpsCode Chef.
Alternatively, you can go for a DIY approach, in which case you just need to think about it in the same way as you would if you had a mix of platforms within your self-managed infrastructure.
Step three: generic integration
This is all about standards. If you manage a heterogeneous environment on your premises, then why not look to extend that management to the cloud elements as well?
System monitoring is a classic example. SNMP is just SNMP, so there should be no problem dipping into the cloud service's elements so long as they are open to you (and on the elements you control such as your virtual servers and virtual firewalls this should be the case).
Similarly, any tool that you use to manage the operating system of an in-house server should be able to manage cloud-based servers in just the same way. And if you have extended your Active Directory (other directory services are available) between on-premise and the cloud, managing your server operating system world should be seamless.
Okay, you don't get the low-level management of a virtualisation layer in this model, but that doesn't prevent you from doing everything else in a coherent, integrated way.
Step four: know what you can't do
Unless you are really lucky you won't be able to integrate your world with the cloud setup at a very low level.
What can you do about this? Frankly, nothing. True, if your cloud provider is small and flexible then it may allow you such things as read-only access to infrastructure modules – but if a provider offered me this I'd instantly wonder whether it bought its ISO27001 certificate from Tesco and whether there's a line of horses tied up outside the tavern next door.
Cloud providers that give “important” customers low-level access to their infrastructure kit are to be feared: you don't let customers sniff at the bits of the world that are shared with other customers or might give clues to potential security configuration holes.
Accept, then, that you won't be able to set up span ports on network switches, or see what compression options are on the underlying storage, or check out the saturation on the uplinks between physical server hosts.
Instead be content with studying the service-level agreement to be sure you understand what you are getting and that if there is a problem the provider's team will diagnose the elements you don't have access to.
Step five: application integration
The reason why you have an infrastructure at all is because you are going to layer applications on top of it.
So if you are going to have a hybrid cloud setup you need to be confident that the applications will work. Happily, there are a handful of simple rules to follow.
Replication speed: you are probably looking to have some of your applications replicated in both locations to give yourself the ability to fail over to the secondary service should the primary one keel over and die.
In that case, you need the primary server to be able to replicate in real time to the secondary – which means ensuring there is enough bandwidth between locations. Before you pick your link you need to consider the throughput you need.
Accessibility: in the event that you need to fail over to the secondary site, you need to understand how your users will access the systems there.
Let's take the example of a desktop app that is back-ended by an Oracle database, where that database is replicated between two sites. The Oracle desktop driver has a config file that describes the various database connections available to that machine.
It is a simple job to configure both your primary and secondary Oracle servers in that file, and if you have done so the driver will magically try the secondary if the primary is unavailable.
Consider all those little nuances and test your solutions promptly rather than waiting for an empirical test when something goes wrong.
Failback: failover in some applications is rather like falling through a trapdoor – it takes seconds to fail over but hours or days to bugger about resynchronising everything to fail back. Make sure you understand this and test both failover and failback as soon as you think it is working.
Nothing is simple
Cloud integration is no different in principle from general on-premise integration.
You figure out how you want your applications to interact with the users and with each other, you connect the systems together, you pick the most effective techniques for managing the integrated whole (though in the cloud you may be prevented from peeking into things at the low level you are used to), and you test it all thoroughly.
No, it is not simple, but nor are most things to do with infrastructure. It is, however, really not that much harder than running a self-managed setup. And that's what matters. ®