Network switches look different in the cloud
Adapting to changing patterns
Cloud computing takes more than just a philosophical shift. It requires new skills, processes and architectures.
In particular, traffic patterns in cloud networks can be quite different from those of the familiar enterprise network and the scale of operation can be significantly higher.
That, according to experts in the field, means planning and building your network differently, both physically and logically.
One of the biggest changes is allocation of bandwidth. In an enterprise network, traffic flow is primarily vertical, or north-south: that is up to the core and down to the clients. It is also pre-defined to some degree: you know which apps run on which servers, and therefore how much bandwidth is needed and where.
By contrast, in a cloud apps are virtualised and logically decoupled from the hardware layer, so an app can run pretty much anywhere and is moveable.
That means you have to build a standardised infrastructure where anything can run anywhere.
“Applications don't correspond to physical space any more,” says Ken Duda, vice-president of software engineering at cloud switching specialist Arista Networks.
He adds that it also means significant horizontal, or east-west, traffic, once you have automated systems moving those virtual machines from server to server as their demands for network capacity change.
Leaf and spine
Part of the solution to this is simply more capacity – for example, the latest switches offer wirespeed 10Gig Ethernet for as little as 2W per port, plus 40Gig, and in the future 100Gig for east-west interswitch links. But most also agree it also demands a different network topology.
Many vendors propose replacing the familiar hub-and-spoke topology with a leaf-and-spine approach, similar to the non-blocking Clos network .
It is possible to build a high-bandwidth switching infrastructure that gives lots of east-west bandwidth using standard protocols. Most network experts also say that the network should be flatter with fewer tiers: two rather than three, and perhaps even just one.
This is partly to simplify network management, but also to reduce latency and increase throughput, according to Bala Pitchaikani, senior director of product line management at Force10 Networks.
One popular idea is to replace spanning tree protocol and its limitations – including blocking off alternative paths, which does not sit well in a fully virtualised any-to-any environment – with a single big Layer 2 network where multiple non-blocking paths between end points can coexist.
Just a large cloud
The result is an Ethernet fabric analogous to today's virtualised storage fabrics, says Simon Pamplin, Brocade's UK and Ireland systems engineering manager.
“We've learned from the storage industry how to manage in virtualised environments,” he says. “A storage network is just a large cloud.”
The two leading non-proprietary options for replacing spanning tree are the IEEE's shortest path bridging and the IETF's Trill (transparent interconnect of lots of links).
The problem is that having two of them leaves plenty of room for vendor confusion and semi-proprietary alternatives. And both are likely to need new hardware.
But there are other schemes on offer. For example, Pitchaikani says that as well as backing Trill over L2, Force10 recently introduced a 64-port switch supporting a Layer 3 alternative called equal-cost multi-path routing.
Duda, on the other hand, prefers “network virtualisation – a service model virtualised at the edge, giving the appearance of lots of L2 networks".
He adds: “Layer 2 scales well to a couple of thousand servers, which is enough for smaller clouds. Beyond that the simplest thing is what Google did: a Layer 3 cloud that includes a broadcast service.”
Whichever multi-path scheme you end up using, the next layer of automation – and complexity – is the ability to move the network connection along with the virtual machine.
Pamplin says that is why Brocade's Ethernet fabric technology also includes automatic migration of port profiles.
Other vendors do broadly similar things. Arista, for example, has provided hooks in its EOS switch operating system so cloud management tools can dynamically modify an 802.1Q virtual LAN.
Finally, while we are still at the switching layer, the need to increase speed and minimise network latency also means doing more locally.
One solution is to have an application service module or blade within the switch so that some tasks, such as security and load balancing, can be done within the rack instead of being farmed out across the cloud, says Johan Ragmo, data business development manager for northern Europe at Alcatel-Lucent Enterprise.
This also ties in with the standardised block, pod or container approach favoured by Alcatel-Lucent, Brocade, Cisco and others, where servers, storage and switches are integrated into a pre-tested modular building block.
Shipped as a single standard item, this hooks up with the automated provisioning layer and automatically adds its resources to the cloud pool. ®