As server farms grow and their workload changes, the design and structure of the networks that serve them must also change. End-of-row switching is increasingly giving way to top-of-rack switching, and tiered networks may need to be replaced – or perhaps augmented – by more mesh-like Ethernet fabrics.
The increasing density of servers, both physical and virtual, is clear. What is less obvious, says Bala Pitchaikani, a product line management director at Force10 Networks, is just how much demand this can place on network bandwidth.
"For example, you could have four servers in 2U, each with two or four Xeons with four to six cores per chip: that could be as much as 96 cores in 2U," he says.
If each server has a 10 Gigabit Ethernet (GbE) link, "you are seeing 40G per 2U. Then you can have 20 times 2U in each rack, so that's 20 by 40G from the rack".
And once you add in converged networking, with storage traffic running over the same wire, even that may not be enough to keep all those cores and their virtual machines fed and watered.
“The next generation of Ethernet fabric adapters brings higher performance and more flexibility, with a single card having multiple personalities: 10G Ethernet, Fibre Channel, iSCSI and Fibre Channel over Ethernet ,” says Simon Pamplin, Brocade's UK and Ireland systems engineering manager.
“Our current boxes offer 60 ports of 10G at the top-of-rack. We are trying to make sure that the development of the network keeps pace with server development, because you can put so much performance in servers.”
The second change factor that needs to be considered is the workload: flatter cloud-type architectures, more virtualisation and the need to move virtual machines around, and more server-to-server communication in general, all mean different traffic flows.
In particular, it means less traditional north-south traffic, going from server to core to client, or server to core to server, and more east-west traffic going server to server, or even client to client.
“Lots of traffic doesn't go to the core any more, except maybe for big database queries,” says Pitchaikani.
“The concept of tiering within software – for example, web, application, database, storage – is starting to fall apart. As unstructured data and the cloud become key, we no longer divide application and database.”
In pod we trust
A popular way to deal with this, and also to provide high scalability, is to create interlinked pods, racks, containers or compute blocks, says Johan Ragmo, data business development manager at Alcatel-Lucent Enterprise.
These provide large numbers of standardised hosts for virtualised servers, plus fabric-style fast network connections both within the pod and between pods, as well as back to the core.
Of course core or data centre switches with plenty of wirespeed – 10GbE at least – are still needed, he says, but bandwidth-hungry applications will increasingly need higher speeds for inter-switch links too, which is why a device such as Alcatel-Lucent's modular OmniSwitch 6900 can offer six 40GbE uplinks alongside 40 10GbE ports.
"With unified communications, SMP and so on, you will have a lot of east-west traffic between servers," says Ragmo.
"The traditional way is into the core and back, but in a pod you could take three 40GbE links and connect them to three other pods in a star or ring, say.
“Yes, it needs more switch performance in the rack for this architecture, for example to do 40GbE inter-switch links, but it also means you can do with fewer core switches and you need less power and space.”
The alternative approach of providing this level of server-to-server connectivity via the core would require exponentially more core switches, he adds.
Putting this much capacity at the end of the row within a container or pod is possible, but challenging.
“Vendors that cannot offer a small form-factor switch will push end-of-rack,” says Pitchaikani. But he thinks it is much better to distribute the network evenly, which tends to mean top-of-rack.
“You also need to provide, for example, load balancing at top-of-rack, so you need to push network compute functions into an appliance module,” he says.
“However, for scale you might want subcontainers with compute nodes and end-of-row switches. Or you could have 10G each to dual active-active Z9000 distributed-core switches, leaving plenty of spare ports for inter-switch links.
"I believe top-of-rack is the better way. The standard container takes ten racks, so you could have nine for servers and one for switching and so on, but that's not balanced.”
Pamplin says it is the fabric part of this that is key – the move to high-performance and high port-count flat networks that bypass the spanning tree protocol and instead work more like meshes, with multiple active-active data paths. This enables compute blocks to interlink without going via the core.
“It also comes down to the cabling. If it is organised for end-of-row switching, you can give high performance to that,” he says. ®