Ethernet reaches for the hyper-scale cloud

To infinity and beyond

What if the largest Ethernet networks we see today are just precursors, initial steps on the path to what's been called hyper-scale cloud networking?

The term "hyper" is used generally to describe something almost unfathomably large. We might, for example, say that a regional group of airports is a small air transport network, a national one is a large network, and the global air-transport system, with its hundreds of airports, thousands of planes and millions of flights a year carrying billions of passengers, is a hyper network.

A hyper-scale Ethernet network is global and embraces tens of thousands of cables and switches, millions of ports, and trillions, perhaps quadrillions or even more, of packets of data flowing across the network each year.

The Ethernet used for such a network has not been developed yet but it will be based on standards and speeds that are coming into use now.

Breaking the speed limit

Currently we are seeing 10Gbps Ethernet links and ports being used for high data throughput end-points of Ethernet fabrics. The inter-switch links, the fabric trunk lines, are moving to 40Gbps with backbones and network spines beginning to feature 100Gbps Ethernet.

For example, ISP iiNet is readying itself for the National Broadband Network being developed in Australia by using Juniper T1600 routers in the Asia-Pacific 100Gbps Ethernet backbone trial.

Dell-owned Force10 Networks has announced 40Gbps Ethernet switches. Brocade, Cisco and Huawei also have 100Gbps Ethernet product.

Both 40Gbps and 100 Gbps Ethernet have frames being transmitted along several 10Gbps or 25Gbps lanes and are specified in the IEEE 802.3ba standard.

Ethernet speeds do not increase in step with Moore's Law as optical cable transmission speed increases are not primarily a digital problem but an analog one. Thus there is no easy way to make out any path to 1,000Gbps Ethernet, even though hyper-scale Ethernet fabrics might well use it for backbone links.

Not good enough

Away from wire considerations, ordinary Ethernet is ludicrously mismatched to hyper-scale networking requirements, being prone to losing packets and offering unpredictable packet delivery times, something known as being non-deterministic.

Fortunately, in an attempt to layer Fibre Channel storage networking on Ethernet, the IEEE is developing Data Centre Ethernet to stop packet loss and provide predictable packet delivery.

This involves congestion management through pausing flows (802.1Qau), a priority scheme to ensure important Ethernet traffic gets through a congested network (802.1Qbb) with specific bandwidth amounts allocated to traffic types (ETS and the 802.1Qaz standard).

It is a way for configuration data of network devices to be maintained and exchanged in a consistent manner (802.1AB).

Where standardisation efforts seem to be failing is in coping with the limitations of Ethernet's Spanning Tree Protocol (STP).

As Ethernet sprays packets all over the network, this protocol is in place to prevent endless loops. It creates a set (spanning tree) inside a mesh network of connected Layer-2 bridges. Spare links are set aside for use as redundant paths if the main one fails, but this means that not all paths are used and network capacity is wasted.

Trill (Transparent Interconnect of Lots of Links) is an IETF standard aiming to get over this, and is supported by Brocade and Cisco. It provides for multiple path use in Ethernet and so does not waste bandwidth.

Brocade tells us that Trill:

• Uses shortest path routing protocols instead of STP

• Works at Layer 2, so protocols such as FCoE can make use of it

• Supports multi-hopping environments

• Works with any network topology and uses links that would otherwise have been blocked

• Can be used at the same time as STP

The main benefit of Trill is that it frees up capacity on your network that can’t be used with STP, allowing Ethernet frames to take the shortest path to their destination. Trill is also more stable than STP, offering faster recovery time following a hardware failure.

Late arrival

A problem with hyper-scale networks is latency. The longer it takes for network devices to respond to incoming traffic and send it on its way, the longer it takes for data to traverse the network.

Brocade's VDX 6730 data centre switch, a 10GbE fixed port switch, shrinks the time needed for traffic to pass through it with a port-to-port latency of 600 nanoseconds, which is virtually instantaneous.

This involves a single ASIC. Local switching across ASICS on a single VDX 6730 switch for intra-rack traffic has a latency of 1.8 microseconds.This helps designers build a network with no over-subscription for deterministic network performance and faster application response time.

Other companies such as Arista are specialising in delivering low-latency switches. Arista chief executive Andy Bechtolsheim says the company’s switches with a 3.5 microsecond latency are much faster than the 5-microsecond Nexus 5000, the 15-microsecond Nexus 7000 and the 20-plus microsecond latency of the Catalyst 6500.

Each switch crossing will contribute its own minuscule delay

We can see that as the number of switch crossings in a network rises, packet delivery will take longer and longer. A five-switch crossing with Arista will take 17.5 microseconds whereas it will take 100 microseconds or more with Cisco's Catalyst 6500.

This is not to knock Cisco, which is as aware of the problem as anyone, but to point out the issue. A hyper-scale Ethernet network will involve many switch crossings and each will contribute its own minuscule delay.

It is possible, but unlikely, that new Ethernet standards will evolve to cope with this by specifying latency either at a switch level or at a network level.

To return to the air transport analogy, this would mean maximum times for aircraft turnaround at airports being adhered to, which is clearly impractical, however much passengers would love it.

What we can be sure of is that Ethernet networks will evolve towards using multiple links and, somehow, reduced switch counts to keep traffic speed high.

A note of warning: "hyper" is not a definitive term. It is marketing speak, and there is no network device count boundary which once crossed will put it in hyper-networking territory.

In fact "hyper" might always be just over the horizon. ®

Similar topics

Other stories you might like

  • While the iPhone's repairability is in the toilet, at least the Apple Watch 7 is as fixable as the previous model

    Component swaps still a thing – for now

    Apple's seventh-gen Watch has managed to maintain its iFixit repairability rating on a par with the last model – unlike its smartphone sibling.

    The iFixit team found the slightly larger display of the latest Apple Watch a boon for removal via heat and a suction handle. Where the previous generation required a pair of flex folds in its display, the new version turned out to be simpler, with just the one flex.

    Things are also slightly different within the watch itself. Apple's diagnostic port has gone and the battery is larger. That equates to a slight increase in power (1.094Wh from 1.024Wh between 40mm S6 and 41mm S7) which, when paired with the slightly hungrier display, means battery life is pretty much unchanged.

    Continue reading
  • Better late than never: Microsoft rolls out a public preview of E2EE in Teams calls

    Only for one-to-one voice and video, mind

    Microsoft has finally kicked off the rollout of end-to-end-encryption (E2EE) in its Teams collaboration platform with a public preview of E2EE for one-to-one calls.

    It has been a while coming. The company made the promise of E2EE for some one-to-one Teams calls at its virtual Ignite shindig in March this year ( and as 2021 nears its end appears to have delivered, in preview form at least.

    The company's rival in the conference calling space, Zoom, added E2EE for all a year ago, making Microsoft rather late to the privacy party. COO at Matrix-based communications and collaboration app Element, Amandine Le Pape, told The Register that the preview, although welcome, was "long overdue."

    Continue reading
  • Recycled Cobalt Strike key pairs show many crooks are using same cloned installation

    Researcher spots RSA tell-tale lurking in plain sight on VirusTotal

    Around 1,500 Cobalt Strike beacons uploaded to VirusTotal were reusing the same RSA keys from a cracked version of the software, according to a security researcher who pored through the malware repository.

    The discovery could make blue teams' lives easier by giving them a clue about whether or not Cobalt Strike traffic across their networks is a real threat or an action by an authorised red team carrying out a penetration test.

    Didier Stevens, the researcher with Belgian infosec firm NVISO who discovered that private Cobalt Strike keys are being widely reused by criminals, told The Register: "While fingerprinting Cobalt Strike servers on the internet, we noticed that some public keys appeared often. The fact that there is a reuse of public keys means that there is a reuse of private keys too: a public key and a private key are linked to each other."

    Continue reading

Biting the hand that feeds IT © 1998–2021