This article is more than 1 year old

Graviton 3: AWS attempts to gain silicon advantage with latest custom hardware

Key to faster, more predictable cloud

RE:INVENT AWS had a conviction that "modern processors were not well optimized for modern workloads," the cloud corp's senior veep of Infrastructure, Peter DeSantis, claimed at its latest annual Re:invent gathering in Las Vegas.

DeSantis was speaking last week about AWS's Graviton 3 Arm-based processor, providing a bit more meat around the bones, so to speak – and in his comment the word "modern" is doing a lot of work.

The computing landscape looks different from the perspective of a hyperscale cloud provider; what counts is not flexibility but intensive optimization and predictable performance.

Custom hardware is the obvious solution and in 2015 Amazon acquired Israeli chip company Annapurna Labs, which remains a team within AWS working on Graviton and other custom CPUs.

"Nitro is the reason that AWS got started on building its own chips," said DeSantis, talking the Re:invent audience through the purpose of FTL (Flash Translation Layer) chips on SSD drives, which makes a bunch of flash RAM chips look like a storage drive while also optimising their wear.

AWS found that using vendor-supplied FTLs, which come in many variants, led to inconsistent performance and occasional pauses for garbage collection. Nitro cards took over this function, among other roles, giving AWS better and more predictable performance. Another advantage was that Nitro-managed encryption has near-zero overhead so that when encrypted data at rest is required, there is no penalty.

Nitro has storage, networking, security and hypervisor roles – but what of the CPU? In late 2018 AWS introduced Graviton, believing, like Apple, that the efficiency of the Arm architecture gives it an advantage over x86. Graviton 2 was first previewed in late 2019.

Now the company has introduced Graviton 3, still in preview, but already tried by select customers, like Twitter, whose head of platform Nick Tornow declared that: "We found Graviton3-based C7g instances deliver 20-80 per cent higher performance vs. Graviton2 based C6g instances, while also reducing tail latencies by as much as 35 per cent."

Graviton 3 core statistics

Graviton 3 core statistics

C7g is the only Graviton 3 VM instance currently, and supports Elastic Fabric Adaptor (EFA), an AWS enhanced network interface that enables low-latency connection to other instances for applications such as HPC (High Performance Computing) workloads.

AWS said at Re:invent that Graviton servers feature a Nitro card capable of managing three Graviton 3 processors.

Graviton 3 specifications

  • 2.6 GHz clock speed
  • 300 GB/sec max memory bandwidth
  • DDR5 RAM
  • 64 cores
  • Seven silicon die chiplet-based design
  • 256-bit SVE (Scalable Vector Extension)
  • 55 billion transistors (Graviton 2: 30 billion)

DeSantis was careful to explain that the core statistics do not tell the whole story. Graviton 2 was also 64-core and 2.5 GHz clock speed, but the new processor has greater width, meaning that twice as much data can be processed in a single clock cycle, and that the number of instructions that each core can work on concurrently has increased from 5 to 8 per cycle.

Graviton 2 had Arm's Neoverse N1 cores; it is not yet clear whether Graviton 3 has Neoverse N2 (which makes sense given the stated 256-bit SVE, as our sister title The Next Platform observes, or Neoverse V1 as SemiAnalysis insists. Each core has 50 per cent more memory bandwidth than Graviton 2.

Another feature of Graviton 3 is support for bfloat16 – which is a truncated 32-bit floating point value ideal for machine learning where speed is more important than precision.

Security is also enhanced, with a new pointer authentication feature.

"Before return addresses are pushed on to the stack, they are first signed with a secret key and additional context information, including the current value of the stack pointer. When the signed addresses are popped off the stack, they are validated before being used. An exception is raised if the address is not valid, thereby blocking attacks that work by overwriting the stack contents with the address of harmful code," says AWS evangelist Jeff Barr in a blog, with an invitation to compiler and operating system developers to get in touch to learn how to take advantage.

How specifications translate into real-world performance is always variable. Some of the figures chucked out by AWS are that Graviton 3 is 60 per cent more efficient than its predecessor; it offers up to double floating-point performance and cryptographic performance; and up to three times better ML performance. Overall the performance claim is for a more modest 25 per cent. A slide shown by DeSantis shows gains from 25 per cent for a Redis application, to 60 per cent for Nginx.

Real-world performance improvement over Graviton 2 varies according to the workload

Real-world performance improvement over Graviton 2 varies according to the workload

Users will do the sums and migrate to Graviton 3 where applications are compatible and there is significant cost and/or performance benefit, which it looks like there is. The AWS investment in custom chips is a challenge to x86 chip vendors, and to competitors relying on commodity servers.

We reported last year on Microsoft's rumoured plans for custom Arm-based processors, while GCP (Google Cloud Platform) already has custom chips for ML and said in March that more custom hardware will follow - which does not take away from the head start AWS has achieved here. ®

More about

TIP US OFF

Send us news


Other stories you might like