In the Epyc center: More Zen server CPU specs, prices sneak out of AMD

And a quick look at the chips' encrypted RAM tech


Updated Here it is: the official lineup of AMD's Epyc processors, which will go toe to toe with Intel's Xeons that utterly dominate the data center world.

Epyc. That's not a typo. AMD's desktop and laptop chips are called Ryzen, and its server-class family is called Epyc. Welcome to the new AyyyyyyyMD. Both Ryzen and Epyc are built from AMD's x86 Zen microarchitecture.

Is this AMD's big comeback; can it hope to compete against monopoly player Intel; blah, blah, blah – we'll go through all that later. For now, let's skip the opinions, and instead talk specs: each Epyc part is a 14nm system-on-chip (SoC) processor fabricated by Global Foundries. There are four silicon dies in each package, rather than one mega-die, which is cheaper and easier to manufacture. Up to eight cores can be used per die, or up to 32 in total per processor package; each core can run one or two hardware threads.

The dies are all connected together internally using AMD's Infinity Fabric – an enhanced version of HyperTransport. Infinity is also used to connect dual-socket Epyc CPUs together. Each processor package can support up to 2TB of DDR4 RAM over eight channels, and has 128 PCIe lanes. When you pair two Epyc SoCs in a dual-socket system, they each give up 64 PCIe lanes to talk to each other using the Infinity protocol. In other words, a single-socket Epyc has 128 PCIe lanes available, and a dual-socket Epyc system has 128 lanes available – not 256 – because each processor gives up 64 lanes for their socket-to-socket interconnect.

AMD is, by the way, really pushing Epyc as a great single-socket chip, with dual-socket capabilities if you need it.

Die four me ... inside an AMD Epyc chip (the little loops represent the Infinity fabric)

The Epyc family supports AMD's encrypted memory features [whitepaper, manual]. This works in one of three modes. One is transparent mode in which all reads and writes to RAM are decrypted and encrypted by a key held in the memory controller. The key is generated during power-up by the BIOS, and should never leave the controller – it can't be read by any software. This cryptography happens transparent to the running operating systems, applications, hypervisors, and virtual machines.

The second mode is SEM (Secure Encrypted Memory) in which selected pages of memory can be marked by the underlying operating system as encrypted or non-encrypted, and the controller takes care of the cryptography again using a key only it knows and is regenerated on boot. Just set bit 47 in the physical address mapping of a given page to enable encryption.

Both of these modes are designed to prevent miscreants with physical access to a box from being able to sniff the contents of the RAM from the buses while the computer is running, or stop thieves from seizing non-volatile RAM DIMMs and extracting sensitive information held on them. This is for people who are paranoid that someone is going to literally break into their machines.

The other mode is fscking insane: it's SEV (Secure Encrypted Virtualization). It is AMD's courageous attempt to provide encrypted virtual machines that are protected from the hypervisor, the underlying operating system, other VMs, and any other code on the machine.

Each VM is assigned an address space ID (ASID) as normal by the hypervisor, and this ID is tied to an encryption key held in the controller. When CPU core time is given to a virtual machine, the controller takes the VM's ASID, looks up its private key, and uses that for encrypting and decrypting all memory accesses on the fly. The hypervisor has its own ASID – zero – and can never see the keys. Thus not even a rogue or hijacked hypervisor can make sense of a virtual machine's contents, let alone any other software running in other VMs, because all the data will appear scrambled. The hypervisor and host operating system simply don't have the keys.

Here's where it gets weird. SEV is designed for paranoid people who don't trust whoever is hosting their virtual machines. The technology verifies that a VM started as expected and wasn't tampered with before or during boot-up, and that the encryption system is working correctly. This involves AMD holding a database of signing keys for each platform, and yeah... we'll dig into this in detail later.

All the cryptography (it's AES-128) happens on the fly before the data leaves the SoC, adding about 7ns of latency to each access. That translates into a performance hit of 1.5 per cent, we're told, when enabled. It works across multiple cores, and even with DMA in certain circumstances. It's all powered by an ARM Cortex coprocessor and AMD's custom firmware, all encased in the Epyc SoC – and thus it all hinges on that small chunk of hidden code not being buggy. The coprocessor also provides services such as secure boot, ensuring only cryptographically signed operating systems start up, if required.

Specifications

Below are the official stats for the new Epyc parts, announced today, paired with Intel CPUs that AMD is pitching each of its components against.

For example, according to a presentation AMD gave to analysts and journalists on Monday at its offices in Austin, Texas, the Epyc 7601 was compared to the Intel Xeon E5-2699A v4. In a SPECint_rate_base2006 integer-based benchmark run by AMD, the 7601 was, we're told, 47 per cent faster than the Intel part.

This particular benchmark tests for standard, everyday performance, rather than peak output. We're usually highly allergic to vendor-issued benchmarks, but we're publishing these to give you an idea of where AMD is trying to position itself in the data center market, and the sort of components it's gunning for. It's not indicative of true performance because it doesn't compare to running real or involved workloads – for example, virtual machines that span multiple cores, which stresses other parts of the chip such as the inter-core connectivity.

No rival was suggested for the 7501, by the way, so we compared it to a Xeon E5-4669 v4 for the hell of it.

As for the headings: cores should be obvious, it's the number of CPU cores per system-on-chip package; threads is the number of hardware threads; base and turbo are the normal and peak CPU clock frequencies; L3 is the last-level cache size; TDP is the maximum power draw; SPECint is the increase the AMD part has over its Intel rival in AMD's own aforementioned benchmarks; and price is the recommended retail price. Where there are two TDP figures, the part can be configured to operate in either mode – high power and performance versus lower power and lower performance.

AMD has split its Epyc SKUs into dual and single socket classes – they can be used in either configuration, though, unless they are a P-coded SKU, and a couple appear twice because they straddle both classes. So, for example, AMD recommends using the 7301 in a dual-socket system as an alternative to a pair of $800-plus Intel Xeon E5-2640 v4s, and the 7551P in a single-socket server versus a pair of Xeon E5-2650 v4s. Yes, AMD is pitching its single-socket-class SKUs against selected dual-socket Intel chips, claiming it can outperform them.

Dual-socket class

CPU SKU Cores / threads Base / turbo GHz L3 (MB) TDP SPECint Price
Epyc 7601 32 / 64 2.2 / 3.2 64 180W +47% $4000
Xeon E5-2699A v4 22 / 48 2.4 / 3.6 55 145W - $4938
Epyc 7551 32 / 64 2 / 3 64 180W +44% $3200
Xeon E5-2698 v4 20 / 40 2.2 / 3.6 50 135W - $3226
Epyc 7501 32 / 64 2 / 3 64 155/170W N/A Unknown
Xeon E5-4669 v4 22 / 44 2.2 / 3 55 135W - $7007
Epyc 7451 24 / 48 2.3 / 3.2 48 180W +47% $2400
Xeon E5-2695 v4 18 / 36 2.1 / 3.3 45 120W - $2428
Epyc 7401 24 / 48 2 / 3 48 155/170W +53% $1700
Xeon E5-2680 v4 14 / 28 2.4 / 3.3 35 120W - $1745
Epyc 7351 16 / 32 2.4 / 2.9 32 155/170W +63% $1100
Xeon E5-2650 v4 12 / 24 2.2 / 2.9 30 105W - $1171
Epyc 7301 16 / 32 2.2 / 2.7 32 155/170W +70% $800
Xeon E5-2640 v4 10 / 20 2.4 / 3.4 25 90W - $939
Epyc 7281 16 / 32 2.1 / 2.7 32 155/170W +60% $600
Xeon E5-2630 v4 10 / 20 2.2 / 3.1 25 85W - $671
Epyc 7251 8 / 16 2.1 / 2.9 16 120W +23% $400
Xeon E5-2620 v4 8 / 16 2.1 / 3 20 85W - $422

Single-socket class

CPU SKU Cores / threads Base / turbo GHz L3 (MB) TDP SPECint Price
Epyc 7551P 32 / 64 2 / 3 64 180W +21% $2000
2 x Xeon E5-2650 v4 12 / 24 2.2 / 2.9 30 105W - $1171
Epyc 7401P 24 / 48 2 / 3 48 155/170W +22% $1000
2 x Xeon E5-2630 v4 10 / 20 2.2 / 3.1 25 85W - $671
Epyc 7351P 16 / 32 2.4 / 2.9 32 155/170W +21% $700
2 x Xeon E5-2620 v4 8 / 16 2.1 / 3 20 85W - $422
Epyc 7281 16 / 32 2.1 / 2.7 32 155/170W +63% $600
2 x Xeon E5-2609 v4 8 / 8 1.7 / 1.7 20 85W - $310
Epyc 7251 8 / 16 2.1 / 2.9 16 120W +38% $400
2 x Xeon E5-2603 v4 6 / 6 1.7 / 1.7 15 85W - $213

(The above prices are list prices according to our sister site The Next Platform – AMD officially says the 7601, the 7551 and the 7501 start from $3,400; the 7451 and 7401 start from $1,850; the 7351, 7301 and 7281 start from $650; and the 7251 starts from $475. The one-socket 7551P is priced $2,100, the 7401P is $1,075, and the 7351P is $750.)

So, here are some initial thoughts on the above. The power figures might surprise you. Also, the above Xeons are all 14nm scale-out Broadwell E5-26xx parts from 2016, rather than beefy scale-up E7s or the full-fat Broadwell E5-46xx family. And don't forget, Intel is launching its Skylake-based Xeons this year, meaning we don't yet know how the fledgling Epyc will stand up against Chipzilla's next wave of server processors.

For now, AMD is comparing its Epyc products to the vast majority of server processors being bought and shipped today – scale-out workhorse Broadwells filling up data centers worldwide. Crucially, it will all come down to the price: the argument will be that you can buy an alternative to a given Xeon for less money. Ultimately, AMD has to look good on one key metric: performance per watt per dollar – it's all the big chip buyers, like Google and Facebook, care about after years of paying eye-watering prices for Intel chips. Something new has to come along to challenge Chipzilla's levies on the industry.

Regarding power, AMD says its Epyc processors are system-on-chips: they contain the north and southbridges in the package, rather than as separate controllers, so all you have to do is add some RAM. And storage and networking and any GPUs, and so on. So, some of the chipset power is absorbed into the Epyc SoCs.

For what it's worth, the above Broadwell E5-26xx v4s each have 40 PCIe lanes, and support up to 1.54TB of RAM, per socket. Each Epyc has 64KB and 32KB of L1 instruction and data cache, respectively, versus 32KB for both in the Broadwell family, and 512KB of L2 cache versus 256KB. AMD says Epyc matches the Broadwells in L2 and L2 TLB latencies, and has roughly half the L3 latency of Intel's counterparts.

We understand the Epyc chips are available from today, and will start shipping in July. People we know testing the hardware at the moment say they're expecting system firmware updates next month or so to hopefully iron out lingering performance issues in the launch silicon.

Basic Instincts

Finally, AMD is also talking up its Radeon graphics processors for accelerating AI software. Look out for the Radeon Instinct MI25 (Vega architecture, 16GB HBM2 RAM, 300W, dual PCIe slot) for training; the MI6 (Polaris architecture, 16GB DDR5 RAM, 150W, single slot) for training and inference; and the MI8 (Fiji architecture, 4GB HBM1 RAM, 175W, dual slot) for inference.

We're told each MI25 can hit 12.3TFLOPS using 32-bit floating-point math, or 24.6TFLOPS using 16-bit FP, and has a 484GB/s memory bandwidth. The MI16 can top 5.7TFLOPS using 16 or 32-bit FP, with a memory bandwidth of 224GB/s. The MI8 can reach 8.2TFLOPS using 16-bit or 32-bit FP, and has a memory bandwidth of 512GB/s. They're all due to start shipping to "technology partners" in the third quarter of this year.

Check back later this week for a full dive into Epyc and Zen's architecture, with a roundup of AMD's latest desktop, server and GPU accelerator offerings, once we've escaped the sweltering Texas climate. ®

Similar topics

Broader topics


Other stories you might like

  • Intel is running rings around AMD and Arm at the edge
    What will it take to loosen the x86 giant's edge stranglehold?

    Analysis Supermicro launched a wave of edge appliances using Intel's newly refreshed Xeon-D processors last week. The launch itself was nothing to write home about, but a thought occurred: with all the hype surrounding the outer reaches of computing that we call the edge, you'd think there would be more competition from chipmakers in this arena.

    So where are all the AMD and Arm-based edge appliances?

    A glance through the catalogs of the major OEMs – Dell, HPE, Lenovo, Inspur, Supermicro – returned plenty of results for AMD servers, but few, if any, validated for edge deployments. In fact, Supermicro was the only one of the five vendors that even offered an AMD-based edge appliance – which used an ageing Epyc processor. Hardly a great showing from AMD. Meanwhile, just one appliance from Inspur used an Arm-based chip from Nvidia.

    Continue reading
  • AMD bests Intel in cloud CPU performance study
    Overall price-performance in Big 3 hyperscalers a dead heat, says CockroachDB

    AMD's processors have come out on top in terms of cloud CPU performance across AWS, Microsoft Azure, and Google Cloud Platform, according to a recently published study.

    The multi-core x86-64 microprocessors Milan and Rome and beat Intel Cascade Lake and Ice Lake instances in tests of performance in the three most popular cloud providers, research from database company CockroachDB found.

    Using the CoreMark version 1.0 benchmark – which can be limited to run on a single vCPU or execute workloads on multiple vCPUs – the researchers showed AMD's Milan processors outperformed those of Intel in many cases, and at worst statistically tied with Intel's latest-gen Ice Lake processors across both the OLTP and CPU benchmarks.

    Continue reading
  • Intel says Sapphire Rapids CPU delay will help AMD catch up
    Our window to have leading server chips again is narrowing, exec admits

    While Intel has bagged Nvidia as a marquee customer for its next-generation Xeon Scalable processor, the x86 giant has admitted that a broader rollout of the server chip has been delayed to later this year.

    Sandra Rivera, Intel's datacenter boss, confirmed the delay of the Xeon processor, code-named Sapphire Rapids, in a Tuesday panel discussion at the BofA Securities 2022 Global Technology Conference. Earlier that day at the same event, Nvidia's CEO disclosed that the GPU giant would use Sapphire Rapids, and not AMD's upcoming Genoa chip, for its flagship DGX H100 system, a reversal from its last-generation machine.

    Intel has been hyping up Sapphire Rapids as a next-generation Xeon CPU that will help the chipmaker become more competitive after falling behind AMD in technology over the past few years. In fact, Intel hopes it will beat AMD's next-generation Epyc chip, Genoa, to the market with industry-first support for new technologies such as DDR5, PCIe Gen 5 and Compute Express Link.

    Continue reading

Biting the hand that feeds IT © 1998–2022