Intel’s Falcon Shores XPU to mix ‘n’ match CPUs, GPUs within processor package

x86 giant now has an HPC roadmap, which includes successor to Ponte Vecchio

After a few years of teasing Ponte Vecchio – the powerful GPU that will go into what will become one of the fastest supercomputers in the world – Intel is sharing more details of the high-performance computing chips that will follow, and one of them will combine CPUs and GPUs in one package.

The semiconductor giant shared the details Tuesday in a roadmap update for its HPC-focused products at the International Supercomputing Conference in Hamburg, Germany.

Intel has only recently carved out a separate group of products for HPC applications because it is now developing versions of Xeon Scalable CPUs, starting with a high-bandwidth-memory (HBM) variant of the forthcoming Sapphire Rapids chips, for high-performance kit. This chip will sport up to 64GB of HBM2e memory, which will give it quick access to very large datasets.

The other driver for Intel's nascent HPC portfolio is its datacenter GPUs, which will start later this year with the much-hyped Ponte Vecchio chip that will compete against Nvidia's A100 and AMD's Instinct MI200 chips. It will tackle a mix of HPC and AI workloads with up to 128GB of HBM2e memory.

A slide outlining Intel's Super Computer Silicon Roadmap, which includes CPUs, GPUs, hybrid CPU-GPU chips, and deep learning accelerators.

Intel plans to provide a broad range of options for processing high-performance computing and AI applications. Click to enlarge.

The Sapphire Rapids HBM variant and Ponte Vecchio will power the US Department of Energy's in-development exascale supercomputer that is expected to fire up later this year. Intel once hoped that Aurora would become the first exascale supercomputer in the US, but AMD just beat it to the punch with the DOE's Frontier supercomputer.

Rialto Bridge the next GPU after Ponte Vecchio

In a roadmap update provided to journalists, Intel representatives were light on details for the successor to the Sapphire Rapids HBM variant, only referring to it as "Xeon Next" and saying that it will arrive in 2023 or later. Thankfully, we learned a good deal more about the successor to Ponte Vecchio and a new kind of hybrid CPU-GPU chip that will follow.

The successor to Ponte Vecchio is called Rialto Bridge, and it will sport up to 160 cores of Intel's Xe GPU architecture. Jeff McVeigh, the head of Intel's Super Compute Group, said this will help Rialto Bridge provide around 30 percent better performance than Ponte Vecchio for applications.

A slide showing the expected features and performance targets for Intel's Rialto Bridge GPU.

Intel is promising a step up in performance over Ponte Vecchio with Rialto Bridge ... Click to enlarge.

To get the highest performance possible from Rialto Bridge, Intel plans to provide a power-hungry, 800-watt module that will be liquid cooled. Intel will also fit Rialto Bridge into the OAM 2.0 form factor that is used by so-called hyperscalers like Facebook parent company Meta and Microsoft.

Anyone developing applications to take advantage of Ponte Vecchio shouldn't have much trouble preparing them for Rialto Bridge since Intel is promising "software consistency." Intel said it expects to start sampling Rialto Bridge in mid-2023.

'Falcon Shores XPU' combines CPUs and GPUs

What comes after Rialto Bridge is much more interesting, especially since Intel is considering Falcon Shores a descendant of both Ponte Vecchio and Sapphire Rapids HBM. This is because Falcon Shores will combine x86 CPU cores and Xe GPU cores in a single package. As such, Intel is calling Falcon Shores an "XPU."

Intel said Falcon Shores will provide 5x higher performance-per-watt, memory capacity, and memory bandwidth than "current platforms," which we assume includes Nvidia's A100 combined with the latest server CPUs from Intel and AMD. The chipmaker also promised that Falcon Shores will have 5x greater compute density in an x86 socket than the best option available now, which we take to mean AMD's third-generation Epyc processor with 64 cores since Intel is currently behind in that department.

A slide showing the expected features and performance targets for Intel's Falcon Shores XPU.

Intel is promising a lot with Falcon Shores. Click to enlarge.

What's especially intriguing about Falcon Shores is how it will come in different configurations of x86 CPU cores and Xe cores, and some variants will only have x86 cores while others will only have Xe cores. This essentially means Intel is creating a super flexible chip design that can serve as a CPU, a GPU, or a combination of the two that will share memory at what it said will be an "extreme bandwidth." At the same time, Intel still plans to make more traditional Xeon CPUs.

Falcon Shores will be made using one of Intel's "Angstrom-era" manufacturing processes, which means it will likely use Intel 20A or Intel 18A since Falcon Shores is targeted for a 2024 launch.  

To avoid scaring developers away, Intel is promising a "simplified programming model" that will allow developers to decide how to map different parts of an application to the chip's x86 and Xe cores.

Tile-based design may help Intel respond faster to market

Intel's McVeigh said Falcon Shores' hybrid design is made possible by using tiles, also known as chiplets, which will give the chipmaker greater flexibility in how chips are configured much later in the design process.

This will allow Intel to respond faster to new and emerging applications that may benefit from different arrangements of x86 and Xe tiles, he added. "If new trends come along, we can more easily adapt and place those within the design," McVeigh said.

McVeigh called the use of tiles to enable more flexible design choices a "revolutionary" change in the way chips are created. For the past few years, Intel has teased that it will use this tile-based approach for several products, including Ponte Vecchio, Sapphire Rapids, and upcoming client processors like Meteor Lake.  

While companies like Intel and AMD have been using chiplet designs in products over the past several years, McVeigh said chiplets have traditionally been used to give more flexibility in which manufacturing processes are tapped for different parts of a chip.

"What's different here is making sure that you can pick and place different tiles into the same area and interfaces, so that you can adapt that over time as opposed to, well, I just use a disaggregated architecture to make it easier to manufacture for yield," he said.

If Falcon Shores' hybrid CPU-GPU design sounds familiar to you, that's likely because you've heard about Nvidia heading in a similar direction with its Arm-compatible Grace CPU.

Nvidia plans to make Grace available in the Grace Superchip, which will contain two Grace CPUs for a total of 144 cores. The CPU will also go in the Grace Hopper Superchip, which will combine one Grace CPU and a GPU using Nvidia's next-generation Hopper architecture.

Nvidia does seem further ahead than Intel, however, since the GPU maker has promised to launch its Superchips in the first half of 2023 and Intel expects to launch Falcon Shores in 2024. ®

Similar topics


Other stories you might like

Biting the hand that feeds IT © 1998–2022