SC13 From Intel's point of view, today's hottest trend in high-performance computing – GPU acceleration – is just a phase, one that will be superseded by the advent of many-core CPUs, beginning with Chipzilla's next-generation Xeon Phi, codenamed "Knights Landing".
"Tomorrow looks quite fundamentally different," Rajeeb Hazra, VP of Intel's Data Center Group and GM of its Technical Computing Group, told a reporters' roundtable at the SC13 supercomputing conference in Denver, Colorado.
Knights Landing, Hazra said, "does something quite remarkable, and can be seen as an inflection point in the evolution of heterogeneous computing. It actually takes the biggest problem of heterogeneous computing – which is offloading, or thinking about programs and applications as having this artificial structure of main body and something that has to be offloaded – away."
Knight's Landing, Hazra said, "takes us back to the veritable homogeneous architecture, except that it does that with the benefits of many-core for very highly parallel applications."
Back to the future – or, maybe more correctly, forward to the past. Good-bye heterogeneous CPU-GPU mashups, and welcome back CPU-focused computing.
Knights Landing's release date has not yet been announced, but the smart money is on the 14-nanometer part appearing sometime in the next 12 to 18 months. No details as to core count or power requirements have yet been released, but Intel announced on Tuesday that the next-gen Xeon Phi will be available in a coprocessor/accelerator version on a PCIe card, as are members of the current "Knights Corner" Xeon Phi family, and also – and here's the kicker – as a CPU version that will fit into standard rack architectures.
Using Knights Landing as a bootable CPU running application entirely natively, Intel says, "will significantly reduce programming complexity and eliminate 'offloading' of the data, thus improving performance and decreasing latencies caused by memory, PCIe and networking."
In Intel parlance, Knights Landing is both a "tick" – a new microarchitecture – and a "tock" – a process shrink. The new microarchitecture includes what Hazra characterized as "significant advances" in per-core performance and energy efficiency.
A new in-package memory scheme, separate from the Knights Landing core caches, will also debut. "A big challenge when you create these kinds of architectures that consume a lot of data very quickly is 'How do you feed the beast?'" Hazra said, "And one of the biggest innovations – industry-leading – in Knights Landing is the new memory architecture that it employs."
According to Hazra, the new architecture provides enough total in-package memory and enough bandwidth to allow developers to use Knights Landing in its CPU configuration just as they would use a Xeon processor today. But seeing as how Knights Landing will have many cores – the current "Knights Corner" Xeon Phi family has up to 61 cores – performance of well-optimized, highly parallelized code could increase dramatically.
Hazra claims that the amount and speed of the in-package memory is sufficient to accommodate "meaningful portions of workload or workloads themselves," and is backed up by "a very large amount" of standard DDR memory. The in-package memory can be used as a traditional flat memory space for apps to play in, or it can be used as cache.
"System software for decades now has known how to use caching levels of memory," he said, "and you can use the in-package memory as a very large cache – if you block into that cache you get stupendous performance of your workload." You're not limited to setting up the in-package "near memory" as either standard application RAM or cache, by the way – you can combine the two.
Hazra was not shy when describing what he believes to be the impact of Knights Landing. "The combination of the fact that we go back and provide the versatility of homogeneous many-core along with a processor with very high energy-efficient performance, industry-leading memory bandwidth, is going to be a game-changer," he enthused.
"Knights Landing will not be called an accelerator. It will be called a many-core CPU.
It will also, as Intel has said, be available in an accelerator/coprocessor version that will continue to compete in the high-performance market with Nvidia's quite successful Tesla cards and AMD's also-ran FirePro cards, but it is as a standalone, bootable CPU – one that Intel expects will also work its way down into the high-performance client workstation market – where it should have the most impact.
If Hazra and his team can produce reasonably cost-effective parts that live up to the hype, he may very well be correct: Knights Landing might be a game-changer. ®