Intel has revealed details of its 11th Generation Core technology, including its Tiger Lake laptop-friendly system-on-chip family featuring Willow Cove CPU cores and so-called SuperFin 10nm transistors.
Despite a delay in the chip maker's 7nm production plans after years of stalling with 10nm, credible competition from AMD, acknowledgement that external foundries might be needed for future components, and the swift ousting of chief engineering officer Venkata "Murthy" Renduchintala, Raja Koduri, Intel's chief architect, speaking during a media briefing on Tuesday, made it sound as if the semiconductor goliath had no reason to do anything differently.
Recalling the Six Pillars vision he laid out in 2018 – an acknowledgement that the transistor doubling described by Moore's Law is no longer enough – Koduri insisted Intel is staying its course.
"Today, I'm super happy to tell you that our architecture vision has not changed," he said. "In fact, we've been working hard on accelerating this vision."
Tiger Lake processors are in production and shipping to customers, according to Intel, and are expected to show up in notebook-like products for the upcoming Christmas shopping season. The official launch is expected on September 2, and the family will supersede Chipzilla's Ice Lake line of 10nm laptop chips.
The general-purpose chip family relies on a refinement of Intel's FinFETs called SuperFins. The 10nm transistor design incorporates a Super MIM (metal-insulator-material) capacitor that provides a five-times increase in MIM capacitance. MIM capacitors can boost performance, and are not exclusive to Intel: TSMC plans to use super-high-density MIM caps in its 5nm node, for instance.
Gate and switch ... An Intel-provided slide illustrating the Superfin design plus a simple block diagram of Tiger Lake's Willow Cove CPU core. Click to enlarge any graphic
"When compared to the industry standard, it delivers a five-times increase in capacitance within the same footprint, driving a voltage reduction that translates to dramatically improved product performance," said Ruth Brain, Intel Fellow and Director of Interconnect Technology and Integration. "This is an industry first technology that far exceeds the current capabilities of other manufacturers."
Intel claims that although Tiger Lake's 10nm is an intranode step (Tiger Lake uses an improved 10nm process; its predecessor Ice Lake was also 10nm), the change in design apparently provides a performance boost equivalent to a full node change (going from 14nm to 10nm would be a full node change, for example).
Brain said that Intel is working to further adapt its SuperFin processors to suit its data center customers with an eye toward delivering something server-grade and 10nm next year. "Servers benefit from enhancements to the large amount of data that needs to be shared across the chip," she said. "So in addition to continuing transistor optimization to deliver more internode performance, we also focused on improving the metal stack with interconnect layer optimizations that make data center-scale fabrics for CPU and GPU more easily routable."
Packaging the dies
Advances in Intel's packaging technologies – how it crams multiple dies and layers of electronics into one chip package – were also discussed. "Ultimately our packaging technologies are about increasing density and decreasing power, allowing chiplets to be connected in a package with functionality that matches or exceeds the functionality of a monolithic SoC," said Ramune Nagisetty, senior principal engineer in Intel's Product and Process Integration group.
The Tiger Lake arrangement: two dies, one system-on-chip package. The lower die contains the CPU cores, the GPU, math acceleration, and IO, and the upper die contains the platform controller hub
She also pointed to Intel's Lakefield hybrid processor as the tech giant's first product to take advantage of its Foveros 3D stacking technology, so called because it literally stacks RAM on top of 10nm CPU and GPU cores which sit on a 22nm base containing the IO chipset, all in one package. Lakefield is called a hybrid because it uses one big Sunny Cove CPU core and four smaller Tremont Atom cores, much like Arm's big.LITTLE architecture: the smaller, lower-power cores run your code most of time, and the larger, beefier power-hungry core spins up to take on bursts of heavy work as needed.
Meanwhile, Alder Lake, the next iteration of Intel's hybrid architecture, is expected to integrate Chipzilla's 10nm big Golden Cove and small Gracemont CPU cores in the hope of eking out even better performance per watt.
Tiger Lake isn't a hybrid as it uses just the one type of CPU core. Specifically, it sports four Willow Cove CPU cores, which each boast a redesigned 1.25MB middle-level cache and Intel's Control Flow Enforcement Technology to block return, call, and jump-oriented programming exploitation of vulnerabilities. Willow Cove is based on Sunny Cove.
"Willow Cove is better, faster and more efficient, enabling generational CPU gains in not only TDP (Thermal Design Power) limited performance, but also in unconstrained performance across the board," said Boyd S. Phelps, VP of Intel's Client Engineering Group. "Willow Cove was designed to optimize the entire range of the VF (voltage-frequency) curve."
Tiger Lake's points of interest, according to Intel, are:
- Four new Willow Cove CPU cores with significant frequency uplift over Sunny Cove, leveraging 10nm SuperFin technology advancements. We're talking greater than 4.5GHz if Intel's graphs are to be believed. Crucially, you can run Willow Cove at a greater frequency than Sunny Cove at the same voltage level.
- New Xe-LP graphics processor with up to 96 execution units (EUs) with significant performance-per-watt efficiency improvements, apparently. This sports a 3.8MB L3 cache.
- Power management – autonomous dynamic voltage frequency scaling in coherent fabric, increased fully integrated voltage regulator efficiency.
- Fabrics and memory – Two-times increase in coherent fabric bandwidth, said fabric is arranged in a dual ring structure, roughly 86GB/s memory bandwidth, validated LP4x-4267, DDR4-3200, and LP5-5400 support. There's also support for RAM encryption.
- Gaussian Network Accelerator (GNA) 2.0 dedicated processing engine for low-power neural inferencing offloading from the CPU. Roughly 20 per cent lower CPU utilization on GNA vs on a CPU core (running a noise suppression workload).
- IO – integrated Thunderbolt 4 and USB4 with up to 40Gb/s shifted per port, integrated PCIe Gen 4 on CPU die rather than the platform hub for low-latency (saving about 100ns), high-bandwidth (8GB/s) device access to memory.
- Display – up to 64GB/s of isochronous bandwidth to memory for multiple high-resolution displays. Dedicated fabric path to memory to maintain quality of service.
- IPU6 – up to six sensors with 4K30 video, 27MP image, up to 4K90 and 42MP image architectural capability.
Chips 'n bits
Intel's Ice Lake server-grade processor, it's first 10nm Xeon, is on track to hit data centers at the end of 2020, the silicon shifter said. It includes total memory encryption, PCIe Gen 4, and eight memory channels with an instruction set capable of accelerating cryptographic processing.
Sapphire Rapids, the subsequent 10nm chip generation and designated brains of the Aurora exascale supercomputer system being built for Argonne National Lab, will be based on enhanced SuperFin technology and is supposed to include DDR5, PCIe Gen 5 and Compute Express Link 1.1. Initial production shipments are planned for the second half of 2021.
Intel also delved into its Xe-LP graphic microarchitecture for mobile platforms, which beyond its increase from 64 to 96 EUs, is accompanied by driver enhancements like a new DX11 path and a better optimized compiler.
"We've made changes to our compiler really to maximize the hardware-software design efficiency, and one significant example of this was the shift to software scoreboarding," said Lisa Pearce, VP of Intel Architecture, Graphics, and Software. "Some of the register dependency checking we moved to the compiler. This allowed us to simplify the hardware thread control logic, reduce gate count, and as a result, increase the power efficiency."
At its media briefing, Chipzilla also demoed its Xe-HP chip, a multi-tiled GPU coming next year that the company describes as a "a media supercomputer on a PCIe card." And it discussed Xe-HPG, a microarchitecture optimized for gaming based on GDDR6, also supposedly shipping next year. The biz also talked up the Intel Server GPU (SG1), a GPU for the data center based on the Xe architecture that is slated to ship to customers later this year.
Patrick Moorhead, president and principal analyst of Moor Insights and Strategy, said that against the backdrop of Intel losing some unit market share in CPUs, its 7nm delay, and better than expected financial performance, there's some reason for optimism.
"I feel better about its future as it didn't ignore the obvious fab issues and at the same time gave reasons to believe architecturally, it could be back on top," he said in an email to The Register.
Moorhead said Tiger Lake looks like Intel has actually delivered on its Six Pillar strategy. "It showed how it could get a 13-25 per cent performance bump just in CPU core scaling with Willow Cove and I think we will see even more significant scaling in its GPU and ML [machine learning] performance," he said.
"Software will matter a lot here on ML and I'm looking forward to seeing more details on what software can actually take advantage of the new capabilities."
Moorhead said while it's still unclear how Tiger Lake will compare with AMD chips at this point, Intel does appear to have scaled it fairly well and to have added new technologies as needed. The future of chip design and packaging, he said, looks like chiplets from many different companies combined into a single 3D package. "I feel confident Intel is extremely competent, maybe even the current lead," he said.
"This is a long-term strategy and industry shift and I don't see Lakefield representative of what will be many designs in the future. The package commitments of power and bandwidth are compelling and were the most impressive." ®