Intel's 45nm 'Nehalem' processor architecture, due for release later this year, will see the chip maker adopt AMD's approach to cache structure: small per-core Level 1 and Level 2 caches connected to a big, shared Level 3 cache.
Nehalem, which will form the basis for two-, four- and eight-core processors, will contain 64KB of L1 cache per core, split 50:50 between memory reserved for program instructions and for data. That's current how Core 2 CPUs work, but while today's desktop and mobile CPUs complement that with a big, multi-megabyte L2 caches shared between pairs of cores, each Nehalem core will get 256KB of L2 cache of its own.
All two, four or eight cores will then be able to access a shared pool of up to 8MB of L3 cache memory, allowing them to take as much or as little as they need for the threads they're running up to the overall limit.
Intel's Nehalem: native quad-core
It's an approach AMD introduced with its Phenom chips. Earlier AMD processors gave each CPU both its own L1 cache and L2 memory. Intel previously poo-poo'd this design, claiming better performance could be achieved using a shared L2. Whatever the reason, the Phenom CPU line introduced a third tier of cache, this time shared.
The Phenom 9600, for example, has 2MB of L2 divided into four 512KB blocks, each assigned to a single core. All four cores share a further 2MB of L3. Each core has 128KB of L1 cache.
It's a logical move for Intel as it was for AMD. The exclusive L2 caches give each core a pool of fast-access memory, while the shared cache acts as a buffer to trap data and instructions other cores may have requested and which another core can now grab more quickly that going out to main memory or peeking onto other cores' personal storage.
More to the point, since Nehalem is essentially Intel's first design - as AMD's have been for some time - that doesn't build four-core CPUs out of groups of two two-core dies. With no shared L3, the core-pairs in today's Core 2 Quad and Core 2 Extreme processors have to look in other core-pairs' caches, which can hinder performance.
Each Nehalem core uses Intel HyperThreading technology to handle up to two processing threads in execution simultaneously, allowing a four-core chip to appear to the host OS as an eight-core part.
Nehalem will initially be a 'true' quad-core part, but Intel promised future, eight-core parts that are built natively rather than from a part of quad-core CPUs bolted together.
The CPU design incorporates an out-of-order window running to 128 instructions, up from Core 2's 96 instructions. That allows the new chip to look ahead to a greater number of instructions to see which can be pulled out of the program sequence and processed without affecting the results of operations further down the line. It's also able to keep 33 per cent more micro-ops in flight at once than its predecessor could.