The great wheel of semiconductor life continues to turn. 'Tis the season to start gearing up for the International Solid State Circuits Conference in San Francisco. The ISSCC event is the second event of each new year, following the Consumer Electronics Show, where new PC processors and sundry other computing gadgets are brought to market. ISSCC is where the hard-core techies get to show off their etchings and their IQs, particularly with processors used in servers.
The most interesting chip for commercial servers that will be detailed at the ISSCC event could turn out to be the "Poulson" Itanium processor. The chipheads from Intel's Fort Collins, Colorado, and Hudson, Massachusetts, chip design labs will be showing off the Poulson chip, which we now know will have eight cores (Intel has been saying "at least eight cores") and 3.1 billion transistors implemented on a 32 nanometer 9M process.
The current quad-core "Tukwila" Itanium 9300s, which were announced at ISSCC last year and which started shipping around the middle of this year in blade servers from Hewlett-Packard, are implemented in a 65 nanometer process. Given its 2 billion transistors, it is a considerably larger processor. But despite the shrink, the Poulson chip will be plug-compatible with the Tukwila chips. Or more precisely, the Tukwila was delayed so its socket could be made compatible with Poulson chip in 2012 and its kicker, code-named "Kittson," for 2014, because server makers (meaning HP) balked at the short lifespan of the former Tukwila socket.
According to the ISSCC abstract, Intel will be showing off a "12-wide issue" Itanium chip with eight multi-threaded cores that has a ring-based system interface and a combined cache on the die of 50 MB. That's a weird number, with the Itanium 9300 having 512 KB of L2 instruction cache, 256 KB of L2 data cache, and 6 MB of L3 cache per core. Just keeping all of those cache sizes the same and ramping up to eight cores puts you at 54 MB.
Intel says in the abstract that the high-speed links on the chip allow for peak processor-to-processor bandwidth of up to 128 GB/sec and memory bandwidth of up to 45 GB/sec. David Kanter, over at Real World Technologies, has a theory that Intel will be putting together a reasonably new Itanium microarchitecture with Poulson, perhaps switching from two-thread HyperThreading to four threads of real simultaneous multithreading per core. This sounds reasonable to me, given that both IBM and Oracle are putting four and eight threads, respectively, on their Power7 and Sparc T3 cores.
The Poulson Itanium will have a die size of 18.2mm by 29.9mm, or 544.2 square mm. The current quad-core Tukwila weighs in at 21.5mm by 32.5mm, or 698.8 square mm. I would guess that some of that transistor shrink resulting area shrink is to boost the clock speed on the Itanium cores by anywhere from 25 to 30 per cent. That will more than double the performance of the Poulson over Tukwila systems, provided the DDR3 memory controllers and buffered memory cards can feed the extra clocks.
The other interesting tidbit from the ISSCC abstract is the target clock speed for Advanced Micro Devices' future "Bulldozer" cores used in a variety of workstation and server processors. El Reg has told you all about the Bulldozer designs here, there, and everywhere, and we were guessing that the 16-core "Interlagos" Opterons for two-socket and four-socket servers could hit somewhere around 2.75 GHz for standard parts.
The ISSCC abstract says "3.5 GHz+" is the target clock speed for the dual-core Bulldozer module when implemented in a 32 nanometer process. That is a big bump up in clock speed compared to the current 12-core "Magny-Cours" Opteron 6100s, which are baked in a 45 nanometer process and which top out at a measly 2.2 GHz. If AMD can ramp up the core count faster than Intel and also boost clock speeds to get into the 3 GHz zone where Xeon 5000 series processors play in two-socket boxes, then AMD could start being a contender again in the server racket.
The ISSCC teaser says that a two-core Bulldozer module will have 213 million transistors and will operate at voltages of between 0.8 and 1.3 volts. The Bulldozer module will take up 30.9 square mm of area, including 2 MB of L2 cache memory dedicated to the cores.
AMD will also be showing off the 40-entry unified out-of-order scheduler and integer execution unit for the Bulldozer cores at ISSCC. AMD says that the scheduler can issue up to four operations per cycle and supports single-cycle operation wakeup; the integer unit supports single-cycle bypass between four functional units.
Intel will also be showing off its future "Westmere-EX" ten-core processor for high-end Xeon servers, a kicker to the current Xeon 7500s that were announced back in March and helping to fuel heavy server configurations this year as companies implement virtualization on middle-tier and back-end systems. The "Sandy Bridge" CPU-GPU hybrids due to be launched in January at CES in PCs and laptops will also be detailed; these chips will sport up to four cores, an integrated GPU, and memory and PCI-Express controllers all on the same die.
Techies from IBM will be on hand to talk about the guts of the quad-core 5.2 GHz mainframe engine at the heart of its System zEnterprise 196 machines, which debuted in July and which IBM was chatting about at the Hot Chips conference in June. It is hard to imagine IBM saying anything new about the chip that we have not already told you in detail, but anything is possible.
Finally, the Chinese Academy of Sciences and its Longsoon Technologies chip designing arm will trot out the Godson-3B variant of the MIPS RISC processor that will eventually be at the heart of China's supercomputers - and maybe other kinds of servers, too. The Godson-3B is implemented in a 65 nanometer process and sports eight cores running at 1.05 GHz. The chip delivers 128 gigaflops of double-precision floating point math performance in a 40 watt envelope; it has 582.6 million transistors and has an area of 299.8 square mm. ®