There is no such thing as the last laugh in the server chip business. But you can get the next laugh, and Advanced Micro Devices thinks it is going to get that sequential chuckle on rival Intel in the x64 server racket with next year's launch of the "Bulldozer" family of Opteron processors.
The reason is simple. This year, AMD has struggled through a major socket upgrade and server platform redefinition, including its first homegrown chipsets (if you consider ATI homegrown), while the world was trying to recover from the Great Recession. Intel's march 2009 rollout of its substantially revamped Nehalem architecture with the two-socket Xeon 5500s might have been pushed out a bit because of technical issues, but the need to do virtualization to cut costs during the recession and the substantially improved Nehalem design allowed Intel to weather the recession pretty well in addition to stealing back some market share from AMD.
The next battle in the x64 server war, which AMD started in early 2009 to stay in the game and which it has continued with its "Lisbon" Opteron 4100s and "Magny-Cours" Opteron 6100s in early 2010, is shaping up for the late summer or early fall of 2011. That's when AMD will get its first Opteron server processors based on the Bulldozer core to market. It is also about the same time that we expect to see "Sandy Bridge" Xeon processors for servers in the field.
That's nearly a year away, and in the meantime, AMD is going to have to rely on offering better bang for the buck on the current six-core Opteron 4100s (for uniprocessor and dual-socket boxes) and twelve-core Opteron 6100s (for 2P and 4P machines) relative to the current six-core Xeon 5600s (for two socket boxes) and eight-core Xeon 7500s (for four socket and larger machines).
If AMD had a Facebook page for the current crop of Opteron processors, its friend map would look like this:
Acer – which bought Gateway and is a server wannabe and also-ran like Gateway was for a decade – came out swinging at the Opteron 6100 debut, and Dell and Hewlett-Packard have shown some enthusiasm for the chips. The Opteron 4100s, while offering compact and clever motherboard and systems options, were less enthusiastically adopted. IBM seemed to have to be bound and dragged back to the Opteron side of the field after putting its eX5 chipset for Intel's high-end Xeon 7500s out for blades and racks and doing the easy Xeon 5600 refresh on existing products.
Oracle, which acquired once-Opteron-loving Sun Microsystems in January, took the Sun Fire and Sun Blade Opteron products and is now using them as a boat anchor on Larry Ellison's America's Cup yacht. The uptake by server partners for the current Opterons has been slow because of the chipset and socket changes required by server makers, who were understandably stingy during the downturn and annoyed with AMD over Opteron delays from a few years back.
With Intel getting its own server chip design act together with the Nehalem family, server makers focused their engineering in 2008, when the global economy went into the crapper, on the easy sell for 2009. AMD has been suffering since then, and many believe it will continue to suffer despite demonstrable price/performance advantages for its chips.
According to Pat Patla, vice president and general manager of AMD's server and embedded products unit, that other x64 chip maker - the one that brought you 64-bits, integrated memory controllers, and multicore processors first - is spoiling for a new fight in 2011 with the Bulldozer-based chips, which at AMD's Financial Analyst Day this week he characterized as "a whole new approach to the ISA" and as the biggest architectural change that AMD has made with its chips in a decade.
El Reg walked you through the finer points of the Bulldozer architecture last December and gave you an update on their expected performance in August of this year. Without going over all the same details again, the Bulldozer concept is to design a chip that is halfway between cookie cutting whole computing elements and putting them on a die (as AMD does) and virtualizing instruction streams and threading across the virtual pipelines to boost performance (which is what Intel does with HyperThreading).
Intel, in fact, uses both techniques - cookie cutting and virtual threading (which is known more generically as simultaneous multithreading) - in its Xeon and Itanium chips; IBM uses similar techniques with eight-core, 32-thread Power7 chips and Oracle does likewise with its 16-core, 128-thread Sparc T3 chips.
With the Bulldozer chips, which are implemented in GlobalFoundries' 32 nanometer technologies, AMD wants to do what it calls "two strong threads," as the illustration below shows:
The Opteron Bulldozer core: Two strong threads, no HyperThreading
Each core - which means an integer unit and a floating point unit - has their own integer unit scheduler and L1 data caches. Just like a single-core CPU did and the cores on multicore processors have today. But the cores share fetch and decode units as well as a floating point scheduler and L2 cache memory. The Bulldozer modules are cookie-cuttered in two-core units, and the future "Valencia" Opteron 4200 chip will be four of these modules with a shared memory controller, L3 cache, and northbridge spanning the four modules and eight cores. Each integer unit has four pipelines, capable of executing one instruction per cycle.
Each Bulldozer module has two 128-bit floating point units, which can do two 64-bit double-precision operations per clock or four 32-bit single precision operations. What is neat about the Bulldozer design is that either "core" in the module can grab the scheduler and if the other core is not doing floating point, then it can take all 256 bits and do four double precision or eight single precision ops in a clock using what AMD is now calling an AVX mode.