There are a lot of different things that Advanced Micro Devices needs to do to get itself back on track, but one of them - and perhaps the most important - is to execute a flawless launch of the "Shanghai" quad-core Opteron chips for servers.
Not just in terms of having no bugs, but also having a smooth production ramp on the new 45 nanometer processes it uses, getting them priced correctly, and giving AMD's server customers the confidence that it can be relied upon to make roadmaps - and not drive them into the weeds like it did with the initial quad-core "Barcelona" processors.
Depending on how generous you want to be, the Barcelona chips should have arrived in the summer of 2007, when Intel delivered its quad core "Harpertown" Xeon 5400s based on the Penryn cores and a 45 nanometer process. And maybe Barcelona's should have been earlier to keep pace with Intel's quad core "Clovertown" Xeon 5300s, which came out at the end of 2006.
Unlike the new "Dunnington" Xeon 7400 processors, which have four or six cores on a single die with their caches, the Clovertown and Harpertown Xeons leapfrogged from two to four cores per socket by packing two dual-core chips into a single piece of ceramic. Having seen Hewlett-Packard do this with single-core Itaniums when Intel was much delayed with its dual-core Itaniums, AMD should have had a backup plan like this for the Rev F Opterons. But that's all water under the southbridge.
As AMD begins its Shanghai ramp and continues to talk up its roadmap between now and 2011, the company is keen to prove to its server and workstation partners that it understands how badly the Barcelona slippage - including the bug in the table lookahead buffer as well as market delays - screwed up their own server plans. And specifically, AMD wants to prove that it has changed its design and production processes so the errors that caused Barcelona to be delayed are not repeated.
"Shanghai is not Barcelona 2.0," explains Pat Patla, who is now general manager of the server and workstation chip business at AMD. "We have changed a lot of things, and we have learned from our mistakes."
One thing that AMD did was hold an off-site powwow with the entire chip engineering team in January 2008 so it could work out exactly what went wrong. At that meeting, the company appointed Raghuram Tupuri lead engineer and made him the single contact point for the Shanghai engineering project.
The first tape-out of Shanghai happened in the first quarter of this year, and based on the changes AMD had made in the engineering process, the company was confident enough in the Shanghai design that it brought its hardware and software partners in for code validation well ahead of where they would have normally entered the process. And that was after having delayed the tape-out for one month to really get down to root causes for some minor issues.
This new process of creating and testing the design has been the key factor that enabled AMD to move Shanghai from a Q1 2009 delivery to server partners up to its current Q4 release date. AMD has not said when Shanghai will launch, but Patla says that the current C2 stepping - the second stepping of the Shanghai design - is the production version of the chip. AMD is ramping Shanghai into full production right now as the fourth quarter is set to begin, and server partners have parts for final hardware validation in their hands. Well, they have sockets.
"We understand that it was very disruptive that we didn't meet our Barcelona delivery schedule," says Patla. "Our lack of execution created an opportunity for our competition, and we don’t want to have anything slow down our momentum again."
Patla says that the Shanghai chip and its related server platforms will have enough goodies to entice server makers to push it even though they have strong partnerships with Intel. It was easier to embrace AMD's Opterons back in 2003 when Intel was still pitching Itanium as its 64-bit chip, and AMD did well for a while as it added tier one server makers to its customer list, starting with IBM, then Sun Microsystems, then Hewlett-Packard, and then others.
But Intel embraced 64-bits on Xeon chips, and then just ahead of the transition from dual-core to quad-core chips, Intel woke up from its stupor and started getting respectable performance and thermals for the Xeons. Today, the gap in performance and thermals between Opterons and Xeons is much smaller than it once was.
The Shanghai Opterons are being manufactured in a 45 nanometer immersion lithography technique and will have 6 MB of L3 cache, clock speeds higher than the current top-end 2.3 GHz parts, and tweaks in the instruction stream that will deliver somewhere between 15 and 30 per cent more performance for Opteron customers. Expect around 20 per cent for an average workload.
The Shanghai chips will use 800-MHz DDR2 main memory as well, which runs about 10 per cent faster than the 667-MHz DDR2 main memory used with the Barcelonas and will include support for the HT-3 HyperTransport interconnect. The Shanghais plug into the existing Socket F1 CPU sockets (also called the Rev F socket or the Socket 1207). nVidia nForce 3050 and 3600 chipsets and Broadcom HT-1000 and HT-2100 chipsets will therefore support the Shanghai chips, just as they supported the dual-core "Santa Rosa" and quad-core Barcelona Rev F chips.
The Shanghai processors will be available in standard, Special Edition (SE for short, and meaning higher clock speed and much hotter temperature), and Highly Efficient (HE, and meaning lower voltage and therefore lower heat for a given clock speed) variants. The 75 watt standard parts will ship in Q4 of this year, with HE (55 watt) and SE (105 watt) parts available in the first quarter. The Opteron 1000 Series variant of Shanghai, known as "Suzuka" and plugging into the AM2 socket, is coming in the second half of 2009.
In the first half of 2009, AMD will bring out its own chipsets, in the "Fiorano" platform, which implements the SR5690 chipset and related SP5100 southbridge, which were previously known as the RD890S and SB700S back in May - just to confuse you.
The chipset and southbridge include support for the next generation of PCI-Express peripherals as well as DDR2 main memory and HT-3 interconnects. The Fiorano chipset will support the Shanghai quad-core chip as well as the a few months later in Q4 2009 the "Istanbul" six-core kicker to Shanghai. On the latest AMD roadmap, the Fiorano chipset carries these Socket F chips through to 2011 and, if you believe the arrows pointing right, beyond.
After that, in the first half of 2010, two-socket and larger Opteron servers get a tweaked version of the six-core Istanbul design in a chip called "San Paolo," which will have 12 MB L3 cache, DDR3 main memory, four HT-3 HyperTransport links, and a bunch of other features that will give it a modest performance boost over Istanbuls.
In 2010, AMD will also deliver its first two-chip ceramic package with the "Magny-Cours" Opteron, which is two of these San Paolo chips in a single socket for a total of 12 cores in the same thermal and power envelope as today's Barcelona. The San Paolo and Magny-Cours Opterons will make use of AMD's "Maranello" platform, which has the G34 socket. Patla won't say anything more about the G34 socket, except that it will have more pins than the Rev F socket because of the extra HT links. ®