AMD lifts the veil on Opteron, ARM chip plans for 2014
Hey Samsung, let's see your chippery handle 64GB of RAM
AMD has unfolded its server-chip roadmap for next year, and the road ahead appears to be a sensible motorway with no hair-rasing hairpin turns or unexpected switchbacks – although there is one bright shiny new vehicle on the road.
You can forgive AMD for being somewhat cautious – heck, you should congratulate them for their clear-headed approach. With Intel gearing up to refresh the midrange and high end of its Xeon E5 family of processors with "Ivy Bridge" updates in the second half, it can't be a particularly relaxing ride for CEO Rory Read and his minivan of chipmakers.
That said, fortunes can turn on a dime in the server-chip biz. Remember that the much-improved chips that have come out of Intel in the past four generations since AMD's Opteron woke the Chipzilla beast up would not have happened had it not been for AMD. Data center managers who never spent a dime on Opteron servers nonetheless owe AMD thanks due to the excellent engineering and gritty competition embodied in the "Hammer" family of x86 chips that took the IT market by storm a decade ago
And truth be told, those same data center managers had better hope AMD rises again to keep Intel on its toes; the fight between these rivals doesn't just result in cheaper servers, but also better ones.
But if you think that AMD is looking for a rematch in the Xeon-Opteron fight, think again. The company is not opening a can of feisty whupass, but instead sensibly putting together a portfolio of Opteron chips based on x86 and ARM cores, while at the same time giving those customers who do like "Piledriver" Opteron 3300, 4300, and 6300 systems a little more bang for the buck, as well.
The new roadmap is not a sledgehammer smashing in the doors to Santa Clara's corner offices, as was the case with the original Opterons back in 2003. It's a practical plan based on the needs of server buyers and what AMD can deliver through its fab partner, GlobalFoundries, in a timely fashion and without making mistakes.
Some people may react unenthusiastically to the lack of brashness in AMD's server-chip roadmap because they want a smashmouth brawl. They need to realize that is not going to happen in a global economy that remains skittish, and in a server market that's undergoing gut-wrenching transformations in the supply chain, manufacturing base, and processor and related networking and storage architectures.
It's in that environment that AMD is extending the existing "Piledriver" Opteron processors with a new chip code-named "Warsaw", and following it up with a new ceepie-geepie hybrid code-named "Berlin" for the Opteron X-Series. And, finally, AMD is lifting the veil a little bit on its future ARM server chips, code-named "Seattle."
"What we are doing with the Warsaw Opteron is ripping out cost and power and increasing performance, and it is compatible with the existing G34 socket," Andrew Feldman, general manager of the server business unit at AMD, explains to El Reg. The chips use the same Piledriver cores as the existing Opteron CPU lineup – with some tweaks, as is always the case – and are implemented in the same 32-nanometer process and etched by GlobalFoundries.
"It is designed for those Open Compute 3.0 boards," Feldman say, "to just drop in and go."
This is significant because the "Roadrunner" Open Compute 3.0 system boards that AMD created with motherboard partners are getting some traction, particularly among data center managers in the financial services area who are not particularly pleased that Facebook has denser and cheaper server infrastructure than they do.
It looks like Warsaw is just an upgrade of the existing Opteron 6300 chips, which came out in November 2012. The Warsaw chip will offer about 20 per cent higher performance per watt than the Opteron 6300, says Feldman, and will come in twelve-core and sixteen-core variants. Importantly, the Warsaw chips will slide into the same exact G34 sockets used in two-socket and four-socket servers and will not require recertification for software, since it is not really a new chip at all – it's a deep bin sort and improving yields that are at work. (Shhhh.)
The Warsaw Opteron, presumably to be called the Opteron 6400, will be available in the first quarter of 2014 in the wake of Intel's "Ivy Bridge" Xeon processor rollout, which will be in the midrange in the third quarter and at the high-end in the fourth quarter.
It doesn't look like there will be an Opteron 4400 at that time, and the Opteron 3300 has been effectively replaced by the Opteron X-Series, launched in late May and aimed at both entry Atom and Xeon servers. If AMD needs an Opteron 4400, it can no doubt do another bin sort.
In the case of Warsaw, what companies are looking for is improvement in performance per dollar and throughput per dollar, and goosing the existing Piledriver chip is the easiest and cheapest thing to do. And it may not be sufficient to blunt the market-share gains that Intel is making with its current "Sandy Bridge" Xeon E5 server chips and its future Ivy Bridge kickers.
AMD does not have any new Opteron server processors coming until next year
In the wake of the Opteron X-Series APU launch only three weeks ago, AMD says it is preparing a kicker with more heft on both the CPU and GPU sides of this hybrid processor. This Berlin APU is going to shake some things up, and do so in traditional server computing and in any area where a GPU can be used to accelerate workloads.
"Berlin is cool, and it uses a new Steamroller core from us and delivers tremendous compute and power efficiency," says Feldman. "When you have a huge amount of compute in a single-socket part, this is ideal for workloads where performance per watt per dollar and compute density per dollar are paramount."
Steamroller is the brawny core designed to be used in high-end Opteron x86 processors, not the lighter "Jaguar" core used in the Kyoto server chips and other desktop and notebook chips. AMD is saying very little about the Steamroller core at this point, except that it offers double the performance of the Jaguar core and will max out with twice the memory capacity, too.
The Berlin Opteron chip will be aimed at single-socket machines and will also use the Graphics Core Next (GCN) graphics chip from AMD as both a video card for workstations and servers or as an adjunct compute engine for the CPU.
The Berlin Opteron is significant because it fully embraces the Heterogeneous System Architecture (HSA) that AMD has been talking about for the past two years, and in which the CPU cores and the graphics engine have access to the same unified memory space implemented on the DRAM in the system.
"That means you can program the GPU in exactly the same way you program the CPU cores, and that is the big jump," says Feldman.
If you are wondering why single-socket machines (workstations or microservers) seem trapped at a 32GB limit, Feldman explained, "DRAM pinouts require a huge amount of pins because you are using a huge amount of communication between the DRAM and the CPU. So you put two channels down, that gives you two DIMM slots with 16GB each. If you put four DIMMs down, then your die area gets bigger, and that takes you out of some of the opportunities you want to be in. There are no 32GB DIMMs available in the mainstream, and as you get off the mainstream, the prices go through the roof anyway."
Besides, Feldman says, the workloads running on single-socket boxes do not need more than 32GB of memory at this point or 64GB of memory when 32GB parts do go mainstream.
The 'Berlin' Opteron APU is the kicker to the just-announced 'Kyoto' Opteron X
The Berlin chip will have 7.8 times the gigaflops per watt as an Opteron 6386SE, which has sixteen cores running at 2.8GHz – thanks in no small part to the GCN graphics chip on the die. It is not clear how much extra oomph is coming from those Steamroller cores, but probably a little.
The Steamroller cores are etched in pairs on the die, and each pair has 2MB of L2 cache with error correction. The memory controller can feed four memory sticks (in either SODIMM or UDIMM form factors) and can handle 1,866 GT/sec of bandwidth. The chip includes the usual USB, SATA, PCI-Express, and video ports.
AMD used Taiwan Semiconductor Manufacturing Corp to fab its Kyoto Opteron X-Series processors using its 28-nanometer processes, and the kicker Berlin Opteron X chips will also come out of the same wafer baker and use the same 28nm processes.
Neither the Warsaw nor the Berlin Opteron chips will have PCI-Express 3.0 peripheral controllers on the die, but the Berlin chip has one PCI-Express 2.0 controller on die and another one on the system controller hub. This Berlin part will hook into the Freedom interconnect used in the SeaMicro SM10000 and SM15000 microserver systems, but will not have Freedom ports on the die.
The Berlin Opteron X chips will debut in the first half of 2014.
ARMed and actually dangerous
That brings us to the "Seattle" Opteron ARM processors from AMD, which are based on the Cortex-A57 core from ARM Holdings. These are the brawnier of the 64-bit ARMv8 cores designed by ARM and licensed by the server chip upstarts for their 2014 products.
AMD jumped into the ARM server fray last October, and if you look at the Seattle feeds and speeds, you'll see that AMD is really going to use ARM chips to try to go after Intel.
"There are two separate questions, in my mind, that need to be asked," says Feldman. "Will ARM win in servers? And will AMD win with ARM? I think ARM wins in the long run. In the history of compute, small, lower cost, and higher volume always wins. And community always wins. ARM has lower power, but I don't think, in the end, that it wins because it is lower power. It wins because it is lower cost. It takes the process of making a CPU down from three and a half years and $350m and $400m down to 18 months and $30m."
So why will AMD win? Feldman said that although Samsung and Qualcomm have more experience on the client side, AMD has more experience on the server side than "any ARM licensee on Earth". IP such as a memory controller that can support 64GB of DRAM gives AMD the edge, he said. "This is hard stuff."
And, perhaps more importantly, a lot of the things that many have been suggesting AMD do with its Opteron chips – get PCI-Express 3.0 peripheral controllers, Ethernet ports, and Freedom fabric interconnects on the die, for example – are happening with the Seattle chip, which starts sampling in the first quarter of next year and will ship sometime in the fall.
The Seattle design also be etched in TSMC's 28nm process, and will initially come out with eight Cortex-A57 cores that will offer two times the performance of an Opteron X-Series chip and also offer a "significant reduction" in the wattage it takes to get a unit of work done – exactly how much, AMD is not saying.
Seattle will have 10Gb/sec Ethernet ports on the die, as well as Freedom fabric ports and a high port-count storage interface that is "optimized for big data." The plan is to start with an eight-core part that runs at 2GHz and higher, and then add a sixteen-core version that will have four times the performance of the Opteron X-Series. The Seattle chip will also have offload engines for encryption and data compression.
And if they are cheaper to make and more profitable to sell, AMD might really be on to something here – provided the workload doesn't need a lot of floating point performance.
Don't count AMD out just yet in Server Land. ®