As expected, Intel is closing out the year in the server arena with the launch of its "Centerton" Atom S Series processor, what the company's top brass is billing as the first Atom-based processor aimed at servers.
The Atom S1200, as the first generation is known, is also the foundation from which Intel will build a bulwark against the onslaught of ARM-based server processors expected in late 2013 and early 2014 from a number of different vendors.
Some workloads are compute or memory intensive, and they need as many fast threads as any chip maker can throw at them. This is where the Xeon E5 for two-socket and now four-socket machines are aimed as well as the Xeon E7 processors for four-socket and larger boxes, where fat memory and SMP/NUMA scalability are more important than thermals or price.
For more modest workloads, Intel has geared up its Core desktops to create the Xeon E3 chips, and they are appropriate for a number of uses, including single-socket microservers with modest compute and even more modest memory needs.
But for some workloads, such as dedicated hosting for tiny web sites or big data analytics, even these Xeon chips have too much oomph, give off too much heat as they run, and cost too much. That is where the Atom comes in.
The Atom S1200 is not the first 64-bit Atom that has ended up in servers. SeaMicro, the upstart microserver maker that is now part of rival Advanced Micro Devices, launched its second-generation SM10000-64HD back in July 2011 based on the 64-bit Atom N570 processor.
This chip had two cores and a thermal design point of 8.5 watts, but it did not have ECC error scrubbing on its DDR3 memory controllers and it did not support the VT electronics that the Core and Xeon chips have to make virtualization hypervisors run faster. Still, you could run Windows Server and Linux operating systems on the N570, and many customers jonesing for microservers did just that.
Diane Bryant, GM of Intel's Data Center and Connected
Systems Group, holds an Atom S1200
With the Atom S1200, Intel is taking some of the risk out of using Atoms in the data center, whether it is in servers or in low-end storage or switches. The Atom S1200 is etched in Intel's prior generation and fully ironed out 32 nanometer process technology.
The S1200 has 64-bit processing and memory addressing, just like the N570 before it, and it also has the in-order processor of the Atom chip. There's speculation that the future "Avoton" Atom S Series chip will have out-of-order execution like Xeons.
The big difference between the N570 and the S1200 is that the latter adds ECC to the memory and VT for hypervisors. The former feature is necessary for servers, which need greater reliability than PCs and tablets; the use of the latter can be debated, but no one can complain that the S1200 doesn't have VT, especially with future 64-bit ARM chips coming with their own flavor of virtualization-assisting circuits.
The two-core Atom S1200 also supports HyperThreading, the Intel implementation of simultaneous multithreading (SMT) that lets a single instruction pipeline look like two as far as the operating system is concerned. Depending on the workloads, SMT can boost overall chip performance by as much as 20 per cent, based on past history.
Intel has not said how well the HT implementation on the Atom S1200 works, and on what workloads in particular. The Atom S1200 does not support Turbo Boost clock speed acceleration.
Intel bolts server features onto an Atom system-on-chip
The S1200 chip has two cores active, four threads enabled by HyperThreading, and plug into the BGA1286 socket. The S1200 chips have 64KB of L1 instruction and 64KB of L1 data cache memory, with an addition 1MB of L2 cache memory shared across the cores. The chip has one DDR3 memory channel, which runs at 1.33GHz and which currently maxes out at 8GB in a single slot.
The SoC design also includes a PCI-Express 2.0 I/O controller that can support up to eight lanes as well as a UART to drive USB ports. There are three different models of the Atom S1200 processor that are available starting today, with pricing below in 1,000-unit tray quantities:
- Atom S1260: 2.00GHz, 8.5W TDP, $64
- Atom S1240: 1.60GHz, 6.1W TDP, $64
- Atom S1220: 1.60 GHz, 8.1W TDP, $54
At the Atom S1200 announcement event in San Francisco today, Diane Bryant, general manager of the Data Center and Connected Systems Group, said that the Atom S1200 processor already had 20 design wins across server, storage, and networking equipment makers. She made a point of singling out one win away from the PowerPC architecture in the networking arena and another win against an ARM architecture in the storage area.
The marketing message for the Atom S1200 will be the same as for the Xeon Phi x86-based coprocessor aimed at massively parallel supercomputing workloads: By sticking with the Intel architecture, you can have instruction set consistency across a wide variety of workloads and still be able to tailor systems for specific workloads. And Bryant made Intel's commitment here simple: "Whatever the workload is, it will run best on Intel architecture."
Intel trotted out Jeffrey Snover, distinguished engineer and Windows Server lead architect, to not only talk up the ability to run Windows Server on this processor, but also to pour cold water (well, it was a liquid for sure, but it was warm and somewhat beer-colored) on the idea that a 32-bit operating system was acceptable in the server space.
"The benefits of a 64-bit flat address space are so important that Microsoft stopped supporting 32-bit operating systems a few releases ago," explained Snover.
Bryant also said you needed 64-bit processing and memory addressing, and when asked during a Q&A session how this chip would stack up to an ARM server chip, she deflected it by saying that "today that is not an apples-to-apples comparison." She added that Intel had a good view into the competition and that Chipzilla believed it would have a compelling lead on the ARM competition for tiny servers.
Intel also asked Paul Santeler, vice president of the hyperscale business unit at Hewlett-Packard, to talk a bit about the future "Gemini" Atom-based hyperscale servers, which HP previewed back in June. Santeler didn't really add much about the Gemini systems, but did say they would launch in the first quarter of 2013 using the Atom S1200s. Customers have already been able to get early evaluation units of the Gemini machines using the new Intel chips.
HP compares Atom S1200 versus Xeon E3 1200 performance
He also threw out some performance data to show how a 2GHz Atom S1200 compared on scale-out applications and compute-intensive applications versus the Xeon E3-1200 v2 chips, which are based on the "Ivy Bridge" architecture.
For simple content delivery, large distributed caching, simple search, and MapReduce applications, an Atom S1200 can deliver around twice the bang per watt. But on heavier work, the Xeon E3 has a two-to-one advantage, based on HP's internal tests. "You're not going to run an Atom head-to-head and get more performance than a Xeon," said Santeler. "But you will get more performance per watt.
What HP did not do is make the obvious comparison to a workhorse Xeon E5 server.
Frank Frankovsky, vice president of hardware design and supply chain at Facebook, did not commit the social network to using Atom S1200-based servers, but he did show some performance comparisons that Facebook has done to compare wimpy cores (such as Atom) against brawny cores (such as Xeon):
Facebook stacks up wimpy Atom against brawny Xeon cores
Frankovsky said that the only metric Facebook cared about when it installed a server is what the company calls a Relative Compute Unit, which is simply a unit of work per watt per dollar. And that based on how well these wimpy cores do, you can get the same unit of work done with about half the watts. The chart above did not show the important cost metric, by the way.
One place where Facebook might consider using wimpy cores is to drive its photo sharing application, which currently holds more than 220 billion photos and which is adding more than 300 million photos per day. This app only needs a modest amount of compute per core to run, but obviously with a billion users, you need a lot of cores to process photos.
Facebook was very careful not to admit to actually using the Atom S1200 in its data centers and Frankovsky did not launch an Open Compute server based on the Atom S1200, either. But you can probably expect such a thing has already been done and will be featured at the Open Compute Summit early next year.
Intel's own rough reckoning of Atom-versus-Xeon pitted the S1200 again a low-volt E3-1200 v2. With a full rack of hyperscale servers, the Atom-based rack could 560 nodes and delivered around $35,800 of processor revenue to Intel. A full rack of the Xeon E3 servers would have about 100 nodes, which is not much, but would offer about two times the throughout performance while bringing Intel around $32,900 in processor revenue.
Intel is making no bones about making Atom a peer in the data center, which it has to do if it wants to blunt the ARM attack. And from these numbers, Intel can make the revenue either way. Jason Waxman, who runs Intel's hyperscale business unit, said that rather than try to over-tailor and differentiate the Atoms and Xeons, Intel decided to "let the lines blur and let the customers choose."
Roadmap for Atom Series processors
Bryant said that Intel would not be holding back the performance of either the Atom or Xeon processors as it moved ahead. Next year's "Avoton" Atom S Series chip will be implemented in the 22 nanometer process that Ivy Bridge and "Haswell" chips use, and as El Reg has previously reported, will include on-chip networking.
What we have been told is that the Avoton chip will have Ethernet network interfaces, but not a full blown switch like some ARM chips are getting. Over time, all of the Xeon chips will get integrated networking, Bryant said, but she did not elaborate the manner of that networking and she did keep calling it the "integration of fabric" even though that is probably not what Intel is doing.
That said, plans change and maybe Intel is going to plunk a distributed network chip onto Atoms and Xeons. For a lot of customers, that would radically simply the data center. The Avoton chip also expected to get out of order execution to help boost performance per core, and it seems likely with the process shrink it will get more cores eventually.
Intel is not saying if the Atom S Series chip due in 2014 is a tick or a tock, but if Avoton sports a new core and integrated Ethernet ports, then it stands to reason that the 2014 chip, whatever it is called, will use the 14 nanometer process shrink to boost the core count and maybe introduce a ring interconnect across the cores and caches like modern Xeons and Itaniums have.
The future Atoms could also sport modest L3 caches, too. But then again, that would make them look like Xeons, and what would be the point? ®