Facebook's first custom-built data center – the Prineville, Oregon facility that it just "open sourced" – uses standard x86 chips from Intel and AMD. But there's little doubt the social networking giant is exploring the use of so-called massively multicore servers packed with hundreds of low-power ARM chips, or even silicon from new-age chip maker Tilera.
On Thursday, after Facebook released the specifications and design documents for the servers, racks, and other data-center equipment under the aegis of the Open Compute Project, we asked Jonathan Heiliger, the company's vice president of technical operations, whether the company was planning to eventually make the move to ARM or Tilera machines, or even the Intel Atom–based machines from SeaMicro, and though he stopped short of saying there are definite plans to install such systems in a live data center, it's clear the company is doing its homework in this hot-button area.
When we pointed out that he has publicly mentioned Tilera and SeaMicro, Heiliger said he "could not recall" if he had or not – "I have been know to make off-the-cuff remarks" – indicating that he had mentioned that Facebook was exploring "alternative architectures".
"We are always open to new ways of thinking about things and new ways of doing things," Heiliger told us, pointing to the Open Compute Project as an example of that. "For now, this is our way of innovating and moving the industry forward."
Asked if there are any specific plans to move to ARM, he paused before saying: "No, there are no specific plans at this point. But as I said, we're always looking at new technologies and new architectures."
The idea is that a server that splits tasking across hundreds of lower-power ARM chips would save both power and money. But it seems that at the moment, ARM's memory footprint can't handle the sort of workloads Facebook requires. ARM limits you to 4GB of memory per server node. But future ARM chips will offer a large memory footprint, and this could lead to a Facebook break with Intel.
On Thursday, we also asked Intel high density–computing man Jason Waxman about a possible Facebook move to massively multicore machines, and he said that Intel is developing its own massively multicore chips, but that these sorts of designs are suited to high-performance scientific applications rather than a highly scalable web back-end. Intel has discussed an 80-core processor, known as its "Teraflops Research Chip", and a 48-core design known as the, er, "Single Chip Cloud Computer".
Intel's website describes the Single Chip Cloud Computer as "a microcosm of a cloud datacenter." It includes 24 "tiles", each offering two cores, and there's a built-in 24-router mesh network. In other words, it resembles the sorts of chips Tilera is building. Tilera chips, for what it's worth, are now being used in servers manufactued by Quanta, the Taiwanese outfit that built the machines for Facebook's Prineville facility.
Clearly, ARM and Tilera are a potential threat to Intel's server business. But it should be noted that even Google has called for caution when it comes to massively multicore systems. In a paper published in IEEE Micro last year, Google senior vice president of operations Urs Hölzle said that chips that spread workloads across more energy-efficient but slower cores may not be preferable to processors with faster but power-hungry cores.
"So why doesn’t everyone want wimpy-core systems?" Hölzle writes. "Because in many corners of the real world, they’re prohibited by law – Amdahl’s law. Even though many Internet services benefit from seemingly unbounded request- and data-level parallelism, such systems aren’t above the law," he said. "As the number of parallel threads increases, reducing serialization and communication overheads can become increasingly difficult. In a limit case, the amount of inherently serial work performed on behalf of a user request by slow single-threaded cores will dominate overall execution time."
Hölzle also pointed out that if you switch to a massively multicore setup, you have to rejigger your software to take advantage of extreme parallelism. "Wimpy-core systems can require applications to be explicitly parallelized or otherwise optimized for acceptable performance. For example, suppose a Web service runs with a latency of one second per user request, half of it caused by serial CPU time. If we switch to wimpy-core servers, whose single-threaded performance is three times slower, the response time doubles to two seconds and developers might have to spend a substantial amount of effort to optimize the code to get back to the one- second latency."
This may be holding Facebook back as well, but we get the feeling it will only hold them back for so long. ®