Company you never heard of builds 3.4 petaflops super for DOE
The Wizards of Phi follow the yellow brick road to Pacific Northwest
Nature abhors a vacuum as well as an oligopoly, which is why upstart supercomputer maker Atipa Technologies may find itself having an easier time getting its foot into the data center door now that Cray has eaten supercomputer-maker Appro International.
The company you've never heard of is Atipa, a division of PC and server maker Microtech Computers, based in Lawrence, Kansas. Microtech was founded in 1986 to build PCs and servers for the local Midwest market, and it was one of many PC makers who made a decent living peddling to local businesses before the industry massively consolidated. The company did so well in the pre-consolidation era that it expanded across the US up until the 1990s.
Atipa Linux Solutions was founded in 1994 as a Linux workstation and server distributor and graduated into building Linux clusters. The company raised $30m in funding, but was hurt by the recession in the wake of the dot-com bust and was snapped up by Microtech in 2001.
The next year, Atipa built its first capacity-class supercomputer cluster in 2002 for Louisiana State University, a machine that ranked number 17 on the Top500 supercomputer list that is compiled twice a year. Since then 14 of Atipa's machines have been on the Top500 list and two on the Green500 list, a different ranking that looks at performance per watt instead of the system's floating point computing power (LINPACK benchmark) as the ranking metric.
Atipa was able to close the deal with the US Department of Energy (DOE) to build the Higher Performance Computer System-4 (HPCS-4) supercomputer at the DOE's Pacific Northwest National Laboratory, which is located in Richland in southeast Washington state. The bidding for the HPCS-4 project at PNNL was competitive, and you can read all about it here, because the US government publishes docs on this stuff.
The DOE runs all the big supercomputer centers outside of the military and spooks in the US, and it is focused on scientific research related to energy production and consumption in its many forms. The DOE's Hanford site, which used to be run by General Electric and which still makes nuclear fuel, is nearby.
The HPCS-4 deal is actually being broken into two parts. Atipa has won the first part, called HPCS-4A, which will be installed in July of this year. The second phase will be specified and sent out for bids in 2014 with delivery of an upgrade to the system sometime in 2015.
The PNNL data center awaiting the unnamed ceepie-phibie
supercomputer to be made by Atipa. (Source: PNNL)
The HPCS-4A machine has not yet been named - PNNL is planning to run a contest to do so - but the specs are ironed out. The machine has to run Linux, and according to Dan Mantyla, an HPC programmer at Atipa who spoke to El Reg about the PNNL box, it will ship with the CentOS clone of Red Hat Enterprise Linux on its nodes. It will then be switched to Scientific Linux, the Tri-Labs DOE variant of Red Hat Enterprise Linux that is used on the existing "Chinook" Xeon cluster that HP built for PNNL a few years back, and that this Atipa machine will replace.
The most important thing for PNNL is that a single job running in its NWChem computational chemistry application can scale across at least 75 per cent of the compute capacity of the machine and do so with 97.5 per cent uptime across a year. The lab also had a 2.5 megawatt power budget for the data center and a 2 megawatt power limit on the HPCS-4A machine itself.
Atipa is not an unknown to PNNL, as it tapped the Kansan cluster maker for its "Olympus" supercomputer cluster which is ranked number 295 on the November 2012 Top500 list. That machine has 19,200 cores and is based on two-socket servers using sixteen-core Opteron 6272 processors running at 2.1GHz. The Olympus machine has a peak theoretical performance of 161.3 teraflops and delivers 102.2 teraflops on the Linpack Fortran benchmark test used to rank machines on the Top500 list.
Chinook is based on earlier quad-core Opteron processors and has 18,176 cores delivering 97.1 teraflops sustained performance on Linpack. It is a cluster of HP ProLiant DL servers.
The HPCS-4A machine is supposed to pack a lot more wallop into a small thermal envelope, and by definition that means using GPU coprocessors from Nvidia or AMD or x86 coprocessors from Intel to boost the performance.
Atipa pitched the combination of Intel's Xeon E5 processors and its passively cooled Xeon Phi 5110P coprocessors to meet the performance goals, and estimates that the machine it will build after winning the bid will deliver 3.4 petaflops of aggregate peak performance, which is more than a factor of 21 boost over the Chinook system.
HPCS-4A will pair two eight-core Xeon E5-2670 processors running at 2.6GHz with two of Intel's Xeon Phi 5110P coprocessors, which were announced last November and which have 60 cores running at 1.05GHz that the Xeon can offload calculations to.
The Xeon Phi has 5GB of GDDR5 local memory and each server node has 128GB of main memory that can be shared by the CPUs and GPUs. The HPCS-4A machine has a total of 1,440 nodes across 42 computer racks, and if you add it all up, there are 23,040 cores on the Xeon side and 172,800 cores on the Xeon Phi side, which yields 195,840 cores in total. That is 10.8 times the core count of the Chinook machine, but it delivers 21 times the raw floating point oomph.
The nodes in the HPCS-4A system are linked together with 56Gb/sec FDR InfiniBand switches from Mellanox Technologies, which beat out Intel's QLogic unit for the deal, and Super Micro is the motherboard supplier for the system and has FDR InfiniBand mezzanine adapter cards from Mellanox that also snap into the servers. The Chinook machine used 20Gb/sec DDR InfiniBand networking, and Olympus uses 40Gb/sec QDR InfiniBand switching.
An Atipa tech builds a supercomputer cluster in Kansas
Here's the amazing bit. Including a 2.7PB shared file system that will be delivered by DataDirect Networks, a specialist supercomputing storage supplier, the HPCS-4A system will cost the DOE's Office of Science a mere $17m. And that is for a memory-heavy cluster, which sports about four times the main memory per node as you see in a typical HPC cluster these days.
The parts for the HPCS-4A system will start showing up this July, and the plan is to have it up and running by October. That is just in time to get onto the November 2013 Top500 list, where the cluster should break into the top twenty machines. About 400 scientists from all over the country use the Chinook system, and they will be trying to figure out how to port their codes to run in hybrid ceepie-phibie mode. ®