Nagoya University is the latest academic institute in Japan to take a slice of the K supercomputer design - and put it on its campus to run applications on a monstrous 10.51 petaflops box. In theory.
And in an interesting twist, the new machine is a hybrid Sparc64-Xeon-Xeon Phi cluster that will eventually push up into the multiple peta-FLOPS performance level.
Remind me what a K supercomputer is
The number-crunching power of the K super is not what makes it monstrous, at least not compared to hybrid supers marrying GPU or x86 coprocessors to CPUs to goose their performance.
The K machine was built by Fujitsu for the Japanese government and is housed at the Rikagaku Kenkyusho (Riken) research lab in Kobe, Japan. The machine, which was ranked at the pinnacle of the Top500 supercomputer rankings for a short time, uses Fujitsu's eight-core, 2GHz Sparc64-VIIIfx processor and the Tofu 6D mesh/torus interconnect to link them together. The K super has 22,032 blade servers, and this, not its aggregate floating point performance, is what makes it a monster.
What is the university building?
The new Nagoya machine will use a PrimeHPC FX10, which is an upgraded K-compatible box that puts sixteen-core Sparc64-IXfc processors into the machine. The FX10 scales up to 1,024 racks and a maximum of 23 petaflops; it has not been upgraded to use the sixteen-core "Athena" Sparc64-X processor, which is used in the Sparc M series commercial server line from Fujitsu, as yet.
These Athena machines were launched in January in Japan, and in April Oracle also decided to resell the boxes to its commercial customers. Oracle has shown little interest in the traditional HPC market and does not resell the PrimeHPC FX10 machines or clusters based on Fujitsu's Primergy Xeon boxes, either.
This is good for Fujitsu, which needs to recoup substantial investments it has made in Sparc64 processors and the Tofu interconnect.
The wonder, of course, is why Fujitsu has not created Xeon blade servers that can plug into the Tofu interconnect. This could be particularly useful for certain kinds of workloads that run well on a torus. It would be very interesting to see Fujitsu plug Tofu into PCI-Express 3.0 ports and thus break it free from a tight link with the Sparc64 processors.
Cray broke free of the Opteron's HyperTransport point-to-point interconnect with the "Aries" Dragonfly interconnect, giving it better options to lash together CPUs, FPGAs, GPU coprocessors like Nvidia's Tesla cards, and x86 coprocessors like Intel's Xeon Phi cards.
What's in the big red box?
The initial Nagoya hybrid supercomputer being built by Fujitsu will have 384 PrimeHPC FX10 server nodes, which works out to 6,144 cores and around 90 teraflops of floating point crunching. This FX10 box is linked to a cluster of Primergy CX400 tray servers that feature two-socket Xeon E5 nodes – a total of 552 of them.
The CX400 setup allows for up to 84 nodes to be crammed into a rack. And pushing the hybrid nature up another notch, 184 of the CX400 nodes have an Intel Xeon Phi coprocessor, each delivering about a teraflops of double-precision math.
All told, this initial hybrid machine is rated at 561.4 teraflops of aggregate peak theoretical performance across its three computing units. And, incidentally, some of the Xeon nodes are running ScaleMP's vSMP aggregation hypervisor, turning them into a virtual SMP box for running large, shared-memory applications.
This is not a particularly powerful machine by modern standards, but Nagoya says it has plans in the future to upgrade the hybrid machine to 3.66 petaflops of compute capacity.
The university did not say how it would accomplish this, but putting Xeon Phi coprocessors on all of the Xeon nodes would only add another 368 teraflops, so that isn't it.
Nagoya has installed several parallel machines based on Fujitsu M9000 big iron boxes and its FX1 single-socket servers as well as Opteron-based clusters in the past, and it stands to reason that the future system it installs will continue to be hybrid, spreading work across Sparc, Xeon, and Xeon Phi nodes.
In effect, the cheap Xeon Phi FLOPS mean Nagoya can afford to indulge in relatively expensive Sparc64 FLOPS and not have to port some of its older applications. But then again, the rest of the box could be built with mostly Xeon and Xeon Phi chips with really large Sparc jobs being pushed out to the K-compatible super. This is probably the most cost-effective tactic provided there is spare capacity on the K machine.
The Nagoya hybrid super will use Fujitsu's Technical Computing Suite, a set of compilers and cluster libraries, to run applications and will also put the Fujitsu Exabyte File System (FEFS), a variant of the Lustre parallel file system announced in November 2011, on a 6PB storage cluster to feed data to the computing beast. ®