Ahead of the SC11 supercomputing conference in Seattle next week, Japanese IT conglomerate Fujitsu says it's not only going to commercialize the K supercomputer that just busted through the 10 petaflops barrier, but that early next year it will double-stuff the design with a new Sparc64 chip, and sell it to entities other than the Japanese government.
With the launch of the Sparc64-IXfx processor, and the PrimeHPC FX10 machines that will use it, Fujitsu says that it is making a commitment to a "Human Centric Intelligent Society by contributing to a prosperous future for society and customers through the continued development of supercomputers." (Perhaps this sounded less weird in Japanese?)
Fujitsu did not release many details on the new Sparc64-IXfx processor, that will be the brains of the massively parallel supercomputer, but did confirm that it had 16 cores and would run at 1.85GHz clock speeds. The chip is expected to do 236.5 gigaflops of double-precision math, and deliver more than 2 gigaflops per watt of performance at the chip level. If you do the math, that means the Sparc64-IXfx is burning at around 115 watts.
The current K supercomputer, built for the Japanese government, is running at the Rikagaku Kenkyusho (Riken) research lab in Kobe, Japan. It is based on the "Venus" Sparc64-VIIIfx processor, which spins at 2GHz and delivers 128 gigaflops per chip, has a thermal efficiency of around 2.2 gigaflops per watt, and dissipates around 58 watts. The K super has 22,032 four-socket blade servers fitted into 864 server racks to bring 705,024 cores to bear on parallel computation jobs. The fully loaded K machine at Riken was just tested using the Linpack Fortran benchmark test and rated 10.51 teraflops, against a peak theoretical performance of 11.28 teraflops.
The PrimeHPC FX10 system that HPC is gearing up to sell to commercial customers will presumably use the same quad-socket blades; we know it uses the same "Tofu" 6D mesh/torus interconnect that the K super uses. Fujitsu is doubling up the core count with the Sparc64-IXfx processors, but is running the chips slightly slower, too, so it needs to add cabinets and stretch the Tofu interconnect to double up the number-crunching oomph of the machine. And that is precisely what it has done.
The PrimeHPC FX10 machine will scale from 4 to 1,024 cabinets, sporting between 384 and 98,304 nodes. In the K architecture, each socket on the four-socket blade is a unique node in the cluster. This is also true for the FX10 super.
Each FX node can be equipped with either 32GB or 64GB of DDR3 main memory, and each node has 85GB/sec of memory bandwidth coming into the processor, and a 5GB/sec bi-directional Tofu link. A four-rack FX10 machine based on the 16-core Sparc64-IXfc will have 12TB of main memory and deliver around 90.8 teraflops of peak theoretical performance.
A fully loaded 1,024-rack FX10 will have 6.1PB of main memory across its 98,304 nodes and deliver a peak performance of 23.25 petaflops. So now Fujitsu, IBM, and Cray have all set their sights on breaking the 20 petaflops barrier on their way to the exascale heavens.
Based on the specs, it looks like the FX10 will run quite a bit hotter than the K super because it has fatter chips and, based on the thermals, it looks like there hasn't been much of a process shrink (if any) from the 45 nanometer processes Fujitsu used in its own fabs to make the Sparc64-VIIIfx processors.
Fujitsu's long-term goal is to move its chip making operations to Taiwan Semiconductor Manufacturing Corp, and it could be that the Sparc64-IXfc has already been moved over to TSMC and is using its 40 nanometer processes – the same one that Sparc partner Oracle is using to make the eight-core Sparc T4 processors.
However Fujitsu made the new Sparc64-IXfc processors, one thing is clear: The company is almost ready to go. Fujitsu says it will start selling the FX10 supers in January 2012 and expects to shift around 50 of these systems in the next three years. The idea is to allow companies to buy a computer that is compatible with the K super, so they can do application development for the K machine, as well as to let governments and corporations buy their own variant of the K machine for their own workloads.
A single rack of the FX10 will sell for ¥50m, or about $640,000. So if you have $656m laying around, you can buy a fully loaded FX10. And you could probably even get it for a little bit less than that, after discounting. Still, the FX10 runs Linux on Sparc, so you won't be able to play Crysis without doing a lot of hacking first. ®