Fujitsu takes trip to Venus
The octocore Sparc64
With Sun Microsystems and Oracle hogging all of the debate about the Sparc architecture these days, you can't blame Fujitsu for wanting to get a word in edgewise. So today, somewhere in Japan, Fujitsu reminded everyone that even though it's getting out of the chip manufacturing racket, it does have an eight-core Sparc64 chip in development.
Dubbed "Venus," this future Sparc64 processor was first talked about publicly back in August 2008, but since then, Fujitsu has said very little about it and Sun has given no indication that it was planning to use the processor in any of its future machines. (That doesn't mean there wasn't a plan. It just means that since the economic meltdown, neither Sun nor Fujitsu have talked seriously about their product roadmaps and how they would be extended beyond about the middle of 2009).
Anyway, according to a report from the Associated Press, Fujitsu has gotten a prototype of the Venus chip up and running and has successfully demonstrated that it can hit the 128 gigaflops of performance per chip, which a Fujitsu spokesperson said made it 2.5 times faster than the best thing that Intel can put into the field.
Of course, Intel can put its Nehalem EP chips into systems today, and the Venus chip will not be here for at least a year. So the comparison is not exactly fair. But then again, they rarely are in the IT world.
The Venus chip is, according to Fujitsu's presentations from last year, an eight-core Sparc V9-compatible processor with SIMD extensions aimed at boosting performance for parallel supercomputing workloads. The extensions are known collectively as HPC-ACE, apparently, and what they do isn't clear. The Venus chip will support DDR3 main memory and will include an on-chip memory controller, an L2 cache that is shared by all of the cores, and L1 data and instruction caches on each Sparc64 core.
The chip is implemented in Fujitsu's own 45 nanometer CMOS processes and is expected to deliver 128 gigaflops of number-crunching power per processors socket, and it's aimed at petascale supercomputing. To be even more precise, Fujitsu is trying to push up to the 10 petaflops level with Venus-based machines using general purpose processors.
This approach stands in stark contrast to the embedded processor approach that IBM uses in its BlueGene PowerPC-based supers or in the hybrid Opteron-Cell bladed supercomputers based on the "Roadrunner" design. Fujitsu says that it's giving priority to application migration (by which it presumably means compatibility), when neither BlueGene nor Roadrunner do (at least compared to earlier Power-based SMP server clusters).
The current Sparc64 VII chip used in the Sparc Enterprise servers sold by Sun and Fujitsu as well as being sold in Japan inside the FX1 supercomputer clusters is a quad-core chip that delivers 40 gigaflops of floating point performance across those four cores running at 2.5 GHz. The FX1 nodes are blade-like modules with processor and memory cards that are mounted in rack chassis. Late last year, the Japan Aerospace Exploration Agency took delivery of an FX1 super that had a fat Sparc Enterprise SMP server node and 3,392 of the single-core FX1 blade nodes all ganged up using InfiniBand switches to deliver 135 teraflops of aggregate power. Fujitsu apparently made the InfiniBand switch itself. The top-end Sparc64 VII parts burn at 135 watts, which ain't exactly cool.
To get to that 128 gigaflops level with Venus, Fujitsu could be adding lots of mathematical functions. Or it could be boosting clock speeds to 4 GHz or so. (The odds favor special instructions as well as a boosted clock speed and the doubling of the core count to get that extra performance). The Sparc64 VII chip has 6 MB of shared L2 cache, and Fujitsu says it has cooked up a technology it called hardware barrier synchronization (or sometimes "Impact"). It allows a four-core chip to look like a much faster single-core chip as far as compilers are concerned. This means programmers don't have to do as much work coping with application parallelization to get performance.
As for the future supercomputers that will be based on the Venus chip, Fujitsu is cooking up its own variant of a 3D torus, switchless interconnect that can scale to over 100,000 nodes, which puts the machine at around 12.8 petaflops. That Venus supercomputer is expected to use about one-tenth the power per floating point instruction as the FX1 super clusters. You might be thinking that this must be a system-level measuremen. If it was a chip-level metric, then an eight-core Venus chip would run burn about 45 watts, And that just sounds way too low, right?
But the AP report says that the Venus chip has twice the number of transistors of the Sparc64 VII chip but has one-third the power consumption. The Venus supers will employ some form of liquid cooling, and they will therefore allow ten times the density of the FX1 machines.
As you might imagine, Fujitsu is relying heavily on the Japanese government for support for the development of the Venus chips and the related systems. The chip will, in fact, be deployed as part of something called the Japanese Next Generation Supercomputing project, which will mix scalar Venus systems manufactured by Fujitsu with a vector-based system co-created by NEC and Hitachi. This hybrid scalar/vector machine is supposed to hit 10 petaflops and a prototype is expected to be in field by the end of Fujitsu's fiscal 2010 year, which ends in March 2010. The full hybrid machine is expected to be operational by March 2011.
Fujitsu has not said how the Venus chip will be deployed inside of commercial Sparc Enterprise servers, and Sun has not said if it will use the chips. That decision will ultimately be made by Oracle now, not Sun, unless the $5.6bn acquisition of Sun by Oracle somehow unwinds. ®