Fujitsu parades 16-core Sparc64 super stunner

Top of the FLOPS

SC11 Ahead of the SC11 supercomputer conference in Seattle last week, recently awakened supercomputing giant Fujitsu rolled out the kicker: a commercialized version of the K supercomputer that is at the top of the flops charts in the world right now.

A whole lot of details on the Sparc64-IXfx processor and the PrimeHPC FX10 systems were missing, but El Reg has chased them down just as Fujitsu has announced its first paying customer for the FX10 machines.

The K supercomputer is the first machine in the world to break through the 10 petaflops performance barrier as gauged by the Linpack Fortran benchmark test. It was built by Fujitsu for the Japanese government and is installed at the Rikagaku Kenkyusho (RIKEN) research lab in Kobe, Japan.

The K super is based on the "Venus" Sparc64-VIIIfx processor designed by Fujitsu and fabbed by Taiwan Semiconductor Manufacturing Corp. The eight-core Venus chip clocks at 2GHz and delivers 128 gigaflops per chip, has a thermal efficiency of around 2.2 gigaflops per watt, and dissipates around 58 watts.

Fujitsu PrimeHPC FX10 small

Some nodes of Fujitsu's PrimeHPC FX10 supercomputer

The K super has 22,032 four-socket blade servers fitted into 864 server racks to bring 705,024 cores to bear on parallel computation jobs. Running Linpack, the K machine delivered 10.51 teraflops of sustained performance on the Linpack test, which is 93.2 per cent efficiency as lined up against its peak theoretical performance of 11.28 teraflops. The Torus Fusion, or Tofu, 6D mesh/torus interconnect that Fujitsu has cooked up is no doubt one of the secret sauces in the K and FX10 supers.

The PrimeHPC FX10 super uses double-stuffed 16-core Sparc64 processors, also designed by Fujitsu and fabbed by TSMC, and increases the rack count to 1,024.

Most of the feeds and speeds of the Sparc64-IXfx processor were not available two weeks ago when Fujitsu jumped the gun on the SC11 conference. We knew that the chip has 16 Sparc cores that run at 1.85GHz and delivers 236 gigaflops of double-precision floating point number crunching. Now we know what the chip looks like and some more stuff about it.

Fujitsu's Sparc64-IXfx processor

Fujitsu's Sparc64-IXfx processor (click to enlarge)

The Sparc64-IXfx chip has 85GB/sec of memory bandwidth and includes 12MB of L2 cache memory on the chip that is shared by all 16 of those cores. Fujitsu is not implementing a ring interconnect for those cores, as Intel is doing for future Xeon and Itanium processors, but rather is plunking a big L2 cache memory controller in the dead center of the chip and wrapping four banks of L2 cache memory around it. Two banks of cores are on the chip, top and bottom, with a DDR3 main memory controller implemented on each side of the L2 cache banks with memory interfaces out to the memory DIMMs.

The cores on the Sparc64-IXfx processor have 32KB of L1 data cache and 32KB of L1 instruction cache. The core has two integer units, two load/store units, and four floating point units that can execute two add or multiply instructions per clock. The chip can also allow a fat SIMD instruction to span two floating point units. The 16-core chip can do 128 floating point operations per clock, and at just a hair under 1.85GHz, you get 236 gigaflops peak theoretical performance.

The Sparc64-IXfx chip is implemented in a 40 nanometer process from TSMC and the die is nearly perfectly square at 21.9 millimeters by 22.1 millimeters. The chip has 1.87 billion transistors and 1,442 signal pins. During normal operations, Fujitsu says that the Sparc64-IXfx processor will burn about 110 watts.

At the top of the chip is an interface to the Tofu interconnect. Each processor socket in the K or FX10 machine has one of its own Tofu interconnect chips. This interconnect chip has a processor bus to link back to the Sparc64-IXfx processor, four Tofu network interfaces that handle packets coming off the processor and also provides remote direct memory access (RDMA) like InfiniBand does.

The interconnect chip has a Tofu barrier interface that handles collective operations, and a Tofu network router that has ten Tofu links. These links are used to hook the Tofu interconnect chips to up to ten other interconnect chips in the cluster, implementing the 6D mesh/torus when all the links are used.

The interconnect chip also has a PCI-Express 2.0 peripheral controller for linking out to storage and other peripherals. The interconnect chip is implemented in a fairly ancient 65 nanometer process and runs at 312.5MHz, which is a little less than one sixth the clock speed of the processor, and has ten bi-directional ports running at 5GB/sec this delivering a peak of 100GB/sec of switching capacity.

You have to think that Fujitsu wants to put the Tofu controller on the future Sparc64-Xfx processor, if there is such a thing. Or at least get it on the same chip package to further increase the density of the PrimeHPC clusters.

Fujitsu PrimeHPC FX10 blade

The PrimeHPC blade server with Tofu interconnect chips on the left

As with the K supers, there are four Sparc64-IXfx processors on each blade in the FX10 machine, with four matching Tofu interconnect chips. All eight chips on the blade are cooled with water blocks, which are attached to rear-door water jackets on the PrimeHPC racks.

The base PrimeHPC FX10 machine has 64 racks, as it turns out, and a loaded up rack costs about for ¥50m, or about $650,000 (£414,000), each. Those 64 racks have 6,144 compute nodes (four per blade) with 384TB of memory and 1.4 petaflops of peak number-crunching power; this configuration also has 384 I/O nodes, which have a total of 1,536 expansion slots.

This machine has about the same power efficiency as the K super, and burns 1.4 megawatts. A fully loaded 1,024-rack system would have 98,304 compute nodes, 6PB of main memory, and deliver 23 petaflops of oomph while burning 23 megawatts. Such a box would cost $655.4m at list price, but we're pretty sure Fujitsu will cut you a deal.

Fujitsu is ready to ship the PrimeHPC FX10 machines starting in January 2012, and the University of Tokyo's supercomputing division is the first customer to buy a PrimeHPC FX10 machine. The university is buying a 50-rack setup with 4,800 Sparc64-IXfx nodes with 150TB of memory and 1.13 petaflops of oomph. The FX10 machine at the University of Tokyo is front-ended by 16 Primergy RX200 S6 and 58 Primergy RX300 S6 servers that are being used as access controllers to the 1.13 petaflops monster.

The cluster is backed by 150 Eternus DX80 S2 RAID 5 storage arrays with 1.1PB of capacity, which are connected to the nodes directly, and 80 Eternus DX410 S2 arrays that are implemented using RAID 6 protection across their collective 2.1PB of capacity and shared by all nodes in the cluster.

The whole shebang runs the Fujitsu Exabyte File System, which also made its debut ahead of the SC11 show. FEFS is a variant of the open-source Lustre file system, and Fujitsu has committed to giving its enhancements to Lustre back to the community through a partnership with Whamcloud.

The latter company is offering third-party support for Lustre, which is technically controlled by Oracle since its acquisition of Sun Microsystems nearly two years ago. But Oracle doesn't care about HPC and therefore Whamcloud has forked Lustre and is offering support services to keep the big supercomputing labs of the world happy.

Fujitsu said it wanted to sell 50 of the PrimeHPC FX10 systems in the next three years, predominantly as a development machine for institutions that want to deploy applications on the K machine. One down, 49 to go. ®

Broader topics

Other stories you might like

  • Deepfake attacks can easily trick live facial recognition systems online
    Plus: Next PyTorch release will support Apple GPUs so devs can train neural networks on their own laptops

    In brief Miscreants can easily steal someone else's identity by tricking live facial recognition software using deepfakes, according to a new report.

    Sensity AI, a startup focused on tackling identity fraud, carried out a series of pretend attacks. Engineers scanned the image of someone from an ID card, and mapped their likeness onto another person's face. Sensity then tested whether they could breach live facial recognition systems by tricking them into believing the pretend attacker is a real user.

    So-called "liveness tests" try to authenticate identities in real-time, relying on images or video streams from cameras like face recognition used to unlock mobile phones, for example. Nine out of ten vendors failed Sensity's live deepfake attacks.

    Continue reading
  • Lonestar plans to put datacenters in the Moon's lava tubes
    How? Founder tells The Register 'Robots… lots of robots'

    Imagine a future where racks of computer servers hum quietly in darkness below the surface of the Moon.

    Here is where some of the most important data is stored, to be left untouched for as long as can be. The idea sounds like something from science-fiction, but one startup that recently emerged from stealth is trying to turn it into a reality. Lonestar Data Holdings has a unique mission unlike any other cloud provider: to build datacenters on the Moon backing up the world's data.

    "It's inconceivable to me that we are keeping our most precious assets, our knowledge and our data, on Earth, where we're setting off bombs and burning things," Christopher Stott, founder and CEO of Lonestar, told The Register. "We need to put our assets in place off our planet, where we can keep it safe."

    Continue reading
  • Conti: Russian-backed rulers of Costa Rican hacktocracy?
    Also, Chinese IT admin jailed for deleting database, and the NSA promises no more backdoors

    In brief The notorious Russian-aligned Conti ransomware gang has upped the ante in its attack against Costa Rica, threatening to overthrow the government if it doesn't pay a $20 million ransom. 

    Costa Rican president Rodrigo Chaves said that the country is effectively at war with the gang, who in April infiltrated the government's computer systems, gaining a foothold in 27 agencies at various government levels. The US State Department has offered a $15 million reward leading to the capture of Conti's leaders, who it said have made more than $150 million from 1,000+ victims.

    Conti claimed this week that it has insiders in the Costa Rican government, the AP reported, warning that "We are determined to overthrow the government by means of a cyber attack, we have already shown you all the strength and power, you have introduced an emergency." 

    Continue reading

Biting the hand that feeds IT © 1998–2022