SC07 One topic - even more so than cheap shrimp - dominated this year's Supercomputing conference in Reno: Accelerators.
The server and chip industries have felt the rise of the accelerator coming for some time. Last year's conference, for example, had the hardware heads pitted against the coders. The software set wondered if enough developers would ever exist to write custom code for tricky FPGAs and GPGPUs (general purpose graphics processors). The hardware crowd countered such skepticism by forcing pitches about their super silicon down the throats of anyone who showed even a faint sign of interest in the technology. Meanwhile, customers in the financial services, oil and gas and media markets hoped these two sides would work out their differences. They want dramatic performance gains, and they want them now.
Reno's monument to cheap shrimp
Much of the reticence around the server accelerators vanished at Supercomputing '07. Yes, the software folks still have their concerns. And, yes, grizzled veterans complain that they've seen accelerator fads come and go. Ultimately, however, it's clear to us that enough big vendors, ISVs, start-ups and customers have coalesced around the accelerator idea to push the technology forward in a profound way.
So, let's take a look at a number of the accelerator options out there from both the hardware and software sides and try to see where things stand.
ClearSpeed - The Floater
ClearSpeed is one of the more mature accelerator players in the server market, having specialized in speeding up floating point operations for some time. The start-up enjoys strong ties to companies such as HP and Sun Microsystems and has its accelerator cards sitting in a couple of the world's top supercomputers.
At Supercomputing, ClearSpeed rolled out a new, complete accelerator system that complements the company's X620 and e620 cards, which plug into PCI-X and PCIe slots. The CATS (ClearSpeed Accelerated Terrascale System) unit takes up 1U of rack space and can reach up to a Teraflop.
ClearSpeed manages to cram 12 of the e620 cards into the CATS box, leaving a system that should cost around $70,000. For the moment, ClearSpeed will sell the CATS units in limited volumes. The company hopes to see larger OEMs offer the product during 2008.
During a demonstration, ClearSpeed linked 12 of the CATS units with an HP ProLiant DL360 server. It then sent all 144 of its 96-core CSX600 chips at the quantum chemistry code Molpro and reached 11.6 Teraflops while consuming about 6.6kW of power (550 watts per CATS node).
(ClearSpeed hits this kind of performance on the back of its CSX600 chip, which can show 25 gigaflops of sustained 64-bit computing. Two of the chips sit on each e620 card.)
Most of ClearSpeed's critics whine about the amount of software work that needs to be done to port a floating point-heavy application over to these custom chips. ClearSpeed has worked hard to rebuff these claims.
For one, it notes that a number of applications such as Matlab and Mathematica can run on the CSX600 chips without any changes to the underlying code thanks to work done by ClearSpeed and the software makers and the presence of friendly ClearSpeed libraries.
ClearSpeed also continues to work on its core software package, which is meant to ease the porting process for customers and partners. It has released a beta of Version 3.0 that includes support for Red Hat Enterprise Linux 5 64-bit and Suse Linux Enterprise Server 10 64-bit. In addition, customers will find support for a wider range of BLAS and LAPACK functions, new library functions and a preview of an Eclipse IDE.
Another nice feature in the ClearSpeed software is the presence of graphical tools that show spots in code which will benefit from acceleration.
It's easy for rivals to relegate ClearSpeed to the floating point niche and say that its hardware is a pain, but growing mainstream acceptance should make it tougher for end users to ignore the company. HP has folded ClearSpeed into its accelerator helper program, while Sun relies on ClearSpeed for some of its most prominent high performance computing wins.
ClearSpeed's story should get stronger in the coming months when it releases revamped silicon that improves overall performance and when it takes care of a lingering denormalized operands issue.
The Tesla Foil
Not to be outdone by some upstart, Nvidia has been hammering away at its accelerator play too.
While at Supercomputing, Nvidia found time to hype up Version 1.1 of CUDA and its Tesla acceleration systems.