Grid and cluster computing management software company Platform Computing has inked a distribution agreement with graphics chip maker nVidia. It will see the HPC expert bundle nVidia's CUDA programming environment with its cluster management tools and integrate it into those tools.
The move will allow Platform Computing's Load Sharing Facility (LSF), the backbone of its open and closed source products, to dispatch applications to nVidia's Tesla GPU co-processors much as it dispatches work to regular CPUs inside HPC clusters. These tend to be built using x64 processors with either Gigabit Ethernet or InfiniBand interconnect, except in the most exotic supercomputer labs.
Platform is integrating the CUDA programming environment with two of its products: its open source Platform Cluster Manager and HPC Workgroup Manager, which adds in LSF, a workload management and dispatching tool. Workgroup Manager is a bundle of tools aimed at companies with 32 nodes or fewer and has lower prices than the regular Cluster Manager plus LSF tools. Cluster Manager is distributed for free, but one year of support costs $150 per server node and three years of support costs $300 per node.
HPC Workgroup Manager includes LSF and a one-year support contract for the bundle on a two-socket server node costs $250 and a three-year contract costs $640. There is no incremental charge for the CUDA development kit or support for it.
The CUDA kit that Platform is distributing includes an extended C runtime that allows routines to be dispatched to the 240-core, 1 teraflop Tesla GPUs instead of to central processors. BLAS, FFT, and a number of other math libraries are also added to the kit. It also includes a GPU hardware debugger, profiling tools, and code samples for dozens of popular HPC algorithms.
Platform Workgroup Manager takes the open source cluster manager and adds LSF, its graphical workload management tool, and its own message-passing interface (MPI) clustering protocol stack, which it bought from Scali.
As I explained back in March, when supercomputer maker Penguin Computing launched its own CPU-GPU bundles, nVidia needs to get the CUDA environment supporting C++ and Fortran to make it truly useful for HPC customers. There are some interfaces, so C++ and Fortran applications can tickle the GPUs.
As of the spring, nVidia had shipped over 100 million CUDA-compatible GPUs, although most of them were probably not in supercomputer clusters. The nVidia Tesla GPU co-processors are supported in both Linux and Windows environments.
The LSF tool provisions and monitors workloads running on clusters, and can do so down to the CPU core level on server nodes and down to the GPU level on the co-processors. This is thanks to the integration of the CUDA development kit with the Platform cluster manager products.
Tripp Purvis, vice president of business development at Platform, hints that other co-processors may get some platforms with the cluster management tools, including Advanced Micro Devices' Firestream, Intel's Larrabee, and IBM's Cell chips.
"As these technologies become more commercially available, we will pursue them," says Purvis.
Well, two out of three nVidia GPU alternatives mentioned above are technically available, but they are not exactly going mainstream like the Teslas seem to be doing. And Platform's support could, in fact, drive adoption of these alternatives and help foster a little competition.
AMD's Firestream SDK includes a programming language called Brook+, which is a Firestream-enabled version of the Brook open source C compiler.
IBM's Power-based Cell co-processors (technically known as the PowerXCell 8i chips and running at 3.2 GHz) have been around for a few years now. They are notably deployed on blades alongside of and linked to Opteron blades in the 1.1 petaflops "Roadrunner" supercomputer installed at the U.S. Department of Energy's Los Alamos National Laboratory. This is the fastest supercomputer in the world, at the moment.
Cray is giving it a run for the money with its own XT5 Opteron clusters. IBM has its own multicore acceleration SDK that is packaged up with the Cell processors, and Platform could integrate this today if it wanted to, much as it has done with nVidia's CUDA. IBM only supports Red Hat Enterprise Linux 5.2 and Fedora 9 with its Cell SDK, however, which limits its appeal somewhat.
According to a report this week in SemiAccurate, Intel is expected to put the 32-core Larrabee chips into machines as graphics co-processors during the "Haswell" generation of chips in 2012. That's three generations away from the current Nehalem cores used in PCs and servers.
There is also talk that Larrabee will debut early next year in external graphics cards in a 32-core chip rated at around 2 teraflops of single-precision floating point performance, and hence could appear in GPU-based co-processors for HPC workloads.
No money is changing hands between Platform and nVidia. Platform wants to leverage integration to sell its software, and nVidia wants the company to do that to help peddle its GPU co-processors. The HPC resellers and cluster makers obviously need to have their own partnership with nVidia and the Tesla products to be able to support Platforms's integrated CUDA environment. ®