Platform Computing, which has carved out a niche for itself managing supercomputer clusters and dispatching applications on HPC gear, has been expanding up the stack. It continued this process today when it acquired the HP-MPI stack created by HP for its own servers, as well as others used in HPC clusters.
MPI, short for message passing interface, is the protocol used to parse-up jobs and spread around data as a cluster runs a parallelised application. It's the backbone of the cluster and is therefore a key component of any application. There are plenty of MPI stacks available, some of them open source, some of them not, and all are tuned for different architectures or spanning multiple types of platforms and network interconnects.
To help build up its business - which got its start with the Load Sharing Facility (LSF) tool for managing gridded applications decades ago - Platform has open-sourced its software and contributed to key open source cluster-management projects, while at the same time buying up MPI stacks and adding other goodies into its tools.
Platform Cluster Manager - formerly known as the Open Cluster Stack and in its fifth release - includes an open source implementation of the LSF job-scheduling tool called Lava and developed under a project called Kusu. OCS also includes Nagios for system monitoring, Cacti for node and cluster monitoring, Ganglia for workload monitoring, and other software that's needed to run an x64-based supercomputer cluster based on Linux.
HP started reselling its own bundle of the Platform cluster tools, called Platform HPC for Insight Control Environment for Linux, in March. This followed Red Hat's own Red Hat HPC Solution, which debuted in October 2008, and Dell's own twist on the Platform stack, called OCS Del Edition, which came out two weeks later. Companies can also download the Cluster Manager tools from Platform directly and pay for support contracts if they want to build their own HPC setups.
To make its cluster tools more useful and relevant, Platform bought the HPC management software stack from Scali in October 2007. Then in August 2008 it acquired the Scali-MPI stack to weave it into its cluster tools.
Just last week, Platform inked a deal with nVidia that will see the CUDA programming environment for Tesla GPU co-processors incorporated into its cluster management tools. This means that both Cluster Manager and LSF can seamlessly dispatch work to Tesla engines, just as it can dispatch work to x64 processor cores inside a cluster.
While the Scali-MPI stack that Platform acquired last year was tuned for Linux, it had some limitations in that it was only supported by a half-dozen independent software vendors. According to Tripp Purvis, VP of business development at Platform, the company had only done a little work making Scali-MPI work with Windows and had done no work porting it to various flavours of Unix.
The HP-MPI stack, by contrast, has a long history at Digital, Compaq, and then HP, and is currently supported on HP-UX, Linux (Red Hat Enterprise Linux and SUSE Linux Enterprise Server), and Windows (the 2003 and 2008 HPC Editions). It runs on Itanium, PA-RISC, and x64 iron, using either Xeons or Opterons in the latter case. HP-MPI doesn't just work on HP's ProLiant or Integrity servers, but any other brand of box that supports these hardware and software combinations.
And according to Scott Misage, director of solutions research and development in HP's cross-divisional Scalable Computing and Infrastructure organization, the HP-MPI stack has two other attributes that Platform found valuable. The first is that HP-MPI can support various speeds of Myrinet, Ethernet, and InfiniBand technologies and do so without having to recompile applications for each change in networking technology. Second, the current HP-MPI 2.3 edition is supported by 32 HPC application providers.
HP reckons that it has sold over 40,000 licenses to its MPI implementation, which dwarfs the installed base of Scali-MPI and which competes with the other popular vendor-sponsored MPI stack - the one from Intel. There are several open source MPI stacks, which are also very popular, and some niche ones, such as those created by Cray for its XT5 supers and IBM for its BlueGene supers. Misage reckons that HP has the most ISVs supporting a particular MPI stack.
Given this, you might be wondering why HP would want to sell this business, and you might also be wondering why HP isn't buying Platform Computing to build up its presence in the HPC business. For whatever reason, this deal is going the other way, and HP seems content to sell its MPI stack to Platform and resell its cluster tools. The financial details of the deal were not disclosed.
For its part, Platform intends to integrate HP-MPI with its Cluster Manager and LSF tools, and over the next few months it will be merging the Scali-MPI and HP-MPI stacks to create a single product. As part of the deal, the key software development engineers working on HP-MPI have been moved over to Platform. They will be keeping their jobs and will be working on the merged product line along with their peers from Platform.
The current Scali-MPI and HP-MPI releases will be maintained and supported as well, so customers are not being forced to move. ®