HPC mavens sell excess super cycles
The Cheshire HPC utility
Supercomputing labs the world over are feeling the budget pressure like the rest of us, especially those supported by public funds. With this in mind, OCF – an HPC system integrator based in Sheffield – has come up with a plan that will help labs make a little money to cover their cost: sell capacity to outsiders for a fee.
To test out the idea, OCF has partnered with Daresbury Laboratory, located in Cheshire and the former home of the world's first high-energy synchrotron radiation source. Daresbury Lab is not a huge supercomputing center - it currently does not have a box on the Top 500 supercomputing list, for instance - but it did have a BlueGene/P parallel super from IBM with 8,192 processors rated at 2.78 teraflops on the November 2009 list. Daresbury has used parallel supers based on Intel i860 RISC processors back in the early 1990s, but it eventually shifted over to IBM's RS/6000 PowerParallel SP2 machines and then to Big Blue's BlueGene boxes.
More recently, Daresbury has installed a 2.5 teraflops cluster based on IBM's iDataPlex bladish-rackish hybrids. According to Julian Fielden, managing director of OCF, which supplies servers, storage, and networking to the public sector in the United Kingdom, Daresbury has an 84-unit iDataPlex rack that currently has 20 two-socket server nodes in it using 2.6 GHz Xeon 5600 processors from Intel.
Some of the nodes have Nvidia GPU co-processors in them for some extra number-crunching. The system can be configured with either Linux or Windows HPC Server 2008 and uses Platform Computing's Infrastructure Sharing Facility to manage the server images and its Load Sharing Facility to schedule multiple jobs on the cluster. IBM's Global Parallel File System is used on a back-end clustered storage array.
"We don't think that HPC is ready for cloud as such," explains Fielden, "but we do think it is ready for infrastructure on demand."
Under the deal that OCF has cut with Daresbury Lab, OCF will sell excess capacity on this baby cluster next year under a utility pricing scheme called enCore. Nodes will be carved up and loaded with outside workloads that are kept absolutely separate from internal workloads running at the lab. HPC labs are extremely paranoid about security, and data on the machines is encrypted, storage arrays are locked down, and remote access is also encrypted into and out of the machines. The GPFS storage is partitioned and no one outside of the lab's firewall can get to the data used by Daresbury's researchers.
OCF has not finalized pricing on the HPC utility yet, but Fielden says the plan is to charge a modest annual fee of around a couple of hundred pounds to gain access to enCore capacity plus 15 pence per core hour on top of that. OCF is not expecting the enCore service to be used on big jobs that require thousands of cores and months to run, but rather is focusing on modest jobs that might otherwise require small companies or research organizations to spend tens to hundreds of thousands of pounds to build their own clusters. The goal of the enCore service is for the overall price to run a typical job to cost between hundreds to thousands of pounds.
The enCore service will be in beta on the Daresbury kit in January and will be generally available in February. OCF is working now to bring other labs into the United Kingdom into the fold. ®