ScaleMP shared memory clusters tuned for Xeon 7500s
Upgrade for added juice
If you bought ScaleMP's shared memory clustering software to create a virtual symmetric multiprocessing system out of machines using Intel's eight-core "Nehalem-EX" Xeon 7500 processors, you might want to think about upgrading to the new 3.5 release of vSMP Foundation, which ships today.
The reason, says Shai Fultheim, the company's founder and chief executive officer, is that with vSMP Foundation 3.5, the company's techies have gone back into the virtual machine manager and clustering software and tuned it to work a lot better on processors with lower clock speeds, fatter caches, and larger main memories.
ScaleMP only announced vSMP Foundation 3.0 back in May, a block-buster release for the company in that it offered significantly better scalability than the prior 2.X releases, spanning up to 128 server nodes (8,192 cores) linked by InfiniBand switches in a single memory footprint and up to 64 TB of shared memory across those nodes.
The prior releases supported up to 16 nodes and up to 4 TB of memory across those nodes - puny by comparison and frankly only offering about twice the scalability of a physical eight-socket server using the Xeon 7500s that came out in March.
Given that ScaleMP sells systems software that allows companies to build virtual SMPs out of lower-cost two-socket servers, the company was not all that concerned about the initial performance of vSMP Foundation 3.0 on the boxes. But then, after the first Xeon 7500 machines hit the street in June, right after ScaleMP started to ship its significantly extended software, 20 per cent of licenses sold by ScaleMP for its code were going on to Xeon 7500 boxes. Some initial customers were taking four eight-socket Xeon 7500 machines and making virtual 32-socket machines out of them.
"The customer is always right," says Fultheim with a laugh, adding that some customers want to manage fewer nodes even though four-socket and eight-socket x64 servers are much more expensive than two-socket x64 machines. "And some of the assumptions that we made about clock speeds, cache sizes, and core counts needed to be retuned for the Xeon 7500s."
In general, the Xeons and Opterons used in two-socket servers have had lower core counts and higher clock speeds than the Xeons and Opterons used in fatter boxes. This generalization is getting a bit muddled with the way AMD split the Opteron 4100 and 6100 processors this year, with both being used in two-socket machines and the Opteron 6100s being also used in four-socket boxes.
According to benchmark tests performed by ScaleMP comparing vSMP Foundation 3.0 to 3.5, the bandwidth coming out of the virtual machine monitor is 74 per cent higher on two-socket boxes, but is as much as five times greater on four-socket machines. And latencies on message passing between the nodes are reduced by almost a factor of four on four-socket boxes and by 63 per cent on two-socket machines. The comparisons were made on two server nodes clustered using eight-core, four-socket machines and eight nodes of two-socket boxes with four cores each.
This is not the only performance improvement that comes with vSMP Foundation 3.5. Fultheim says that while the company was down in the guts of its VMM, monkeying around with things, it created a virtual MPI offload engine and snapped it into the VMM and its related InfiniBand drivers. A lot of ScaleMP's customers in the HPC racket use vSMP so they can manage their clusters are fat nodes, but then layer the Message Passing Interface protocol on top of it so their parallel applications can run on these virtual SMPs and also span multiple vSMPs if need be.
ScaleMP has not released performance tests on this MPI offload feature yet, and Fultheim says that this is because the ratio of compute versus communication in HPC applications varies widely and, to put it bluntly, the software is new enough that ScaleMP does not want to be pinned down yet on making performance claims.
That said, Fultheim says that even with InfiniBand being slower than Silicon Graphics' NUMAlink 5 interconnect used in its Altix UV 1000 shared memory systems, vSMP Foundation can give it a run for the HPC money because the VMM at the heart of the ScaleMP product is basically a bit of software that watches memory usage patterns on many server nodes very carefully. MPI is just passing data from node to node, so a clever VMM can do a bunch of clever things to start processing data in the background while the MPI stack is working, speeding it up. "The trick is how to do it transparently to the applications," Fultheim says.
At the moment, ScaleMP's vSMP Foundation shared memory clustering software works with Red Hat Enterprise Linux 4 and 5 and Novell SUSE Linux Enterprise Server 10 and 11. Ubuntu, CentOS, and Oracle Enterprise Linux are used by some customers, but it is not certified. Prior to the Oracle acquisition of Sun Microsystems in January, ScaleMP was in discussions to port Solaris to the environment, but that fell by the wayside. Windows Server 2008 (in either its plain or HPC variant) is not supported yet.
"As for Windows, we are not seeing demand for that right now," says Fultheim. "That may be because we are so focused on the HPC market, admittedly."
In the wake of the shipping of vSMP Foundation 3.0 in June, ScaleMP broke through 200 customers, and Fultheim boasts that the company has more deployments than Sequent, RNA Networks, Virtual Iron, 3Leaf Networks, NUMAscale, and SGI Altix UV 1000 have all added up at their peaks.
Back in October, ScaleMP certified the 3.0 release on IBM's System x3850 X5 server, the company's fat Xeon 7500 box. This machine is certified to run vSMP Foundation 3.5, and so are Hewlett-Packard's DL580 G7 and Dell's PowerEdge R910. These three and a bunch of downstream channel partners of Super Micro with HPC specialization resell the shared memory software. You can see the latest hardware compatibility list for vSMP here, as well as benchmark results for various tests that how vSMP scales.
Thus far, ScaleMP has not created a variant of the clustering software that can run database and back-end OLTP applications, but that would sure be an interesting development. The software is used mostly in supercomputing and financial transaction applications where messaging passing - not transaction processing - is the heart of the application.
Pricing on vSMP Foundation 3.5 is the same as on the 3.0 release. A license and support contract for a two-socket server using four-core x64 processors costs $1,750 (down from $2,500 with the 2.X releases). A two-socket node using six-core x64 chips costs the same, at $2,500 per server. If you are using eight-core chips in four-socket boxes, a license costs $7,500 per node (same as the 2.X price for six-core Xeons and Opterons), but if you drop down to the cheaper six-core x64 chips used in four-socket servers, then the license drops to $5,000 per box. ®