ScaleMP: Use RAM plus vSMP, not flash, to boost server performance

Partners with Big Blue, chases SGI UV2 shared memory systems


There are hypervisors that chop a single server into virtual bits, and other hypervisors that take multiple servers and make them look like one big virtual one. ScaleMP's vSMP hypervisor is the latter kind, and can be used to create a shared memory x86-based system that runs Linux that would normally require special processors and chipsets. And a much higher price tag.

ScaleMP started out peddling vSMP to customers as an alternative to big SMP machines like those from Silicon Graphics, IBM, Hewlett-Packard, and Oracle, but with the hype around big data these days, Shai Fultheim, founder and CEO at ScaleMP, says the company was seeing a need for larger memories than larger compute capacities, and has therefore, with vSMP 5.1, rolled up a new SKU of the product that is tuned specifically to take a bunch of cheap server nodes and use them as memory expansion boxes for a big fat node. The end result is a significantly less expensive – and asymmetric – shared memory system than what you get from an SMP box based on high-end x86 or RISC processors and special chipsets to handle terabytes of main memory.

The dirty little secret out there in the data centers of the world is that most of the database, middleware, and application code is not designed to scale across lots of cores and threads. "This software is not really written to use all of the processing power in a modern machine," says Fultheim. "The problem is not the CPUs. The problem is the memory."

Meaning, this software can run in ever-embiggening chunks of main memory and get a big performance bump. The trouble is, CPU and memory capacity in modern servers is pretty much locked down. The memory hangs off a processor socket and its controllers are on the processor die, and there are very prescribed memory capacities for machines with one, two, four, or eight sockets. In general, as a machine increases in aggregate CPU performance, main memory capacity also increases, but so does the cost of the processors and the memory sticks in the machine. And even if your job is not compute bound and you don't need all the cores and threads, if you want to make a fat memory system you are pushed into buying a big bad box whether you like it or not.

ScaleMP uses is vSMP aggregation hypervisor and cheap skinny nodes to boost main memory

ScaleMP uses is vSMP aggregation hypervisor and cheap skinny nodes
to boost main memory

To serve the needs of analytics and other kinds of big data jobs where memory matters a lot more than compute, ScaleMP has ginned up the vSMP 5.1 aggregation hypervisor into two flavors. The first is called vSMP System Expansion version, which is tuned to scale up both processing and memory in a balanced fashion like a regular SMP server based on a physical chipset does. This is the vSMP that ScaleMP has been peddling for many years. The new flavor is called vSMP Memory Expansion, it is designed and tuned explicitly for machines that are going to be unbalanced – but in a good way.

There are a lot of ways to play this asymmetric configuration game with vSMP Memory Expansion, but the basic idea is outlined in the scenario in the first chart in this story. Rather than try to figure out how to put flash in a server to accelerate a database or big data workload, Fultheim says keep it simple and build the biggest memory space you can. A Fusion-io flash card has half the I/O operations per second of a chunk of memory for the same dollar, according to Fultheim, so it is the better option. (SGI, trying to push its Xeon E5-based UV2 shared memory supers with their NUMAlink 6 interconnect, would agree with this approach, as would IBM, HP, and Oracle with their big RISC or Itanium iron and fat SMP chipsets.) You can do a direct connect between up to four nodes using 56Gb/sec FDR InfiniBand host adapters and cables using the DC2 interconnect coded into the vSMP hypervisor since November 2009. Or if you want to scale up to 128 server nodes, you plug the servers into an InfiniBand switch.

In the example above, the workload in question only needs a four-socket Xeon E5-4600 or Xeon E7-4800 in terms of the processing capacity, but the 48 to 64 memory sticks in this box do not offer enough main memory capacity, and moreover, the fat memory needed to build up terabytes of memory space are very expensive. So instead of buying an eight-socket box to get more memory slots, you get the four-socket box and put in the faster Xeon or Opteron processors you can afford. Then you buy a bunch of skinny server nodes with 24 memory sticks each, and you turn off the cores and leave on the memory controllers and memory in the boxes as well as the InfiniBand ports, and now the FDR links are effectively a backplane for an SMP based on the vSMP hypervisor.

Fat memory, lots of skinny nodes

Depending on what memory capacities you choose for the memory sticks in these memory expansion nodes, you can get somewhere between 3.75TB and 7.5TB of main memory that is all directly addressable by the four-socket machine. All you threw away was around $200 per socket for the unused computing cores in the expansion machines.

This turns out to be a good trade off, as you can see:

Provided your workload runs well on vSMP, an asymmetric cluster is considerably cheaper than a real SMP box

Provided your workload runs well on vSMP, an asymmetric cluster is considerably cheaper than a real fat SMP box

That is a comparison that Fultheim cooked up showing the cost of the processor, memory, and Oracle 11i Enterprise Edition running on machines with two, four, or eight sockets with specific memory capacities. Oracle software is a lot more expensive on four-socket machines than on two-socket boxes. It also costs more to beef up the memory on any given server size because you are moving from 8GB to 16GB to 32GB memory sticks, and generally, the memory prices are not linear as you get fatter sticks because the cost of producing the denser memory chips is much higher than on lower-density chips. So if you want fat memory on a Xeon E7 server with eight sockets, it can get very pricey indeed, like well over $330,000 for a 4TB box.

Using vSMP memory expansion nodes linked to a four-socket machine, you could build the same 4TB system for under $200,000, or build an 8TB machine – that's twice the memory footprint of the physical eight-socket box – for around $270,000. That's about 20 per cent less money for twice as much addressable memory. And this comparison assumes, of course, that you are memory bound, not compute bound, with your database and analytics workload and that this workload runs on Linux and is amenable to the underlying messaging architecture of the vSMP aggregation hypervisor.

vSMP is available in two different flavors, with the memory expansion and system expansion variants within those flavors. vSMP Foundation scales up to 32 nodes in a single system image and addresses up to 32TB of maximum memory across those nodes. In the Memory Expansion variant, you can only have one fat node with the CPUs turned on and all of the processing on the other nodes is deactivated (with the memory controllers, memory, and I/O controllers obviously remaining on once the vSMP hypervisor is booted into memory and running). The Memory Expansion variant is priced based on the amount of memory addressed in the cluster, and you can get a perpetual license for $10,240 per TB or an annual subscription for $6,144 per TB.

With the System Expansion variant, all of the processors in the 32 nodes can be activated and do computing along with addressing the main memory. A perpetual license costs $400 per socket and an annual subscription costs $240 per socket.

With vSMP Foundation Advanced Platform, the vSMP aggregation hypervisor can scale up to 128 nodes and up to 256TB of addressable memory across the cluster of servers. This Advanced Platform variant also has the ability to support active-active multi-rail InfiniBand links between servers and switches, with up to four host channel adapters yielding up to 224Gb/sec of bandwidth into and out of the node for that virtual SMP memory addressing. The Advanced Platform also allows for the virtual SMP to be partitioned into virtual machines. With the Memory Expansion variant, Advanced Platform once again only allows for one node to do actual computing; it costs the same as the vSMP Foundation with $10,240 per TB for a perpetual license or $6,144 per TB for an annual license. With the System Expansion variant, Advanced Platform costs twice as much at $800 per socket for a perpetual license and $480 per socket for an annual subscription.

Within the next month or so, ScaleMP will roll out a variant of vSMP Foundation called Memory Expansion Free, which will, as the name suggests, be available for download at no cost. Memory Expansion Free will have one compute node and up to a total of eight nodes in a cluster; it is also limited to four sockets of processing in a machine and 1TB of aggregate main memory across the server nodes in the cluster. You can use SUSE Linux Enterprise Server 11, Red Hat Enterprise Linux 5 or 6, or Oracle Linux 5 with the freebie version, just like the full-on version.

The Memory Expansion Free edition will rely on community support, not 24x7 tech support from ScaleMP, and will have limited expandability. But it should still be useful for small installations and proofs of concept, and you can't beat the price.

Similar topics

Narrower topics


Other stories you might like

  • NASA's InSight doomed as Mars dust coats solar panels
    The little lander that couldn't (any longer)

    The Martian InSight lander will no longer be able to function within months as dust continues to pile up on its solar panels, starving it of energy, NASA reported on Tuesday.

    Launched from Earth in 2018, the six-metre-wide machine's mission was sent to study the Red Planet below its surface. InSight is armed with a range of instruments, including a robotic arm, seismometer, and a soil temperature sensor. Astronomers figured the data would help them understand how the rocky cores of planets in the Solar System formed and evolved over time.

    "InSight has transformed our understanding of the interiors of rocky planets and set the stage for future missions," Lori Glaze, director of NASA's Planetary Science Division, said in a statement. "We can apply what we've learned about Mars' inner structure to Earth, the Moon, Venus, and even rocky planets in other solar systems."

    Continue reading
  • The ‘substantial contributions’ Intel has promised to boost RISC-V adoption
    With the benefit of maybe revitalizing the x86 giant’s foundry business

    Analysis Here's something that would have seemed outlandish only a few years ago: to help fuel Intel's future growth, the x86 giant has vowed to do what it can to make the open-source RISC-V ISA worthy of widespread adoption.

    In a presentation, an Intel representative shared some details of how the chipmaker plans to contribute to RISC-V as part of its bet that the instruction set architecture will fuel growth for its revitalized contract chip manufacturing business.

    While Intel invested in RISC-V chip designer SiFive in 2018, the semiconductor titan's intentions with RISC-V evolved last year when it revealed that the contract manufacturing business key to its comeback, Intel Foundry Services, would be willing to make chips compatible with x86, Arm, and RISC-V ISAs. The chipmaker then announced in February it joined RISC-V International, the ISA's governing body, and launched a $1 billion innovation fund that will support chip designers, including those making RISC-V components.

    Continue reading
  • FBI warns of North Korean cyberspies posing as foreign IT workers
    Looking for tech talent? Kim Jong-un's friendly freelancers, at your service

    Pay close attention to that resume before offering that work contract.

    The FBI, in a joint advisory with the US government Departments of State and Treasury, has warned that North Korea's cyberspies are posing as non-North-Korean IT workers to bag Western jobs to advance Kim Jong-un's nefarious pursuits.

    In guidance [PDF] issued this week, the Feds warned that these techies often use fake IDs and other documents to pose as non-North-Korean nationals to gain freelance employment in North America, Europe, and east Asia. Additionally, North Korean IT workers may accept foreign contracts and then outsource those projects to non-North-Korean folks.

    Continue reading
  • Google opens the pod doors on Bay View campus
    A futuristic design won't make people want to come back – just ask Apple

    After nearly a decade of planning and five years of construction, Google is cutting the ribbon on its Bay View campus, the first that Google itself designed.

    The Bay View campus in Mountain View – slated to open this week – consists of two office buildings (one of which, Charleston East, is still under construction), 20 acres of open space, a 1,000-person event center and 240 short-term accommodations for Google employees. The search giant said the buildings at Bay View total 1.1 million square feet. For reference, that's less than half the size of Apple's spaceship. 

    The roofs on the two main buildings, which look like pavilions roofed in sails, were designed that way for a purpose: They're a network of 90,000 scale-like solar panels nicknamed "dragonscales" for their layout and shimmer. By scaling the tiles, Google said the design minimises damage from wind, rain and snow, and the sloped pavilion-like roof improves solar capture by adding additional curves in the roof. 

    Continue reading
  • Pentester pops open Tesla Model 3 using low-cost Bluetooth module
    Anything that uses proximity-based BLE is vulnerable, claim researchers

    Tesla Model 3 and Y owners, beware: the passive entry feature on your vehicle could potentially be hoodwinked by a relay attack, leading to the theft of the flash motor.

    Discovered and demonstrated by researchers at NCC Group, the technique involves relaying the Bluetooth Low Energy (BLE) signals from a smartphone that has been paired with a Tesla back to the vehicle. Far from simply unlocking the door, this hack lets a miscreant start the car and drive away, too.

    Essentially, what happens is this: the paired smartphone should be physically close by the Tesla to unlock it. NCC's technique involves one gadget near the paired phone, and another gadget near the car. The phone-side gadget relays signals from the phone to the car-side gadget, which forwards them to the vehicle to unlock and start it. This shouldn't normally happen because the phone and car are so far apart. The car has a defense mechanism – based on measuring transmission latency to detect that a paired device is too far away – that ideally prevents relayed signals from working, though this can be defeated by simply cutting the latency of the relay process.

    Continue reading
  • Google assuring open-source code to secure software supply chains
    Java and Python packages are the first on the list

    Google has a plan — and a new product plus a partnership with developer-focused security shop Snyk — that attempts to make it easier for enterprises to secure their open source software dependencies.

    The new service, announced today at the Google Cloud Security Summit, is called Assured Open Source Software. We're told it will initially focus on some Java and Python packages that Google's own developers prioritize in their workflows. 

    These two programming languages have "particularly high-risk profiles," Google Cloud Cloud VP and GM Sunil Potti said in response to The Register's questions. "Remember Log4j?" Yes, quite vividly.

    Continue reading
  • Rocket Lab is taking NASA's CAPSTONE to the Moon
    Mission to lunar orbit is further than any Photon satellite bus has gone before

    Rocket Lab has taken delivery of NASA's CAPSTONE spacecraft at its New Zealand launch pad ahead of a mission to the Moon.

    It's been quite a journey for CAPSTONE [Cislunar Autonomous Positioning System Technology Operations and Navigation Experiment], which was originally supposed to launch from Rocket Lab's US launchpad at Wallops Island in Virginia.

    The pad, Launch Complex 2, has been completed for a while now. However, delays in certifying Rocket Lab's Autonomous Flight Termination System (AFTS) pushed the move to Launch Complex 1 in Mahia, New Zealand.

    Continue reading
  • Alibaba Cloud adds third datacenter in Germany
    More Euro-presence than any other Chinese company, but still nowhere near Google or AWS

    Alibaba has pulled ahead of its Chinese rivals in Europe with the opening of a third datacenter in Germany.

    The company said the Frankfurt datacenter serves cloud computing products to Europe and "adheres to the highest security standards and the strict compliance regulations set out in the Cloud Computing Compliance Controls Catalog (C5) in Germany."

    The addition brings Alibaba Cloud to a network of 84 availability zones in 27 regions worldwide. The company's first European cloud center arrived in Frankfurt in 2016.

    Continue reading

Biting the hand that feeds IT © 1998–2022