This article is more than 1 year old
So you want to build the next Google. Who ya gonna call? Er, Big Blue?
IBM's cluster scheduler kicks OpenStack's Nova in teeth, eyes VMware
Analysis IBM has announced a new version of its Platform Resource Scheduler (PRS), which lines up jobs and resources in mammoth OpenStack Havana environments.
In doing so, Big Blue hopes to give enterprises a shot at achieving the same levels of efficiency as Google's highly tuned servers.
Though the tech competes against VMware's Distributed Resource Scheduler, it could become a credible general-purpose job scheduler to rival Google's secretive Borg and Omega systems, and the Apache Mesos project.
A resource scheduler and workload placer is a system that takes jobs, and figures out when to run them and where to run them to maximize IT utilization. It must also leave some spare capacity, rather than consume all the available infrastructure, to ensure there's redundancy to pick up from any failures. And it must hit its deadlines.
Google's Borg system is rumored to have been so good at this task juggling act that it saved the ad-slinger from building an entire data center.
IBM's resource scheduling tech is designed as a drop-in replacement for the scheduler within the Nova component of the open-source cloud manager OpenStack. Nova makes scheduling decisions according to information it stores during its setup, and it selects jobs for compute nodes whose configurations match various filters.
PRS, by contrast, uses the distributed agent framework in Big Blue's Platform Computing products, which considers realtime "machine and hypervisor loads" among other information when making decisions. Thus, PRS can look at the available compute capacity in realtime and make ongoing judgements when placing workloads. It can shift things around as needed using the underlying hypervisor's live migration ability.
"This means that as workloads and resources evolve, workload placement is automatically re-balanced," IBM marketing chap Gord Sissons told The Reg via email.
"The key benefits are: better quality of service in terms of performance and availability, because hypervisors are less likely to be over-subscribed; better utilization, since [virtual machines] can be packed more optimally while respecting service level requirements; and reduced administrator workload, since the re-balancing is automated.
"This is important as OpenStack environments get large. The real 'intellectual property' in the offering is in the pre-configured policies - the idea is that a cloud administrator can simply specify a policy like 'load balancing' or 'packing', and the scheduler will automatically seek to achieve the goal of the policy."
It'll babysit your 50,000 cores. If you can afford it
It's worth noting that this system is unlikely to have the capabilities of Google's Omega system, which is believed to draw on CPU-core-level telemetry from a system named CPI2, along with other Chocolate Factory innovations.
However, by drawing on other IBM technology such as Platform Symphony, it is able to gain some advanced abilities, such as the aforementioned distributed agent-based scheduling, which (we're told) lets IBM's tech "opportunistically 'borrow' resources not in use by different tenants - loaning, borrowing and pre-emption policies are specified in flexible resource sharing plans that can vary with time."
The whole system can also sit on top of IBM's well-regarded General Parallel File System, which gives it some capabilities more advanced than the main open-source equivalent, the Hadoop Distributed File System. Google is likely to field its own tech in this arena, but has published very little on it.
From what we understand, these capabilities mean IBM's PRS is more advanced than parts of the open-source Apache Mesos project – though at the cost of being proprietary and hence only having one major developer (IBM) driving the project.
One drawback of Big Blue's approach is its dependence on full virtualization, which means when passing information between two VMs on the same server there is an overhead. This compares with kernel-level direct transfers within Omega and Mesos thanks to containerization via cgroups
, and so on.
IBM says it already has some customers running in the range of 50,000-cores – hardly Google, but not insignificant.
Though the technology strikes this hack as being handy for the few companies out there with boisterous, instance-filled OpenStack environments not already under some kind of scheduler, it seems unlikely it can maintain feature parity with the open-source scheduler and resource placer Apache Mesos.
Mesos is already in wide use at Twitter – the company hired Benjamin Hindman, co-creator of the tech, recently – and has also been used by trendy room-renting network Airbnb. IBM argues that the Mesos project as it stands is immature – true, but with hefty resources behind it, that may not remain the case.
The prerequisites for enterprises wanting to have a nibble at IBM's answer to Google's most advanced system is the use of IBM Power Systems or IBM System x (including iDataPlex), Red Hat Enterprise Linux 6.3, and IBM SmartCloud Entry V3.2.
Though many view IBM's recent OpenStack love-in as more marketing than substance, this release shows that in some parts of Big Blue's titanic organization, some very clever people are working to supercharge the open-source project – for a price. ®