Microsoft and Novell tag team on HPC

Windows on the supercomputing world


Comment The old Microsoft strategy of "embrace, extend, and extinguish" is just not going to fly in the snooty high performance computing market. Microsoft needs partners and Windows needs to coexist with Linux if the company wants to get anything more than a token share of real HPC work, which is why the company is talking up its interoperability work with Novell at the International Super Computing 2010 conference in Hamburg, Germany this week.

As we all know, Microsoft got to where it is today by crushing the competition in that triple-E bear hug. But with Windows HPC Server 2008 variant of its server platform, the company has little choice but to adopt an approach we'll call "interoperate, cooperate, and perhaps dominate," and that means not just cooperating with Linux suppliers such as Red Hat and Novell (which each have interoperability partnerships with Microsoft), but also with cluster management tool providers, so it can get into position to be a second boot option on x64-based clusters.

Microsoft and Novell were at ISC talking up the work they have done in their joint interoperability lab in Cambridge, Massachusetts, and the 33 joint customers they have running both Windows and Linux on HPC clusters. The two companies have worked with cluster management software maker Adaptive Computing to come up with a rapid dual-boot setup that lets clusters quickly shift nodes from Linux to Windows and back as workloads shift. The Rocky Mountain Supercomputing Center in Butte, Montana (which has a modest 3.2 teraflops cluster supporting Red Hat Enterprise Linux and Windows HPC Server) and the Centre for High Performance Computing (CHPC) in Cape Town, South Africa (which is runs a mix of Linux, Windows, and Unix clusters under control of Adaptive Computing's Moab 5.4 workload management tool) were singled out in Microsoft's interoperability blog as examples of Windows and Linux getting along.

While it is not polite to call someone's supercomputer cluster puny - size is all relative to the job that needs to get done, of course - Microsoft is cooperating with Novell, Adaptive Computing, and others because at this point, there is not really a good technical reason why the vast majority of x64-clusters running Linux could not be converted from static Linux machines to dynamic Linux-Windows images - provided there are applications driving HPC shops to consider Windows. The ability to do quick dual-booting is a first step to getting a broader portfolio of HPC apps running on Windows and then seeing more use of Windows on clusters.

In theory, giving Microsoft more money and power. It is unclear how snobby HPC shops are about closed source Windows after more than a decade of endorsing open source Linux (and dumping closed source Unixes), but history has shown that at the right price (rapidly approaching zero), HPC customers will happily switch hardware architectures and software platforms.

What does Microsoft get out of this? Every second Windows is running on an HPC cluster node is a second it is not running Linux. What does Novell get out of it? Continued association with Microsoft and its marketing machine and a hope that Novell can become the preferred Linux in the dual-boot cluster world Microsoft is trying to foment.

As El Reg previously reported, the latest Top 500 supercomputer rankings came out this week. Of the 500 machines on the list, 403 of them use x64 processors from Intel, 47 use x64 processors from Advanced Micro Devices, and five use Itanium processors from Intel. All of these machines, which represent an aggregate of 26.3 petaflops of aggregate number-crunching power (or 81.2 per cent of the total oomph embodied in the Top 500 list), could in theory support Windows HPC Server 2008.

Five of the machines actually do use Windows HPC Server as their dominant OS, as you can see from this clever graphic put together by the BBC in its coverage of the Top 500 rankings, and Linux is by far the dominant operating system across all processor architectures used in the supers comprising the list. Windows has about 412.6 teraflops of aggregate flops as measured by the Linpack Fortran matrix math test that is used to do the Top 500 rankings, about 1.3 per cent of the 32.4 petaflops on the list. Linux accounts for 91 per cent of the flops (27.2 petaflops), Unix gets 4.6 per cent (1.6 petaflops), and there is another 3.4 per cent that have mixed environments (generally a mix of Unix and Linux).

Windows HPC Server has a long, long way to go to get even a threatening share of installs on the Top 500 list, but with the R2 update of this code, Microsoft says the performance of Windows on Message Passing Interface (MPI) clustering software will be close to parity with Linux when it ships later this year. (The code went into its second beta in early April). Microsoft says that it has thousands of customers who have Windows clusters running real HPC work and that nearly 100 of the key HPC software houses have their code ported to Windows HPC Server too.

The Windows revolution in HPC, if there is indeed one, seems to be coming from the bottom up. But there's some action now at the top too. The Tokyo Institute of Technology is building a 2.4 petaflops hybrid CPU-GPU cluster called Tsubame 2.0 that will dual boot Windows and Linux, and this could be the wave of the future. (The 180.6 teraflops "Magic Cube" Opteron cluster at the Shanghai Supercomputer Center in China is the largest current Windows cluster in the world).

As the rapid rise of Linux in the HPC community shows, if there is any compelling advantage for using a different piece of software or hardware, these HPC folks are just the ones who will drop any technology like a hot potato and move on to something else. If Microsoft can come up with tools that do a better job of dispatching work to GPUs than the Linux stack, this could do the trick. ®

Similar topics


Other stories you might like

  • Google Pixel 6, 6 Pro Android 12 smartphone launch marred by shopping cart crashes

    Chocolate Factory talks up Tensor mobile SoC, Titan M2 security ... for those who can get them

    Google held a virtual event on Tuesday to introduce its latest Android phones, the Pixel 6 and 6 Pro, which are based on a Google-designed Tensor system-on-a-chip (SoC).

    "We're getting the most out of leading edge hardware and software, and AI," said Rick Osterloh, SVP of devices and services at Google. "The brains of our new Pixel lineup is Google Tensor, a mobile system on a chip that we designed specifically around our ambient computing vision and Google's work in AI."

    This latest Tensor SoC has dual Arm Cortex-X1 CPU cores running at 2.8GHz to handle application threads that need a lot of oomph, two Cortex-A76 cores at 2.25GHz for more modest workloads, and four 1.8GHz workhorse Cortex-A55 cores for lighter, less-energy-intensive tasks.

    Continue reading
  • BlackMatter ransomware gang will target agriculture for its next harvest – Uncle Sam

    What was that about hackable tractors?

    The US CISA cybersecurity agency has warned that the Darkside ransomware gang, aka BlackMatter, has been targeting American food and agriculture businesses – and urges security pros to be on the lookout for indicators of compromise.

    Well known in Western infosec circles for causing the shutdown of the US Colonial Pipeline, Darkside's apparent rebranding as BlackMatter after promising to go away for good in the wake of the pipeline hack hasn't slowed their criminal extortion down at all.

    "Ransomware attacks against critical infrastructure entities could directly affect consumer access to critical infrastructure services; therefore, CISA, the FBI, and NSA urge all organizations, including critical infrastructure organizations, to implement the recommendations listed in the Mitigations section of this joint advisory," said the agencies in an alert published on the CISA website.

    Continue reading
  • It's heeere: Node.js 17 is out – but not for production use, says dev team

    EcmaScript 6 modules will not stop growing use of Node, claims chair of Technical Steering Committee

    Node.js 17 is out, loaded with OpenSSL 3 and other new features, but it is not intended for use in production – and the promotion for Node.js 16 to an LTS release, expected soon, may be more important to most developers.

    The release cycle is based on six-monthly major versions, with only the even numbers becoming LTS (long term support) editions. The rule is that a new even-numbered release becomes LTS six months later. All releases get six months of support. This means that Node.js 17 is primarily for testing and experimentation, but also that Node.js 16 (released in April) is about to become LTS. New features in 16 included version 9.0 of the V8 JavaScript engine and prebuilt Apple silicon binaries.

    "We put together the LTS release process almost five years ago, it works quite well in that we're balancing [the fact] that some people want the latest, others prefer to have things be stable… when we go LTS," Red Hat's Michael Dawson, chair of the Node.js Technical Steering Committee, told The Register.

    Continue reading

Biting the hand that feeds IT © 1998–2021