HPC bar goes lower and wider
Parallel computing pitches at the mainstream
The more things stay the same, the more things are likely to change, and clear evidence of that could be seen today at the announcement of the latest Top500 Supercomputers league tables at the International Supercomputer Conference in Dresden.
The tables, compiled every six months, show the fastest-performing systems installed anywhere in the world, based on the LinPack benchmark. This measures performance in Floating Point Operations per Second (otherwise known as a `Flops/s’). The most notable factor of the latest results, however, was not what systems won in the World, Europe and Asia categories. In fact, nothing changed, with the NEC/Sun/ClearSpeed/Voltaire collaborative system, Tsubame, still holding top slot in Asia, IBM’s Barcelona-based Mare Nostrum system sitting atop the European table and Blue Gene/L, IBM’s last world leader hiding away at the Lawrence Livermore Laboratories in the USA, still topping the world rankings with a performance rated at 280.6 TeraFlops/s.
This will no doubt be replaced by Blue Gene/P, just announced by IBM, in the next listing. But the more interesting results could be seen further down the list, and in the surrounding statistics. What they point to is that High Performance Computing (HPC) has suddenly reached the tipping point where it stops being an esoteric corner occupied by scientists and propeller heads and is about to move towards the mainstream of computing.
For example, IBM has dominated supercomputing for years – still does at the high end – but for the first time, HP has installed more of the Top 500 systems, with 40.6 per cent to IBM’s 38.4 per cent. But HP does not appear in the Top 50 at all, and IBM has supplied 41 per cent of the cumulative performance in the Top 500, while HP manages only 24.3 per cent.
One way of interpreting these figures is that HP can’t hack it as an HPC vendor, except that the statistics show a related trend – that 59 per cent of the Top500 are using dual-core, x86 architecture processors: mainly Intel Woodcrest devices, but 18 per cent coming from AMD and Opteron. Intel’s Itanium managed 5 per cent of the total, while IBM Power 4, Power 5 and PowerPC managed 12.2 per cent.
The trend, widely acknowledged at the conference, is that as HPC systems become based on commodity hardware devices, so the technology moves down and out into the mainstream of computing. It is already being used in financial circles for tasks such as risk analysis, and now mainstream companies such as Microsoft are showing a distinct interest in the area. The company has already got itself into the Top500 with its Compute Cluster, used by Mitsubishi UFJ Securities in Japan with a 448-node IBM BladeCenter HS21 cluster. Yet according to Kyril Faenov, Microsoft’s general manager of high performance computing, the Computing Cluster is actually being targeted as much at the low end, mainstream application as appearances in the traditional HPC league tables. To him, a `cluster’ is anything with more than one Node. And a Node is typically a dual multicore-processor server – or as he put it, “that which is managed by a single memory controller.”
HPC technology is indeed heading for a much wider user base, and more mainstream ground, and the arrival of the multicore processor is the primary driving force behind what many at the conference see as a fundamental shift in the core computing paradigm. Burton Smith, an HPC veteran as ex-chief scientist of Cray and now Technical Fellow at Microsoft charged with investigating and developing parallel computing, defined it in a keynote presentation as being at the beginning of the end for the assumptions surrounding the single-threaded Von Neumann architecture.
And multicore devices are arriving from several directions. Indeed, there is already a new class of them, according to Dr Erich Strohmaier of the Lawrence Berkeley National Laboratory in the USA, who announced the Top500 results. These are the many core devices with 100 or more cores. They are simpler cores than in the x86 architecture, but offer more flexibility because of that, especially for anyone with parallel programming skills.
Names already in the many core frame are Intel’s Polaris, with 80 cores per processor, ClearSpeed with 96 cores, nVidia’s G80 with 128 cores, and Cisco’s Metro, with 188 cores.®