HPC

HPE donates 3 mini-supercomputers to UK universities boning up on Arm

Muscling in Arm supers message on road to exascale

Got Tips? 12 Reg comments

HPE is donating three Apollo mini-supercomputer clusters to a trio of UK universities to help build Arm supercomputing expertise and promote its Apollo gear.

The universities are the Edinburgh University's Edinburgh Parallel Computing Centre (EPCC), the University of Bristol, and the University of Leicester. Installation should be completed this summer 2018 as part of a three-year project called Catalyst UK.

The largely identical clusters at each university – designed, built and supported by HPE – will consist of 64 HPE Apollo 70 systems, each with two 32-core Cavium ThunderX2 processors, 128GB of memory composed of 16 DDR4 DIMMs with Mellanox InfiniBand interconnects.

The OS is SUSE Linux Enterprise Server for HPC. Each cluster is expected to occupy two computer racks and draw a total of around 30KW of power.

It means 4,096 cores per installation and 12,288 cores in total.

Professor Simon McIntosh-Smith, Bristol University’s Head of the HPC Research Group, said: "Bristol's early experience with Arm via the EPSRC-funded GW4 Isambard project, and the European FP7-funded Mont-Blanc 2 project gave us the confidence to explore deploying Arm-based supercomputers for real workloads in a production environment... the HPE Apollo 70 HPC systems will, for the first time, enable us to apply that experience to explore scaling across InfiniBand. We expect these results to be of great interest to our industrial and academic partners.”

Professor Mark Parsons, director at the EPCC at the University of Edinburgh, added in a canned quote: “EPCC is really pleased to be involved in the Catalyst UK programme … this will be our first large-scale Arm-based supercomputer. If Arm processors are to be successful as a supercomputing technology we need to build a strong software eco-system and EPCC will port many of the UK’s key scientific applications to our HPE Apollo 70 system.”

The Leicester University Science & Technology Facilities Council DiRAC HPC Facility director, Dr Mark Wilkinson, said the Catalyst UK initiative would allow the facility to explore the potential of Arm-based systems to support HPC workflows: “including simulations of gravitational waves and planet formation, earth observation science models and fundamental particle physics calculations.”

Having an Arm-based cluster in its training portfolio will help “ensure that the next generation of UK HPC experts, both in industry and academia, have the necessary skills to exploit the most appropriate and cost-effective hardware when solving the most complex research problems.”

HPE says the Catalyst UK programme will cooperate with the UK industry to jointly develop applications and workflows to exploit Arm system capabilities. It will provide training for researchers, equipping them with knowledge and skills to work with Arm-based systems in the future, with a specific focus, HPE says, on exascale computing, ie, computers that can execute a billion billion calculations per second.

UK Exascale supercomputer

These might appear a million miles away in scale from these 64-node Apollo clusters, but provide a skills and expertise path to these more complicated machines for UK-based researchers and HPC workers.

Fujitsu’s monster ARM-powered exascale supercomputer Post-K, for example, will use ARMv8 + extensions, scalable custom CPU cores that support FP16 half-precision maths operations (more about that here at our sister publication The Next Platform), a node count greater than 10,000, and a power consumption approaching 30MW, compare that to the 30kW draw of the Apollo clusters.

Professor Parsons said a UK exascale computer could consume 30MW of power and cost between £450m and £500m over five years, needing 200 to 300 racks. To put that in context, he said the the UK's CERN contribution was £132m in 2017. Could the UK afford an exascale system? "It would be a major increase in the current UK academic investment in HPC."

Can you make the argument industrially and scientifically for this? "Yes. And it needs to be made."

The prof told us he thinks the UK could have an exascale system by the mid-2020s, with tens of millions if not hundreds of millions of cores. It could run 100 million to 500 million threads: “No one knows how to use such a system,” he said – and that’s why learning now is so important.

Prof Parsons believes there has been a huge lack of innovation in HPC, both hardware and software, and said innovation is needed for exascale. The use of Arm processors will help spur that.

We asked if x86 development had been stalled. He said: ”Any big company needs a challenger.”

He reckons the UK isn’t competitive in terms of current HPC spending: Germany, for example, is spending far more. Japan, China, the US and Europe are all getting their act together. Parsons told us: “I would argue the UK should have one or two exascale systems or we will be left behind.”

David Lecomber, senior director Infrastructure/HPC Tools at Softbank-owned Arm said: “I think it will provide value for money - it should be something we should afford.”

Before you can run one of these things, though, if you can afford to buy it, you have to learn to walk, and that’s what the Catalyst UK programme is about for the three universities. ®

Sponsored: Ransomware has gone nuclear

SUBSCRIBE TO OUR WEEKLY TECH NEWSLETTER


Keep Reading

Handout pic of Al Worden

Apollo astronaut Al Worden – once named most isolated human being of all time – dies aged 88

Flew Apollo 15 Command Module to an inhuman apogee, made first deep space EVA
A graphic of what the El Capitan supercomputer will look like

Look at me. Look at me. I'm the El Capitan now: Cray to build US govt's $600m cray-cray exascale nuke app super

1.5 EFLOPS monster will chew through simulations, modeling, and more, when it, fingers crossed, spins up in 2023
Nvidia GeForce

Nvidia's $6.9bn Mellanox munch gets closer after Chinese regulators sign off

On track to do the deed towards the end of the month
Apollo 13 Launch (pic: NASA / JSC)

Apollo 13 set off into space 50 years ago today. An ignored change order ensured it did not make it to the Moon...

Part one A liquid-oxygen tank, 65 volts across a 28-volt thermostat, and a two-inch tumble all led to this 'successful failure'
Aurora supercomputer

Hey, US taxpayers. Filed your taxes? Good, good. $500m of it is going on an Intel-Cray exascale boffinry supercomputer

Well, that Knights Hill 2018 dream didn't work out, so let's shoot for 2021 instead
An Nvidia graphics processor chip

Nvidia's multi-billion-dollar buying spree continues as it slurps up Cumulus soon after swallowing Mellanox

Side effects may include rapid HPC and large AI workloads

Commercial spinoffs of Fujitsu's Post-K super 'puter will hit shelves long before exascale daddy switched on

HPC goodness sure to cost an Arm and a leg
Jensen_Huang

Tales from the crypt-oh: Nvidia accused of concealing $1bn in coin-mining GPU sales as gaming revenue

Lawsuit filed by shareholders who thought chip biz was onto something long-term rather than serving a fad

Biting the hand that feeds IT © 1998–2020