HPE is donating three Apollo mini-supercomputer clusters to a trio of UK universities to help build Arm supercomputing expertise and promote its Apollo gear.
The universities are the Edinburgh University's Edinburgh Parallel Computing Centre (EPCC), the University of Bristol, and the University of Leicester. Installation should be completed this summer 2018 as part of a three-year project called Catalyst UK.
The largely identical clusters at each university – designed, built and supported by HPE – will consist of 64 HPE Apollo 70 systems, each with two 32-core Cavium ThunderX2 processors, 128GB of memory composed of 16 DDR4 DIMMs with Mellanox InfiniBand interconnects.
The OS is SUSE Linux Enterprise Server for HPC. Each cluster is expected to occupy two computer racks and draw a total of around 30KW of power.
It means 4,096 cores per installation and 12,288 cores in total.
Professor Simon McIntosh-Smith, Bristol University’s Head of the HPC Research Group, said: "Bristol's early experience with Arm via the EPSRC-funded GW4 Isambard project, and the European FP7-funded Mont-Blanc 2 project gave us the confidence to explore deploying Arm-based supercomputers for real workloads in a production environment... the HPE Apollo 70 HPC systems will, for the first time, enable us to apply that experience to explore scaling across InfiniBand. We expect these results to be of great interest to our industrial and academic partners.”
Professor Mark Parsons, director at the EPCC at the University of Edinburgh, added in a canned quote: “EPCC is really pleased to be involved in the Catalyst UK programme … this will be our first large-scale Arm-based supercomputer. If Arm processors are to be successful as a supercomputing technology we need to build a strong software eco-system and EPCC will port many of the UK’s key scientific applications to our HPE Apollo 70 system.”
The Leicester University Science & Technology Facilities Council DiRAC HPC Facility director, Dr Mark Wilkinson, said the Catalyst UK initiative would allow the facility to explore the potential of Arm-based systems to support HPC workflows: “including simulations of gravitational waves and planet formation, earth observation science models and fundamental particle physics calculations.”
Having an Arm-based cluster in its training portfolio will help “ensure that the next generation of UK HPC experts, both in industry and academia, have the necessary skills to exploit the most appropriate and cost-effective hardware when solving the most complex research problems.”
HPE says the Catalyst UK programme will cooperate with the UK industry to jointly develop applications and workflows to exploit Arm system capabilities. It will provide training for researchers, equipping them with knowledge and skills to work with Arm-based systems in the future, with a specific focus, HPE says, on exascale computing, ie, computers that can execute a billion billion calculations per second.
UK Exascale supercomputer
These might appear a million miles away in scale from these 64-node Apollo clusters, but provide a skills and expertise path to these more complicated machines for UK-based researchers and HPC workers.
Fujitsu’s monster ARM-powered exascale supercomputer Post-K, for example, will use ARMv8 + extensions, scalable custom CPU cores that support FP16 half-precision maths operations (more about that here at our sister publication The Next Platform), a node count greater than 10,000, and a power consumption approaching 30MW, compare that to the 30kW draw of the Apollo clusters.
Professor Parsons said a UK exascale computer could consume 30MW of power and cost between £450m and £500m over five years, needing 200 to 300 racks. To put that in context, he said the the UK's CERN contribution was £132m in 2017. Could the UK afford an exascale system? "It would be a major increase in the current UK academic investment in HPC."
Can you make the argument industrially and scientifically for this? "Yes. And it needs to be made."
The prof told us he thinks the UK could have an exascale system by the mid-2020s, with tens of millions if not hundreds of millions of cores. It could run 100 million to 500 million threads: “No one knows how to use such a system,” he said – and that’s why learning now is so important.
Prof Parsons believes there has been a huge lack of innovation in HPC, both hardware and software, and said innovation is needed for exascale. The use of Arm processors will help spur that.
We asked if x86 development had been stalled. He said: ”Any big company needs a challenger.”
He reckons the UK isn’t competitive in terms of current HPC spending: Germany, for example, is spending far more. Japan, China, the US and Europe are all getting their act together. Parsons told us: “I would argue the UK should have one or two exascale systems or we will be left behind.”
David Lecomber, senior director Infrastructure/HPC Tools at Softbank-owned Arm said: “I think it will provide value for money - it should be something we should afford.”
Before you can run one of these things, though, if you can afford to buy it, you have to learn to walk, and that’s what the Catalyst UK programme is about for the three universities. ®