US seeks exascale systems 10 times faster than current state-of-the-art computers
China claims to have 10 in the pipeline and may pull ahead in HPC arms race
The US Department of Energy is looking to vendors that will help build supercomputers up to 10 times faster than the recently inaugurated Frontier exascale system to come on stream between 2025 and 2030, and even more powerful systems than that for the 2030s.
These details were disclosed in a request for information (RFI) issued by the DoE for computing hardware and software vendors, system integrators and others to "assist the DoE national laboratories (labs) to plan, design, commission, and acquire the next generation of supercomputing systems in the 2025 to 2030 time frame."
Vendors have until the end of July to respond.
For this RFI, the DoE says it is interested in the deployment of one or more supercomputers that can solve scientific problems five to 10 times faster than current state-of-the-art systems "or solve more complex problems, such as those with more physics or requirements for higher fidelity."
The current state of the art is perhaps best represented by the Frontier exascale system installed in the Oak Ridge National Laboratory, which was declared operational at the end of May and clocked at 1.102 Linpack exaFLOPS of compute power, but which is expected to hit a peak theoretical performance in excess of 2 exaFLOPS in future.
In line with this, the DoE states that its rough estimate – based upon trends covering the past 20 years – includes traditional HPC systems at the 10-20 exaFLOPS level or beyond in the 2025+ time frame, and 100+ exaFLOPS and beyond in the 2030+ time frame, which it expects to be delivered "through hardware and software acceleration mechanisms."
Any such supercomputer will be expected to operate within a power envelope of 20-60MW, according to the DoE. For comparison, the Frontier system already consumes about 20MW, and can apparently hit a peak of over 30MW.
Systems should also be "sufficiently resilient" to hardware and software failures to minimize requirements for user intervention.
Interestingly, the DoE says that it is seeking to move away from "monolithic acquisitions" towards a model that would allow more rapid upgrade cycles of deployed systems to enable faster innovation on hardware and software.
One possible strategy that might be followed is an increased reuse of existing infrastructure so that the upgrades are modular, with an acquisition process that allows continuous injection of technological advances to facilities, perhaps every 12 to 24 months rather than on a four or five-year cycle.
- US weather forecasters triple supercomputing oomph with latest machines
- Lenovo, Barcelona Supercomputing Center sign joint research deal
- Businesses brace for quantum computing disruption by end of decade
- Germany to host Europe's first exascale supercomputer
This sounds somewhat similar to the approach that has been adopted for Europe's first publicly declared exascale computer, the Jupiter system being constructed in Germany by the European High Performance Computing Joint Undertaking (EuroHPC JU).
Jupiter will be based on a "dynamic, modular supercomputing architecture," according to the Forschungszentrum Jülich where it will be based, comprising a universal cluster module paired with a GPU accelerator module and storage modules, but it is planned to be expanded in future with additional modules that may include a quantum processing unit or a neuromorphic processing module.
The responses to this RFI will help the DoE and the national labs to update their long-term advanced computing roadmaps, as well as inform the requirements for one or more DoE system acquisition RFPs (request for proposal) for systems to be delivered in the 2025–2030 time frame.
The level of information requested from vendors by the DoE is quite detailed, covering not just the kind of processors, memory, storage, and interconnect options the vendors foresee using within the 2025 to 2030 time frame, but also what manufacturing processes they expect the chips to be made with, whether the processors will be some form of APU/XPU system-on-chip combination, the expectations for the bandwidth and power of interconnects, the potential node configuration, and so on.
Perhaps the DoE has been spurred on by fears that the US may fall behind China in the supercomputing arms race, as reported earlier this year.
While the US has three exascale systems in the pipeline, China aims to have up to 10 operational systems by 2025, and reports in the Financial Times claimed that US experts feared that China could beat them to important science and technology breakthroughs by fielding a larger number of exascale machines. ®