DoE drops $23M in effort to reinvigorate supercomputing
Challenges span energy efficiency, memory, programmability, and national security
The US Department of Energy has launched a $23 million program aimed at overcoming a litany of supercomputing performance bottlenecks.
Under the New Frontiers initiative, the agency will solicit the help of private companies to develop technologies that could help to scale compute more efficiently for the next generation of post-exascale supercomputers.
"There is a growing consensus that urgent action is needed to address the array of bottlenecks in advanced computing, including energy efficiency, advanced memory, interconnects, and programmability to maintain economic leadership and national security," Ceren Susut, associate director of the DoE's Office of Science for Advanced Scientific Computing Research, explained in a statement.
Modern DoE supercomputers are built using tens of thousands of accelerators — more than 60,000 in the case of Argonne National Laboratory's Aurora system, which crossed the exaFLOP barrier this spring. But while the peak output of these systems can be easily calculated by tallying up the floating point performance of each chip, bandwidth bottlenecks in the interconnects used to stitch them together mean these figures are rarely, if ever, achieved in the real world.
What's more, the amount of power required to achieve generational performance gains is growing larger, according to Oak Ridge National Laboratory's Christopher Zimmer, who will head up the New Frontier's project.
"With Dennard scaling long dead and the slowing of Moore's law, we're seeing technologies critical to HPC consuming more power that partially offset increases in application performance due to improvements in silicon process nodes and improved packaging techniques," Zimmer said.
To overcome these trends, the DoE is investing in new and emerging technologies which could enter production within the next five to 10 years and includes a 40 percent cost share option.
While the funding opportunity isn't specific as to which technologies the program is targeting, likely candidates include photonics interconnects and advanced packaging. These have already shown potential for mitigating interconnect and memory bandwidth bottlenecks, which prevent modern systems from achieving peak performance.
Over the past few years, we've seen a flurry of development around the use of co-packaged optical interconnects, like those being developed by Broadcom, Intel, Ayar Labs, and others to bolster the bandwidth and distance of chip-to-chip and even chip-to-memory communications. Last month, Broadcom shared its work on co-packaging optical interconnects with GPUs at the annual Hot Chips conference at Stanford.
- Oak Ridge boffins enlist Quantum Brilliance to make supercomputers sparkle at room temp
- DoD spins up supercomputer to accelerate biothreat defense
- Japan's Fugaku supercomputer released in virtual version that runs in AWS
- China stops worrying about lack of GPUs and learns to love the supercomputer
While $23 million may not seem like much, especially when split between multiple companies, these technologies also have practical applications for scaling AI clusters, which will no doubt make them prime candidates for VC funding, if they aren't already.
The DoE request for proposal comes just days after the agency awarded $68 million in funding to 43 AI projects over three years. These projects encompass a wide variety of use cases, from the development of foundation models from computational research to lab automation and accelerating scientific programming. ®