SambaNova injects a little AI mojo into US supercomputer lab's nuke sims

LLNL harnesses DataScale platform with aim to improve predictive models

AI systems developer SambaNova Systems has announced that Lawrence Livermore National Laboratory (LLNL) is integrating its platform into the lab's supercomputing facilities to boost its cognitive simulation capabilities. The move follows other top-tier research laboratories that have deployed SambaNova technology.

LLNL is one of the major federal research sites focused on the safety and effectiveness of the US nuclear deterrent, which involves a great deal of high-performance modeling and simulation work.

The facility is also set to become the home for the National Nuclear Security Administration's (NNSA) first exascale supercomputer system, El Capitan, which is due to come online sometime this year.

According to SambaNova, the institution has been studying how neural networks may be used to accelerate traditional physics-based simulations as part of the NNSA's Advanced Simulation and Computing program, and this is where SambaNova comes in.

The company's DataScale platform is an integrated hardware and software system designed specifically for machine learning workloads, and has been previously claimed by the company to be six times faster than GPU-based systems, including Nvidia's DGX A100 servers.

Cognitive simulation, according to LLNL, is about using machine learning to improve predictive models. The need for this is driven by a desire to improve and advance predictive simulations, which increasingly rely on experiments that produce huge volumes of highly complex data.

"Multi-physics simulation is complex," said LLNL computer scientist and Informatics Group Lead Brian Van Essen. He said that the lab's inertial confinement fusion (ICF) experiments generate huge volumes of data, but connecting the underlying physics to the experimental data can prove extremely difficult.

"AI techniques hold the key to teaching existing models to better mirror experimental models and to create an improved feedback loop between the experiments and models," he explained.

SambaNova has been working with LLNL since 2020, when the two organizations integrated DataScale hardware directly into the Corona supercomputer. While that kicked off the use of machine learning to improve productivity, this next stage sees the system less tightly integrated with the supercomputing clusters, delivering a more generalized solution that expands the possible use cases, SambaNova claimed.

"SambaNova has a different architecture than CPU or GPU-based systems, which we are leveraging to create an enhanced approach for CogSim that leverages a heterogeneous system combining the SambaNova DataScale with our supercomputing clusters," said Bronis de Supinski, CTO for Livermore Computing (LC), which operates LLNL's Computing Center.

Each DataScale system is in fact built around what SambaNova calls a Reconfigurable Dataflow Unit (RDU) chip. This is composed of a grid of configurable compute and memory elements linked by an on-chip communication fabric, which is configurable so the flow of data through the chip elements mirrors the dataflow graph of the machine learning algorithm it is running.

Earlier this year, SambaNova's DataScale platform was also picked by Japan's RIKEN scientific research institute to be integrated with the Fugaku supercomputer for a fusion of HPC simulations and AI. It has also been adopted by the Argonne National Laboratory in the US for a similar purpose.

In related HPC news, the world's first exascale supercomputer system, Frontier, is now open to full user operations, according to the Oak Ridge National Laboratory (ORNL) in Tennessee, where it is based.

Frontier debuted in May last year as the fastest computer on the planet and the first to break the exascale barrier at 1.1 exaFLOPS. Now the system has been tweaked in performance by 92 petaFLOPS to achieve a High-Performance Linpack score of 1.194 exaFLOPS, and ORNL engineers said they believe that the system's performance may ultimately exceed 1.4 exaFLOPS.

Some of the studies under way on Frontier include ExaSMR, which aims to use exascale compute power to simulate modular nuclear reactors that would be smaller and safer than today's nuclear power plants, more accurate and detailed predictions of climate change and its impacts, and simulations of the physics and tectonic conditions that cause earthquakes. ®

More about

TIP US OFF

Send us news


Other stories you might like