We need a 20MW 20,000-GPU-strong machine-learning supercomputer to build EU's planned digital twin of Earth

And this machine will be used to *check notes* study climate change. Study it or cause it?

Computer scientists attempting to build computational replicas of Earth to tackle climate change and environmental disasters reckon they will need a 20MW supercomputer with 20,000 GPUs to run a full-scale simulation.

Starting mid-2021, the boffins will embark upon a seven-to-ten-year mission to create and deploy Destination Earth, or DestinE for short, which will be part of a €1tn (£868bn, $1.2tn) investment in green technologies by the European Union.

And it is at the core of DestinE, we're told, you'll eventually find that GPU-stuffed supercomputer: a federated system capable of running artificial intelligence, data analytics, and other applications. Crucially, this super will bring together so-called digital twins of Earth, which are numerical models of our home world that simulate and forecast the weather and climate, ocean currents and polar caps, food and water supplies, the effect of humans on the environment, and so on.

These aim is to help scientists, politicians, and the public understand the role nature and humans will play in shaping the planet's future, and help the EU reach its goal of becoming carbon neutral by 2050 through policy decisions. By 2025, the team hopes to have four or five digital twins running, and by 2030, "a full digital twin of the Earth through a convergence of the digital twins already offered."

I can run through the data in my digital twin and check whether the dike will in all likelihood still protect against expected extreme events

"If you are planning a two-​metre high dike in The Netherlands, for example, I can run through the data in my digital twin and check whether the dike will in all likelihood still protect against expected extreme events in 2050," said Peter Bauer, deputy director for Research at the European Centre for Medium-​Range Weather Forecasts (ECMWF) and co-​initiator of Destination Earth initiative, this week.

Computer scientists will be able to tweak the digital twins to assess hypothetical scenarios, such as the outcome of adding wind farms in various locations across Europe or where crops should be best grown in the future as conditions change. The DestinE team wants to map all the processes unfolding on Earth’s surface, and be able to inspect the model down to a kilometre by kilometre scale. In order for the digital twins to be accurate recreations, they will need to ingest data from numerous sources.

“First and foremost, a digital twin would become a data assimilation instrument that continuously cycles real-time, highly detailed, high-resolution Earth system simulations and ingests observational information from all possible instruments — including novel observatories like miniaturized satellites, drones in the Arctic, undersea cables and buoys, smart sensor arrays in crop fields and mobile phones within the expanding internet of things — also to estimate uncertain model parameters and surrogate missing process details,” states a paper describing the project published in Nature.

The sheer amount of data from these instruments and devices will require a non-trivial level of compute to process, Bauer and his colleagues at the ECMWF and ETH Zurich warned in a second study also published in Nature.

More power!

The team figured primarily GPU-accelerated machines are the way to go versus purely CPU or a balance of CPU-GPU. After weighing up benchmarks, and considering what technology will be developed in the near future, they believe to run a full-scale digital twin of Earth, they will need something four times the scale of today's Piz Daint 25-plus petaFLOPS supercomputer, which has 5,000 Nvidia Tesla P100 GPU accelerators.

“Extrapolating this to near-future technology produces an estimate of a remaining shortfall factor of four thus requiring about 20,000 GPUs to perform the digital-twin calculations with the necessary throughput,” they wrote. “This machine would have a power envelope of about 20MW.”


US Air Force boots up not one but two AMD-powered supercomputers after five years of Intel Haswell CPUs


For comparison, America's 10MW Summit, the second most-powerful-known super listed in the latest top 500 rankings, has a peak theoretical performance of about 200 petaFLOPs and today sports more than 27,000 Nvidia Tesla V100 GPUs. It was built by IBM for the Oak Ridge National Laboratory.

The irony of building an energy-intensive system to tackle climate change was not lost on the researchers. They noted in their paper that the future super should be built at a location where its nodes can run on more renewable energy sources: “According to the US Environmental Protection Agency, which accounts about 1,000 lb CO2 output per MWh, such a simulation machine, if it was built in places where only ‘dirty’ power is available, would produce substantial amounts of CO2 per year.”

In order to keep the digital twin running as efficient as possible, the model’s software will need to be boosted by machine learning, we're told. The boffins believe that it will require a mix of traditional climate modelling techniques and AI algorithms to operate the various physical models. It's likely neural networks will deal with processing the incoming data and crunching it before it is passed onto the mathematical models.

“The majority of the weather and climate community remains skeptical regarding the use of black-box deep-learning tools for predictions and aims for hybrid modeling approaches that couple physical process models with the versatility of data-driven machine learning tools to achieve the best results,” the study noted. ®

Biting the hand that feeds IT © 1998–2021