HPC

Future supers pop up on $636m cash wishlist to get exascale beasts prowling on US soil

Oak Ridge, Lawrence Livermore labs slated for beastmode kit in 2021-2023


The two new mystery exascale computing systems known only as Frontier and El Capitan popped up on a budget request last week. They are being developed by the US government and have been slated for deployment in 2022 and 2023.

If American Congresspeople were to say yes to the budget, the DoE would get $636m towards current R&D work on exascale computing, including a $23m cushion to prep the Lawrence Livermore National lab for the coming of El Capitan in five years' time.

However, as our sister publication The Next Platform has noted, scientists shouldn't start mentally spending the cash:

The proposal is not law or policy, and over the past decade, Congress has tended to essentially ignore the budget proposals from presidents and create its own spending plans. And in a highly divided and increasingly partisan Congress, even doing that has been difficult, with government funding often being done through weeks- or months-long emergency continuing resolutions that tend to keep budgets at previous levels.

The administration's FY2019 budget request last week included $636m in funding for the Department of Energy's Exascale Computing Project, $376m up on FY2017 enacted levels.

There are three exascale systems mentioned:

  • Aurora – Intel/Cray-based to be delivered in 2021 at Argonne National Laboratory (ANL) and already known about
  • Frontier – for 2021-2022 delivery to Oak Ridge National Laboratory (ORNL)
  • El Capitan – to be delivered to the Lawrence Livermore National Laboratory (LLNL) around 2023, with funding under the National Nuclear Security Administration's (NNSA) Advanced Simulation & Computing Capital

Some Exascale Computing Project (ECP) presentations shed further light on the matter.

The 180PFlop Aurora was supposed to go live this year but has been delayed since Intel stopped developing the Knights Hill gen 3 Phi processors upon which it depended. The US govt pushed back the deadline to 2021, by which time Chipzilla must rework its processors to bring Aurora up to 1,000 PFLOPs.

Frontier and El Capitan are mentioned in an ECP Update slide deck (PDF):

ECP_new_exascale_supers_Mar_2018

We understand El Capitan is in an initial development phase. It and Frontier have no defined architecture or suppliers yet. We might suppose that, since Intel and Cray are working on Aurora, that combinations of the other four hardware suppliers to ECP – AMD, HPE, IBM and Nvidia – might be involved.

ECP_Pathforward

The DoE has two departments with significant supercomputing spending – the Office of Science and the NNSA.

The budget request splits a $578m pool between further research funding for exascale and quantum computing - the latter scoops $105m. It is earmarked to “address the emerging urgency of building U.S. competency and competitiveness in the developing area of quantum information science, including quantum computing and quantum sensor technology.”

The DoE will also develop a software stack for both exascale platforms, and to support additional co-design centres in preparation for exascale deployment in 2021.

Performance measurement

According to a DoE presentation (PDF) an exascale system – a computer that can hit at least one exa-FLOPS, or a billion billion floating-point math calculations per second – will deliver 50x the performance of today's 20 petaFLOPS systems, operate in a 20-30MW power envelope, and have a perceived fault rate of one a week or less.

There is no LINPACK or peak FLOPS target. Instead Figures of Merit (FOMs) will be defined:

ECP_FOMs_Mar_2018

+RegComment

Intel has staggered the ECP's schedule with its Knight Landing gen 3 cancellation. There is now no clear understanding of the hardware elements and architecture for the Aurora, Frontier and El Capitan exascale triplets. There are just three years until Aurora sets down at Argonne. Care to bet it will be on time? ®


Other stories you might like

  • DuckDuckGo tries to explain why its browsers won't block some Microsoft web trackers
    Meanwhile, Tails 5.0 users told to stop what they're doing over Firefox flaw

    DuckDuckGo promises privacy to users of its Android, iOS browsers, and macOS browsers – yet it allows certain data to flow from third-party websites to Microsoft-owned services.

    Security researcher Zach Edwards recently conducted an audit of DuckDuckGo's mobile browsers and found that, contrary to expectations, they do not block Meta's Workplace domain, for example, from sending information to Microsoft's Bing and LinkedIn domains.

    Specifically, DuckDuckGo's software didn't stop Microsoft's trackers on the Workplace page from blabbing information about the user to Bing and LinkedIn for tailored advertising purposes. Other trackers, such as Google's, are blocked.

    Continue reading
  • Despite 'key' partnership with AWS, Meta taps up Microsoft Azure for AI work
    Someone got Zuck'd

    Meta’s AI business unit set up shop in Microsoft Azure this week and announced a strategic partnership it says will advance PyTorch development on the public cloud.

    The deal [PDF] will see Mark Zuckerberg’s umbrella company deploy machine-learning workloads on thousands of Nvidia GPUs running in Azure. While a win for Microsoft, the partnership calls in to question just how strong Meta’s commitment to Amazon Web Services (AWS) really is.

    Back in those long-gone days of December, Meta named AWS as its “key long-term strategic cloud provider." As part of that, Meta promised that if it bought any companies that used AWS, it would continue to support their use of Amazon's cloud, rather than force them off into its own private datacenters. The pact also included a vow to expand Meta’s consumption of Amazon’s cloud-based compute, storage, database, and security services.

    Continue reading
  • Atos pushes out HPC cloud services based on Nimbix tech
    Moore's Law got you down? Throw everything at the problem! Quantum, AI, cloud...

    IT services biz Atos has introduced a suite of cloud-based high-performance computing (HPC) services, based around technology gained from its purchase of cloud provider Nimbix last year.

    The Nimbix Supercomputing Suite is described by Atos as a set of flexible and secure HPC solutions available as a service. It includes access to HPC, AI, and quantum computing resources, according to the services company.

    In addition to the existing Nimbix HPC products, the updated portfolio includes a new federated supercomputing-as-a-service platform and a dedicated bare-metal service based on Atos BullSequana supercomputer hardware.

    Continue reading

Biting the hand that feeds IT © 1998–2022