A single HPC-AI software environment is less desirable than you might think
Every possible thing that can be tuned must be tuned – and tuned well
Register Debate Welcome to the latest Register Debate in which writers discuss technology topics, and you the reader choose the winning argument. The format is simple: we propose a motion, the arguments for the motion will run this Monday and Wednesday, and the arguments against on Tuesday and Thursday. During the week you can cast your vote on which side you support using the poll embedded below, choosing whether you're in favour or against the motion. The final score will be announced on Friday, revealing whether the for or against argument was most popular.
This week's motion is: A unified, agnostic software environment can be achieved. We debate the question: can the industry ever have a truly open, unified, agnostic software environment in HPC and AI that can span multiple kinds of compute engines?
Our contributor today debating AGAINST the motion is Timothy Prickett Morgan, co-editor of The Next Platform.
Nothing in this life is simple, is it? And in the simulation and modeling and machine learning sectors of the broader high performance computing sector perhaps one day they will be a unified field, like quantum mechanics and relativity, and perhaps there will be a single programming environment that can span it all. But for now, in a post Moore’s Law world where every transistor counts, every bit moved and processed counts, and every joule of energy counts, there is no room for any inefficiency in the hardware and software stack.
And that means there is going to be complexity in the programming stack. It is an unavoidable trade-off between application performance and application portability, which we have seen play out over more than five decades of commercial and HPC computing.
HPC and AI shops value performance over everything – performance is a more important cost for simulations and AI training, and that is why we think for the foreseeable future there will be many different languages in use and possibly many different compilers for each language across a widening array of devices.
How many different C compilers do we need? Throughout history, over fifty different C and C++ compilers have been created, and at any given time there are probably a half dozen to a dozen popular ones. We do not think this will change. If we can’t get down to one C compiler that can optimize code for all CPUs, how are we going to do this across multiple languages and not just CPUs, but GPUs, FPGAs, and custom ASICs?
It is easier in the enterprise than it is in the HPC and AI markets, because enterprises value price/performance and the longevity of their code more than they do raw performance. This is about making money – and keeping as much of it as possible – and not about solving the mysteries of the universe or making our phones give us cogent backtalk.
Enterprises value programmer compatibility – can they get people to build and maintain their code – equally with program portability – can they run it on different iron and operating systems. Hence the popularity of Java, which is not precisely a high performance language. And sometimes, despite their reputations for all being C++ programmers, the hyperscalers and cloud builders choose Java to create an analytics stack or a database.
This was never going to be the right answer in HPC and it won’t be in AI, either. The reality of the 21st century is that every possible thing that can be tuned must be tuned, and tuned well, and that means having many more tools to cover a cornucopia of devices.
And this is a good thing, despite all the benefits that come from having a single standard – the big one being eliminating complexity. But this also eliminates competition. Having competing standards for anything drives innovation and it weeds out bad ideas and keeps good ones. To be fair, the lack of standards at any layer of the software stack creates islands of incompatibility, and you pay a heavy price for betting wrong when you pick your software tools.
It is easy to get standards on plumbing in the IT stack – well, it is relatively easier – and we have seen a number of successful ones, such as the many different peripheral standards (PCI and PCI-Express and now the latter’s CXL overlay) as well as interconnect standards (Ethernet shines, and so does InfiniBand). But applications are not pipes, they are the liquid that flows through them. Getting people to agree on what the recipe for that liquid should be is much more complex.
Rather than have one standard programming environment for HPC and AI, maybe what we need is cooperation so we can agree on the plumbing and then have a way to convert from one liquid to another as necessary, Having AMD’s ROCm environment kick out CUDA code compatible with Nvidia’s software environment that can run natively on Nvidia GPUs is one example; we expect that should Intel’s oneAPI effort take off, ROCm will be able to absorb or output SYCL code as well.
There is always a performance penalty that comes with conversion, but if the industry leaders work together on this – and the big HPC and AI centers insist on it – there should be a sharing of technology. Think of it as a set of inter-connected moats, like Venice, instead of isolated castles far from each other and with isolated moats around them.
We want to connect the moats and let the code flow, back and forth. ®
Cast your vote below. We'll close the poll on Thursday night and publish the final result on Friday. You can track the debate's progress here.