Register Debate Welcome to the latest Register Debate in which writers discuss technology topics, and you the reader choose the winning argument. The format is simple: we propose a motion, the arguments for the motion will run this Monday and Wednesday, and the arguments against on Tuesday and Thursday. During the week you can cast your vote on which side you support using the poll embedded below, choosing whether you're in favour or against the motion. The final score will be announced on Friday, revealing whether the for or against argument was most popular.
This week's motion is: A unified, agnostic software environment can be achieved. We debate the question: can the industry ever have a truly open, unified, agnostic software environment in HPC and AI that can span multiple kinds of compute engines?
Our first contributor arguing FOR the motion is Nicole Hemsoth, co-editor of The Next Platform.
Here’s a novel idea: how about having the smallest number of tools to support the widest array of hardware, supporting both high performance computing and AI and for that matter, hyperscale applications, too?
On the surface, this should sound like a no-brainer, especially for application owners who have spent years cobbling together the optimal combination of compilers and tools out of necessity, or worse yet, working with sub-optimal tools because that is what they had and that is what they knew how to use.
Creating a unified HPC and AI software stack that is both open and agnostic sounds like common sense to us, and the reason it sounds like that is because it is. What is preventing us from bringing all minds to bear on solving problems instead of endlessly untangling a matrix of tools and code? Egos and near-religious adherence to preferred platforms, the not-invented-here syndrome, and lack of cooperation is the root of this particular evil and the outcome is this vicious cycle reinventing the wheel. Over and over.
The fact that the hyperscalers and the HPC community have developed absolutely unique and absolutely incompatible distributed computing frameworks is the best example of how not to do things as far as we are concerned.
We can look to the early days of Linux clusters in HPC for a good example of how to create a common, high-performance computing platform.
The Message Passing Interface (MPI) protocol, which is at the heart of most HPC simulation and models today, took a while to gather steam but was born from necessity. Collaboration on a single standard gave the HPC community a base to build on, some much-needed stability. And then multi-threaded and multi-core processors came on the scene, there were how many different ways to program them before the OpenMP standard was proposed and adopted by a significant share of the HPC community?
Now, as we look ahead into the future, the combination of MPI plus OpenMP is emerging as the parallel programming standard at a number of HPC centers, and they are setting the pace for how this is done. And now, the rest of the HPC community can benefit from all of that work, and they can stop reinventing wheels and focus on solving problems.
Wouldn’t you think that, given all of this, the MPI and OpenMP approaches would have would have translated easily to other areas with high thread counts, high core counts, and lots of node-to-node communication? Especially since these are the outcomes of years and years of software development and tuning? It appears not. Rather than adopt the HPC stack, the AI world decided to build their own, casting aside software tools that have been, time and again, proven at scale. To what end?
Here is the point:
Perhaps it’s time to realize that it is not so important to have some monolithic tool or compiler at the heart of cross-AI/HPC/hyperscale efforts as it is for the software brains behind some of the world’s largest systems to find a way to kick out and absorb each other’s code. AMD has been able to do this, for example.
Its ROCm environment for its CPUs and GPUs can generate its native code for AMD’s compute engines and can also convert that to Nvidia’s CUDA and run code natively on Nvidia GPUs (and probably Intel’s GPUs and FPGAs in the near future). Maybe this is all that we need now, and all that we can hope for. Maybe Nvidia can do the same with its CUDA environment, and maybe Intel will do the same with oneAPI, too. There is a non-zero, but admittedly small, chance that a unified software environment could be created, and it would take a lot of arm twisting by end users to accomplish this.
So, to all of you in HPC, AI, and hyperscale, some advice that works on a certain three-year-old in close proximity: “Learn how to share, how to work together. Many minds in harmony are better than one.” ®
Cast your vote below. We'll close the poll on Thursday night and publish the final result on Friday. You can track the debate's progress here.