Intel CTO suggests using AI to port CUDA code to – surprise! – Intel chips
This is about ending Nvidia's vendor lock-in, insists Greg Lavender
Saddled with a bunch of legacy code written for Nvidia's CUDA platform? Intel CTO Greg Lavender suggests building a large language model (LLM) to convert it to something that works on other AI accelerators – like maybe its own Gaudi2 or GPU Max hardware.
"I'll just throw out a challenge to all the developers. Let's use LLMs and technologies like Copilot to train a machine learning model to convert all your CUDA code to SYCL," he quipped during his Innovation keynote Wednesday, referring to Intel's accelerator-centric programming tool.
One of the challenges Intel, AMD, and other accelerator-makers face when pushing the adoption of their AI hardware is that plenty of code written for Nvidia's CUDA runtime must be refactored before it can be used on alternative platforms.
In some cases, it might be just a couple lines of code calling CUDA libraries that are key to an application's function, but that "makes it relatively sticky to one micro-architecture," Joe Curley, VP of software products and ecosystem at Intel, told The Register.
Intel has already made some headway on this effort. The silicon titan has invested heavily in its cross-platform parallel programming model called oneAPI and its AI inferencing offering called OpenVINO. In his speech, Lavender boasted that oneAPI's install base had grown 85 percent since 2021, which he opined demonstrates growing enthusiasm toward the platform.
While developed by Intel, it's worth noting that both oneAPI and OpenVINO are open source and aren't limited to the chipmaker's hardware.
Chipzilla has also released dozens of open source reference kits to address a variety of common AI/ML workloads, ranging from chatbots and other generative AI to more traditional ML workloads like object detection, voice generation, and financial risk prediction.
SYCL is a recent part of Intel's efforts to break CUDA's stranglehold on the AI software ecosystem. As our sister site The Next Platform reported early last year, SYCL – or more specifically SYCLomatic – is a royalty-free, cross-architecture abstraction layer that powers Intel's parallel C++ programming language.
In a nutshell, SYCL handles most of the heavy lifting – purportedly up to 95 percent – of porting CUDA code to a format that can run on non-Nvidia accelerators. But as you might expect there's usually some fine tuning and tweaking required to get applications running at full speed.
"If you want to get the last mile out of an Intel GPU – versus an AMD GPU versus an Nvidia GPU – you'll do something whether it's through an extension mechanism to SYCL or simply how you structure your code,” Curley explained.
- Intel slaps forehead, says I got it: AI PCs. Sell them AI PCs
- Core blimey, Intel's answer to AMD and Ampere's cloudy chips has 288 of them
- So what if China has 7nm chips now, there's no Huawei it can make them 'at scale'
- Nvidia's 900 tons of GPU muscle bulks up server market, slims down wallets
It's this fine tuning that Lavender seems to be suggesting could be further automated by a LLM model.
"There's certainly going to be research on doing exactly this," Curley predicted. "What we think of as a low code, no code world today, five years from now is going to be utterly and completely different. So, that is not only a good idea, that's an idea that's going to happen."
The challenge, Curley believes, will be determining the appropriate source data on which to train your model.
However, it's also worth noting that SYCL is by no means the only way to write code that's accelerator agnostic. Curley pointed to frameworks like OpenAI's Triton or Google's Jax as just two examples.
"If you don't like where we're headed with SYCL for some reason, embrace one of these other ways that are also standard. We'll all as an industry generate the compilation chains for our hardware and give you the same benefit," Curley said.
Beyond software runtimes, like SYCL, Intel is making a lot of resources available in the form of software, support, and accelerators running in its Developer Cloud to help AI startups optimize their code for Gaudi2, GPU Max, or the Advanced Matrix Extensions in its latest Xeons.
Intel's Liftoff Program – which aims to woo fledgling AI software startups by providing technical expertise to help them build applications that run on its products – was also promoted.
Intel is far from the only one grappling with these challenges. This week the Linux Foundation, in partnership with Arm, Fujitsu, Google Cloud, Imagination Technologies, Intel, Qualcomm Technologies and Samsung, formed the Unified Acceleration (UXL) Foundation. The working group aims to evolve oneAPI address a wide swath of accelerators from various vendors.
According to Lavender, "The industry benefits from open standardized programming languages for programming accelerator hardware that everyone can contribute to and collaborate on without vendor lock-in." ®