Multicore Expo Yes, OpenCL is an open standard. But from where Nvidia's sitting, it's also a way for making money.
OpenCL - the open, royalty-free standard for parallel processing across collections of CPUs, GPUs, DSPs, and other silicon - was further detailed, discussed, and demoed during the run-up to the Multicore Expo, which opens Tuesday in Santa Clara, California.
In a nutshell, OpenCL allows a developer to create code that treats all of a system's computational resources as peers. An OpenCL-enabled operating system will distribute that code to the resources that can best handle it.
For example, a system's CPU could handle code elements that most require complex programming, the GPU could handle the massively parallel needs of media processing, and a DSP could handle the specialized tasks for which it was created - audio processing, for example.
In this cooperative manner, performance could be increased far beyond what any single chunk of silicon could accomplish on its own.
During the tutorial, Trevett - who wears yet another hat as the OpenCL chair - explained the reason for the open standard: "We're in this to make money, but we don't make money by selling standards." OpenCL exists to build markets, not to gain licensing revenue.
The OpenCL 1.0 specification was, as we reported, released back in December. Monday, Trevett and other members of the OpenCL group explained why they think OpenCL is a Very Big Deal.
John Roberts, an Nvidia engineer, gave a detailed overview of the OpenCL parallel-processing APIs, which are transferable from the heights of high-performance computing (HPC) down through desktops all the way to handhelds.
Roberts also mentioned that OpenCL's kernal language - essentially its processing innards - is "very similar to [Nvidia's] CUDA" parallel-processing architecture. This may help explain both why Nvidia was so quick to jump aboard the OpenCL bandwagon, and why its new market-seeding GPU Ventures Program will accept entrants based both on CUDA and OpenGL.
Nokia's Kari Pulli discussed the OpenCL 1.0 Embedded Profile for handhelds, pointing out that one of the advantages of this profile is that it's a "pure subset" of the full OpenCL 1.0 specification. Code written for the Embedded Profile could thus be run on more-powerful OpenCL-capable devices such as desktops and HPC systems.
Pulli gave as one example an image-processing app for a handheld which used the CPU to process the lower half of an image and the GPU to process the top. In Nokia's tests, performance nearly doubled, with only a small percentage of overhead due to the need to coordinate the two processors.
He also explained that OpenCL can enable "pipelined computation." For example, a DSP might begin a process, then pass it off to the CPU, which then would pass it to the GPU. Each would perform the part of the task they're best suited for, then free themselves for other work.
After the tutorial, Nvidia's Trevett told The Reg that Intel's Larrabee crew - the engineers creating that GPU/CPU hybrid - are deeply involved in the evolving OpenCL spec.
He also said that he's anxiously awaiting Apple's release of the next version of Mac OS X, aka Snow Leopard, which will incorporate OpenCL and bring its cooperative goodness to a substantial installed base.
So are we - and although Apple hasn't yet announced a release date for its latest big cat, we're still hoping that Snow Leopard will be uncaged in early June. ®