Xeon today, MIC tomorrow
Intel’s line is that if you have an application that takes advantage of parallel programming on the CPU today, it can easily be adapted for MIC, since the MIC processors use the familiar x86 instruction set and programming model.
“We are trying to provide the common tools and programming models for the Xeon and x86 architecture and for the MIC architecture so you can use C++, Fortran, OpenMP, TBB, Cilk Plus, MKL; not only for Xeon but for MIC as well,” said Intel technical consulting engineer Levent Akyil at the company’s Software Conference last Month in Istanbul. “You can develop for Xeon today and scale your investment to the future Intel MIC architecture.”
The advantage over CUDA is that developers do not have to learn a new language. Intel quotes Dan Stanzione, deputy director at TACC (Texas Advanced Computing Center). “Moving a code to MIC might involve sitting down and adding a couple of lines of directives [which] takes a few minutes. Moving a code to a GPU is a project,” says Stanzione.
Intel’s Knight’s Ferry prototype MIC accelerator board
That said, NVIDIA has partnered with CAPS, Cray and PGI to create a directive-based approach to programming GPU accelerators, called OpenACC. Compiler support is limited currently to those from the above companies, but the expectation is that OpenACC will eventually merge with OpenMP. Adding directives to their C or C++ code is easier for programmers than learning CUDA C or OpenCL.
Why use NVIDIA GPUs rather than Intel MIC? Jack Wells, director of science at the Oak Ridge National Laboratory (ORNL) in Tennessee, is doubtful that Intel’s “same code” approach will deliver optimum results. ORNL is responsible for the Jaguar supercomputer, which is the fastest in the USA and third in the world, according to the Top 500 list. The Titan project underway at ORNL involves adding 15,000 Kepler K20 GPUs to achieve over 20 petaflops performance.
“At the NSF [National Science Foundation] computing center at our facility, there is an Intel center of excellence where they have early versions of the MIC," Wells says. "The director of that supercomputing center has justified that approach based on a belief that the thousands of users associated with NSF computing center might not want to port their codes.
“But this is a delicate issue. In supercomputing, just porting codes and getting them to run is not the goal. If it doesn’t run well, it’s a bug. So our best judgment is that the same process one needs to go through to get the codes running on a GPU hybrid machine would be similar to what you would do on a MIC hybrid machine, if it’s in a hybrid mode... It is not credible to me that, even if MIC delivers good performance, that just compiling your code and running it will be satisfactory.”
Watts call the shots
Is Wells concerned that CUDA is a proprietary standard? “NVIDIA has embraced OpenACC, and that’s a development that we’re thrilled about,” he says.
Another mitigating factor against the proprietary nature of CUDA is NVIDIA’s support for the open-source LLVM compiler project. The LLVM compiler for CUDA – which in this context is the CUDA architecture and not just the CUDA language – opens up the possibility of both supporting other languages on NVIDIA GPUs, and compiling CUDA code to target other GPUs or x86 CPUs.
The key question is: Will MIC or Knights Corner offer the best performance per watt? That is an analysis that will have to wait until the production release. ®