Oh no, you're thinking, yet another cookie pop-up. Well, sorry, it's the law. We measure how many people read us, and ensure you see relevant ads, by storing cookies on your device. If you're cool with that, hit “Accept all Cookies”. For more info and to customize your settings, hit “Customize Settings”.

Review and manage your consent

Here's an overview of our use of cookies, similar technologies and how to manage them. You can also change your choices at any time, by hitting the “Your Consent Options” link on the site's footer.

Manage Cookie Preferences
  • These cookies are strictly necessary so that you can navigate the site as normal and use all features. Without these cookies we cannot provide you with the service that you expect.

  • These cookies are used to make advertising messages more relevant to you. They perform functions like preventing the same ad from continuously reappearing, ensuring that ads are properly displayed for advertisers, and in some cases selecting advertisements that are based on your interests.

  • These cookies collect information in aggregate form to help us understand how our websites are being used. They allow us to count visits and traffic sources so that we can measure and improve the performance of our sites. If people say no to these cookies, we do not know how many people have visited and we cannot monitor performance.

See also our Cookie policy and Privacy policy.

This article is more than 1 year old

ARM64 gets better GPU support in CUDA release

CUDA 6.5 eyes HPC market

NVIDIA has launched the next upgrade to its parallel computing and programming platform, with CUDA 6.5 going live as a production release.

The free download here puts 64-bit ARM platforms on a par with x86, by letting them take advantage of GPU acceleration. As NVIDIA claims in this blog, the combination of low-power ARM64 architectures with ultra-fast GPU compute is “a compelling solution for HPC”.

Fast Fourier Transform performance is improved, NVIDIA says, with cuFFT device callbacks implemented so as to run FFTs in a single memory roundtrip: “cuFFT can transform the input and output data without extra bandwidth usage above what the FFT itself uses”, the company says.

There are tools to provide better Fortran support in its cuda-gdb debugger, nvprof command line profiler, cuda-memcheck and the NVIDIA Visual Profiler.

Host compiler support now includes Microsoft Visual Studio 2013 for Windows; various math libraries have better double precision performance; and there are various new static CUDA libraries to reduce dynamic library dependencies.

Other features include a new occupancy calculator API, so programmers don't have to configure GPU kernel launches for each architecture; and a utility called nvprune that slices out device code that's not needed in the target architecture. ®

 

Similar topics

TIP US OFF

Send us news


Other stories you might like