HPC Blog The upcoming HPC Advisory Council conference in Lugano will be much more than just a bunch of smart folks presenting PowerPoints to a crowd. It will feature a number of sessions designed to teach HPCers how to better use their gear, do better science, and generally humble all those around you with your vast knowledge and perspicacity.
The first "best practices" session will feature Maxime Martinasso from the Swiss National Supercomputing Center discussing how MetroSwiss (the national weather forecasting institute) uses densely populated accelerated servers as their basic server to compute weather forecast simulations.
However, when you have a lot of accelerators attached to the PCI bus of a system, you're going to generate some congestion. How much congestion will you get and how do you deal with it? They've come up with an algorithm for computing congestion that characterises the dynamic usage of network resources by an application. Their model has been validated as 97 per cent accurate on two different eight-GPU topologies. Not too shabby, eh?
Another best practice session also deals with accelerators, discussing a dCUDA programming model that implements device-side RMA access. What's interesting is how they hide pipeline latencies by over-decomposing the problem and then over-subscribing the device by running many more threads than there are hardware execution units. The result is that when a threat stalls, the scheduler immediately proceeds with the execution of another threat. This fully utilises the hardware and leads to higher throughput.
We will also see a best practices session covering SPACK, an open-source package manager for HPC applications. Intel will present a session on how to do deep learning on their Xeon Phi processor. Dr Gilles Fourestey will discuss how Big Data can be, and should be, processed on HPC clusters.
Pak Lui from Mellanox will lead a discussion on how to best profile HPC applications and wring the utmost scalability and performance out of them. Other session topics include how to best deploy HPC workloads using containers, how to use the Fabriscale Monitoring System, and how to build a more efficient HPC system.
Tutorials include a twilight session on how to get started with deep learning (you'll need to bring your own laptop to this one), using EasyBuild and Continuous Integration tools, and using SWITCHengines to scale horizontally campus wide.
Phew, that's a lot of stuff... and it's all free, provided you register for the event and get yourself to Lugano by 10 April. I'll be there covering the event, so be sure to say hi if you happen to see me.