Meet the beautiful minds vying to be crowned the Student Cluster champs
40 per cent less tedium and weepy backstories than the Olympics
HPC Blog So who are the kids competing in the SC16 Student Cluster Competition this year? What are their hopes, their dreams – what makes them tick? Why did they travel all the way to Salt Lake City to chase the Cluster Crown? (There is no actual crown, but there should be, right?)
In these videos, we get to know the kids and get a personal feel for each of the teams. Sort of like the Olympics when they do the "up-close-and-personal" features – but with lower production values, 40 per cent less tedium, and no weepy back stories.
EAFIT: The kids from Colombia are driving their Power 8-based cluster for all it's worth, and were seeing good results when we finally got a chance to interview them. They had a bit of trouble finding and compiling some of the applications, but all is well now. This is the first time the team has used GPUs (they have four NVIDIA K80s) and they're highly impressed with the performance they're getting out of them.
Friedrich-Alexander University: They are one of two teams from Germany in the cluster competition this year. As usual, I make the Germany teams pronounce their school names with true German feeling – and FAU doesn't disappoint on this score. The team is pushing an HPE Moonshot cluster that utilises 37 power-efficient nodes, but no accelerators this year – which is a bit frustrating for the team. They were hoping for a more CPU-centric competition. But, as I point out in the video, many apps are optimised for GPUs, which makes it unlikely that they'll see a slate of cluster competition apps that don't have GPU versions. I also try to talk them into applying for the ASC competition – they'll think about it, but aren't making any promises.
Huazhong University of Science & Technology: For once, I sort of pronounce the word "Huazhong" semi-correctly, which is a big step for me. A fair amount of time is spent straightening out the interpreter situation, but I speed up the translation cycles at 400x in order to save time. This is a powerful team; they won the ASC16 competition in their home town of Wuhan (pronounced "Woo Han"!). The team reports that their hardware is performing "exactly" as they expected and that the applications are going well. Check out the video to get a look at them.
University of Illinois, Champaign-Urbana: While the team prefers "Urbana-Champaign", I'm sticking to the old school "Champaign-Urbana" moniker. In the video, the team talks about how they're approaching the applications and how their Paraview application expert pulled an all-nighter working on the app. This team is using a five-node cluster with 200 CPU cores and five NVIDIA K80 accelerators. We talk about how the team will deal with the upcoming power outage. They're confident that it won't pose a problem for them.
Nanyang Technological University: This is the first time the lads from Singapore are competing in the US SC competition, although they've participated in a couple of ASC and an ISC competition in the past. They think the applications in the SC competition are a bit more challenging than what they've seen before, particularly since they'll be doing all of the applications at the same time. The team is riding a four-workstation cluster with 176 CPU cores, but jam-packed with eight P100 GPUs. They have a competitive box, so we'll see what happens.
National Tsing Hua University: The team rep says that everything is going great for the team. But he would say that anyway, as I pointed out in the video. The team is utilising a four-node, 176 CPU core system, with eight NVIDIA P100 GPUs. We discuss how the team will cope with the power outage, they don't seem too worried about the consequences.
Technical University of Munich: To start the video, I have the team pronounce their university name with feeling – and they oblige. They're the first team to run a Knights Landing cluster with the KL chips as hosts. This gives them a total of 576 cores spread over eight nodes. Everything is running very well according to the team, including their Omni Path interconnect. The team is having fun with their system, having founded "Phi Club".
These kids have a great sense of humour, but also pack a technical punch. They won the LINPACK competition at SC15 despite having some hardware problems along the way.
You can see the "Rules of Phi Club" in the video – it's pretty funny. The kids are also running Twitter polls during the competition, which are pretty funny too. While everything is good on the hardware front, the team might be running even better. They're a fun-loving bunch of kids and one of the most engaging teams in this competition.
University of Utah: As the home team, the Utah Utes are doing their best to represent Salt Lake City and Utah. They've been helping other teams out with spare parts and stuff, which is a great exhibition of student cluster competition spirit. This is the first time the team has had a chance to work out on a large cluster.
For the competition, they're driving a dual-node Dell cluster with only 40 CPU nodes, but with a full brace (8) of GPUs. There's some question as to whether they can get their cluster up to the 3,000-watt power limit, so far they've only reached 2,800. This is OK, since they don't want to go over the limit, but most teams will be riding that 3,000-watt line in order to get the most bang out of their cluster. The analogy I used to describe this is "like having a race car with the best brakes, but not so much speed".
San Diego State University: This first-time competitor went with ultra-small Kennedy Pass motherboards, four of 'em, holding a whopping 348 CPU cores, which is well above the 213 core average. Their cluster is so small that it will easily fit into a laptop bag, along with their switch. This is a LOUD system, with fans that literally scream when they're pushing at full speed. Even though it's a small cluster, it can still hit the 3,000-watt power cap. Good time, nice to have them at the competition this year. Check out the video for more.
University of Science & Technology: This team traveled all the way from China to participate in the 10th Anniversary SC Student Cluster Competition. They're working in shifts like coal miners, only emerging from the cluster to sleep and eat. Their cluster is an accelerated monster with 18(!) accelerators – 10 NVIDIA P100s and 8 NVIDIA GTX 1080s.
So why so many accelerators and why the two types? The team explains that the P100s will be used for LINPACK and other applications, while the GTX 1080s will be used for the rendering required by Paraview. They won't be using all of the accelerators at the same time, since this would drive them way over the power limit. What's interesting is that these cards only use 3 watts at idle, so why not bring a whole bunch of them, right? It's a solid move by the team.
Boston Green Team: This team we've seen before, although the personnel has changed over time. They are also known as Mass Green, Team Boston, and Team Chowder. When we talk to them, they're definitely feeling the fatigue that comes with any SC competition.
Cluster-wise, they should name this thing "Team Unity". They've combined Intel Xeon and AMD CPUs, along with dual NVIDIA P100s and four AMD R9 Nano accelerators – making these strange bedfellows perform together. In the video, we discuss their cluster, how they're approaching the applications, and their progress so far in the competition. We also learn that the woman in charge of Paraview has a "top-secret idea" that should give them a leg up on the competition.
University of Texas/Texas State U: In this video, we meet the blended team of Texas State and the University of Texas, Austin. When we catch up to them, the team seems to be doing well with the applications and isn't having any obvious problems. While UT has competed in many competitions and was the first team to win three SC competitions in a row, this is an entirely new set of personnel.
One of the really cool things that we find out in this video is the impact of the cluster competition on team members. They've had multiple people come by their booth, drop off business cards, and inquire about how they can hire the kids either on an intern or full-time basis. In short, this competition is a real résumé builder for the students, which is great pay-off for the months that the kids have devoted to the competition.
University of Peking: This is the first time that University of Peking has competed at any cluster competition. They're one of two teams running OpenPOWER hardware. This team has the full kit, with two Power-based nodes, NVLink-connected NVIDIA P100s (eight – a double brace), with two FPGAs to help on the password-cracking application.
As you'll see in the video, overall the machine is working for the kids very well, but they couldn't get the FPGAs to functioning perfectly. It could be a mismatch between the hardware and what they were trying to do, or they could have simply run out of time to program/optimise the chips. In the video, I tried to stir up some trouble between them and Tsinghua University (a perennial cluster competition top-echelon team located very close to Peking), but wasn't very successful.
Northeastern University/Auburn U: This team had the misfortune of having their machine not arrive in time for the competition, it was stuck somewhere in UPS Land when we interviewed them for the video. But the team persevered, going around to various vendors and begging, borrowing, and maybe even stealing hardware to cobble into a cluster.
What they ended up with was a couple of years-old workstations plus a real Xeon node, connected together with Mellanox FDR InfiniBand. They also have a couple of older K40 cards coupled with four AMD R9 Nano accelerators. This won't be a winning combination for the team, but they're showing the true student cluster competition spirit by scrounging up enough hardware to compete. Good on them.
Next up, we'll talk about LINPACK results, show the students competing in the SUSE Space Pirate challenge, and analyse the final results. Stay tuned...