HPC

Nvidia, AMD tout 12GB GPUs for supers ... But ONE question remains

Can they run Crysis?


SC13 The battle of the big-memory GPU cards targeted at HPC and the data center is now well underway, with AMD having unveiled its FirePro S10000 12GB Edition card, and Nvidia announcing its 12GB Tesla K40 GPU accelerator card today.

Nvidia Tesla K40 GPU card

Nvidia Tesla K40 ... Double your pleasure, double your GDDR5 memory (click to enlarge)

The 12GB of memory on the K40 is double that of its predecessor, the K20X, which is boosting the performance of 38 members of the Top500 supercomputers list that was announced on Monday morning at the SC13 supercomputing conference in Denver, Colorado.

Referring to that memory boost, the general manager of Nvidia's Tesla wing, Sumit Gupta, told The Reg: "That obviously is the biggest benefit of this card," pointing out that doubling the memory expands the number of applications that can take advantage of GPU acceleration.

Chart comparing specifications of the Nvidia Tesla K20X with the new Tesla K40

One day you're king of the hill, then a newcomer arrives to knock you down a peg

The K40 has an additional performance-enhancing trick up its sleeve, one that Nvidia calls "GPU Boost". As its name implies, this capability can boost the GPU's clock if there's sufficient power headroom to accommodate it.

If this sounds a lot like the "Turbo Boost" tech used in Intel CPUs, Gupta says that it is, but with differences. "It's a similar concept, but different in the sense that Turbo Boost opportunistically boosts one CPU core at a time, but GPU Boost boosts all 2,880 cores."

According to Gupta, the reason that boosting all the cores in the K40's Kepler GK110B GPU at once is important is that doing so provides the user with the ability to control the consistency of performance when running enterprise data center or HPC workloads. "Every time you run an application, you want the same performance," he said, "and in every server node that you have, you want the same performance. That's very critical – if you're boosting, you boost all the GPUs to the same level."

Application-test benchmarks chart comparing the Nvidia Tesla K20X to the new K40 with and without GPU Boost

Running any of these on a K20X system? It may be time for a cost-benefit analysis (click to enlarge)

To accomplish this consistency, he said, the user controls when GPU Boost is enabled or disabled – it's not up to the GPUs. "The user says, 'Boost the following 100 cards in my data center'," he explained. And, as you might imagine, such boosting can be done dynamically with a command-line call.

On the hardware side – in addition to doubling the memory – the K40 upgrades a number of specs from its K20X predecessor: core count and clock, memory bandwidth and clock, and its PCIe connection from Gen-2 to Gen-3. Despite the hardware upgrades, the card's power budget stays the same at 235W. These enhancements help raise single-precision performance from the K20X's 3.93 to the K40's 4.25 teraflops, and double-precision from 1.31 to 1.43 teraflops.

"It's the world's fastest accelerator targeted at supercomputing and big-data analytics," Gupta promised. "We already had the fastest with the K20X, but K40 comes in as our flagship new product."

Now if only Nvidia and AMD can convince the supercomputing community that their new, fatter-memory cards can help them run all the workloads they want faster and more efficiently, perhaps the "ceepie-geepie" HPC revolution can get out of second gear. ®


Other stories you might like

  • Tesla lawsuit alleges unlawful layoffs at Nevada gigafactory
    It's the second time a Musk-owned company has been accused of WARN Act violations

    Tesla is facing another lawsuit, and it's treading over old territory with this one. Fired Gigafactory workers are alleging that the electric car maker improperly terminated more than 500 people.

    The proposed class action suit, filed on Sunday, stems from an email owner Elon Musk sent to Tesla leaders in early June – no, not the one where the billionaire said Tesla's workforce needed to be reduced by 10 percent.

    According to the lawsuit [PDF], filed by two former employees at Musk's Nevada battery plant, Tesla moved far faster than it was legally allowed to when it fired employees at the gigafactory in the city of Sparks, NV. 

    Continue reading
  • Intel’s Falcon Shores XPU to mix ‘n’ match CPUs, GPUs within processor package
    x86 giant now has an HPC roadmap, which includes successor to Ponte Vecchio

    After a few years of teasing Ponte Vecchio – the powerful GPU that will go into what will become one of the fastest supercomputers in the world – Intel is sharing more details of the high-performance computing chips that will follow, and one of them will combine CPUs and GPUs in one package.

    The semiconductor giant shared the details Tuesday in a roadmap update for its HPC-focused products at the International Supercomputing Conference in Hamburg, Germany.

    Intel has only recently carved out a separate group of products for HPC applications because it is now developing versions of Xeon Scalable CPUs, starting with a high-bandwidth-memory (HBM) variant of the forthcoming Sapphire Rapids chips, for high-performance kit. This chip will sport up to 64GB of HBM2e memory, which will give it quick access to very large datasets.

    Continue reading
  • AMD touts big datacenter, AI ambitions in CPU-GPU roadmap
    Epyc future ahead, along with Instinct, Ryzen, Radeon and custom chip push

    After taking serious CPU market share from Intel over the last few years, AMD has revealed larger ambitions in AI, datacenters and other areas with an expanded roadmap of CPUs, GPUs and other kinds of chips for the near future.

    These ambitions were laid out at AMD's Financial Analyst Day 2022 event on Thursday, where it signaled intentions to become a tougher competitor for Intel, Nvidia and other chip companies with a renewed focus on building better and faster chips for servers and other devices, becoming a bigger player in AI, enabling applications with improved software, and making more custom silicon.  

    "These are where we think we can win in terms of differentiation," AMD CEO Lisa Su said in opening remarks at the event. "It's about compute technology leadership. It's about expanding datacenter leadership. It's about expanding our AI footprint. It's expanding our software capability. And then it's really bringing together a broader custom solutions effort because we think this is a growth area going forward."

    Continue reading

Biting the hand that feeds IT © 1998–2022