HPC

Nvidia accused of cheating in big-data performance test by benchmark's umpires: Workloads 'tweaked' to beat rivals in TPCx-BB

GPU giant says it'll play ball soon


Nvidia has been accused of cheating in a big-data performance benchmark, and thus unfairly coming out on top, by the very umpires of the test.

At its GPU Technology Conference last year, Nvidia claimed that a cluster of its DGX A100 systems was 19.5x faster than the best performing system on the TPCx-BB benchmark, devised by the Transaction Processing Performance Council (TPC). This week, the TPC rebuked the chip biz, accusing it of not only breaking its test's terms of use but also circumventing the benchmark's constraints to artificially inflate its score.

Michael Majdalany, administrator of the TPC, told The Register Nvidia “tweaked the workloads” in its tests to make its DGX A100 systems seem more powerful than they really are.

“There are constraints in the TPCx-BB benchmark that Nvidia circumvented in order to make the claim: ‘Nvidia outperformed by nearly 20x the record for running the standard big data analytics benchmark, known as TPCx-BB,’" he told us.

"In effect, they weren’t running the same benchmark, so all corresponding claims are invalid."

They weren’t running the same benchmark, so all corresponding claims are invalid

The TPCx-BB benchmark measures the performance of Hadoop-based big-data systems, which can be accelerated by graphics processors, such as those made by Nvidia. It involves running SQL queries on structured data, and uses machine-learning algorithms for unstructured data, to mimic analysis work typically performed by retail giants, be they online, offline, or both. The test comprises a specification to follow and tools to run.

“Using the RAPIDS suite of open-source data science software libraries powered by 16 Nvidia DGX A100 systems, Nvidia ran the benchmark in just 14.5 minutes, versus the current leading result of 4.7 hours on a CPU system,” Nv boasted in a blog on its dotcom in June. “The DGX A100 systems had a total of 128 Nvidia A100 GPUs and used Nvidia Mellanox networking.”

However, the performance council is unimpressed by Nvidia simply quoting this non-certified top-line figure as an official TPCx-BB score, which goes against the fair-use rules [PDF] of the benchmark. That policy states that organizations can only use the name TPC with a benchmark score if the results have been reviewed and published on the TPC website as an official score for people to compare with other vendors' results. Unofficial scores must be referred to "non-TPC" results.

"In the paper 'State of RAPIDS: Bridging the GPU Data Science Ecosystem,' presented at Nvidia's GPU Technology Conference (GTC) 2020, and in associated company blogs and marketing materials, Nvidia claims that it has 'outperformed by nearly 20x the record for running the standard big data analytics benchmark, known as TPCx-BB,'" the council said on Wednesday.

"Since Nvidia has not published official TPC results, and instead compared results from derived workloads to official TPC results, the comparisons are invalid."

"The TPC actively encourages publicizing of TPC results by all organizations, including the press, market researchers, financial analysts and non-profit organizations,“ added Mike Brey, chairman of the TPC Steering Committee. "However, to ensure that users and readers of TPC results are given a fair and complete representation of TPC data, the TPC requests that all users follow the Fair Use rules, outlined in TPC policies, when publishing or publicizing results.”

A40

Nvidia touts another two spanking new GPUs to join its list of Ampere architecture based goodies

READ MORE

A spokesperson for the council told us the org has been trying to get Nvidia to retract its claims or issue a correction. The marketing blog post is still up on Nvidia’s site unchanged.

Behind closed doors, the GPU giant said it was interested in formally participating in the benchmark and releasing its results, we understand, and even said it may join the TPC as a member. The council is supported by its members, which include AMD, Intel, and Microsoft.

The discussions between TPC and Nvidia, however, went nowhere. And as Nvidia is a non-member, there's seemingly really not a lot else the council can do to sanction the Silicon Valley goliath. With no options left, TPC decided to go public.

“Nvidia has been working in good faith with the Transaction Processing Performance Council to prepare an official submission and comply with their requests,” an Nvidia spokesperson told The Register.

“We are confident in the performance of our software and hardware and look forward to sharing our results.” Nvidia PRs declined to comment on how the biz measured the performance of its chips, and whether it was going to take down its blog post or issue a correction.

“We don’t have any additional comments here,” the spokesperson said. ®

Similar topics


Other stories you might like

Biting the hand that feeds IT © 1998–2021