AI, AI, Pure: Nvidia cooks deep learning GPU server chips with NetApp

Pure Storage's AIRI reference architecture probably a bit jelly


NetApp and Nvidia have introduced a combined AI reference architecture system to rival the Pure Storage-Nvidia AIRI system.

Fairy in the woods

If you've got $1m+ to blow on AI, meet Pure, Nvidia's AIRI fairy: A hyperconverged beast

READ MORE

It is aimed at deep learning and, unlike FlexPod (Cisco and NetApp's converge infrastructure), has no brand name. Unlike AIRI, neither does it have its own enclosure.

A NetApp and Nvidia technical whitepaper – Scalable AI Infrastructure Designing For Real-World Deep Learning Use Cases (PDF) – defines a reference architecture (RA) for a NetApp A800 all-flash storage array and Nvidia DGX-1 GPU server system. There is a slower and less expensive A700 array-based RA.

The topline RA supports a single A800 array (high-availability pair config) with 5 x DGX-1 GPU servers hooked up across 2 x Cisco Nexus 100GbitE switches. The slower A700 all-flash array RA supports 4 x DGX-1s across 40GbitE.

The A800 system uses a 100GbitE link connecting to the DGX-1, which supports RDMA as a cluster interconnect. The A800 scales out to a 24-node cluster and 74.8PB.

It's said to have a 25GB/sec read bandwidth and a sub-500μsec latency.

NetApp_Nvidia_DL_RA

NetApp Nvidia DL RA config diagram

Pure Storage and Nvidia's AIRI has a FlashBlade array supporting 4 x DGX-1s. It offers 17GB/sec from its FlashBlade array which provides sub-3ms latency. This seems slow compared to the NetApp/Nvidia RA system but then the A800 is NetApp's fastest all-flash array, whereas Pure's FlashBlade is more of a capacity-optimised flash array.

The NetApp Nvidia DL RA scales out, like Pure's AIRI Mini, starting out at one DGX-1 and growing to five. The A800 typically has 364.8TB of raw capacity. Pure's AIRI has 533TB of raw flash.

There is an AIRI RA document here and its config diagram looks like this:

Pure_Nvidia_AIRI_config

Pure Nvidia AIRI config diagram.

Both NetApp and Pure have run benchmarks of their two systems, and both include Res-152 and ResNet-50 runs using synthetic data, NFS, and a batch size of 64.

NetApp provides graphs and numbers while Pure just supplies graphs, making comparison difficult. Still, we can do a rough and ready estimate by putting those charts next to each other.

The resulting overall chart ain't pretty but does provide a means of comparison:

NetApp_Pure_Resnet_comparisons

NetApp and Pure Resnet performance comparison

It appears from these charts, at least, that NetApp Nvidia RA performs better than than AIRI but, to our surprise, not by much, given the NetApp/Nvidia DL system's higher bandwidth and lower latency – 25GB/sec read bandwidth and sub 500μsec – compared to the Pure AIRI system – 17GB/sec and sub-3ms.

A price comparison would be good but no one's talking dollars to us here. We might expect Nvidia to announce more deep learning partnerships like the ones with NetApp and Pure. HPE and IBM are obvious candidates as are the newer NVMe-oF-class array startups such as Apeiron, E8 and Excelero. ®

Narrower topics


Other stories you might like

  • Twitter founder Dorsey beats hasty retweet from the board
    As shareholders sue the social network amid Elon Musk's takeover scramble

    Twitter has officially entered the post-Dorsey age: its founder and two-time CEO's board term expired Wednesday, marking the first time the social media company hasn't had him around in some capacity.

    Jack Dorsey announced his resignation as Twitter chief exec in November 2021, and passed the baton to Parag Agrawal while remaining on the board. Now that board term has ended, and Dorsey has stepped down as expected. Agrawal has taken Dorsey's board seat; Salesforce co-CEO Bret Taylor has assumed the role of Twitter's board chair. 

    In his resignation announcement, Dorsey – who co-founded and is CEO of Block (formerly Square) – said having founders leading the companies they created can be severely limiting for an organization and can serve as a single point of failure. "I believe it's critical a company can stand on its own, free of its founder's influence or direction," Dorsey said. He didn't respond to a request for further comment today. 

    Continue reading
  • Snowflake stock drops as some top customers cut usage
    You might say its valuation is melting away

    IPO darling Snowflake's share price took a beating in an already bearish market for tech stocks after filing weaker than expected financial guidance amid a slowdown in orders from some of its largest customers.

    For its first quarter of fiscal 2023, ended April 30, Snowflake's revenue grew 85 percent year-on-year to $422.4 million. The company made an operating loss of $188.8 million, albeit down from $205.6 million a year ago.

    Although surpassing revenue expectations, the cloud-based data warehousing business saw its valuation tumble 16 percent in extended trading on Wednesday. Its stock price dived from $133 apiece to $117 in after-hours trading, and today is cruising back at $127. That stumble arrived amid a general tech stock sell-off some observers said was overdue.

    Continue reading
  • Amazon investors nuke proposed ethics overhaul and say yes to $212m CEO pay
    Workplace safety, labor organizing, sustainability and, um, wage 'fairness' all struck down in vote

    Amazon CEO Andy Jassy's first shareholder meeting was a rousing success for Amazon leadership and Jassy's bank account. But for activist investors intent on making Amazon more open and transparent, it was nothing short of a disaster.

    While actual voting results haven't been released yet, Amazon general counsel David Zapolsky told Reuters that stock owners voted down fifteen shareholder resolutions addressing topics including workplace safety, labor organizing, sustainability, and pay fairness. Amazon's board recommended voting no on all of the proposals.

    Jassy and the board scored additional victories in the form of shareholder approval for board appointments, executive compensation and a 20-for-1 stock split. Jassy's executive compensation package, which is tied to Amazon stock price and mostly delivered as stock awards over a multi-year period, was $212 million in 2021. 

    Continue reading
  • Confirmed: Broadcom, VMware agree to $61b merger
    Unless anyone out there can make a better offer. Oh, Elon?

    Broadcom has confirmed it intends to acquire VMware in a deal that looks set to be worth $61 billion, if it goes ahead: the agreement provides for a “go-shop” provision under which the virtualization giant may solicit alternative offers.

    Rumors of the proposed merger emerged earlier this week, amid much speculation, but neither of the companies was prepared to comment on the deal before today, when it was disclosed that the boards of directors of both organizations have unanimously approved the agreement.

    Michael Dell and Silver Lake investors, which own just over half of the outstanding shares in VMware between both, have apparently signed support agreements to vote in favor of the transaction, so long as the VMware board continues to recommend the proposed transaction with chip designer Broadcom.

    Continue reading
  • Perl Steering Council lays out a backwards compatible future for Perl 7
    Sensibly written code only, please. Plus: what all those 'heated discussions' were about

    The much-anticipated Perl 7 continues to twinkle in the distance although the final release of 5.36.0 is "just around the corner", according to the Perl Steering Council.

    Well into its fourth decade, the fortunes of Perl have ebbed and flowed over the years. Things came to a head last year, with the departure of former "pumpking" Sawyer X, following what he described as community "hostility."

    Part of the issue stemmed from the planned version 7 release, a key element of which, according to a post by the steering council "was to significantly reduce the boilerplate needed at the top of your code, by enabling a lot of widely used modules / pragmas."

    Continue reading

Biting the hand that feeds IT © 1998–2022