Codename Brainwave: Microsoft reveals tricks and tips for whipping cloud FPGAs into shape

Pipelining, on-die memories exploited in Azure

Hot Chips Microsoft today teased chip designers with Brainwave, its cloud-hosted pool of FPGAs designed to perform AI stuff in real time.

The Windows giant has previously spoken of bunging FPGAs in its Azure cloud. It's been using the programmable logic gate arrays in its data centers for a few years now.

The chips are typically used in Redmond's servers as accelerators attached to CPUs via PCIe 3, with a 40Gb/s QSFP channel to the network controller – so it can access packets –– and another channel to a special network bus of FPGAs. These accelerators can be programmed to tackle tasks from calculating webpages' search rankings to machine-learning workloads, in dedicated silicon at high speed.

Brainwave seeks to enhance the performance of FPGAs used in Microsoft's cloud by turning each of the arrays into hardware microservices. Yes, the dreaded M word. There are a few key steps that have been taken to maximize efficiency in order to achieve on-the-fly processing for AI applications. These techniques were presented at the Hot Chips conference in Silicon Valley earlier today.

One is step is that, by using the latest Intel Stratix chips, Redmond's machine-learning models are stored entirely in memory within the gate array and not in RAM. That allows the model to persist within the chips, with the attached DRAM used only for buffering incoming and outgoing data.

Another step is to optimize the FPGA design so that every single array resource and memory is used to process an incoming query. That increases throughput and avoids having to crunch queries in batches, which ruins latency and hampers real-time analysis. In other words, it's possible to maintain a stream of data processing.

The next step is to pool the FPGAs as a collection of microservices to tackle a task. For example, let's say a process requires eight stages, or eight matrix-based equations performed on some data. Brainwave allocates eight FPGAs to form an eight-stage pipeline, flowing data from one chip to the next via the array network. Each FPGA on this network is two microseconds from each other, in terms of latency. Thus, the data is scheduled through the eight stages; as one stage is finished, it is allocated to another pipeline.

This approach can also be used to perform matrix math in parallel, running, say, a single dense matrix through eight FPAGs at the same time.

Microsoft appears to be using this technology internally for now, and has deployed it, or will shortly deploy it, in production for the usual things: Bing searches, computer vision, speech processing, and so on. Any external availability has yet to be announced.

"We are working to bring this powerful, real-time AI system to users in Azure, so that our customers can benefit from Project Brainwave directly, complementing the indirect access through our services such as Bing," said Microsoft engineer Doug Burger.

Each of these Brainwave-managed FPGAs has a Redmond-designed microarchitecture that has instructions specialized for machine learning, such as vector operations and non-linear activations.

Microsoft has been mentioning Brainwave here and there for a while, although only now revealing some of its technical details. Derek Chiou of Azure's silicon team gave a presentation earlier this year about it. Today, Redmond published a blog post about the technology, attaching its slides from Hot Chips for anyone who wants to peer deeper. ®

Broader topics

Other stories you might like

  • Oracle shrinks on-prem cloud offering in both size and cost
    Now we can squeeze required boxes into a smaller datacenter footprint, says Big Red

    Oracle has slimmed down its on-prem fully managed cloud offer to a smaller datacenter footprint for a sixth of the budget.

    Snappily dubbed OCI Dedicated Region Cloud@Customer, the service was launched in 2020 and promised to run a private cloud inside a customer's datacenter, or one run by a third party. Paid for "as-a-service," the concept promised customers the flexibility of moving workloads seamlessly between the on-prem system and Oracle's public cloud for a $6 million annual fee and a minimum commitment of three years.

    Big Red has now slashed the fee for a scaled-down version of its on-prem cloud to $1 million a year for a minimum period of four years.

    Continue reading
  • ZTE intros 'cloud laptop' that draws just five watts of power
    The catch: It hooks up to desktop-as-a-service and runs Android – so while it looks like a laptop ...

    Chinese telecom equipment maker ZTE has announced what it claims is the first "cloud laptop" – an Android-powered device that the consumes just five watts and links to its cloud desktop-as-a-service.

    Announced this week at the partially state-owned company's 2022 Cloud Network Ecosystem Summit, the machine – model W600D – measures 325mm × 215mm × 14 mm, weighs 1.1kg and includes a 14-inch HD display, full-size keyboard, HD camera, and Bluetooth and Wi-Fi connectivity. An unspecified eight-core processors drives it, and a 40.42 watt-hour battery is claimed to last for eight hours.

    It seems the primary purpose of this thing is to access a cloud-hosted remote desktop in which you do all or most of your work. ZTE claimed its home-grown RAP protocol ensures these remote desktops will be usable even on connections of a mere 128Kbit/sec, or with latency of 300ms and packet loss of six percent. That's quite a brag.

    Continue reading
  • Mega's unbreakable encryption proves to be anything but
    Boffins devise five attacks to expose private files

    Mega, the New Zealand-based file-sharing biz co-founded a decade ago by Kim Dotcom, promotes its "privacy by design" and user-controlled encryption keys to claim that data stored on Mega's servers can only be accessed by customers, even if its main system is taken over by law enforcement or others.

    The design of the service, however, falls short of that promise thanks to poorly implemented encryption. Cryptography experts at ETH Zurich in Switzerland on Tuesday published a paper describing five possible attacks that can compromise the confidentiality of users' files.

    The paper [PDF], titled "Mega: Malleable Encryption Goes Awry," by ETH cryptography researchers Matilda Backendal and Miro Haller, and computer science professor Kenneth Paterson, identifies "significant shortcomings in Mega’s cryptographic architecture" that allow Mega, or those able to mount a TLS MITM attack on Mega's client software, to access user files.

    Continue reading

Biting the hand that feeds IT © 1998–2022