Ignite Microsoft is using Intel Altera Field Programmable Gate Arrays (FPGA) chips to speed up Azure services, according to an announcement at the Ignite event under way in Atlanta.
FPGA chips aim to combine the performance advantage of hardware with the flexibility of software. They are integrated circuits that can be reconfigured by downloading a new hardware configuration after manufacture, hence "Field Programmable".
Initiated in 2010, Microsoft's Project Catapult is an effort to accelerate cloud computing through a network of FPGAs in the company's datacenters.
Now the company has announced what it says is the world's largest deployment of FPGAs. For three years or so, Microsoft has been including an Altera FPGA in every server it installs. Altera, a specialist FPGA vendor, was acquired by Intel in 2015. This FPGA network will be used to accelerate artificial intelligence services, among other things.
Microsoft says the current rollout is in 15 countries over five continents, though there is no detail available. Newer data centres, such as those recently opened in the UK, are likely to be suitably equipped.
“The really important thing is how we’ve architected the system,” Microsoft Distinguished Engineer Doug Burger told The Reg. “The FPGA sits directly between the server and the network, so all the traffic goes through. The CPU can also talk to it over PCIe, but the FPGAs can talk to one another over the network as well. So in some sense it’s a new kind of computer that’s been inserted into our cloud. That layer can do networking, it can do AI, it can do other things. It is a major architectural change."
The fact that all the network traffic goes through the FPGAs is what enables Microsoft to use the FPGAs somewhat independently of the servers, rather than always going via the host servers. At the same time, this design introduces new risks, since a bug or fault impacts the whole system. That, said Burger, has been the key challenge.
"You are putting an alien technology into a very mature system. All of the network traffic runs through this thing. You screw it up, you can do some real damage. You think about reliability at scale, failure diagnostics, health monitoring, debugging, version management, package management, all of that needs to be built into the platform. No one has gone large scale like this."
The benefit though is huge speed-up for certain specialist tasks. "When you have a successful FPGA deployment, the speed-ups you get tend to range between 10 and 1000 times. Usually it’s in the low 10s," said Burger.
Might Microsoft allow developers to upload their own FPGA images to run on Azure? "That is a potential business," says Burger. "We haven't announced any plans or schedule to do that." ®