What sort of silicon brain do you need for artificial intelligence?

Using CPUs, GPUs, FPGAs and ASICS to make sense of AI


The Raspberry Pi is one of the most exciting developments in hobbyist computing today. Across the world, people are using it to automate beer making, open up the world of robotics and revolutionise STEM education in a world overrun by film students. These are all laudable pursuits. Meanwhile, what is Microsoft doing with it? Creating squirrel-hunting water robots.

Over at the firm’s Machine Learning and Optimization group, a researcher saw squirrels stealing flower bulbs and seeds from his bird feeder. The research team trained a computer vision model to detect squirrels, and then put it onto a Raspberry Pi 3 board. Whenever an adventurous rodent happened by, it would turn on the sprinkler system.

Microsoft’s sciurine aversions aren’t the point of that story – its shoehorning of a convolutional neural network onto an ARM CPU is. It shows how organizations are pushing hardware further to support AI algorithms. As AI continues to make the headlines, researchers are pushing its capabilities to make it increasingly competent at basic tasks such as recognizing vision and speech.

As people expect more of the technology, cramming it into self-flying drones and self-driving cars, the hardware challenges are increasing. Companies are producing custom silicon and computing nodes capable of handling them.

Jeff Orr, research director at analyst firm ABI Research, divides advances in AI hardware into three broad areas: cloud services, on‑device, and hybrid. The first focuses on AI processing done online in hyperscale data centre environments like Microsoft’s, Amazon’s and Google’s.

At the other end of the spectrum, he sees more processing happening on devices in the field, where connectivity or latency prohibit sending data back to the cloud.

“It’s using maybe a voice input to allow for hands-free operation of a smartphone or a wearable product like smart glasses,” he says. “That will continue to grow. There’s just not a large number of real-world examples on‑device today.” He views augmented reality as a key driver here. Or there’s always this app, we suppose.

Finally, hybrid efforts marry both platforms to complete AI computations. This is where your phone recognizes what you’re asking it but asks cloud-based AI to answer it, for example.

The cloud: rAIning algorithms

The cloud’s importance stems from the way that AI learns. AI models are increasingly moving to deep learning, which uses complex neural networks with many layers to create more accurate AI routines.

There are two aspects to using neural networks. The first is training, where the network analyses lots of data to produce a statistical model. This is effectively the “learning” phase. The second is inference, where the neural network then interprets new data to generate accurate results. Training these networks chews up vast amounts of computing power, but the training load can be split into many tasks that run concurrently. This is why GPUs, with their double floating point precision and huge core counts, are so good at it.

Nevertheless, neural networks are getting bigger and the challenges are getting greater. Ian Buck, vice president of the Accelerate Computing Group at dominant GPU vendor Nvidia, says that they’re doubling in size each year. The company is creating more computationally intense GPU architectures to cope, but it is also changing the way it handles its maths.

“It can be done with some reduced precision,” he says. Originally, neural network training all happened in 32‑bit floating point, but it has optimized its newer Volta architecture, announced in May, for 16‑bit inputs with 32‑bit internal mathematics.

Reducing the precision of the calculation to 16 bits has two benefits, according to Buck.

“One is that you can take advantage of faster compute, because processors tend to have more throughput at lower resolution,” he says. Cutting the precision also increases the amount of available bandwidth, because you’re fetching smaller amounts of data for each computation.

“The question is, how low can you go?” asks Buck. “If you go too low, it won’t train. You’ll never achieve the accuracy you need for production, or it will become unstable.”

Beyond GPUs

While Nvidia refines its architecture, some cloud vendors have been creating their own chips using alternative architectures to GPUs. The first generation of Google’s Tensor Processing Unit (TPU) originally focused on 8‑bit integers for inference workloads. The newer generation, announced in May, offers floating point precision and can be used for training, too. These chips are application-specific integrated circuits (ASICs). Unlike CPUs and GPUs, they are designed for a specific purpose (you’ll often see them used for mining bitcoins these days) and cannot be reprogrammed. Their lack of extraneous logic makes them extremely high in performance and economic in their power usage – but very expensive.

Google's scale is large enough that it can swallow the high non-recurring expenditures (NREs) associated with designing the ASIC in the first place because of the cost savings it achieves in AI‑based data centre operations. It uses them across many operations, ranging from recognizing Street View text to performing Rankbrain search queries, and every time a TPU does something instead of a GPU, Google saves power.

“It’s going to save them a lot of money,” said Karl Freund, senior analyst for high performance computing and deep learning at Moor Insights and Strategy.

He doesn’t think that’s entirely why Google did it, though. “I think they did it so they would have complete control of the hardware and software stack.” If Google is betting the farm on AI, then it makes sense to control it from endpoint applications such as self-driving cars through to software frameworks and the cloud.

Next page: FPGAs and more

Similar topics


Other stories you might like

  • Cisco deprecates Microsoft management integrations for UCS servers

    Working on Azure integration – but not there yet

    Cisco has deprecated support for some third-party management integrations for its UCS servers, and emerged unable to play nice with Microsoft's most recent offerings.

    Late last week the server contender slipped out an end-of-life notice [PDF] for integrations with Microsoft System Center's Configuration Manager, Operations Manager, and Virtual Machine Manager. Support for plugins to VMware vCenter Orchestrator and vRealize Orchestrator have also been taken out behind an empty rack with a shotgun.

    The Register inquired about the deprecations, and has good news and bad news.

    Continue reading
  • Protonmail celebrates Swiss court victory exempting it from telco data retention laws

    Doesn't stop local courts' surveillance orders, though

    Encrypted email provider Protonmail has hailed a recent Swiss legal ruling as a "victory for privacy," after winning a lawsuit that sees it exempted from data retention laws in the mountainous realm.

    Referring to a previous ruling that exempted instant messaging services from data capture and storage laws, the Protonmail team said this week: "Together, these two rulings are a victory for privacy in Switzerland as many Swiss companies are now exempted from handing over certain user information in response to Swiss legal orders."

    Switzerland's Federal Administrative Court ruled on October 22 that email providers in Switzerland are not considered telecommunications providers under Swiss law, thereby removing them from the scope of data retention requirements imposed on telcos.

    Continue reading
  • Japan picks AWS and Google for first gov cloud push

    Local players passed over for Digital Agency’s first project

    Japan's Digital Agency has picked Amazon Web Services and Google Cloud for its first big reform push.

    The Agency started operations in September 2021, years after efforts like the UK's Government Digital Service (GDS) or Australia's Digital Transformation Agency (DTA). The body was a signature reform initiated by Prime Minister Yoshihide Suga, who spent his year-long stint in the top job trying to curb Japan's reliance on paper documents, manual processes, and faxes. Japan's many government agencies also operated their websites independently of each other, most with their own design and interface.

    The new Agency therefore has a remit to "cut across all ministries" and "provide services that are driven not toward ministries, agency, laws, or systems, but toward users and to improve user-experience".

    Continue reading

Biting the hand that feeds IT © 1998–2021