The state of today's machine learning: Short, wide, deep but not high

A gentle guide to where we're at with AI

Comment Remember that kid in middle school who was deeply into Dungeons & Dragons, and hadn't seen his growth spurt yet? Machine learning is sort of like that kid – deep, wide, and short – and not so tall.

But on the serious side, machine learning today is useful for a wide variety of pattern recognition problems, including the following:

  • Image classification.
  • Speech processing.
  • Handwriting recognition.
  • Text processing.
  • Threat assessment.
  • Fraud detection.
  • Language translation.
  • Self-driving vehicles.
  • Medical diagnosis.
  • Robotics.
  • Sentiment analysis.
  • Stock trading.

Deep learning, a subset of machine learning, has progressed rapidly during the past decade due to:

  • Big data – an increased availability of large data sets for training and deployment has also driven the need for deeper nets.
  • Deeper nets – deep neural nets have multiple layers, and often possess higher-order architecture (width) within a given layer.
  • Clever training – it was discovered that a large dose of unsupervised learning in the earlier stages of training allowed for the net to do its own automated, lower-level feature recognition and extraction, and pass those features on to the next stage for higher-level feature recognition.
  • High performance computing – clustered systems, enhanced with accelerator technology, have become essential to training large deep nets.

In deep learning, the key computational kernels involve linear algebra, including matrix and tensor arithmetic. A deep neural net can have millions or even billions of parameters due to their rich connectivity. While depth refers to the number of layers, the layers can also be quite wide – with hundreds to thousands of neurons in a given layer. The weights of these connections must be adjusted iteratively until a solution is reached in a space of very high dimensionality.

Because of the large number of parameters and the generally modest accuracy required for the final output – is this image a cat? or is this a fraudulent application? – low-precision arithmetic typically suffices. Training can be successful with floating point half precision (16 bits) or with fixed point or integers (as low as 8 bits in some cases). This is the short aspect.

Yann LeCun, one of the pioneers of deep learning, has noted: "Getting excellent results on ImageNet is easily achieved with a convolutional net with something like 8- to 16-bit precision on the neuron states and weights."

The dominance of linear algebra kernels plus short precision indicates that accelerator hardware is extremely useful in deep learning. Overall, the class of problems being addressed is that of very high-order optimization problems with very large input data sets – it is thus natural that deep learning has entered the realm of high-performance computing.

Major requirements are highly scalable performance, high memory bandwidth, low power consumption, and excellent short arithmetic performance. The requisite computational resources are clusters whose nodes are populated with a sufficient number of accelerators. These provide the needed performance while keeping power consumption low. Nvidia GPUs are the most popular acceleration technology in deep learning today.

Since the advent of the Pascal version of their CPU and CUDA 7.5, Nvidia has added half precision support, specifically for deep learning. With half precision or 16-bit floating point, the peak performance is double that obtained with 32-bit. In their marketing of the DGX-1 "deep learning supercomputer," Nvidia touts the higher 170 Teraflops peak rate that is based on FP16 half-precision.

Other alternatives beyond GPUs are often based on ASICs and FPGAs. From Intel we have Altera FPGAs, Nervana Engines (being acquired), and Movidius VPUs (being acquired), as well as the Knights Mill (the next-generation 2017 version of Phi). From other companies, solutions include Alphabet's Google TPUs, Wave Computing DPUs, DeePhi Tech DPUs, and IBM's TrueNorth neuromorphic chips. All of these technologies have enhanced performance for reduced precision arithmetic.

So like that dweeby middle school kid, machine learning is deep, wide, and short. But for it to grow, it will continue to depend on flexible HPC compute solutions – particularly accelerators – be they GPUs, FPGAs, ASICs or some other brand new chippy solution. ®

Similar topics

Other stories you might like

  • Prisons transcribe private phone calls with inmates using speech-to-text AI

    Plus: A drug designed by machine learning algorithms to treat liver disease reaches human clinical trials and more

    In brief Prisons around the US are installing AI speech-to-text models to automatically transcribe conversations with inmates during their phone calls.

    A series of contracts and emails from eight different states revealed how Verus, an AI application developed by LEO Technologies and based on a speech-to-text system offered by Amazon, was used to eavesdrop on prisoners’ phone calls.

    In a sales pitch, LEO’s CEO James Sexton told officials working for a jail in Cook County, Illinois, that one of its customers in Calhoun County, Alabama, uses the software to protect prisons from getting sued, according to an investigation by the Thomson Reuters Foundation.

    Continue reading
  • Battlefield 2042: Please don't be the death knell of the franchise, please don't be the death knell of the franchise

    Another terrible launch, but DICE is already working on improvements

    The RPG Greetings, traveller, and welcome back to The Register Plays Games, our monthly gaming column. Since the last edition on New World, we hit level cap and the "endgame". Around this time, item duping exploits became rife and every attempt Amazon Games made to fix it just broke something else. The post-level 60 "watermark" system for gear drops is also infuriating and tedious, but not something we were able to address in the column. So bear these things in mind if you were ever tempted. On that note, it's time to look at another newly released shit show – Battlefield 2042.

    I wanted to love Battlefield 2042, I really did. After the bum note of the first-person shooter (FPS) franchise's return to Second World War theatres with Battlefield V (2018), I stupidly assumed the next entry from EA-owned Swedish developer DICE would be a return to form. I was wrong.

    The multiplayer military FPS market is dominated by two forces: Activision's Call of Duty (COD) series and EA's Battlefield. Fans of each franchise are loyal to the point of zealotry with little crossover between player bases. Here's where I stand: COD jumped the shark with Modern Warfare 2 in 2009. It's flip-flopped from WW2 to present-day combat and back again, tried sci-fi, and even the Battle Royale trend with the free-to-play Call of Duty: Warzone (2020), which has been thoroughly ruined by hackers and developer inaction.

    Continue reading
  • American diplomats' iPhones reportedly compromised by NSO Group intrusion software

    Reuters claims nine State Department employees outside the US had their devices hacked

    The Apple iPhones of at least nine US State Department officials were compromised by an unidentified entity using NSO Group's Pegasus spyware, according to a report published Friday by Reuters.

    NSO Group in an email to The Register said it has blocked an unnamed customers' access to its system upon receiving an inquiry about the incident but has yet to confirm whether its software was involved.

    "Once the inquiry was received, and before any investigation under our compliance policy, we have decided to immediately terminate relevant customers’ access to the system, due to the severity of the allegations," an NSO spokesperson told The Register in an email. "To this point, we haven’t received any information nor the phone numbers, nor any indication that NSO’s tools were used in this case."

    Continue reading

Biting the hand that feeds IT © 1998–2021