This article is more than 1 year old

Why, you're no better than an 8-bit hustler: IBM punts paper on time-saving DNN-training trick

Data centre, edge computing: yep – business applications

IBM has said it is possible to train deep learning models with 8-bit precision rather than 16-bit with no loss in model accuracy for image classification, speech recognition and language translation. The firm claimed today this work will help "hardware to rapidly train and deploy broad AI at the data center and the edge".

Big Blue boffins will present the research paper, "Training Deep Neural Networks with 8-bit Floating Point Numbers", at the 32nd Conference on Neural Information Processing Systems (NeurIPS) expo tomorrow.

rainy day - woman with umbrella checks to see it is still raining

IBM's Phase Change Memory computer can tell you if it's raining


The training process can be done both digitally with ASICS and with an analog chip using phase-change memory (PCM).

The firm will demonstrate a PCM chip at the event, using it to classify hand-written digits in real time via the cloud.

A model using 8-bit precision would need far less memory to store its numbers than a 32-bit precision model, and thus need less electrical energy as well.

The idea is that deep learning models could use existing hardware better by dropping precision to 8-bits and this will yield better models faster than trying to scale up to 32-bit hardware.

Train with less precision at same accuracy

Big Blue referred to its 2015 research paper, "Deep Learning with Limited Numerical Precision" (PDF), which showed that deep neural networks could be trained with 16-bit precision instead of 32-bit precision with little or no degradation in accuracy.

IBM said the new paper shows the precision number can be cut in half again: "Computational building blocks with 16-bit precision engines are typically 4 times smaller than comparable blocks with 32-bit precision."

This is achieved by trading "numerical precision for computational throughput enhancements, provided we also develop algorithmic improvements to preserve model accuracy."

"IBM researchers [are] achieving 8-bit precision for training and 4-bit precision for inference, across a range of deep learning datasets and neural networks."

This is done with 8-bit and 4-bit ASICs and using dataflow architectures.


The progress apparently "opens the realm of energy-efficient broad AI at the edge" or users could transcribe speech to text without using arrays of Nvidia Tesla GPUs – in their smartphone perhaps?

Analogical phase-change memory chips

A second IBM research paper, to be presented at the International Electron Devices Meeting (IEDM), stated that "8-bit Precision In-Memory Multiplication with Projected Phase-Change Memory" will show how analog memory devices can help deep neural network training in the same way as GPUs but with far less electrical energy. Where GPUs need data moved to their compute units, analog phase-change memory devices can do some computation inside the device, with no data movement.

The analog devices measure continually varying signals and have a problem with precision, being limited thus far to 4 bits or less. The research showed the achievement of 8-bit precision in a scalar multiplication operation, and how it "consumed 33x less energy than a digital architecture of similar precision", IBM said.


Crossbar arrays of non-volatile memories can accelerate the training of fully connected neural networks by performing computation at the location of the data.

Big Blue's boffins said PCM records synaptic weights in its physical state along a gradient between amorphous and crystalline. The conductance of the material changes along with its physical state and can be modified using electrical pulses. There are eight levels of conductance – and thus 8 values to store. With an array of such devices, "in-memory computing may be able to achieve high-performance deep learning in low-power environments, such as IoT and edge applications". ®

More about


Send us news

Other stories you might like