Photonic processor can classify millions of images faster than you can blink

We ask again: Has science gone too far?

Engineers at the University of Pennsylvania say they've developed a photonic deep neural network processor capable of analyzing billions of images every second with high accuracy using the power of light.

It might sound like science fiction or some optical engineer's fever dream, but that's exactly what researchers at the American university's School of Engineering and Applied Sciences claim to have done in an article published in the journal Nature earlier this month.

The standalone light-driven chip – this isn't another PCIe accelerator or coprocessor – handles data by simulating brain neurons that have been trained to recognize specific patterns. This is useful for a variety of applications including object detection, facial recognition, and audio transcription to name just a few.

Traditionally, this has been achieved by simulating an approximation of neurons using standard silicon chips, such as GPUs and other ASICs. The academics said their chip is the first to do so optically using light signals.

"The low-energy consumption and ultra-low computation time offered by our photonic classifier chip can revolutionize applications such as event-driven and salient-object detection,"  the article's authors wrote.

In a proof of concept detailed in Nature, the photonics chip was able to categorize an image in under 570 picoseconds with an accuracy of 89.8-93.8 percent. This, the authors claim, puts the chip on par with high-end GPUs for image classification.

To put that in perspective, that works out to just over half a billion images in the time it takes you to blink (1/3 second). And the team posits even faster processing — in the neighborhood of 100 picoseconds per image — is possible using commercial fabrication processes available today.

According to the article, this offers numerous benefits, including lower-power consumption, high throughput, and fewer bottlenecks compared to existing deep-neural-networking technologies that are either physically separated from the image sensor or tied to a clock frequency.

"Direct clock-less processing of optical data eliminates analogue-to-digital conversion and the requirement for a large memory module, allowing faster and more energy-efficient neural networks for the next generation of deep-learning systems," the article's authors wrote.

What's more, because all of the computation is done in chip, there is no image sensor required. In fact, because the processing is done optically, it is the image sensor.

The test

Before you get too excited, the images used in the proof of concept were positively tiny, measuring 30 pixels in total. The actual test involved classifying hand-drawn "P" and "d" characters projected onto the chip. Nonetheless, it was still able to achieve accuracy slightly less than the popular Keras deep-learning API (96 percent) running in Python.

However, the team notes that resolution isn't the limiting factor here, and there's nothing stopping them from scaling the chip up to support larger resolutions. What's more, they claim the tech could be used to classify any data that can be converted into an optical signal.

If true, the technology has implications for a variety of fields, video object detection being the obvious one, since processing could effectively be done in real-time and wouldn't be limited to the frame-rate of a traditional digital image sensor.

"The large bandwidth available at optical frequencies as well as low propagation loss of nanophotonic waveguides — serving as interconnects — make photonic integrated circuits a promising platform to implement fast and energy efficient processing units," the article reads.

How it works

The nine-square millimeter chip is made up of two layers, an optical component which handles the compute side, and an optoelectric layer responsible for signal processing.

The optical layer features a 5x6 array of gated couplers that act as input pixels. Light from these pixels is divided into three overlapping 3x4 pixel sub-images and then channeled to nine artificial neurons spread across three layers using nanophotonic waveguides.

The optoelectric layer then converts the optical signal into a voltage, amplifies it, and passes it to a micro-ring modulator which converts the signal back into light, which can then be interpreted by a digital signal processor.

However, before the chip can render usable results, it has to be trained. Researchers achieved this using a series of training images projected onto a secondary pixel array on the chip.

The output from those images was then into a digital neural network that replicates the chip in Keras running on Python to determine the optimal weight vectors. A combination of microcontrollers and digital analogue converters were then used to write those weights back to the chip. 

Once trained, all classification is handled in-chip.

According to the researchers, the technology addresses several of the limitations inherent to GPU and ASIC-based deep neural networks today and has the potential to "revolutionize" several applications, including object detection.

The team further claims that by scaling up the chip, higher resolutions or greater numbers of neurons could be achieved, with the only limitations being the bandwidth of the micro-ring modulators and the silicon-germanium photodiodes in the optoelectronic layer.

What's more, the researches posit that commercial fabrication processes offering monolithic integration of both the electric and photonic components could further accelerate the chip, allowing for bandwidths in the tens of gigahertz range and processing times of less than 100 picoseconds. ®

Other stories you might like

Biting the hand that feeds IT © 1998–2022