Vid Run a camera fast enough and its images can capture sound from the way nearby objects vibrate, according to boffins from MIT, Adobe and Microsoft.
The experiments, announced by MIT, worked so well that they claimed to have recovered sound from the leaves of plants, and the vibration of a crisp packet.
The latter, as the researchers demonstrate in their video (below), was filmed through soundproof glass as proof that they weren't fudging the experiment using a hidden microphone.
In a paper to be presented at Siggraph, Abe Davis (MIT), Michael Rubenstein (MIT and Microsoft), Neil Wadhwa (MIT), Gautham Mysore (Adobe), Frédo Durand and William Freeman (both MIT) describe their “visual microphone”.
It's obvious that sounds will make objects around them vibrate. Turning that into a system that can recover the original sounds from the vibrations posed several challenges: the video had to be captured at a high enough frame rate; the software the researchers developed had to be able to detect the tiny pixel-scale vibrations; and background noise had to be removed.
Even through soundproof glass, audio can be recovered from vibrating objects
“To recover sound from an object, we film the object using a high-speed video camera. We then extract local motion signals across the dimensions of a complex steerable pyramid built on the recorded video. These local signals are aligned and averaged into a single, 1D motion signal that captures global movement of the object over time, which we further filter and denoise to produce the recovered sound,” the researchers write.
The video frame rate is important, since (for those that remember the Nyquist-Shannon sampling theorem) you need to sample an analogue signal like sound at twice the maximum frequency you want to capture. In the “visual microphone” experiment, the researchers worked at frame rates between 2 kHz and 20 kHz.
However, they also note that the “rolling shutter” characteristic of digital cameras means they can be used to capture sound. They used the rolling shutter “to effectively increase the sampling rate of a camera and recover sound frequencies above the camera’s frame rate”, the paper states.
The signal processing of the video wasn't in anything like real time: “Processing each video typically took 2 to 3 hours using MATLAB on a machine with two 3.46 GHz processors and 32 GB of RAM” they write. ®