This article is more than 1 year old
Tech demo takes brain scan, creates a picture of what you're looking at
Break out the tinfoil hats: Boffins' experimental tech improves computer mind reading
There's a lot of noise to signal in the machine-learning model world, but this demo is genuinely impressive – or scary, if you are given to recreationally climbing into MRI scanners.
The new research is presented in a paper titled "High-resolution image reconstruction with latent diffusion models from human brain activity" co-authored by Professor Shinji Nishimoto and Assistant Professor Yu Takagi of the Graduate School of Frontier Biosciences (FBS) at Osaka University. What the boffins have done is found a way to pass fMRI brain scans into the open source Stable Diffusion latent-variable model created by billion dollar startup unicorn Stability AI.
The results are startling, to say the least. Presented with the output of an fMRI brain scan – which to our eyes, looks very close to random noise – the researchers' limited-diffusion model can, in their words:
reconstruct high-resolution images with high fidelity in straightforward fashion, without the need for any additional training and fine-tuning of complex deep-learning models.
The preprint paper showcases five recovered images: a teddy bear, complete with its bow-tie; an avenue of trees; a jet airliner landing (or possibly taking off); a snowboarder on the slopes; and a tapering clocktower. The level of match is variable, and a sixth image, of a steam locomotive, is less clear, but it's remarkably good, even if the researchers had picked the best of their results, as we suspect they would naturally be inclined to do.
The boffins say that they the source code of their model is "coming soon". Their input data was four of the eight volunteers whose scans are in the University of Minnesota's public Natural Scenes Dataset or NSD, and the sample images given in the paper are from one person.
Stable Diffusion itself has become famous for taking textual descriptions and generating sometimes very realistic images from just a handful of words – and if those are carefully enough chosen, the text can evoke the original images used to train the model.
So while this is not precisely a computer reading someone's mind, it delivers markedly better results than, for example, some earlier efforts in this direction which we reported on in 2021. If we follow the paper correctly, they're using Stable Diffusion to improve the recovered images by incorporating elements from its training database. For comparison, some 12 years ago, a comparable paper [PDF] using Bayesian statistics and modelling did produce some recognizable images, but of significantly lower quality.
- This app could block text-to-image AI models from ripping off artists
- It's official! Space travel increases the brain size of astronauts, even when they're back on Mother Earth
- Time for another cuppa then? Tea-drinkers have better brains, say boffins with even better brains
- Don't trust deep-learning algos to touch up medical scans: Boffins warn 'highly unstable' tech leads to bad diagnoses
As we have reported in the past, the claims of fMRI research have long been controversial, but this is the sort of area where machine-learning and neural-network algorithms can be at their most useful: finding very faint signals, correlating and matching them with their huge libraries of images, to produce easily-recognizable results.
Functional MRI is a subset of magnetic resonance imaging, or nuclear magnetic resonance imaging as it used to be called before the boffins realized that the "n word" scared people off. The scanners involved are extremely large (and this vulture can attest, having been in more than one, extremely loud) machines. Nobody is going to point a parabolic dish at your head from across the street and read what you're thinking about. But if you first sign a bunch of waivers, then spend an hour lying with your head clamped still while a huge donut-shaped magnet is rotated around it, yes, this kind of technique might be able to tell what picture you're looking at.
The two profs have a page about their work, and you can read the abstract or the whole 11-page paper [PDF] on the bioRchiv preprint server. They will be presenting their findings at this year's CVPR in Vancouver in June. ®