Put on a virtual reality headset and it's hard to believe that your visual system is being stretched beyond its limit. Individual pixels are still visible and the narrow field of view makes it feel like you're wearing ski googles.
Yet even now VR bombards our visual system with more information than it can process. Engineers are grappling with how to make headset displays match up with what we can biologically handle. If they fail, VR could hit a ceiling where it requires too much computing power to make virtual worlds look realistic.
One of the key challenges is that most of the pixels currently displayed by headsets are, in some sense, wasted. To understand why, consider how we see. Our vision consists of a high resolution fovea in the centre, surrounded by a much more blurry periphery, which has evolved to be good at detecting motion but far worse at fine detail and colour.
"All of our fine resolution takes place in the central one degree," says Tim Meese, professor of vision science at Aston University. Graphics card maker Nvidia has calculated that around 96 per cent of the pixels in a VR headset are viewed in our periphery, rather than the central fovea.
The technological race is therefore on to develop "foveated rendering": headsets that display a small, high-resolution spot that follows where we are looking, but a steadily lower quality image in our periphery, in order to save on processing power.
There's a parallel between foveated rendering and how evolution has honed our visual system. Both processes are about "where best should we put the effort" of visualising the world, argues Meese.
By some estimates, given a field of view of 180 degrees, around 74 gigabytes of visual data are available to us each second, but only around 125 megabytes are ultimately processed, he explains, "so a lot less" than we could theoretically take in.
Mythbuster seeks cash for roller skates to wear in virtual realityREAD MORE
Our peripheral vision can spot movement – useful when there is a lion sneaking up in the bushes, for example – which allows us to then use our fine-detail fovea to see if there really is a predator about to pounce, Meese says. But if our entire visual field had the same acuity as our fovea, we'd need an optic nerve and brain perhaps a hundred times bigger to process all the information.
Yet certain quirks of our visual system make foveated rendering more complex than it might seem. For a start, simply blurring the periphery in a headset has made some users experience tunnel vision. Nvidia shed some light on why: our ability to detect the presence of contrast – the difference between light and dark objects – in our peripheral vision peters out more slowly than our ability to resolve details.
"If you over-blur the periphery, your visual system doesn't have anything to latch on to out there, and it makes you feel nauseated and dizzy," explains Dave Luebke, vice president of graphics research at the GPU giant.
We are also surprisingly good at picking out human faces in our peripheral vision, meaning they might have to be rendered in more detail than other objects. "There is indeed evidence that you can sense the average expression of faces in the crowd without ever looking at them directly," he says.
Foveated rendering also requires headsets that can track where you are looking, something seemingly not lost on the big tech companies. In June last year, reports surfaced that Apple had bought SensoMotoric Instruments, a company based near Berlin which has demonstrated foveated rendering in current-generation headsets.
A headache to perfect
But getting eye-tracking technology good enough "is really hard, and even the best companies out there are sort of only barely good enough," says Luebke, as tracking can be thrown off by things like contact lenses and mascara. Individual differences in our face and eyes mean that "they are not robust enough across the population". To be commercially viable, they need to work across 99.9 per cent of people, he thinks (earlier this year, SMI said that their technology currently worked for 98 per cent of the population).
Why is it so important to get foveated rendering to work? Last year, Michael Abrash, chief scientist at Oculus, warned that making VR sharp enough so that it was "good enough to pass a driver's licence test" required an "order of magnitude" more computing power.
Even now, with individual pixels still visible when using the very best headsets, VR gaming is around seven times as computationally expensive compared to a PC monitor, Nvidia has calculated. Not only do you need a screen per eye, doubling the number of pixels served, but VR games need to run at at least 90Hz (frames per second).
Both Abrash and Luebke think foveated rendering is so important to the VR industry that the eye-tracking problem will be solved within five years. According to Abrash, the failure to crack foveated rendering is "the greatest single risk factor" to his predictions of dramatic improvements in VR realism over the next five years.
VR is also testing the limits of our visual system when it comes to how quickly the screen refreshes. Video games on a monitor are playable at 30Hz, but VR games tend to run at 90Hz and above. Compared to a monitor, VR envelopes far more of our peripheral vision, which, according to Meese, is much more sensitive to flicker than the fovea.
The centre of our vision "doesn't care too much about movement," explains Laurence Harris, a professor of psychology, kinesiology, health sciences and biology at York University in Canada. "Motion detection is most important in the periphery, to tell you about your own movement and to detect animals trying to creep up on you."
Nvidia has been experimenting with displays for VR and AR (augmented reality, which layers 3D objects over the real world using transparent visors such as Microsoft's Hololens) that boast frame rates well beyond even 90Hz. Last year, Luebke demonstrated a 1,700 Hz VR display, while another team, also involving company researchers, has demonstrated an even faster AR display, which updates at 16,000Hz.
The purpose of super-fast refresh rates is to cut down the time between us moving our heads and the display in a VR headset changing to reflect that motion – so-called latency. "Above a certain point you're not going to perceive the flicker even in your peripheral vision... but you can still have the benefit from driving the latency down," Luebke says.
Magic Leap blows our mind with its incredible technology... that still doesn't f**king existREAD MORE
With so many more frames per second, it becomes possible to add in very slight changes in perspective almost instantaneously when we move our head in VR, driving latency down and making if feel like our bodies in VR are reacting immediately to our real movements, he says. In VR games, the AI and physics might be calculated at no more than 20 frames per second; the rendering then happens at 90Hz; and finally, a super-fast refresh rate allows us to "bolt something on top of that" to make movement feel even more responsive, he says.
But in one respect VR still lags far behind our biological limits: we are nowhere near a pixel density that mimics real life.
In theory, in our fovea, we need about 120 pixels per degree of view to match reality (although Meese says in practice, people generally can't see in finer detail than around 80 pixels per degree). Currently, the best headsets manage about 10 pixels per degree horizontally. Given the need to scale up by a factor of about 10 – on both axes – the increase in resolution required is enormous. "I don't think the technology is there yet for those display pixel densities," says Bryan William Jones, a retinal neuroscientist at the University of Utah.
And for VR obsessives who will settle for nothing less than a perfect replica of reality, even 120 pixels per degree might not be enough. Put two lines above each other and move one slightly to the left or right, and it turns out we are "extraordinarily sensitive" to even the tiniest differences between them, says Meese, even to movements smaller than the width of a cone in the eye.
To match this sensitivity on a computer monitor would require a pixel density that "beggars belief", he says, and well beyond 120 pixels per degree. The US Air Force has estimated that a computer screen would require 10,300 pixels per inch to simulate these so called "hyper acuities". This is more than 30 times that of an iPhone 7 (and 12 times the density boasted by a new Samsung VR display, revealed in June, which has a display of around 850 pixels per inch).
Such fine-grain vision is rarely used in the real world, says Meese. "The closest example you can think of is threading the eye of a needle, or something like that," he thinks. But it serves as a reminder that imitating the look of real life in VR remains a technological pipe dream – don't expect to do any VR sewing in the next decade or two. ®