As yet another mobile phone vendor produces a 3G mobile phone with a camera able only to point away from the user, is it time to forget about making mobile video calls? This time it was Nokia, with the latest Symbian OS-based phone, so why have a camera if it doesn't point at the user?
Even in Japan, where camera phones are widely used, the most important transmitted image usage is to "see what I see", and not "see me". This is odd, because videophones have long been heralded - from science fiction to science fairs - as the next big thing for telephony. Indeed, the mobile version of the application was used for all the early advertising when Hutchinson 3 launched their 3G service in the UK.
Why should users pay for a "see what I see" image? The same reasons they send postcards, and take and share any photo: to prove they were somewhere doing something. Then there are various derivatives of the "yes dear" application. You know how it goes: "I'm thinking of getting this, what do you think?" Click, send. "Yes dear, that looks fine."
On a related subject, is video conferencing as an application currently flawed? Most people I speak to, even in technology providers, are not great fans of video conferencing, so is there any hope for mobile video calls?
Most systems are getting over the jerky limitations of bandwidth, and those who are regular users probably accommodate any remaining image jumps. The problem is one of visual impact and immediately recognisable to portrait photographers. There's no eye contact, and the person in shot doesn't stand out from the background. So how about addressing the optics rather than just the bandwidth?
Some useful work is being undertaken in the machine learning group within Microsoft Research in Cambridge, which could dramatically improve the visual cues in video conferencing applications. Today, it requires the power of a well-specified PC, but could be employed with a little more processing power improvement, operating at lower resolution on mobile devices.
The research project is called i2i, and uses the stereo vision of two side-by-side cameras. This provides sufficient information to determine the distance away from the camera pair for all features on view. It means that a two dimensional image of a face can be turned into a three-dimensional object, which can be rotated to ensure that the apparent gaze line is direct to the camera, and therefore the remote user. This avoids one of the biggest problems of video conferencing visual cues, where the participants look at images on screen, not directly to camera, so eye contact is lost.
The relative location of the face is also known. This allows another very valuable visual effect where the background of the image can be pushed out of focus relative to the subject matter, the face of the other person on the call. This makes the person stand out and have more impact, just like they do in reality when your eyes focus on the subject, not the background. The idea can be pushed even further with virtual backgrounds and additional objects, such as 3D icons in the foreground. But some of these ideas make even the animated paperclip helper look sensible.
However, it's the simpler fixes that will have most impact: eye contact and background defocusing. Once they're sorted, video conferencing images will have many of the aspects of being there in person. For fixed or perhaps future mobile video calls, it's not the smart use of bandwidth that is holding back usage, it's the smart use of processing power and camera optical effects to bring us closer to reality.