For all the fuss over algorithms and machine learning, computation can't quite compete with people when it comes to lossy image compression, it is claimed.
Comp-sci boffins from Stanford University and student interns from three San Francisco Bay Area high schools in the US devised a system to assess how code instructions interpreted by a computer differ from text instructions interpreted by a person when attempting to compress an image. And "compress" is used loosely, here.
The researchers – Ashutosh Bhown, Soham Mukherjee, Sean Yang, Shubham Chandak, Irena Fischer-Hwang, Kedar Tatwawadi, Tsachy Weissman – focused on lossy image compression techniques (in which data is subtracted, degrading the image, in order to compress it) because existing algorithms don't do enough to consider human perception and tend to produce results that are blurred or unnatural looking.
They describe their work in a paper titled, "Humans are still the best lossy image compressors."
"Some compression methods, for example, take advantage of the fact that human vision is more susceptible to differences in intensity than in color, and quantize color space more crudely than intensity space in order to achieve better compression performance," their paper explains.
Hoping to understand how image crunching algorithms might be made more deferential to human perception, the boffins set up a system by which a describer – via Skype text chat and links to online resources – directs a reconstructor to reduce the file size of an image using PhotoScape X, an image editing tool.
Or as the eggheads put it:
In this work, we perform compression experiments in which one human describes images to another, using publicly available images and text instructions.
The results of this interaction were then evaluated by workers from Amazon Mechanical Turk for their visual aesthetic. The aim is to have one person describe to another how to edit an image to reduce its complexity and size, rather than rely on an algorithm. The person drawing takes public images and other online resources and glues them together to approximate the original image, using instructions from the person viewing the original.
This, in a crude way, compresses the image – text instructions and URLs to images don't take up much storage space – although it's, well, a relatively close approximation rather than a direct transformation. That's a nice way to say the final result looks odd, though it is a compression of sorts.
Another aim of the experiment is to make it repeatable. The chat transcripts between the describer and the image reconstructor represent the conversational code by which the image processing can be repeated, to the extent that's possible when the values applied to operations in the photo editing app aren't captured.
Here's a sample excerpt:
when you're done with that take a look at these https://public-media.smithsonianmag.com/filer/32/f2/32f24473-b380-43f5-9 4df-da0e58644439/16301090250_acf80be87f_o.jpg https://img.purch.com/w/192/aHR0cDovL3d3dy5saXZlc2NpZW5jZS5jb20v aW1hZ2VzL2kvMDAwLzA2OC8wOTQvaTMwMC9naXJhZmZlLmpwZz8x NDA1MDA4NDQy sure while you're editing that giraffe its spots are too dark make it look like the other giragge… make the right one bigger than the left make the heads level wait back put the left one where it was before good now move the right giraffe to the left so that their necks cross good move them both to the center make them both taller as well their heads should be above the middle line of shrubs…
The researchers, who have made their code and data available, asked the Mechanical Turk workers to compare the results of this collaboration to images processed with WebP, an image compression algorithm developer by Google. And they found people produced more pleasing results than WebP.
"The human compressor was ranked higher than WebP on 10 out of the 13 images from the dataset," their paper says. "Qualitatively, the human reconstructions seem more natural and sharper to the MTurk workers, as compared to the WebP compressed images while still achieving high compression ratios, ranging from around 100x to 1000x."
Be our Guetzli, says Google, to make beastly JPEGs beautifully smallREAD MORE
However, this system isn't intended as a replacement for automated algorithms. As the boffins acknowledge, human-driven compression is impractical because it is time- and labor-intensive, and conversational instructions are not optimized.
They observe that their human compression scheme makes use of online image resources as points of reference and claim that algorithms could do something similar to achieve a better compression ratio.
"We plan to use the insights obtained from this work to build an image compressor that is both optimized for human perception loss and able to utilize side information in the form of publicly available databases," they conclude.
People may not produce the best results for long, however. As the researchers point out, machine learning practitioners have shown that they can generate visually appealing images that are highly compressed using GANs (Generative Adversarial Networks). ®