Updated A German PhD student has found a flaw in some Xerox Workcentres that fudges the numbers on some scans thanks to poor data compression.
Last Wednesday, computer science student David Kriesel was scanning in some building plans on a Xerox WorkCentre, and when checking the copies he found some of the dimensions of the plans had been altered, with three room measurements out of sync with the original documents. In particular, the number six was being changed to an eight.
The problem appears to affect documents scanned as PDFs without any character recognition (OCR) enabled, and Kriesel found that he could replicate the dodgy digits. He notes the flaw was most potent at 200 DPI copies using a seven or eight-point Arial typeface.
Kriesel found the flaw was present on the WorkCentre 7535 and 7556 models and posted a blog piece about his research. Judging from the wave of interest it generated, the problem is more widespread than he knew, with the flaw reportedly reproduced on six WorkCentre models and two of Xerox's ColorQube range.
Maybe last year's annual results weren't so good.
As for the cause, Kriesel originally thought there must be something wrong with the copier's data-compression algorithm, which was recognizing low-resolution numbers improperly. Emails he received after publishing suggest it's a problem with the JBIG2 compression system those copiers use.
"This algorithm creates a dictionary of image patches it finds 'similar'," he said. "Those patches then get reused instead of the original image data, as long as the error generated by them is not 'too high'. Makes sense."
According to a statement released by Xerox, he's right. The company confirmed that the problem is caused by a combination of compression levels and resolution setting with the JBIG2 system, saying it has "inherent tradeoffs under low resolution and quality settings."
However, Xerox says that the problem is essentially due to the settings users have put into individual copiers. Xerox recommends using the factory default option of higher image quality when setting up the copier, and points out that if the lower quality option is picked, the following disclaimer is displayed:
"The normal quality option produces small file sizes by using advanced compression techniques. Image quality is generally acceptable, however, text quality degradation and character substitution errors may occur with some originals."
Legally, then, Xerox is in the clear, but that's going to be cold comfort if your newly built house extension collapses in the middle of the night. ®
Xerox has been in touch to say that a software patch has been issued for the scanning problem and reiterated that the flaw doesn't affect "standard printing, copying and traditional fax functions".