This article is more than 1 year old

Little grouse on the prairie: IBM's AI facial-recognition training dataset gets it in trouble... in Illinois

Photo subjects fling class-action sueball

A class-action lawsuit filed late last week has accused IBM of using photos of millions of people in Illinois without informing them to build a facial-recognition dataset.

In the filing (Bloomberg Law has a copy here), lead plaintiff Tim Janecyk alleges that IBM violated the state's Biometric Information Privacy Act (BIPA) laws by using at least seven of his photographs from Flickr without informing him or the subjects of his photos.

The "Diversity in Faces" dataset comprises millions of images pulled from Flickr, a popular image-sharing site, under the Creative Commons licence. This means that the images could legally be shared with third parties – under certain conditions outlined in the various CC licences, which almost always preclude their use for commercial purposes.

IBM has always claimed the dataset, which is used to train other facial-recognition systems to be less biased, is meant as an academic resource. The dataset is not publicly available and users need to be granted permission before they can access it.

In the vast majority of cases, this has meant the images were above board. Except, or so the plaintiffs will be hoping, in the Prairie State of Illinois, which has its own specific state legislation on the matter.

Illinois's BIPA is a 2008 US state law that requires businesses that collect or otherwise obtain biometric information – such as fingerprints, retina scans, or Flickr photos – to obtain written consent from the individuals involved.

Janecyk and the other members of the class have asked IBM to cough up $5,000 for every photo they used without consent.

Several states have similar laws, including Washington and Texas, but only Illinois allows individuals, rather than companies, to file for damages as a result of a violation.

IBM said in a statement: "We believe the allegations from the plaintiff's complaint are meritless, and we intend to defend vigorously against them."

This is not the first time that IBM's facial-recognition software has come under fire. Last year a number of Flickr users spoke out when they were told their photos were being used to train facial-recognition algorithms. The image owners found it difficult to get their images removed from IBM's dataset and it was impossible to delete them from copies that had already been given to researchers.

The Creative Commons group responded to the reports, saying that "fair use allows all types of content to be used freely".

Flickr and the plaintiff did not respond to The Register's requests for comment. ®

More about


Send us news

Other stories you might like