This article is more than 1 year old

It is possible to extract copies of images used to train generative AI models

Tools like DALL-E, Stable Diffusion, and Midjourney have memories after all

Generative AI models can memorize images from their training data, possibly allowing users to extract private copyrighted data, according to research.

Tools like DALL-E, Stable Diffusion, and Midjourney are trained on billions of images scraped from the internet, including data protected by copyright like artwork and logos. They learn to map visual representations of objects and styles to natural language. When they're given a text description as input, they generate an image matching the caption as output.

The new technology has sparked a fresh legal debate over copyright: do these tools violate intellectual property rights since they ingested copyrighted images without permission?

Lawsuits have been filed against makers of the most popular generative AI tools for infringing copyright. Companies building text-to-image models argue that since their software generates unique images, their use of copyright data is fair use. But artists who have seen their styles and work imitated by these tools believe they've been ripped off.

Now research led by researchers working at Google, DeepMind, the University of California, Berkeley, ETH Zurich, and Princeton University demonstrates that images used to train these models can be extracted. Generative AI models memorize images and can generate precise copies of them, raising new copyright and privacy concerns.

diffusion_extraction_research

Some examples of images the researchers managed to extract from Stable Diffusion

"In a real attack, where an adversary wants to extract private information, they would guess the label or caption that was used for an image," co-authors of the study told The Register.

"Fortunately for the attacker, our method can sometimes work even if the guess is not perfect. For example, we can extract the portrait of Ann Graham Lotz by just prompting Stable Diffusion with her name, instead of the full caption from the training set ("Living in the light with Ann Graham Lotz").

diffusion_extraction_research_2

Only images memorized by the model can be extracted, and how much a model can memorize data varies on factors like its training data and size. Copies of the same image are more likely to be memorized, and models containing more parameters are more likely to be able to remember images too.

The team was able to extract 94 images from 350,000 examples used to train Stable Diffusion, and 23 images from 1,000 examples from Google's Imagen model. For comparison, Stable Diffusion has 890 million parameters and was trained on 160 million images, while Imagen has two billion parameters – it's not clear how many images were used to train it exactly.

"For Stable Diffusion, we find that most memorized images were duplicated 100 times or more in the training set, but some as few as 10 times," the researchers said. "For Google's Imagen model, which is a larger model than Stable Diffusion and trained on a smaller dataset, memorization appears to be much more frequent. Here we find some outlier images that are present just a single time in the entire training set, yet are still extractable."

They aren't quite sure why larger models tend to memorize more images, but believe it may have something to do with being able to store more of their training data in its parameters.

Memorization rates for these models is pretty low, and in reality extracting images would be tedious and tricky. Attackers would have to guess and try numerous prompts to lead the model into generating memorized data. Still, the team is warning developers to refrain from training generative AI models on private sensitive data.

"How bad memorization is depends on the application of the generative models. In highly private applications, such as in the medical domain (e.g. training on chest X-rays or medical records), memorization is highly undesirable, even if it only affects a very small fraction of users. Furthermore, the training sets used in privacy sensitive applications are usually smaller than the ones used to train current generative art models. Therefore, we might see a lot more memorization, including images that are not duplicated," they told us.

One way to prevent data extraction is to decrease the likelihood of memorization in models. Getting rid of duplicates in the training dataset, for example, would minimize the chances of images being memorized and extracted. Stability AI, the creators of Stable Diffusion, have reportedly trained their newest model on a dataset containing less duplicates independently of the researchers' findings.

Now that it's been proven text-to-image models can generate exact copies of images they were trained on, it's not clear how this could impact copyright cases.

"A common argument we had seen people make online was some variant of 'these models never memorize training data'. We now know that this is clearly false. But whether this actually matters or not in the legal debate is also up for debate," the researchers concluded.

"At least now, both sides in these lawsuits have some more tangible facts they can rely on: yes, memorization happens; but it is very rare; and it mainly seems to happen for highly duplicated images." ®

More about

TIP US OFF

Send us news


Other stories you might like