What is Model Collapse and how to avoid it

We chat to AI expert Ilia Shumailov about the pitfalls of using machines to train machines

Feature What happens to machine learning models when they feed on themselves, when the data they ingest comes more and more from other generative models rather than human authors?

This is already happening as the output of text models like ChatGPT and Bard, and of text-to-image models like Stable Diffusion, shows up on websites, gets scraped, and becomes fodder for further model training.

Last year, a group of researchers affiliated with universities in the UK and Canada asked this question and the answer they found suggests that data gathering and training practices need to account for this phenomenon.

The researchers – Ilia Shumailov, Zakhar Shumaylov, Yiren Zhao, Yarin Gal, Nicolas Papernot, and Ross Anderson – found that models fed on their own output stop working well, particularly in the model's tail – low-probability events for which there's not a lot of data. They call this phenomenon "Model Collapse," which they describe in their paper, "The Curse of Recursion: Training on Generated Data Makes Models Forget."

"Model Collapse is a degenerative process affecting generations of learned generative models, where generated data end up polluting the training set of the next generation of models; being trained on polluted data, they then misperceive reality," they explain.

Ilia Shumailov, lead author of the paper and a junior fellow at the University of Oxford at the time this research was done, spoke with The Register about the research findings.

The Register:

Is the phenomenon of audio feedback – where a mic captures and recaptures its own sound output from a loudspeaker – an appropriate analogy to understand Model Collapse?


A deep answer is "It depends." A more high level answer is, "Yeah, kinda."

If you ask me as a technology person, I would probably say no, because most of our distortions are the same. And thus, by basically replaying it, you're probably going to have a constant amount of distortion. And it's probably not even going to be noticeable.

Whereas the feedback loops in ML [machine learning] are a lot more intricate, in that there are a lot of biases that are inhaled from either learning procedures, or for example, from the architectures we end up using, because there is no science behind what architectures are better. And those biases, they don't just replace one another.

In many ways, they're biased in the same direction. And if you take something that is already biased, and you put additional bias, they end up amplifying the biases and, at some point, basically overtaking the signal overall.

Hallucinations are generalizations over areas. And so it happens that those generalizations are wildly inappropriate for the world we live in

You know, when people talk about hallucinations in LLMs and say this is a problem? This is not really a problem because hallucinations are generalizations over areas. And so it happens that those generalizations are wildly inappropriate for the world we live in.

But if you think about this, in many cases, those hallucinations could have happened, right? And this is just a given model that has imagined the world from all the data observed where those effects are true. If I say something like, "Oh, Trump went to the moon." Then you can imagine the world in which Trump went to the moon.

But then if you write this down, and you write essays about it, and some ML model takes this data, and it's like, "I was also thinking in presence of all the other data that you know, he's pals with Elon Musk, and together they go to the moon." And then it starts rambling on, creating new hypotheticals that don't exist in the real world.

So what our paper talks about is that, as of right now, if you take all the content that humans have produced, overall, all this content together, it forms this underlying distribution of things that humans are capable of producing.

Now, if you then take all of this and you train a model on top of this – all of this distribution of data that exists out there that humans have produced and thus, they are valid human-produced things, including facts as themselves – and then you ask a model to model the whole thing and start generating data – which is statistically indistinguishable from this distribution of data – the model inherently is going to make mistakes.

And it always will make mistakes. It's infeasible to assume that in some hypothetical future, we'll build perfect models. It's impossible. And we can bring a lot of philosophical arguments why it's impossible. That means any data that it will produce, with a relatively high probability, it's going to have a lot of errors.

But more nuanced, it's also going to have a lot of biases, in places where we don't even think about biases. And those biases are then getting inhaled by other third party models that in turn observe those biases. And their perception of the underlying distribution – this thing that all humans have produced – kind of gets shifted.

The biases end up counteracting each other and amplifying each other. And overall, by [the nth generation of the model], you observe that suddenly the perception of the real world, of this distribution of all human data that the model has, has nothing to do with reality whatsoever.

The Register:

Have you observed this with models in the wild?


Since we released the paper, there have been a couple of other papers noting that's exactly what they observed. As a matter of fact, this is now a very active field of basically training regimes in which you end up inhaling synthetic data and you want to account for distortions that get introduced.

You'll find plenty of those papers. Every single paper that comes out nowadays that claims that they can do this self supervisory loop, they are assuming that they're capable of filtering this data or they have an external guide or a reward function that basically allows them to say, "Okay, this looks like bias with a certain amount of probability. So I should probably not include this into my training." So it does happen.

The only problem with that is as an outsider and as a consumer, you're very unlikely to ever encounter this on a day-to-day basis, because even if you assume that there exists a [model] generation x, which was good, and then x plus one is suddenly experiencing some sort of collapsing behavior for some sort of fairness metric – it becomes more racist because it observed more racist data – then more likely than not, people who run massive evaluation suites or behaviors of those models are going to actually notice this. And they will basically make sure that a model like this is never gonna see the real world.

Or they also run additional training with the data they have to accommodate the sorts of distortions that have been introduced.

So as a consumer, I'm pretty sure we will probably not see such effects. It's more likely that it's just ever changing business models because people can decide what they want these models to do, and what the consumer is expected to pay, rather than them just not capturing degradation of their models. But generally speaking, 100 percent this happens.

The Register:

How serious do you consider Model Collapse to be in light of the other issues facing the machine learning community?


I think it's not going to be that much of a problem for rich companies. It's going to be a big problem for poor companies. Take an arbitrarily big company. They have enough cash to get people to label more data. And we know for a fact that this happens already. They pay – the amount of human evaluations big companies do and the amount of annotations that they harvest for, in very specific domains, is massive.

And the only thing that it will mean is that perhaps tomorrow data for smaller companies is going to cost more than for bigger companies.

The Register:

In your paper, you suggest that community coordination on data provenance is one approach for dealing with Model Collapse. Has there been any support for that idea?


The answer is yes or no. If you look at The White House commitments, I'd say the answer is yes. They are betting quite a lot on provenance. How well this will work is a very good question. Because for many of the problems we talk about, the solutions are either not bulletproof – they work on some of the time – or we are not really modeling the phenomenon we're talking about precisely enough.

Imagine you're capable of actually telling that a given piece of content has been artificially produced and thus you would not involve it in training – using whatever method, right? So what happens tomorrow when humans start repeating after ML models, which is totally normal?

We observe a piece of text and we repeat it like parrots, especially if it's nicely written and those models are very good. So then I'm not sure at what point this idea that something is artificial is even going to mean anything.

Imagine the world of tomorrow where everyone has a personalized news assistant ... presumably the quality of such content is going to be much better than writing something for the general audience

Or imagine the world of tomorrow where everyone has a personalized news assistant or [some company like] The New York Times or whatever writes a set of facts. And then those facts are actually presented to you in a personalized way where a model literally knows what you're thinking about. It knows all the things you know about so it connects to personal stuff. And then presumably the quality of such content is going to be much better than writing something for the general audience.

The sort of attention that an individual is going to express to this piece of news was going to be better. So in this regard, I would probably argue that artificial content is probably going to be richer than human content.

So there are a lot of questions like this, but fundamentally on a more technical mathematical level, we already know for a fact that Model Collapse will happen. And what happens tomorrow [to the vast sea of human-generated content once machines have a say]? It's a good question. What's going to happen once a machine learning model starts basically dictating what appears in this vast sea. Sometimes [these models] are definitely going to be amplifying biases. … The world is going to change. But to what extent technical solutions will be used to solve other technical problems is unclear.

The Register:

Have you heard of any contrary examples, where synthetic data makes models better? I was speaking with Simon Willison, who said that he'd be interested to sort of hear more about your paper when I mentioned it. He said he'd heard the opposite, that some people who are working with LLaMA and feeding in LLaMA-generated content had been getting good results.


There are cases where people have reported that they observe improvement in performance. And it's quite clear that this is going to happen. There exist cases where this self-improvement loop works, and I can give you plenty of examples of this. Imagine you have a model that is capable of doing a summation and the minus operation. It's totally plausible that you can ask this model to sum something n times and call this operation multiplication and the model suddenly realizes that it is capable of multiplication. So in this case, it suddenly realizes that it's capable of producing a lot more than originally it was ever taught.

Model Collapse is not talking about this. Model Collapse is talking about more fundamentally shifts in the underlying distribution related to biases from algorithms, architectures, and sampling.

The Register:

What steps should the machine learning community take to address your findings?


I think that there is really only one immediate thing we should talk about and that is understanding what we care about inside of our models.

That's because the first shifts that we see are shifts in [sparsely represented] data. So basically things that are badly represented in data and are badly understood by the models, they experience most of the immediate degradation in performance.

Basically, we need very good evaluation metrics for ML models. We need to be able to model those low probability events very well if we want to make sure that our models work for minority groups – where minority groups are defined as data that does not appear very often inside of the underlying data set. ®

More about


Send us news

Other stories you might like