246
AI models collapse when trained on recursively generated data
(www.nature.com)
This is a most excellent place for technology news and articles.
You realize that those "billions of dollars" have actually resulted in a solution to this? "Model collapse" has been known about for a long time and further research figured out how to avoid it. Modern LLMs actually turn out better when they're trained on well-crafted and well-curated synthetic data.
Honestly, everyone seems to assume that machine learning researchers are simpletons who've never used a photocopier before.