Education - It's about to get wild

balderdash@lemmy.zip · 11 months ago

Education - It's about to get wild

jacksilver@lemmy.world · 11 months ago

It’s interesting, because people say they can only get better, but I’m not sure that’s true. What happens when most new text data is being generated by LLMs or we accidentally start labeling images created through diffusion as real. Seems like there is a potential for these models to implode.

FierySpectre@lemmy.world · 11 months ago

They actually tested that, trained a model using only the outputs of the previous generation of model. It takes less iterations of that to completely lose quality than you’d think.

jacksilver@lemmy.world · 11 months ago

Do you have any links on that, it was something I had wanted to explore, but never had the time or money.

WarmSoda@lemm.ee · 11 months ago

They go insane pretty quickly don’t they? As in it all just become a jumble.

Ilovethebomb@lemm.ee · 11 months ago

Given that people quite frequently try and present AI generated content as real, I’d say this will be a huge problem in the future.

danielbln@lemmy.world · 11 months ago

Microsoft has shown with Phi-2 (https://www.microsoft.com/en-us/research/blog/phi-2-the-surprising-power-of-small-language-models/) that synthetic data generation can be a great source for training data.