Unlock your trading potential! Become a verified Bitget elite trader and earn 10,000 USDT to help skyrocket your profits. Join now and start your journey to success!
Share link:In this post: AI models trained using AI-generated data lack substance and nuance, study finds. The findings present a new challenge for AI developers. Researchers urge caution in data used to train AI.Disclaimer. The information provided is not trading advice. Cryptopolitan.com holds no liability for any investments made based on the information provided on this page. We strongly recommend independent research and/or consultation with a qualified professional before making any investment decisi
Large language models (LLMs) trained on previous iterations of AI-generated material produce outputs that lack substance and nuance, a new study has found. The findings present a new challenge for AI developers, who rely on limited human-generated data sets for content.
Also read: AI deepfakes are making it hard for US authorities to protect children – report
Artificial intelligence researchers from the University of Cambridge and Oxford University in the United Kingdom tried to write prompts relying on a dataset comprising only AI-generated content. The outcome was not ideal, as it produced incomprehensible responses.
AI still needs humans to make sense
One of the paper’s authors, Zhakar Shumaylov from the University of Cambridge said there is a need for quality control in the data that feeds LLMs, the technology behind generative AI chatbots like ChatGPT and Google’s Gemini. Shumaylov said:
“The message is we have to be very careful about what ends up in our training data. [Otherwise,] things will always, provably, go wrong”.
The phenomenon is known as “model collapse,” Shumaylov detailed. It has been proven to affect all kinds of artificial intelligence models including those that specialize in image generation using text prompts.
According to the study , repeat text prompts using AI-generated data on one model ended up generating gibberish. For example, researchers found that one system tested with text about the UK’s medieval Church towers produced a repetitive list of jackrabbits after only nine generations.
Commenting on the outputs, University of California computer scientist, Hany Farid, likened the data collapse to the challenges endemic to animal in-breeding.
“If a species inbreeds with their own offspring and doesn’t diversify their gene pool, it can lead to a collapse of the species,” Farid said.
When the researchers infused human-generated data into the AI data, the collapse happened more slowly than when it was running on purely AI-generated content.
0
0
Disclaimer: The content of this article solely reflects the author's opinion and does not represent the platform in any capacity. This article is not intended to serve as a reference for making investment decisions.