MIT researchers make language models scalable self-learners (news.mit.edu)

submitted 1 year ago* (last edited 1 year ago) by Hopps@lemmy.world to c/machinelearning@lemmy.world

0 comments fedilink hide all child comments

TLDR Summary:

MIT researchers developed a 350-million-parameter self-training entailment model to enhance smaller language models' capabilities, outperforming larger models with 137 to 175 billion parameters without human-generated labels.
The researchers enhanced the model's performance using 'self-training,' where it learns from its own predictions, reducing human supervision and outperforming models like Google's LaMDA, FLAN, and GPT models.
They developed an algorithm called 'SimPLE' to review and correct noisy or incorrect labels generated during self-training, improving the quality of self-generated labels and model robustness.
This approach addresses inefficiency and privacy issues of larger AI models while retaining high performance. They used 'textual entailment' to train these models, improving their adaptability to different tasks without additional training.
By reformulating natural language understanding tasks like sentiment analysis and news classification as entailment tasks, the model's applications were expanded.
While the model showed limitations in multi-class classification tasks, the research still presents an efficient method for training large language models, potentially reshaping AI and machine learning.

no comments (yet)

sorted by: hot top controversial new old

there doesn't seem to be anything here

this post was submitted on 21 Jun 2023

1 points (100.0% liked)

Machine Learning | Artificial Intelligence

950 readers

1 users here now

Welcome to Machine Learning – a versatile digital hub where Artificial Intelligence enthusiasts unite. From news flashes and coding tutorials to ML-themed humor, our community covers the gamut of machine learning topics. Regardless of whether you're an AI expert, a budding programmer, or simply curious about the field, this is your space to share, learn, and connect over all things machine learning. Let's weave algorithms and spark innovation together.

founded 1 year ago

MODERATORS

Hopps@lemmy.world