Meta introduces AI model that can review work of other models
Meta Platforms (NASDAQ:META) said on Friday it was releasing some AI models from its research division, including a “Self-Taught Evaluator” which could provide a path to less human involvement in the AI development process.
The Facebook-owner said the “Self-Taught Evaluator” approach generates contrasting model outputs and trains a large language model, or LLM-as-a-Judge to produce reasoning traces for evaluation and final judgments, with an iterative self-improvement scheme.
The Self-Taught Evaluator is a new method for generating synthetic preference data to train reward models without relying on human annotations.
The release follows Meta’s introduction of the product in a paper in August, which showed how it depends on the same “chain of thought” technique used by OpenAI’s o1 models to get it to make reliable judgments about models’ responses, according to a report from Reuters.
Last month, Microsoft-backed (MSFT) OpenAI released AI models called o1 and o1-mini, which can reason through complex tasks and solve harder problems than previous models in science, coding, and math.
The ability to use AI to review AI reliably offers a glimpse at a potential pathway to building autonomous AI agents which can learn from their own mistakes, the report added, citing two Meta researchers behind the project.
Self-improving models could remove the requirement for an often expensive and inefficient process currently used, called Reinforcement Learning from Human Feedback, which requires input from human annotators who should have specialized expertise to label data accurately and verify that answers to complex math and writing questions are correct, the report added.
“The idea of being self-taught and able to self-evaluate is basically crucial to the idea of getting to this sort of super-human level of AI,” said Jason Weston, one of the researchers, the report noted.
The social media giant also released Meta Segment Anything 2.1 (SAM 2.1), an update to its Segment Anything Model 2 for images and videos. SAM 2.1 includes a new developer suite with the code for model training and the web demo.