@www.verdict.co.uk
//
OpenAI is shifting its strategy by integrating its o3 technology, rather than releasing it as a standalone AI model. CEO Sam Altman announced this change, stating that GPT-5 will be a comprehensive system incorporating o3, aiming to simplify OpenAI's product offerings. This decision follows the testing of advanced reasoning models, o3 and o3 mini, which were designed to tackle more complex tasks.
Altman emphasized the desire to make AI "just work" for users, acknowledging the complexity of the current model selection process. He expressed dissatisfaction with the 'model picker' feature and aims to return to "magic unified intelligence". The company plans to unify its AI models, eliminating the need for users to manually select which GPT model to use. This integration strategy also includes the upcoming release of GPT-4.5, which Altman describes as their last non-chain-of-thought model. A key goal is to create AI systems capable of using all available tools and adapting their reasoning time based on the task at hand. While GPT-5 will be accessible on the free tier of ChatGPT with standard intelligence, paid subscriptions will offer a higher level of intelligence incorporating voice, search, and deep research capabilities. References :
Classification:
@the-decoder.com
//
OpenAI's o3 model is facing scrutiny after achieving record-breaking results on the FrontierMath benchmark, an AI math test developed by Epoch AI. It has emerged that OpenAI quietly funded the development of FrontierMath, and had prior access to the benchmark's datasets. The company's involvement was not disclosed until the announcement of o3's unprecedented performance, where it achieved a 25.2% accuracy rate, a significant jump from the 2% scores of previous models. This lack of transparency has drawn comparisons to the Theranos scandal, raising concerns about potential data manipulation and biased results. Epoch AI's associate director has admitted the lack of transparency was a mistake.
The controversy has sparked debate within the AI community, with questions being raised about the legitimacy of o3's performance. While OpenAI claims the data wasn't used for model training, concerns linger as six mathematicians who contributed to the benchmark said that they were not aware of OpenAI's involvement or the company having exclusive access. They also indicated that had they known, they might not have contributed to the project. Epoch AI has said that an "unseen-by-OpenAI hold-out set" was used to verify the model's capabilities. Now, Epoch AI is working on developing new hold-out questions to retest the o3 model's performance, ensuring OpenAI does not have prior access. References :
Classification:
|
BenchmarksBlogsResearch Tools |