News from the AI & ML world

DeeperML - #reasoningmodel

@www.analyticsvidhya.com //
MiniMaxAI, a Chinese AI company, has launched MiniMax-M1, a large-scale open-source reasoning model, marking a significant step in the open-source AI landscape. Released on the first day of the "MiniMaxWeek" event, MiniMax-M1 is designed to compete with leading models like OpenAI's o3, Claude 4, and DeepSeek-R1. Alongside the model, MiniMax has released a beta version of an agent capable of running code, building applications, and creating presentations. MiniMax-M1 presents a flexible option for organizations looking to experiment with or scale up advanced AI capabilities while managing costs.

MiniMax-M1 boasts a 1 million token context window and utilizes a new, highly efficient reinforcement learning technique. The model comes in two variants, MiniMax-M1-40k and MiniMax-M1-80k. Built on a Mixture-of-Experts (MoE) architecture, the model is trained on 456 billion parameters. MiniMax has introduced Lightning Attention for its M1 model, dramatically reducing inference costs and only consumes 25% of the floating point operations (FLOPs) required by DeepSeek R1 at a generation length of 100,000 tokens.

Available on AI code sharing communities like Hugging Face and GitHub, MiniMax-M1 is released under the Apache 2.0 license, enabling businesses to freely use, modify, and implement it for commercial applications without restrictions or payment. MiniMax-M1 features a web search functionality and can handle multimodal input like text, images, and presentations. The expansive context window allows the model to exchange information equivalent to a small collection or book series, far exceeding OpenAI's GPT-4o, which has a context window of 128,000 tokens.

Recommended read:
References :
  • AI News | VentureBeat: MiniMax-M1 presents a flexible option for organizations looking to experiment with or scale up advanced AI capabilities while managing costs.
  • Analytics Vidhya: The Chinese AI company, MiniMaxAI, has just launched a large-scale open-source reasoning model, named MiniMax-M1. The model, released on Day 1 of the 5-day MiniMaxWeek event, seems to give a good competition to OpenAI o3, Claude 4, DeepSeke-R1, and other contemporaries.
  • The Rundown AI: PLUS: MiniMax’s new open-source reasoner with 1M token context
  • www.analyticsvidhya.com: The Chinese AI company, MiniMaxAI, has just launched a large-scale open-source reasoning model, named MiniMax-M1.
  • www.marktechpost.com: MiniMax AI Releases MiniMax-M1: A 456B Parameter Hybrid Model for Long-Context and Reinforcement Learning RL Tasks

Mark Tyson@tomshardware.com //
References: , Last Week in AI , composio.dev ...
OpenAI has launched O3 PRO for ChatGPT, marking a significant advancement in both performance and cost-efficiency for its reasoning models. This new model, O3-Pro, is now accessible through the OpenAI API and the Pro plan, priced at $200 per month. The company highlights substantial improvements with O3 PRO and has also dropped the price of its previous o3 model by 80%. This strategic move aims to provide users with more powerful and affordable AI capabilities, challenging competitors in the AI model market and expanding the boundaries of reasoning.

The O3-Pro model is set to offer enhanced raw reasoning capabilities, but early reviews suggest mixed results when compared to competing models like Claude 4 Opus and Gemini 2.5 Pro. While some tests indicate that Claude 4 Opus currently excels in prompt following, output quality, and understanding user intentions, Gemini 2.5 Pro is considered the most economical option with a superior price-to-performance ratio. Initial assessments suggest that O3-Pro might not be worth the higher cost unless the user's primary interest lies in research applications.

The launch of O3-Pro coincides with other strategic moves by OpenAI, including consolidating its public sector AI products under the "OpenAI for Government" banner, including ChatGPT Gov. OpenAI has also secured a $200 million contract with the U.S. Department of Defense to explore AI applications in administration and security. Despite these advancements, OpenAI is also navigating challenges, such as the planned deprecation of GPT-4.5 Preview in the API, which has caused frustration among developers who relied on the model for their applications and workflows.

Recommended read:
References :
  • : OpenAI finally unveiled their much expensive O3, the O3-Pro. The model is available in their API and Pro plan, which costs $200
  • Last Week in AI: OpenAI introduces O3 PRO for ChatGPT, highlighting significant improvements in performance and cost-efficiency.
  • Last Week in AI: OpenAI adds o3 Pro to ChatGPT and drops o3 price by 80 per cent, ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries
  • composio.dev: OpenAI finally unveiled their much expensive O3, the O3-Pro. The model is available in their API and Pro plan, which costs $200

Carl Franzen@AI News | VentureBeat //
Mistral AI has launched its first reasoning model, Magistral, signaling a commitment to open-source AI development. The Magistral family features two models: Magistral Small, a 24-billion parameter model available with open weights under the Apache 2.0 license, and Magistral Medium, a proprietary model accessible through an API. This dual release strategy aims to cater to both enterprise clients seeking advanced reasoning capabilities and the broader AI community interested in open-source innovation.

Mistral's decision to release Magistral Small under the permissive Apache 2.0 license marks a significant return to its open-source roots. The license allows for the free use, modification, and distribution of the model's source code, even for commercial purposes. This empowers startups and established companies to build and deploy their own applications on top of Mistral’s latest reasoning architecture, without the burdens of licensing fees or vendor lock-in. The release serves as a powerful counter-narrative, reaffirming Mistral’s dedication to arming the open community with cutting-edge tools.

Magistral Medium demonstrates competitive performance in the reasoning arena, according to internal benchmarks released by Mistral. The model was tested against its predecessor, Mistral-Medium 3, and models from Deepseek. Furthermore, Mistral's Agents API's Handoffs feature facilitates smart, multi-agent workflows, allowing different agents to collaborate on complex tasks. This enables modular and efficient problem-solving, as demonstrated in systems where agents collaborate to answer inflation-related questions.

Recommended read:
References :
  • Simon Willison: Mistral's first reasoning LLM - Magistral - was released today and is available in two sizes, an open weights (Apache 2) 24B model called Magistral Small and an API/hosted only model called Magistral Medium.
  • Simon Willison's Weblog: Mistral's first reasoning model is out today, in two sizes. There's a 24B Apache 2 licensed open-weights model called Magistral Small (actually Magistral-Small-2506), and a larger API-only model called Magistral Medium.
  • THE DECODER: Mistral launches Europe's first reasoning model Magistral but lags behind competitors
  • AI News | VentureBeat: The company is signaling that the future of reasoning AI will be both powerful and, in a meaningful way, open to all.
  • www.marktechpost.com: How to Create Smart Multi-Agent Workflows Using the Mistral Agents API’s Handoffs Feature
  • TestingCatalog: Mistral AI debuts Magistral models focused on advanced reasoning
  • www.artificialintelligence-news.com: Mistral AI has pulled back the curtain on Magistral, their first model specifically built for reasoning tasks.
  • www.infoworld.com: Mistral AI unveils Magistral reasoning model
  • AI News: Mistral AI has pulled back the curtain on Magistral, their first model specifically built for reasoning tasks.
  • the-decoder.com: The French start-up Mistral is launching its first reasoning model on the market with Magistral. It is designed to enable logical thinking in European languages.
  • Simon Willison: Mistral's first reasoning LLM - Magistral - was released today and is available in two sizes, an open weights (Apache 2) 24B model called Magistral Small and an API/hosted only model called Magistral Medium. My notes here, including running Small locally with Ollama and accessing Medium via my llm-mistral plugin
  • SiliconANGLE: Mistral AI debuts new Magistral series of reasoning LLMs.
  • siliconangle.com: Mistral AI debuts new Magistral series of reasoning LLMs
  • MarkTechPost: Mistral AI Releases Magistral Series: Advanced Chain-of-Thought LLMs for Enterprise and Open-Source Applications
  • www.marktechpost.com: Mistral AI Releases Magistral Series: Advanced Chain-of-Thought LLMs for Enterprise and Open-Source Applications
  • WhatIs: What differentiates Mistral AI reasoning model Magistral
  • AlternativeTo: Mistral AI debuts Magistral: a transparent, multilingual reasoning model family, including open-source Magistral Small available on Hugging Face and enterprise-focused Magistral Medium available on various platforms.

Mark Tyson@tomshardware.com //
References: bsky.app , the-decoder.com , Maginative ...
OpenAI has launched o3-pro, a new and improved version of its AI model designed to provide more reliable and thoughtful responses, especially for complex tasks. Replacing the o1-pro model, o3-pro is accessible to Pro and Team users within ChatGPT and through the API, marking OpenAI's ongoing effort to refine its AI technology. The focus of this upgrade is to enhance the model’s reasoning capabilities and maintain consistency in generating responses, directly addressing shortcomings found in previous models.

The o3-pro model is designed to handle tasks requiring deep analytical thinking and advanced reasoning. While built upon the same transformer architecture and deep learning techniques as other OpenAI chatbots, o3-pro distinguishes itself with an improved ability to understand context. Some users have noted that o3-pro feels like o3, but is only modestly better in exchange for being slower.

Comparisons with other leading models such as Claude 4 Opus and Gemini 2.5 Pro reveal interesting insights. While Claude 4 Opus has been praised for prompt following and understanding user intentions, Gemini 2.5 Pro stands out for its price-to-performance ratio. Early user experiences suggest o3-pro might not always be worth the expense due to its speed, except for research purposes. Some users have suggested that o3-pro hallucinates modestly less, though this is still being debated.

Recommended read:
References :
  • bsky.app: the OpenAI API is back to running at 100% again, plus we dropped o3 prices by 80% and launched o3-pro - enjoy!
  • the-decoder.com: OpenAI cuts o3 model prices by 80% and launches o3-pro today
  • AI News | VentureBeat: OpenAI launches o3-pro AI model, offering increased reliability and tool use for enterprises — while sacrificing speed
  • Maginative: OpenAI’s new o3-pro model is now available in ChatGPT and the API, offering top-tier performance in math, science, and coding—at a dramatically lower price.
  • THE DECODER: OpenAI has lowered the price of its o3 language model by 80 percent, CEO Sam Altman said. The article appeared first on The Decoder.
  • AI News | VentureBeat: OpenAI announces 80% price drop for o3, it’s most powerful reasoning model
  • www.cnbc.com: The figure includes sales from the company’s consumer products, ChatGPT business products and its application programming interface, or API.
  • Latent.Space: OpenAI dropped o3 pricing 80% today and launched o3-pro. Ben Hylak of Raindrop.ai returns with the world's first early review.
  • siliconangle.com: OpenAI’s newest reasoning model o3-pro surpasses rivals on multiple benchmarks, but it’s not very fast
  • SiliconANGLE: Silicon Angle reports on OpenAI’s newest reasoning model o3-pro surpassing rivals.
  • bsky.app: OpenAI has launched o3-pro. The new model is available to ChatGPT Pro and Team subscribers and in OpenAI’s API now, while Enterprise and Edu subscribers will get access next week. If you use reasoning models like o1 or o3, try o3-pro, which is much smarter and better at using external tools.
  • The Algorithmic Bridge: OpenAI o3-Pro Is So Good That I Can’t Tell How Good It Is
  • Windows Report: OpenAI rolls out new o3-pro AI model to ChatGPT Pro subscribers
  • Windows Report: OpenAI Rolls out new o3-pro AI Model to ChatGPT Pro Subscribers
  • www.indiatoday.in: OpenAI introduces O3 PRO for ChatGPT, highlighting significant improvements in performance and cost-efficiency.
  • www.marketingaiinstitute.com: [The AI Show Episode 153]: OpenAI Releases o3-Pro, Disney Sues Midjourney, Altman: “Gentle Singularity†Is Here, AI and Jobs & News Sites Getting Crushed by AI Search
  • datafloq.com: OpenAI is now considered another name of innovation in the field of AI, and with the launch of OpenAI o3, this claim is becoming stronger.
  • composio.dev: OpenAI finally unveiled their much expensive O3, the O3-Pro. The model is available in their API and Pro plan, which costs $200

@www.marktechpost.com //
DeepSeek, a Chinese AI startup, has launched an updated version of its R1 reasoning AI model, named DeepSeek-R1-0528. This new iteration brings the open-source model near parity with proprietary paid models like OpenAI’s o3 and Google’s Gemini 2.5 Pro in terms of reasoning capabilities. The model is released under the permissive MIT License, enabling commercial use and customization, marking a commitment to open-source AI development. The model's weights and documentation are available on Hugging Face, facilitating local deployment and API integration.

The DeepSeek-R1-0528 update introduces substantial enhancements in the model's ability to handle complex reasoning tasks across various domains, including mathematics, science, business, and programming. DeepSeek attributes these improvements to leveraging increased computational resources and applying algorithmic optimizations in post-training. Notably, the accuracy on the AIME 2025 test has surged from 70% to 87.5%, demonstrating deeper reasoning processes with an average of 23,000 tokens per question, compared to the previous version's 12,000 tokens.

Alongside enhanced reasoning, the updated R1 model boasts a reduced hallucination rate, which contributes to more reliable and consistent output. Code generation performance has also seen a boost, positioning it as a strong contender in the open-source AI landscape. DeepSeek provides instructions on its GitHub repository for those interested in running the model locally and encourages community feedback and questions. The company aims to provide accessible AI solutions, underscored by the availability of a distilled version of R1-0528, DeepSeek-R1-0528-Qwen3-8B, designed for efficient single-GPU operation.

Recommended read:
References :
  • pub.towardsai.net: DeepSeek R1 : Is It Right For You? (A Practical Self‑Assessment for Businesses and Individuals)
  • AI News | VentureBeat: DeepSeek R1-0528 arrives in powerful open source challenge to OpenAI o3 and Google Gemini 2.5 Pro
  • Analytics Vidhya: New Deepseek R1-0528 Update is INSANE
  • Kyle Wiggers ?: DeepSeek updates its R1 reasoning AI model, releases it on Hugging Face
  • MacStories: Details about DeepSeek's R1-0528 model and its improved performance.
  • MarkTechPost: Information about DeepSeek's R1-0528 model and its enhancements in math and code performance.
  • www.marktechpost.com: DeepSeek, the Chinese AI Unicorn, has released an updated version of its R1 reasoning model, named DeepSeek-R1-0528. This release enhances the model’s capabilities in mathematics, programming, and general logical reasoning, positioning it as a formidable open-source alternative to leading models like OpenAI’s o3 and Google’s Gemini 2.5 Pro. Technical Enhancements The R1-0528 update introduces significant […]
  • www.analyticsvidhya.com: When DeepSeek R1 launched in January, it instantly became one of the most talked-about open-source models on the scene, gaining popularity for its sharp reasoning and impressive performance. Fast-forward to today, and DeepSeek is back with a so-called “minor trial upgradeâ€, but don’t let the modest name fool you. DeepSeek-R1-0528 delivers major leaps in reasoning, […]
  • : The 'Minor Upgrade' That's Anything But: DeepSeek R1-0528 Deep Dive
  • Simon Willison: Some notes on the new DeepSeek-R1-0528 - a completely different model from the R1 they released in January, despite having a very similar name Terrible LLM naming has managed to infect the Chinese AI labs too
  • TheSequence: The Sequence Radar #554 : The New DeepSeek R1-0528 is Very Impressive
  • Fello AI: In late May 2025, Chinese startup DeepSeek quietly rolled out R1-0528, a beefed-up version of its open-source R1 reasoning model.
  • felloai.com: Latest DeepSeek Update Called R1-0528 Is Matching OpenAI’s o3 & Gemini 2.5 Pro

@www.marktechpost.com //
DeepSeek has released a major update to its R1 reasoning model, dubbed DeepSeek-R1-0528, marking a significant step forward in open-source AI. The update boasts enhanced performance in complex reasoning, mathematics, and coding, positioning it as a strong competitor to leading commercial models like OpenAI's o3 and Google's Gemini 2.5 Pro. The model's weights, training recipes, and comprehensive documentation are openly available under the MIT license, fostering transparency and community-driven innovation. This release allows researchers, developers, and businesses to access cutting-edge AI capabilities without the constraints of closed ecosystems or expensive subscriptions.

The DeepSeek-R1-0528 update brings several core improvements. The model's parameter count has increased from 671 billion to 685 billion, enabling it to process and store more intricate patterns. Enhanced chain-of-thought layers deepen the model's reasoning capabilities, making it more reliable in handling multi-step logic problems. Post-training optimizations have also been applied to reduce hallucinations and improve output stability. In practical terms, the update introduces JSON outputs, native function calling, and simplified system prompts, all designed to streamline real-world deployment and enhance the developer experience.

Specifically, DeepSeek R1-0528 demonstrates a remarkable leap in mathematical reasoning. On the AIME 2025 test, its accuracy improved from 70% to an impressive 87.5%, rivaling OpenAI's o3. This improvement is attributed to "enhanced thinking depth," with the model now utilizing significantly more tokens per question, indicating more thorough and systematic logical analysis. The open-source nature of DeepSeek-R1-0528 empowers users to fine-tune and adapt the model to their specific needs, fostering further innovation and advancements within the AI community.

Recommended read:
References :
  • Kyle Wiggers ?: DeepSeek updates its R1 reasoning AI model, releases it on Hugging Face
  • AI News | VentureBeat: VentureBeat article on DeepSeek R1-0528.
  • Analytics Vidhya: New Deepseek R1-0528 Update is INSANE
  • MacStories: Testing DeepSeek R1-0528 on the M3 Ultra Mac Studio and Installing Local GGUF Models with Ollama on macOS
  • www.analyticsvidhya.com: New Deepseek R1-0528 Update is INSANE
  • www.marktechpost.com: DeepSeek Releases R1-0528: An Open-Source Reasoning AI Model Delivering Enhanced Math and Code Performance with Single-GPU Efficiency
  • NextBigFuture.com: DeepSeek R1 has significantly improved its depth of reasoning and inference capabilities by leveraging increased computational resources and introducing algorithmic optimization mechanisms during post-training.
  • MarkTechPost: DeepSeek Releases R1-0528: An Open-Source Reasoning AI Model Delivering Enhanced Math and Code Performance with Single-GPU Efficiency
  • pandaily.com: In the early hours of May 29, Chinese AI startup DeepSeek quietly open-sourced the latest iteration of its R1 large language model, DeepSeek-R1-0528, on the Hugging Face platform .
  • www.computerworld.com: Reports that DeepSeek releases a new version of its R1 reasoning AI model.
  • techcrunch.com: DeepSeek updates its R1 reasoning AI model, releases it on Hugging Face
  • the-decoder.com: Deepseek's R1 model closes the gap with OpenAI and Google after major update
  • Simon Willison: Some notes on the new DeepSeek-R1-0528 - a completely different model from the R1 they released in January, despite having a very similar name Terrible LLM naming has managed to infect the Chinese AI labs too
  • Analytics India Magazine: The new DeepSeek-R1 Is as good as OpenAI o3 and Gemini 2.5 Pro
  • : The 'Minor Upgrade' That's Anything But: DeepSeek R1-0528 Deep Dive
  • simonwillison.net: Some notes on the new DeepSeek-R1-0528 - a completely different model from the R1 they released in January, despite having a very similar name Terrible LLM naming has managed to infect the Chinese AI labs too
  • TheSequence: This article provides an overview of the new DeepSeek R1-0528 model and notes its improvements over the prior model released in January.
  • Kyle Wiggers ?: News about the release of DeepSeek's updated R1 AI model, emphasizing its increased censorship.
  • Fello AI: Reports that the R1-0528 model from DeepSeek is matching the capabilities of OpenAI's o3 and Google's Gemini 2.5 Pro.
  • felloai.com: Latest DeepSeek Update Called R1-0528 Is Matching OpenAI’s o3 & Gemini 2.5 Pro
  • www.tomsguide.com: DeepSeek’s latest update is a serious threat to ChatGPT and Google — here’s why

@www.artificialintelligence-news.com //
ServiceNow is making significant strides in the realm of artificial intelligence with the unveiling of Apriel-Nemotron-15b-Thinker, a new reasoning model optimized for enterprise-scale deployment and efficiency. The model, consisting of 15 billion parameters, is designed to handle complex tasks such as solving mathematical problems, interpreting logical statements, and assisting with enterprise decision-making. This release addresses the growing need for AI models that combine strong performance with efficient memory and token usage, making them viable for deployment in practical hardware environments.

ServiceNow is betting on unified AI to untangle enterprise complexity, providing businesses with a single, coherent way to integrate various AI tools and intelligent agents across the entire company. This ambition was unveiled at Knowledge 2025, where the company showcased its new AI platform and deepened relationships with tech giants like NVIDIA, Microsoft, Google, and Oracle. The aim is to help businesses orchestrate their operations with genuine intelligence, as evidenced by the adoption from industry leaders like Adobe, Aptiv, the NHL, Visa, and Wells Fargo.

To further broaden its reach, ServiceNow has introduced the Core Business Suite, an AI-driven solution aimed at the mid-market. This suite connects employees, suppliers, systems, and data in one place, enabling organizations of all sizes to work faster and more efficiently across critical business processes such as HR, procurement, finance, facilities, and legal affairs. ServiceNow aims for rapid implementation, suggesting deployment within a few weeks, and integrates functionalities from different divisions into a single, uniform experience.

Recommended read:
References :
  • siliconangle.com: ServiceNow debuts AI agents for security and risk to support autonomous enterprise defense
  • www.artificialintelligence-news.com: ServiceNow bets on unified AI to untangle enterprise complexity
  • AI News: ServiceNow bets on unified AI to untangle enterprise complexity
  • www.marktechpost.com: ServiceNow AI Released Apriel-Nemotron-15b-Thinker: A Compact Yet Powerful Reasoning Model Optimized for Enterprise-Scale Deployment and Efficiency