News from the AI & ML world

DeeperML - #professionals

Carl Franzen@AI News | VentureBeat //
Mistral AI has launched its first reasoning model, Magistral, signaling a commitment to open-source AI development. The Magistral family features two models: Magistral Small, a 24-billion parameter model available with open weights under the Apache 2.0 license, and Magistral Medium, a proprietary model accessible through an API. This dual release strategy aims to cater to both enterprise clients seeking advanced reasoning capabilities and the broader AI community interested in open-source innovation.

Mistral's decision to release Magistral Small under the permissive Apache 2.0 license marks a significant return to its open-source roots. The license allows for the free use, modification, and distribution of the model's source code, even for commercial purposes. This empowers startups and established companies to build and deploy their own applications on top of Mistral’s latest reasoning architecture, without the burdens of licensing fees or vendor lock-in. The release serves as a powerful counter-narrative, reaffirming Mistral’s dedication to arming the open community with cutting-edge tools.

Magistral Medium demonstrates competitive performance in the reasoning arena, according to internal benchmarks released by Mistral. The model was tested against its predecessor, Mistral-Medium 3, and models from Deepseek. Furthermore, Mistral's Agents API's Handoffs feature facilitates smart, multi-agent workflows, allowing different agents to collaborate on complex tasks. This enables modular and efficient problem-solving, as demonstrated in systems where agents collaborate to answer inflation-related questions.

Recommended read:
References :
  • Simon Willison: Mistral's first reasoning LLM - Magistral - was released today and is available in two sizes, an open weights (Apache 2) 24B model called Magistral Small and an API/hosted only model called Magistral Medium.
  • Simon Willison's Weblog: Mistral's first reasoning model is out today, in two sizes. There's a 24B Apache 2 licensed open-weights model called Magistral Small (actually Magistral-Small-2506), and a larger API-only model called Magistral Medium.
  • THE DECODER: Mistral launches Europe's first reasoning model Magistral but lags behind competitors
  • AI News | VentureBeat: The company is signaling that the future of reasoning AI will be both powerful and, in a meaningful way, open to all.
  • www.marktechpost.com: How to Create Smart Multi-Agent Workflows Using the Mistral Agents API’s Handoffs Feature
  • TestingCatalog: Mistral AI debuts Magistral models focused on advanced reasoning
  • www.artificialintelligence-news.com: Mistral AI has pulled back the curtain on Magistral, their first model specifically built for reasoning tasks.
  • www.infoworld.com: Mistral AI unveils Magistral reasoning model
  • AI News: Mistral AI has pulled back the curtain on Magistral, their first model specifically built for reasoning tasks.
  • the-decoder.com: The French start-up Mistral is launching its first reasoning model on the market with Magistral. It is designed to enable logical thinking in European languages.
  • Simon Willison: Mistral's first reasoning LLM - Magistral - was released today and is available in two sizes, an open weights (Apache 2) 24B model called Magistral Small and an API/hosted only model called Magistral Medium. My notes here, including running Small locally with Ollama and accessing Medium via my llm-mistral plugin
  • SiliconANGLE: Mistral AI debuts new Magistral series of reasoning LLMs.
  • siliconangle.com: Mistral AI debuts new Magistral series of reasoning LLMs
  • MarkTechPost: Mistral AI Releases Magistral Series: Advanced Chain-of-Thought LLMs for Enterprise and Open-Source Applications
  • www.marktechpost.com: Mistral AI Releases Magistral Series: Advanced Chain-of-Thought LLMs for Enterprise and Open-Source Applications
  • WhatIs: What differentiates Mistral AI reasoning model Magistral
  • AlternativeTo: Mistral AI debuts Magistral: a transparent, multilingual reasoning model family, including open-source Magistral Small available on Hugging Face and enterprise-focused Magistral Medium available on various platforms.

Justin Westcott,@AI News | VentureBeat //
AI agents are poised to revolutionize how we interact with the internet, moving beyond passive assistants to active participants authorized to act on our behalf. This shift necessitates a redesign of the web, transforming it from a human-centric interface to a machine-native environment optimized for speed, efficiency, and transactional capabilities. The current internet, designed for human eyes and fingers, is proving inefficient for AI, which requires structured data, clear intent, and exposed capabilities to navigate, decide, and transact effectively. This evolution will lead to a web where APIs become the new storefronts, prioritizing verifiable sources and trust over traditional user experience elements.

The development and deployment of AI agents face significant challenges, particularly in ensuring reliability and consistency within defined business processes. Existing agentic frameworks often fall short due to a lack of state, leading to unpredictable behavior and poor adherence to workflows. A recent survey highlighted that only 25% of AI initiatives are live in production, with hallucinations and prompt management being major obstacles. This indicates a need for robust evaluation processes and automated testing pipelines to address these issues, as traditional software QA methods may not fully apply to AI applications. The survey indicated that without robust evaluation, AI agents may not reach production or may not be sustainable long term.

An alternative approach, known as process calling, aims to create reliable, process-aware, and easily debuggable conversational agents. This method addresses the limitations of tool calling by incorporating state tracking and structured workflows. Companies achieving success with LLMs are prioritizing robust evaluation and moving beyond simple tool-based interactions. As AI agents become more prevalent, the internet will likely bifurcate into two webs: one designed for humans and another designed for machines. This machine-native web will feature faster protocols, cleaner metadata, and a focus on verifiable sources, ultimately reshaping the architecture of the internet to accommodate AI's unique requirements.

Recommended read:
References :

Alexey Shabanov@TestingCatalog //
Anthropic's Claude is set to receive significant enhancements, primarily benefiting Claude Max subscribers. A key development is the merging of the "research" mode with Model Context Protocol (MCP) integrations. This combination aims to provide deeper answers and more sources by connecting Claude to various external tools and data sources. The introduction of remote MCPs allows users to connect Claude to almost any service, potentially unlocking workflows such as posting to Discord or reading from a Notion database, thereby transforming how businesses leverage AI.

This integration allows users to plug in platforms like Zapier, unlocking a broad range of workflows, including automated research, task execution, and access to internal company systems. The upgraded Claude Max subscription promises to deliver more value by enabling more extensive reasoning and providing access to an array of integrated tools. This strategic move by Anthropic points towards a push towards enterprise AI assistants capable of handling extensive context and automating complex tasks.

In addition to these enhancements, Anthropic is also focusing on improving Claude's coding capabilities. Claude Code, now generally available, integrates directly into a programmer's workspace, helping them "code faster through natural language commands". It works with Amazon Bedrock and Google Vertex AI, two popular enterprise coding tools. Anthropic says the new version of Claude Code on the Pro Plan is "great for shorter coding stints (1-2 hours) in smaller codebases."

Recommended read:
References :

@www.microsoft.com //
References: THE DECODER , Ken Yeung , Maginative ...
Microsoft is aggressively integrating AI into its services to boost productivity and user experience. A key development is the rollout of Microsoft Copilot for Judges in UK courts, alongside updated guidelines for GenAI usage. These efforts are part of a broader strategy to embrace human-agent collaboration, as seen in the latest Microsoft 365 Copilot upgrades. The goal is to harness AI's capabilities while ensuring responsible and secure implementation across various sectors.

The UK Courts and Tribunals Judiciary are encouraging judges to utilize Microsoft’s ‘Copilot Chat’ genAI capability through their eJudiciary platform. Updated guidance emphasizes that while useful, judges must use genAI cautiously and understand its limitations, particularly concerning the accuracy and sources of information. Judges are warned that public AI chatbots do not provide answers from authoritative databases and are not necessarily the most accurate source. Microsoft has assured that ‘Copilot Chat’ offers enterprise data protection and operates within Microsoft 365's security frameworks when accessed via the eJudiciary account.

Microsoft is also working to enhance safety and security in AI agent systems. The Microsoft AI Red Team has released a whitepaper outlining the taxonomy of failure modes in AI agents, to help security professionals and machine learning engineers understand potential risks. This effort involved cataloging failures from internal red teaming, collaboration with various Microsoft teams, and interviews with external practitioners. The taxonomy identifies failure modes across security and safety pillars, addressing issues from data exfiltration to biased service delivery.

Recommended read:
References :
  • THE DECODER: Microsoft adds reasoning agents and company search to 365 Copilot
  • Ken Yeung: Microsoft Pushes Into Human-Agent Collaboration Era with Latest M365 Copilot Upgrades
  • venturebeat.com: Microsoft just launched powerful AI ‘agents’ that could completely transform your workday — and challenge Google’s workplace dominance
  • Maginative: Microsoft 365 Copilot Redesign: The New Face of Human-Agent Collaboration
  • The Register - Software: The latest update to Microsoft 365 Copilot brings AI-powered search, so-called reasoning agents, and a new Agent Store. Some users already have access to certain features, while others may have to wait through May.…
  • www.artificiallawyer.com: UK Courts Roll Out Microsoft Copilot For Judges, Update GenAI Rules
  • www.microsoft.com: Experience the future of customer service with AI agents
  • Ken Yeung: Microsoft: Companies Are Going AI-First, Turning to Digital Labor to Scale
  • the-decoder.com: Microsoft adds reasoning agents and company search to 365 Copilot
  • thetechbasic.com: Microsoft has a new vision for the future of work.
  • blogs.microsoft.com: The 2025 Annual Work Trend Index: The Frontier Firm is born
  • Towards AI: The Future of Work: How Microsoft 365 Office Solutions Are Evolving with AI Integration
  • www.marktechpost.com: Microsoft Releases a Comprehensive Guide to Failure Modes in Agentic AI Systems
  • Source Asia: Microsoft blog post on agentic AI driving AI-first business transformation.

Alexey Shabanov@TestingCatalog //
Microsoft is significantly enhancing its AI integration within knowledge work tools, focusing on boosting critical thinking, decision-making, and creativity. This initiative is underscored by findings from the Semantic Telemetry Project, revealing that professionals who incorporate AI into their routines are more likely to continue using these tools and increase their usage. Microsoft Research is actively exploring AI systems as "Tools for Thought," aiming to reimagine AI's role in augmenting human intellect. The company presented four new research papers and cohosted a workshop at the CHI 2025 conference, diving deep into the intersection of AI and human cognition.

Microsoft's Copilot AI is expanding with the early testing of a new podcast-generation feature. This tool can transform nearly any topic or user-provided material into a concise, conversational show. Copilot analyzes the content, develops a script for two synthetic hosts, and uses neural text-to-speech technology to create a dynamic summary. Users can interact with the podcast in real-time, pausing to ask questions that Copilot will adapt to, streamlining multitasking and improving the user experience. The Microsoft Copilot AI Podcast feature was outlined during Microsoft's 50th-anniversary Copilot event on April 4, 2025, and in a follow-up blog post.

Microsoft is also focusing on user-friendly automation with the New Copilot Studio Feature. Microsoft 365 Copilot Chat is being introduced as an AI-powered conversational assistant that integrates with tools like Word, Excel, Teams, and Outlook. It can interpret context, offer suggestions, and draft responses, leveraging OpenAI's capabilities to make work smarter and faster. Copilot Chat provides context-aware assistance, enhanced communication by drafting emails and summarizing threads, efficient data insights in Excel, and real-time collaboration, bridging the gap between documents and communication tools.

Recommended read:
References :