News from the AI & ML world

DeeperML - #aitools

@Talkback Resources //
Microsoft is making strategic shifts to bolster its AI capabilities, while addressing the financial demands of AI infrastructure. In a move to offset the high costs of running AI data centers, the company is implementing layoffs. This decision, viewed as a "double whammy" for tech workers, comes as Microsoft doubles down on its AI investments, suggesting further workforce adjustments may be on the horizon as AI technologies mature and become more efficient. The company is concurrently rolling out tools and features designed to streamline data interaction and enhance AI application development.

Microsoft has previewed the MCP (Model Context Protocol) tool for SQL Server, aiming to simplify data access for AI agents. Implemented in both Node.js and .NET, this open-source tool allows AI agents like GitHub Copilot or Claude Code to interact with databases using natural language, potentially revolutionizing how developers work with data. The MCP server, once set up locally, offers commands such as listing tables, describing tables, and creating or dropping tables. However, initial user experiences have been mixed, with some finding the tool limited and sometimes frustrating, citing slow operation speeds and the need for further refinement.

In addition to database enhancements, Microsoft is also focused on leveraging AI to improve accessibility and documentation. The MCP server for Microsoft Learn is now in public preview, offering real-time AI agent access to Microsoft's vast documentation library. Furthermore, a new C# script leveraging .NET 10 and local AI models enables the generation of AltText for images, making online content more accessible to visually impaired users. Microsoft is unifying security operations by transitioning Microsoft Sentinel into the Microsoft Defender portal. This consolidation offers a single, comprehensive view of incidents, streamlines response, and integrates AI-driven features like Security Copilot and exposure management to enhance security posture. The Azure portal for Microsoft Sentinel is slated for retirement by July 1, 2026, encouraging customers to transition to the unified Defender portal for an improved security operations experience.

Recommended read:
References :
  • .NET Blog: Local AI + .NET = AltText Magic in One C# Script
  • office365itpros.com: Microsoft Launches New Way to Consume Documentation
  • DEVCLASS: Microsoft SQL Server MCP tool:  “leap in data interaction†or limited and frustrating?
  • Talkback Resources: Planning your move to Microsoft Defender portal for all Microsoft Sentinel customers

Alexey Shabanov@TestingCatalog //
References: Data Phoenix , Maginative , TestingCatalog ...
Perplexity AI is rapidly expanding its presence in the AI market through strategic integrations and innovative features. The company has launched Perplexity Labs, a new tool for Pro subscribers designed to automate tasks such as creating reports, spreadsheets, and mini web apps. This feature leverages AI research, code execution, and content generation, positioning Perplexity as a versatile platform for both information retrieval and content creation. Labs can generate and execute code for data structuring, create interactive web apps, and produce various file types, making it well-suited for diverse projects from marketing campaigns to business analysis.

The startup is also making strides in device integration. Samsung is reportedly nearing a wide-ranging deal with Perplexity that includes investment and deep integration into devices, the Bixby assistant, and the web browser. This partnership could see Perplexity pre-installed on upcoming Galaxy S26 series phones, potentially replacing Google Gemini as the default AI assistant. The integration might also extend to Samsung Internet, offering users more advanced and personalized AI experiences directly within their web browsing.

Furthermore, Perplexity is enhancing its AI-driven search capabilities within the Comet Browser. Users can now observe Perplexity AI controlling pages in the Comet Browser, with visual indicators showing actions like clicking and filling forms. This new feature allows for more interactive and transparent AI-driven automation, benefiting users who automate repetitive workflows such as data entry and testing. This positions Perplexity as a pioneer in bringing interactive and transparent AI-driven automation to the browser.

Recommended read:
References :
  • Data Phoenix: Perplexity launches Labs, an AI tool that helps users create reports, dashboards, and web apps
  • Maginative: Perplexity's new Labs feature for Pro subscribers automates time-consuming tasks like creating reports, spreadsheets, and mini web apps using AI research and code execution.
  • www.techradar.com: The Samsung Galaxy S26 series could have Perplexity AI baked in
  • TestingCatalog: Users can now watch Perplexity AI control pages in Comet Browser
  • Mark Gurman: NEW: Samsung is nearing wide-ranging deal with Perplexity on an investment and deep integration into devices, Bixby assistant and web browser, I’m told.
  • Dataconomy: Samsung may invest in Perplexity and integrate it into Galaxy phones
  • PCMag Middle East ai: Samsung's Galaxy S26 May Drop Google Gemini as Its Default AI Chatbot
  • www.zdnet.com: If Perplexity's app and assistant get preloaded on upcoming Galaxies, what happens to Google Gemini integration?
  • www.lifewire.com: Samsung + Perplexity Might Be the AI Power Couple That Could Redefine Your Phone

James Peckham@PCMag Middle East ai //
References: Data Phoenix , Mark Gurman ,
Samsung is reportedly in the final stages of negotiating a wide-ranging deal with Perplexity to deeply integrate the AI company's technology into its devices. This move could potentially see Perplexity AI becoming the default AI chatbot on the Galaxy S26 series, possibly replacing Google Gemini. The integration may also extend to Samsung's Bixby assistant and its web browser, aiming to enhance these features with more powerful AI-driven capabilities. This strategic shift indicates Samsung's interest in exploring alternatives to Google's AI offerings and potentially supercharging its own services with Perplexity's search functionality.

Perplexity has recently launched Labs, a new AI tool designed to help users create reports, dashboards, and web applications. This innovative tool is available for Pro subscribers and automates time-consuming tasks using AI research and code execution. Labs is equipped with capabilities such as web browsing, code execution, and chart and image creation, enabling it to handle diverse projects and transform ideas into finished deliverables. The new Labs feature includes a suite of tools specifically designed to turn ideas and to-dos into completed work, with the ability to perform sustained automated research for 10 minutes or more, accomplishing tasks that previously took days.

Perplexity Labs stands out for its ability to execute code to structure datasets, apply formulas, and create various file types, including charts, images, and spreadsheets. It can also build and deploy simple interactive websites directly within the interface. The mini web apps feature is particularly ambitious, offering users the ability to create basic dashboards, slideshows, or data visualization tools without needing coding skills. The tool is designed to be self-supervised, working in the background to perform tasks ranging from marketing campaigns to business analysis.

Recommended read:
References :
  • Data Phoenix: Perplexity launches Labs, an AI tool that helps users create reports, dashboards, and web apps
  • Mark Gurman: Samsung is nearing wide-ranging deal with Perplexity on an investment and deep integration into devices, Bixby assistant and web browser, I’m told.
  • PCMag Middle East ai: Samsung's Galaxy S26 May Drop Google Gemini as Its Default AI Chatbot

Alexey Shabanov@TestingCatalog //
References: TechCrunch , TestingCatalog , Data Phoenix ...
Perplexity has unveiled Perplexity Labs, a new AI-powered tool designed for Pro subscribers, aiming to revolutionize the creation of work deliverables. Labs automates tasks like generating reports, spreadsheets, dashboards, and even mini web apps, leveraging AI research and code execution to bring projects from ideation to completion. It functions as an AI-driven team, providing users with a comprehensive suite of tools to transform their ideas into tangible results, marking a significant move beyond traditional search functionalities.

Labs stands out by investing a minimum of 10 minutes in self-supervised work, conducting web browsing, writing and executing code, and organizing data to achieve its objectives. This extended timeframe allows the AI to crunch numbers, apply formulas, generate visuals, and construct interactive web apps, all without requiring the user to lift a finger. The technology combines various AI capabilities Perplexity has developed, packaging the output into an "Assets" tab for easy access and download.

Available on web, iOS, and Android, with Mac and Windows apps on the horizon, Perplexity Labs is accessible for Pro subscribers at $20 per month. With the mini web apps feature being particularly ambitious, Labs can build and deploy simple interactive websites directly within the interface, such as dashboards, slideshows, or data visualization tools, without the user needing any coding knowledge. This move aims to shift Perplexity's positioning from a "better Google" to a "personal research assistant and worker," providing training wheels for building AI agents and automating time-consuming tasks.

Recommended read:
References :
  • TechCrunch: Perplexity’s new tool can generate spreadsheets, dashboards, and more
  • TestingCatalog: Perplexity AI rolled out Perplexity Labs for Pro subscribers
  • www.zdnet.com: 5 projects Perplexity's new Labs AI tool can whip up for you now - in minutes
  • Data Phoenix: Article discussing Perplexity new labs feature.
  • Maginative: Reports on Perplexity Launches New Labs Feature
  • www.itpro.com: Sick and tired of spreadsheets? Perplexity’s new tools can help with that
  • www.analyticsvidhya.com: I Tried Perplexity Labs and Here’s What I Found
  • Analytics Vidhya: I Tried Perplexity Labs and Here’s What I Found

Matthias Bastian@THE DECODER //
Black Forest Labs, known for its contributions to the popular Stable Diffusion model, has recently launched FLUX 1 Kontext and Playground API. This new image editing model lets users combine text and images as prompts to edit existing images, generate new scenes in the style of a reference image, or maintain character consistency across different outputs. The company also announced the BFL Playground, where users can test and explore the models before integrating them into enterprise applications. The release includes two versions of the model: FLUX.1 Kontext [pro] and the experimental FLUX.1 Kontext [max], with a third version, FLUX.1 Kontext [dev], entering private beta soon.

FLUX.1 Kontext is unique because it merges text-to-image generation with step-by-step image editing capabilities. It understands both text and images as input, enabling true in-context generation and editing, and allows for local editing that targets specific parts of an image without affecting the rest. According to Black Forest Labs, the Kontext [pro] model operates "up to an order of magnitude faster than previous state-of-the-art models." This speed allows enterprises creative teams and other developers to edit images with precision and at a faster pace.

The pro version allows users to generate an image and refine it through multiple “turns,” all while preserving the characters and styles in the images, allowing enterprises can use it for fast and iterative editing. The company claims Kontext [pro] led the field in internal tests using an in-house benchmark called KontextBench, showing strong performance in text editing and character retention, and outperforming competitors in speed and adherence to user prompts. The models are now available on platforms such as KreaAI, Freepik, Lightricks, OpenArt and LeonardoAI.

Recommended read:
References :
  • Replicate's blog: Use FLUX.1 Kontext to edit images with words
  • AI News | VentureBeat: FLUX.1 Kontext from Black Forest Labs aims to let users edit images multiple times through both text and reference images without losing speed.
  • TestingCatalog: Discover FLUX 1 Kontext by Black Forest Labs, featuring advanced text-and-image prompting for seamless edits and new scenes.
  • THE DECODER: With FLUX.1 Context, Black Forest Labs extends text-to-image systems to support both image generation and editing. The model enables fast, context-aware manipulation using a mix of text and image prompts, while preserving consistent styles and characters across multiple images.
  • TechCrunch: Black Forest Labs’ Kontext AI models can edit pics as well as generate them
  • the-decoder.com: Black Forest Labs' FLUX.1 merges text-to-image generation with image editing in one model

@cyberalerts.io //
A new malware campaign is exploiting the hype surrounding artificial intelligence to distribute the Noodlophile Stealer, an information-stealing malware. Morphisec researcher Shmuel Uzan discovered that attackers are enticing victims with fake AI video generation tools advertised on social media platforms, particularly Facebook. These platforms masquerade as legitimate AI services for creating videos, logos, images, and even websites, attracting users eager to leverage AI for content creation.

Posts promoting these fake AI tools have garnered significant attention, with some reaching over 62,000 views. Users who click on the advertised links are directed to bogus websites, such as one impersonating CapCut AI, where they are prompted to upload images or videos. Instead of receiving the promised AI-generated content, users are tricked into downloading a malicious ZIP archive named "VideoDreamAI.zip," which contains an executable file designed to initiate the infection chain.

The "Video Dream MachineAI.mp4.exe" file within the archive launches a legitimate binary associated with ByteDance's CapCut video editor, which is then used to execute a .NET-based loader. This loader, in turn, retrieves a Python payload from a remote server, ultimately leading to the deployment of the Noodlophile Stealer. This malware is capable of harvesting browser credentials, cryptocurrency wallet information, and other sensitive data. In some instances, the stealer is bundled with a remote access trojan like XWorm, enabling attackers to gain entrenched access to infected systems.

Recommended read:
References :

Carl Franzen@AI News | VentureBeat //
OpenAI is reportedly finalizing an agreement to acquire Windsurf, an AI-powered developer platform formerly known as Codeium, for approximately $3 billion. This marks OpenAI's largest acquisition to date, signaling a significant move to strengthen its position in the competitive AI tools market for software developers. The deal, which has been rumored for weeks, is anticipated to enhance OpenAI's coding AI capabilities and reflects the increasing importance of AI-powered tools in the software development industry. Windsurf's CEO Varun Mohan hinted at the deal on X, stating, "Big announcement tomorrow!".

This acquisition allows OpenAI to better understand how developers utilize various AI models, including those from competitors such as Meta and Anthropic. By gaining insights into developer preferences and the types of AI models used for coding tasks, OpenAI can refine its own offerings and better cater to the developer community's needs. Windsurf, founded in 2021 by MIT graduates Varun Mohan and Douglas Chen, launched the Windsurf Integrated Development Environment (IDE) in November 2024. The IDE, based on Microsoft’s Visual Studio Code, has attracted over 800,000 developer users and 1,000 enterprise customers.

The acquisition highlights OpenAI's ambition to dominate the AI coding space, pitting it against competitors such as Microsoft's GitHub Copilot and Anthropic's Claude Code. While Windsurf supports multiple large language models (LLMs), including its own custom model based on Meta’s Llama 3, questions arise regarding the future of this model-agnostic approach under OpenAI's ownership. The deal comes shortly after OpenAI announced it would maintain its non-profit-backed structure instead of switching to a traditional for-profit model, further emphasizing its commitment to its core mission of broadly benefiting humanity.

Recommended read:
References :
  • Analytics India Magazine: OpenAI to Acquire Windsurf for $3 Billion to Dominate AI Coding Space
  • THE DECODER: OpenAI's $3 billion Windsurf deal would boost its coding AI efforts
  • AI News | VentureBeat: Report: OpenAI is buying AI-powered developer platform Windsurf — what happens to its support for rival LLMs?
  • John Werner: OpenAI Strikes $3 Billion Deal To Buy Windsurf: Reports
  • Verdict: OpenAI to acquire Windsurf for $3bn
  • the-decoder.com: OpenAI's $3 billion Windsurf deal would boost its coding AI efforts
  • www.verdict.co.uk: OpenAI to acquire Windsurf for $3bn
  • Cautious Optimism: OpenAI solves its internal crisis, snaps up Windsurf
  • Latest from ITPro in News: OpenAI is closing in on its biggest acquisition to date – and it could be a game changer for software developers and ‘vibe coding’ fanatics
  • Techmeme: Sources: OpenAI reaches an agreement to buy Windsurf, an AI coding tool formerly known as Codeium, for about $3B; the deal has not yet closed (Bloomberg)
  • www.computerworld.com: OpenAI to acquire AI coding tool Windsurf for $3B
  • Techzine Global: OpenAI acquires Windsurf for $3 billion

Alexey Shabanov@TestingCatalog //
References: TestingCatalog , Maginative , THE DECODER ...
OpenAI is now providing access to its Deep Research tool to all ChatGPT users, including those with free accounts. The company is introducing a "lightweight" version of Deep Research, powered by the o4-mini model, designed to be nearly as intelligent as the original while significantly cheaper to serve. This move aims to democratize access to sophisticated AI reasoning capabilities, allowing a broader audience to benefit from the tool's in-depth analytical capabilities.

The Deep Research feature offers users detailed insights on various topics, from consumer decision-making to educational guidance. The lightweight version available to free users enables in-depth, topic-specific breakdowns without requiring a premium subscription. This expansion means free ChatGPT users will have access to Deep Research, albeit with a limitation of five tasks per month. The tool allows ChatGPT to autonomously browse the web, read, synthesize, and output structured reports, similar to tasks conducted by policy analysts and researchers.

Existing ChatGPT Plus, Team, and Pro users will also see changes. While still having access to the more advanced version of Deep Research, they will now switch to the lightweight version after reaching their initial usage limits. This approach effectively increases monthly usage for paid users by offering additional tasks via the o4-mini-powered tool. The lightweight version preserves core functionalities like multi-step reasoning, real-time browsing, and document parsing, though responses may be slightly shorter while retaining citations and structured logic.

Recommended read:
References :
  • TestingCatalog: OpenAI tests Deep Research Mini tool for free ChatGPT users
  • Maginative: OpenAI's Deep Research Is Now Available to All ChatGPT Users
  • www.tomsguide.com: Reports on OpenAI supercharging ChatGPT with Deep Research mode for free users.
  • THE DECODER: OpenAI has made the Deep Research tool in ChatGPT available to free-tier users. Access is limited to five uses per month, using a lightweight version based on the o4-mini-model.
  • TestingCatalog: OpenAI may have increased the o3 model's quota to 50 messages/day and added task-scheduling to o3 and o4 Mini. An "o3 Pro" tier might be on the horizon.
  • www.techradar.com: Discusses that Free ChatGPT users are finally getting Deep Research access
  • the-decoder.com: Reports that the Deep Research feature is now available to free ChatGPT users.
  • thetechbasic.com: OpenAI has made its smart research tool cheaper and more accessible. The tool, called Deep Research, helps ChatGPT search the web and give detailed answers. Now, a lighter version is available for free users, while paid plans offer more features. This move lets more people try advanced AI without paying upfront. What the Lightweight Tool Can
  • Shelly Palmer: The Washington Post partners with OpenAI to integrate its content into ChatGPT search results.
  • MarkTechPost: OpenAI has officially announced the release of its image generation API, powered by the gpt-image-1 model. This launch brings the multimodal capabilities of ChatGPT into the hands of developers, enabling programmatic access to image generation—an essential step for building intelligent design tools, creative applications, and multimodal agent systems.
  • PCMag Middle East ai: ChatGPT Free Users Can Now Run 'Deep Research' Five Times a Month
  • The Tech Basic: OpenAI has made its smart research tool cheaper and more accessible. The tool, called Deep Research, helps ChatGPT search the web and give detailed answers.
  • eWEEK: OpenAI has updated its ChatGPT models by offering free users a lightweight version of the "Deep Research" tool based on the o4-mini model.
  • techcrunch.com: OpenAI expands deep research usage for Plus, Pro, and Team users with an o4-mini-powered lightweight version, which also rolls out to Free users today.
  • THE DECODER: ChatGPT gets an update: OpenAI promises a more intuitive GPT-4o
  • aigptjournal.com: OpenAI Broadens Access: Lightweight Deep Research Empowers Every ChatGPT User
  • techstrong.ai: OpenAI Debuts ‘Lightweight’ Model for ChatGPT’s Deep Research Tool
  • AI GPT Journal: OpenAI Broadens Access: Lightweight Deep Research Empowers Every ChatGPT User

Chris McKay@Maginative //
OpenAI has released its latest AI models, o3 and o4-mini, designed to enhance reasoning and tool use within ChatGPT. These models aim to provide users with smarter and faster AI experiences by leveraging web search, Python programming, visual analysis, and image generation. The models are designed to solve complex problems and perform tasks more efficiently, positioning OpenAI competitively in the rapidly evolving AI landscape. Greg Brockman from OpenAI noted the models "feel incredibly smart" and have the potential to positively impact daily life and solve challenging problems.

The o3 model stands out due to its ability to use tools independently, which enables more practical applications. The model determines when and how to utilize tools such as web search, file analysis, and image generation, thus reducing the need for users to specify tool usage with each query. The o3 model sets new standards for reasoning, particularly in coding, mathematics, and visual perception, and has achieved state-of-the-art performance on several competition benchmarks. The model excels in programming, business, consulting, and creative ideation.

Usage limits for these models vary, with o3 at 50 queries per week, and o4-mini at 150 queries per day, and o4-mini-high at 50 queries per day for Plus users, alongside 10 Deep Research queries per month. The o3 model is available to ChatGPT Pro and Team subscribers, while the o4-mini models are used across ChatGPT Plus. OpenAI says o3 is also beneficial in generating and critically evaluating novel hypotheses, especially in biology, mathematics, and engineering contexts.

Recommended read:
References :
  • Simon Willison's Weblog: OpenAI are really emphasizing tool use with these: For the first time, our reasoning models can agentically use and combine every tool within ChatGPT—this includes searching the web, analyzing uploaded files and other data with Python, reasoning deeply about visual inputs, and even generating images. Critically, these models are trained to reason about when and how to use tools to produce detailed and thoughtful answers in the right output formats, typically in under a minute, to solve more complex problems.
  • the-decoder.com: OpenAI’s new o3 and o4-mini models reason with images and tools
  • venturebeat.com: OpenAI launches o3 and o4-mini, AI models that ‘think with images’ and use tools autonomously
  • www.analyticsvidhya.com: o3 and o4-mini: OpenAI’s Most Advanced Reasoning Models
  • www.tomsguide.com: OpenAI's o3 and o4-mini models
  • Maginative: OpenAI’s latest models—o3 and o4-mini—introduce agentic reasoning, full tool integration, and multimodal thinking, setting a new bar for AI performance in both speed and sophistication.
  • THE DECODER: OpenAI’s new o3 and o4-mini models reason with images and tools
  • Analytics Vidhya: o3 and o4-mini: OpenAI’s Most Advanced Reasoning Models
  • www.zdnet.com: These new models are the first to independently use all ChatGPT tools.
  • The Tech Basic: OpenAI recently released its new AI models, o3 and o4-mini, to the public. Smart tools employ pictures to address problems through pictures, including sketch interpretation and photo restoration.
  • thetechbasic.com: OpenAI’s new AI Can “See†and Solve Problems with Pictures
  • www.marktechpost.com: OpenAI Introduces o3 and o4-mini: Progressing Towards Agentic AI with Enhanced Multimodal Reasoning
  • MarkTechPost: OpenAI Introduces o3 and o4-mini: Progressing Towards Agentic AI with Enhanced Multimodal Reasoning
  • analyticsindiamag.com: Access to o3 and o4-mini is rolling out today for ChatGPT Plus, Pro, and Team users.
  • THE DECODER: OpenAI is expanding its o-series with two new language models featuring improved tool usage and strong performance on complex tasks.
  • gHacks Technology News: OpenAI released its latest models, o3 and o4-mini, to enhance the performance and speed of ChatGPT in reasoning tasks.
  • www.ghacks.net: OpenAI Launches o3 and o4-Mini models to improve ChatGPT's reasoning abilities
  • Data Phoenix: OpenAI releases new reasoning models o3 and o4-mini amid intense competition. OpenAI has launched o3 and o4-mini, which combine sophisticated reasoning capabilities with comprehensive tool integration.
  • Shelly Palmer: OpenAI Quietly Reshapes the Landscape with o3 and o4-mini. OpenAI just rolled out a major update to ChatGPT, quietly releasing three new models (o3, o4-mini, and o4-mini-high) that offer the most advanced reasoning capabilities the company has ever shipped.
  • THE DECODER: Safety assessments show that OpenAI's o3 is probably the company's riskiest AI model to date
  • shellypalmer.com: OpenAI Quietly Reshapes the Landscape with o3 and o4-mini
  • BleepingComputer: OpenAI details ChatGPT-o3, o4-mini, o4-mini-high usage limits
  • TestingCatalog: OpenAI’s o3 and o4‑mini bring smarter tools and faster reasoning to ChatGPT
  • simonwillison.net: Introducing OpenAI o3 and o4-mini
  • bdtechtalks.com: What to know about o3 and o4-mini, OpenAI’s new reasoning models
  • bdtechtalks.com: What to know about o3 and o4-mini, OpenAI’s new reasoning models
  • thezvi.wordpress.com: OpenAI has finally introduced us to the full o3 along with o4-mini. Greg Brockman (OpenAI): Just released o3 and o4-mini! These models feel incredibly smart. We’ve heard from top scientists that they produce useful novel ideas. Excited to see their …
  • thezvi.wordpress.com: OpenAI has upgraded its entire suite of models. By all reports, they are back in the game for more than images. GPT-4.1 and especially GPT-4.1-mini are their new API non-reasoning models.
  • felloai.com: OpenAI has just launched a brand-new series of GPT models—GPT-4.1, GPT-4.1 mini, and GPT-4.1 nano—that promise major advances in coding, instruction following, and the ability to handle incredibly long contexts.
  • Interconnects: OpenAI's o3: Over-optimization is back and weirder than ever
  • www.ishir.com: OpenAI has released o3 and o4-mini, adding significant reasoning capabilities to its existing models. These advancements will likely transform the way users interact with AI-powered tools, making them more effective and versatile in tackling complex problems.
  • www.bigdatawire.com: OpenAI released the models o3 and o4-mini that offer advanced reasoning capabilities, integrated with tool use, like web searches and code execution.
  • Drew Breunig: OpenAI's o3 and o4-mini models offer enhanced reasoning capabilities in mathematical and coding tasks.
  • TestingCatalog: OpenAI’s o3 and o4-mini bring smarter tools and faster reasoning to ChatGPT
  • www.techradar.com: ChatGPT model matchup - I pitted OpenAI's o3, o4-mini, GPT-4o, and GPT-4.5 AI models against each other and the results surprised me
  • www.techrepublic.com: OpenAI’s o3 and o4-mini models are available now to ChatGPT Plus, Pro, and Team users. Enterprise and education users will get access next week.
  • Last Week in AI: OpenAI’s new GPT-4.1 AI models focus on coding, OpenAI launches a pair of AI reasoning models, o3 and o4-mini, Google’s newest Gemini AI model focuses on efficiency, and more!
  • techcrunch.com: OpenAI’s new reasoning AI models hallucinate more.
  • computational-intelligence.blogspot.com: OpenAI's new reasoning models, o3 and o4-mini, are a step up in certain capabilities compared to prior models, but their accuracy is being questioned due to increased instances of hallucinations.
  • www.unite.ai: unite.ai article discussing OpenAI's o3 and o4-mini new possibilities through multimodal reasoning and integrated toolsets.
  • : On April 16, 2025, OpenAI released upgraded versions of its advanced reasoning models.
  • Digital Information World: OpenAI’s Latest o3 and o4-mini AI Models Disappoint Due to More Hallucinations than Older Models
  • techcrunch.com: TechCrunch reports on OpenAI's GPT-4.1 models focusing on coding.
  • Analytics Vidhya: o3 vs o4-mini vs Gemini 2.5 pro: The Ultimate Reasoning Battle
  • THE DECODER: OpenAI's o3 achieves near-perfect performance on long context benchmark.
  • the-decoder.com: OpenAI's o3 achieves near-perfect performance on long context benchmark
  • www.analyticsvidhya.com: AI models keep getting smarter, but which one truly reasons under pressure? In this blog, we put o3, o4-mini, and Gemini 2.5 Pro through a series of intense challenges: physics puzzles, math problems, coding tasks, and real-world IQ tests.
  • Simon Willison's Weblog: This post explores the use of OpenAI's o3 and o4-mini models for conversational AI, highlighting their ability to use tools in their reasoning process. It also discusses the concept of
  • Simon Willison's Weblog: The benchmark score on OpenAI's internal PersonQA benchmark (as far as I can tell no further details of that evaluation have been shared) going from 0.16 for o1 to 0.33 for o3 is interesting, but I don't know if it it's interesting enough to produce dozens of headlines along the lines of "OpenAI's o3 and o4-mini hallucinate way higher than previous models"
  • techstrong.ai: Techstrong.ai reports OpenAI o3, o4 Reasoning Models Have Some Kinks.
  • www.marktechpost.com: OpenAI Releases a Practical Guide to Identifying and Scaling AI Use Cases in Enterprise Workflows
  • Towards AI: OpenAI's o3 and o4-mini models have demonstrated promising improvements in reasoning tasks, particularly their use of tools in complex thought processes and enhanced reasoning capabilities.
  • Analytics Vidhya: In this article, we explore how OpenAI's o3 reasoning model stands out in tasks demanding analytical thinking and multi-step problem solving, showcasing its capability in accessing and processing information through tools.
  • pub.towardsai.net: TAI#149: OpenAI’s Agentic o3; New Open Weights Inference Optimized Models (DeepMind Gemma, Nvidia…
  • composio.dev: OpenAI o3 vs. Gemini 2.5 Pro vs. o4-mini
  • : OpenAI o3 and o4-mini are out. They are two reasoning state-of-the-art models. They’re expensive, multimodal, and super efficient at tool use.