News from the AI & ML world

DeeperML - #developers

@medium.com //
DeepSeek, a Chinese AI unicorn, has released DeepSeek-R1-0528, a significant update to its R1 reasoning model. This new release aims to enhance the model's capabilities in mathematics, programming, and general logical reasoning, positioning it as a formidable open-source alternative to leading proprietary models like OpenAI's o3 and Google's Gemini 2.5 Pro. The updated model is available on Hugging Face under the MIT license, promoting transparency and accessibility in AI development.

The R1-0528 update showcases improved reasoning depth and inference accuracy. Its performance on the AIME 2025 math benchmark has increased significantly, jumping from 70% to 87.5%. This indicates a deeper reasoning process, averaging 23,000 tokens per question, up from 12,000 in the previous version. These enhancements are attributed to increased computational resources and algorithmic optimizations during post-training. Additionally, the model exhibits improved performance in code generation tasks, ranking just below OpenAI's o4 mini and o3 models on LiveCodeBench benchmarks, and outperforming xAI's Grok 3 mini and Alibaba's Qwen 3.

DeepSeek has also released a distilled version of R1-0528, named DeepSeek-R1-0528-Qwen3-8B. This lightweight model, fine-tuned from Alibaba’s Qwen3-8B, achieves state-of-the-art performance among open-source models on the AIME 2024 benchmark and is designed for efficient operation on a single GPU. The current cost for DeepSeek’s API is $0.14 per 1 million input tokens during regular hours of 8:30 pm to 12:30 pm (drops to $0.035 during discount hours). Output for 1 million tokens is consistently priced at $2.19.

Recommended read:
References :
  • pub.towardsai.net: DeepSeek R1 : Is It Right For You? (A Practical Self‑Assessment for Businesses and Individuals)
  • AI News | VentureBeat: DeepSeek R1-0528 arrives in powerful open source challenge to OpenAI o3 and Google Gemini 2.5 Pro
  • Analytics Vidhya: New Deepseek R1-0528 Update is INSANE
  • Kyle Wiggers ?: DeepSeek updates its R1 reasoning AI model, releases it on Hugging Face
  • MacStories: Testing DeepSeek R1-0528 on the M3 Ultra Mac Studio and Installing Local GGUF Models with Ollama on macOS
  • Kyle Wiggers ?: DeepSeek’s updated R1 AI model is more censored, test finds
  • www.analyticsvidhya.com: New Deepseek R1-0528 Update is INSANE
  • www.marktechpost.com: DeepSeek Releases R1-0528: An Open-Source Reasoning AI Model Delivering Enhanced Math and Code Performance with Single-GPU Efficiency
  • NextBigFuture.com: DeepSeek New Deepseek-R1 Model is Competitive With OpenAI O3 and Gemini 2.5 Pro
  • MarkTechPost: Information about DeepSeek's R1-0528 model and its enhancements in math and code performance.

@www.microsoft.com //
Microsoft recently held its Build 2025 developer conference, showcasing a range of new AI-powered tools and providing a sneak peek into experimental projects. One of the overarching themes of the event was the company's heavy investment in Artificial Intelligence, with nearly every major announcement being related to Generative AI. Microsoft is also focused on AI agents designed to augment and amplify the capabilities of organizations. For instance, marketing agents could propose and execute digital marketing campaign plans, while engineering agents could autonomously create specifications for new features and begin testing them.

At Build, Microsoft highlighted its commitment to "dogfooding" its own AI dev tools. This involved using Copilot within its complex .NET codebase, allowing developers to witness firsthand the agent's stumbles and successes. While this approach might appear risky, it demonstrates Microsoft's commitment to transparency and continuous improvement, differentiating it from other AI development tool vendors. The goal of Microsoft is to solidify its position as the go-to platform for developers through GitHub and Azure, while simultaneously fostering an ecosystem where other startups can build upon this foundation.

One particularly intriguing experimental project unveiled at Build was Project Amelie. This AI agent is designed to build machine learning pipelines from a single prompt. Amelie ingests available data, trains models, and produces a deployable solution, essentially acting as a "mini data scientist in a box." In early testing, Microsoft claims Project Amelie has outperformed current benchmarks on MLE-Bench, a framework for evaluating machine learning agents. While Project Amelie is still in its early stages, it exemplifies Microsoft's vision for AI agents that can autonomously carry out complex AI-related tasks.

Recommended read:
References :
  • The Pragmatic Engineer: Microsoft is dogfooding AI dev tools’ future
  • The Rundown AI: Microsoft's top 5 AI releases from Build 2025
  • Ken Yeung: Microsoft Build featured a flurry of product announcements. But beyond the official launches, the company also offered a glimpse into its frontier projects—experimental inventions that may never be released to the public but serve to showcase the ideas Microsoft is exploring.

Ken Yeung@Ken Yeung //
Microsoft is exploring the frontier of AI-driven development with its experimental project, Project Amelie. Unveiled at Build 2025, Amelie is an AI agent designed to autonomously construct machine learning pipelines from a single prompt. This project showcases Microsoft's ambition to create AI that can develop AI, potentially revolutionizing how machine learning engineering tasks are performed. Powered by Microsoft Research's RD agent, Amelie aims to automate and optimize research and development processes in machine learning, eliminating the manual setup work typically handled by data scientists.

Early testing results are promising, with Microsoft reporting that Project Amelie has outperformed current benchmarks on MLE-Bench, a framework for evaluating machine learning agents' effectiveness in real-world tasks. During a live demo at Microsoft Build, Seth Juarez, Principal Program Manager for Microsoft's AI Platform, illustrated how Amelie could function as a "mini data scientist in a box," capable of processing and analyzing data that would typically take human scientists a day and a half to complete. This project has potential for applications in other scenarios where users want AI to carry out complex AI-related tasks.

Should Project Amelie become commercialized, it could significantly advance Microsoft's goals for human-agent collaboration. While Microsoft is not alone in this endeavor, with companies like Google's DeepMind and OpenAI also exploring similar technologies, the project highlights a shift towards AI agents handling complex AI-related tasks independently. Developers interested in exploring the capabilities of Project Amelie can sign up to participate in its private preview, offering a glimpse into the future of AI-driven machine learning pipeline development.

Recommended read:
References :
  • PCMag Middle East ai: Microsoft Adds Gen AI Features to Paint, Snipping Tool, and Notepad
  • Ken Yeung: Microsoft’s Project Amelie Is an Experiment in ‘AI Developing AI’

@gradientflow.com //
References: eWEEK , Gradient Flow ,
Apple is ramping up its efforts in the artificial intelligence space, focusing on efficiency, privacy, and seamless integration across its hardware and software. The tech giant is reportedly accelerating the development of its first AI-powered smart glasses, with a target release date of late 2026. These glasses, described as similar to Meta's Ray-Ban smart glasses but "better made," will feature built-in cameras, microphones, and speakers, enabling them to analyze the external world and respond to requests via Siri. This move positions Apple to compete directly with Meta, Google, and the emerging OpenAI/Jony Ive partnership in the burgeoning AI device market.

Apple also plans to open its on-device AI models to developers at WWDC 2025. This initiative aims to empower developers to create innovative AI-driven applications that leverage Apple's hardware capabilities while prioritizing user privacy. By providing developers with access to its AI models, Apple hopes to foster a vibrant ecosystem of AI-enhanced experiences across its product line. The company's strategy reflects a desire to integrate sophisticated intelligence deeply into its products without compromising its core values of user privacy and trust, distinguishing it from competitors who may have rapidly deployed high-profile AI models.

While Apple is pushing forward with its smart glasses, it has reportedly shelved plans for an Apple Watch with a built-in camera. This decision suggests a strategic shift in focus, with the company prioritizing the development of AI-powered wearables that align with its vision of seamless integration and user privacy. The abandonment of the camera-equipped watch may also reflect concerns about privacy implications or technical challenges associated with incorporating such features into a smaller wearable device. Ultimately, Apple's success in the AI arena will depend on its ability to deliver genuinely useful and seamlessly embedded AI experiences that enhance user experience.

Recommended read:
References :
  • eWEEK: Indicates that Apple is speeding up development on its first pair of AI-powered smart glasses for a late 2026 release.
  • Gradient Flow: Discusses Apple’s AI focus on efficiency, privacy, and seamless integration.
  • gradientflow.com: Apple’s success has been built upon a meticulous fusion of hardware, software, and services, consistently shaping how people interact with technology while championing user privacy.

Ken Yeung@Ken Yeung //
References: Ken Yeung , AIwire
Microsoft is significantly expanding its AI capabilities to the edge, empowering developers with tools to create innovative AI agents. This strategic move, unveiled at Build 2025, focuses on enabling smarter and faster experiences across various devices. Unlike previous strategies centered on single-use AI assistants, Microsoft is now emphasizing dynamic agents that seamlessly integrate with third-party systems through the Model Context Protocol (MCP). This shift aims to create broader, integrated ecosystems where agents can operate across diverse use cases and integrate with any digital infrastructure.

Microsoft is empowering developers by offering the OpenAI Responses API, which allows the combination of MCP servers, code interpreters, reasoning, web search, and RAG within a single API call. This capability enables the development of next-generation AI agents. Among the announcements at Build 2025 were a platform to build on-device agents, the ability to bring AI to web apps on the Edge browser, and developer capabilities to deploy bots directly on Windows. The company hopes the developments will lead to broader use of AI technologies and a significant increase in the number of daily active users.

Microsoft is already demonstrating the impact of its agentic AI platform, Azure AI Foundry, in healthcare, including streamlining cancer care planning. In addition to their AI initiatives, Microsoft has introduced a new AI-powered orchestration system that streamlines the complex process of cancer care planning. This orchestration system, available through the Azure AI Foundry Agent Catalog, brings together specialized AI agents to assist clinicians with the analysis of multimodal medical data, from imaging and genomics to clinical notes and pathology. Early adopters include Stanford Health Care, Johns Hopkins, Providence Genomics, and UW Health.

Recommended read:
References :
  • Ken Yeung: IN THIS ISSUE: Microsoft pushes AI innovation to the edge. Will OpenAI crack the AI hardware market, a space where many have stumbled, after acquiring Sir Jony Ive’s AI startup for nearly $6.5 billion? Plus, catch up on this week’s key headlines you might have missed, including what was announced at Google I/O and the […]
  • AIwire: Microsoft has introduced a new AI-powered orchestration system designed to streamline the complex process of cancer care planning.

@the-decoder.com //
Google has launched Jules, a coding agent designed to automate tasks such as bug fixing, documentation, and testing. This new tool enters public beta and is available globally, giving developers the chance to have AI file pull requests on their behalf. Jules leverages Google's Gemini 2.5 Pro model and offers a starter tier with five free tasks per day, positioning it as a direct competitor to GitHub Copilot's coding agent and OpenAI's Codex.

Jules differentiates itself by spinning up a disposable Cloud VM, cloning the target repository, and creating a multi-step plan before making changes to any files. The agent can handle tasks like bumping dependencies, refactoring code, adding documentation, writing tests, and addressing open issues. Each change is presented as a standard GitHub pull request for human review. Google emphasizes that Jules "understands your codebase" due to the multimodal Gemini model, which allows it to reason over large file graphs and project history.

The release of Jules in beta signifies a broader shift from code-completion tools to full agentic development. Jules is available to anyone with a Google account and a linked GitHub account, and tasks can be assigned directly from an issue using the assign-to-jules label. This move reflects the increasing trend of AI-assisted programming and automated agents in software development, with both Google and Microsoft vying for dominance in this growing market.

Recommended read:
References :

@zdnet.com //
Microsoft is intensifying its efforts to enhance the security and trustworthiness of AI agents, announcing significant advancements at Build 2025. These moves are designed to empower businesses and individuals to create custom-made AI systems with improved safeguards. A key component of this initiative is the extension of Zero Trust principles to secure the agentic workforce, ensuring that AI agents operate within a secure and controlled environment.

Windows 11 is set to receive native Model Context Protocol (MCP) support, complete with new MCP Registry and MCP Server functionalities. This enhancement aims to streamline the development process for agentic AI experiences, making it easier for developers to build Windows applications with robust AI capabilities. The MCP, an open standard, facilitates seamless interaction between AI models and data residing outside specific applications, enabling apps to share contextual information that AI tools and agents can utilize effectively. Microsoft is introducing the MCP Registry as a secure and trustworthy source for AI agents to discover accessible MCP servers on Windows devices.

In related news, GitHub and Microsoft are collaborating with Anthropic to advance the MCP standard. This partnership will see both companies adding first-party support across Azure and Windows, assisting developers in exposing app features as MCP servers. Further improvements will focus on bolstering security and establishing a registry to list trusted MCP servers. Microsoft Entra Agent ID, an extension of industry-leading identity management and access capabilities, will also be introduced to provide enhanced security for AI agents. These strategic steps underscore Microsoft's commitment to securing the agentic workforce and facilitating the responsible development and deployment of AI technologies.

Recommended read:
References :
  • www.windowscentral.com: Microsoft takes big step towards agentic Windows AI experiences with native Model Context Protocol support
  • www.zdnet.com: Trusting AI agents to deal with your data is hard, and these features seek to make it easier.
  • www.eweek.com: Microsoft’s Big Bet on AI Agents: Model Context Protocol in Windows 11
  • eWEEK: Microsoft’s Big Bet on AI Agents: Model Context Protocol in Windows 11

Matthias Bastian@THE DECODER //
OpenAI has announced the integration of GPT-4.1 and GPT-4.1 mini models into ChatGPT, aimed at enhancing coding and web development capabilities. The GPT-4.1 model, designed as a specialized model excelling at coding tasks and instruction following, is now available to ChatGPT Plus, Pro, and Team users. According to OpenAI, GPT-4.1 is faster and a great alternative to OpenAI o3 & o4-mini for everyday coding needs, providing more help to developers creating applications.

OpenAI is also rolling out GPT-4.1 mini, which will be available to all ChatGPT users, including those on the free tier, replacing the previous GPT-4o mini model. This model serves as the fallback option once GPT-4o usage limits are reached. The release notes confirm that GPT 4.1 mini offers various improvements over GPT-4o mini, including instruction-following, coding, and overall intelligence. This initiative is part of OpenAI's effort to make advanced AI tools more accessible and useful for a broader audience, particularly those engaged in programming and web development.

Johannes Heidecke, Head of Systems at OpenAI, has emphasized that the new models build upon the safety measures established for GPT-4o, ensuring parity in safety performance. According to Heidecke, no new safety risks have been introduced, as GPT-4.1 doesn’t introduce new modalities or ways of interacting with the AI, and that it doesn’t surpass o3 in intelligence. The rollout marks another step in OpenAI's increasingly rapid model release cadence, significantly expanding access to specialized capabilities in web development and coding.

Recommended read:
References :
  • twitter.com: GPT-4.1 is a specialized model that excels at coding tasks & instruction following. Because it’s faster, it’s a great alternative to OpenAI o3 & o4-mini for everyday coding needs.
  • www.computerworld.com: OpenAI adds GPT-4.1 models to ChatGPT
  • gHacks Technology News: OpenAI releases GPT-4.1 and GPT-4.1 mini AI models for ChatGPT
  • Maginative: OpenAI Brings GPT-4.1 to ChatGPT
  • www.windowscentral.com: “Am I crazy or is GPT-4.1 the best model for coding?” ChatGPT gets new models with exemplary web development capabilities — but OpenAI is under fire for allegedly skimming through safety processes
  • the-decoder.com: OpenAI brings its new GPT-4.1 model to ChatGPT users
  • www.ghacks.net: OpenAI releases GPT-4.1 and GPT-4.1 mini AI models for ChatGPT
  • AI News | VentureBeat: OpenAI is rolling out GPT-4.1, its new non-reasoning large language model (LLM) that balances high performance with lower cost, to users of ChatGPT.
  • www.techradar.com: OpenAI just gave ChatGPT users a huge free upgrade – 4.1 mini is available today
  • www.marktechpost.com: OpenAI has introduced Codex, a cloud-native software engineering agent integrated into ChatGPT, signaling a new era in AI-assisted software development.

@Dataconomy //
Databricks has announced its acquisition of Neon, an open-source database startup specializing in serverless Postgres, in a deal reportedly valued at $1 billion. This strategic move is aimed at enhancing Databricks' AI infrastructure, specifically addressing the database bottleneck that often hampers the performance of AI agents. Neon's technology allows for the rapid creation and deployment of database instances, spinning up new databases in milliseconds, which is critical for the speed and scalability required by AI-driven applications. The integration of Neon's serverless Postgres architecture will enable Databricks to provide a more streamlined and efficient environment for building and running AI agents.

Databricks plans to incorporate Neon's scalable Postgres offering into its existing big data platform, eliminating the need to scale separate server and storage components in tandem when responding to AI workload spikes. This resolves a common issue in modern cloud architectures where users are forced to over-provision either compute or storage to meet the demands of the other. With Neon's serverless architecture, Databricks aims to provide instant provisioning, separation of compute and storage, and API-first management, enabling a more flexible and cost-effective solution for managing AI workloads. According to Databricks, Neon reports that 80% of its database instances are provisioned by software rather than humans.

The acquisition of Neon is expected to give Databricks a competitive edge, particularly against competitors like Snowflake. While Snowflake currently lacks similar AI-driven database provisioning capabilities, Databricks' integration of Neon's technology positions it as a leader in the next generation of AI application building. The combination of Databricks' existing data intelligence platform with Neon's serverless Postgres database will allow for the programmatic provisioning of databases in response to the needs of AI agents, overcoming the limitations of traditional, manually provisioned databases.

Recommended read:
References :
  • Databricks: Today, we are excited to announce that we have agreed to acquire Neon, a developer-first, serverless Postgres company.
  • www.infoworld.com: Databricks to acquire open-source database startup Neon to build the next wave of AI agents
  • www.bigdatawire.com: Databricks Nabs Neon to Solve AI Database Bottleneck
  • Dataconomy: Databricks has agreed to acquire Neon, an open-source database startup, for approximately $1 billion.
  • BigDATAwire: Databricks today announced its intent to buy Neon, a database startup founded by Nikita Shamgunov that develops a serverless and infinitely scalable version of the open source Postgres database.
  • Techzine Global: Neon’s technology can spin up a Postgres instance in less than 500 milliseconds, which is crucial for AI agents’ fast working methods.
  • AI News | VentureBeat: The $1 Billion database bet: What Databricks’ Neon acquisition means for your AI strategy
  • analyticsindiamag.com: Databricks to Acquire Database Startup Neon for $1 Billion

@Google DeepMind Blog //
References: LearnAI , The Next Web , www.unite.ai ...
Google DeepMind has introduced AlphaEvolve, a revolutionary AI coding agent designed to autonomously discover innovative algorithms and scientific solutions. This groundbreaking research, detailed in the paper "AlphaEvolve: A Coding Agent for Scientific and Algorithmic Discovery," represents a significant step towards achieving Artificial General Intelligence (AGI) and potentially even Artificial Superintelligence (ASI). AlphaEvolve distinguishes itself through its evolutionary approach, where it autonomously generates, evaluates, and refines code across generations, rather than relying on static fine-tuning or human-labeled datasets. AlphaEvolve combines Google’s Gemini Flash, Gemini Pro, and automated evaluation metrics.

AlphaEvolve operates using an evolutionary pipeline powered by large language models (LLMs). This pipeline doesn't just generate outputs—it mutates, evaluates, selects, and improves code across generations. The system begins with an initial program and iteratively refines it by introducing carefully structured changes. These changes take the form of LLM-generated diffs—code modifications suggested by a language model based on prior examples and explicit instructions. A diff in software engineering refers to the difference between two versions of a file, typically highlighting lines to be removed or replaced.

Google's AlphaEvolve is not merely another code generator, but a system that generates and evolves code, allowing it to discover new algorithms. This innovation has already demonstrated its potential by shattering a 56-year-old record in matrix multiplication, a core component of many machine learning workloads. Additionally, AlphaEvolve has reclaimed 0.7% of compute capacity across Google's global data centers, showcasing its efficiency and cost-effectiveness. AlphaEvolve imagined as a genetic algorithm coupled to a large language model.

Recommended read:
References :
  • LearnAI: Google’s AlphaEvolve Is Evolving New Algorithms — And It Could Be a Game Changer
  • The Next Web: Article on The Next Web describing feats of DeepMind’s AI coding agent AlphaEvolve.
  • Towards Data Science: A blend of LLMs' creative generation capabilities with genetic algorithms
  • www.unite.ai: Google DeepMind has unveiled AlphaEvolve, an evolutionary coding agent designed to autonomously discover novel algorithms and scientific solutions. Presented in the paper titled “AlphaEvolve: A Coding Agent for Scientific and Algorithmic Discovery,†this research represents a foundational step toward Artificial General Intelligence (AGI) and even Artificial Superintelligence (ASI).
  • learn.aisingapore.org: AlphaEvolve imagined as a genetic algorithm coupled to a large language model. Models have undeniably revolutionized how many of us approach coding, but they’re often more like a super-powered intern than a seasoned architect.
  • AI News | VentureBeat: Google's AlphaEvolve is the epitome of a best-practice AI agent orchestration. It offers a lesson in production-grade agent engineering. Discover its architecture & essential takeaways for your enterprise AI strategy.
  • Unite.AI: Google DeepMind has unveiled AlphaEvolve, an evolutionary coding agent designed to autonomously discover novel algorithms and scientific solutions.
  • Last Week in AI: DeepMind introduced Alpha Evolve, a new coding agent designed for scientific and algorithmic discovery, showing improvements in automated code generation and efficiency.
  • venturebeat.com: VentureBeat article about Google DeepMind's AlphaEvolve system.

@the-decoder.com //
Google has announced implicit caching in Gemini 2.5, a new feature designed to significantly reduce developer costs. The company aims to cut costs by as much as 75 percent by automatically applying a 75% cached token discount. This is a substantial improvement over previous methods, where developers had to manually configure caching. The new implicit caching automatically detects and stores recurring content, ensuring that repeated prompts are only processed once, which can lead to substantial cost savings.

The new feature is particularly beneficial for applications that run prompts against the same long context or continue existing conversations. Google recommends placing the stable part of a prompt, such as system instructions, at the start and adding user-specific input, like questions, afterwards to maximize benefits. Implicit caching kicks in for Gemini 2.5 Flash starting at 1,024 tokens, and for Pro versions from 2,048 tokens onwards. This functionality is now live, and developers can find more details and best practices in the Gemini API documentation.

This development builds on the overwhelmingly positive feedback to Gemini 2.5 Pro’s coding and multimodal reasoning capabilities. Beyond UI-focused development, these improvements extend to other coding tasks such as code transformation, code editing and developing complex agentic workflows. Simon Willison notes that Gemini 2.5 now applies the 75% cached token discount automatically, which he considers a potentially big cost saving for applications that run prompts against the same long context or continue existing conversations.

Recommended read:
References :
  • bsky.app: Gemini 2.5 now applies the 75% cached token discount automatically - previously you had to manually configure it Potentially big cost savings here for applications that run prompts against the same long context, or continue existing conversations
  • the-decoder.com: Google introduces implicit caching in Gemini 2.5, aiming to cut developer costs by as much as 75 percent.
  • Simon Willison: Gemini 2.5 now applies the 75% cached token discount automatically - previously you had to manually configure it Potentially big cost savings here for applications that run prompts against the same long context, or continue existing conversations
  • simonwillison.net: This article discusses the new implicit caching feature in Gemini 2.5 Pro, which automatically caches previous results to reduce costs by up to 75%.
  • thetechbasic.com: This article talks about Google's new tool called implicit caching that helps developers save money on repeated prompts.

@www.marktechpost.com //
OpenAI has announced the release of Reinforcement Fine-Tuning (RFT) for its o4-mini reasoning model, alongside supervised fine-tuning (SFT) for the GPT-4.1 nano model. RFT enables developers to customize a private version of the o4-mini model based on their enterprise's unique products, internal terminology, and goals. This allows for a more tailored AI experience, where the model can generate communications, answer specific questions about company knowledge, and pull up private, proprietary company knowledge with greater accuracy. RFT represents a move beyond traditional supervised fine-tuning, offering more flexible control for complex, domain-specific tasks.

The process involves applying a feedback loop during training, where developers can initiate training sessions, upload datasets, and set up assessment logic through OpenAI’s online developer platform. Instead of relying on fixed question-answer pairs, RFT uses a grader model to score multiple candidate responses per prompt, adjusting the model weights to favor high-scoring outputs. This approach allows for fine-tuning to subtle requirements, such as a specific communication style, policy guidelines, or domain-specific expertise. Organizations with clearly defined problems and verifiable answers can benefit significantly from RFT, aligning models with nuanced objectives.

Several organizations have already leveraged RFT in closed previews, demonstrating its versatility across industries. Accordance AI improved the performance of a tax analysis model, while Ambience Healthcare increased the accuracy of medical coding. Other use cases include legal document analysis by Harvey, Stripe API code generation by Runloop, and content moderation by SafetyKit. OpenAI also announced that supervised fine-tuning is now supported for its GPT-4.1 nano model, the company’s most affordable and fastest offering to date, opening customization to all paid API tiers. The cost model for RFT is more transparent, based on active training time rather than per-token processing.

Recommended read:
References :
  • AI News | VentureBeat: You can now fine-tune your enterprise’s own version of OpenAI’s o4-mini reasoning model with reinforcement learning
  • Maginative: OpenAI Brings Reinforcement Fine-Tuning and GPT-4.1 Nano Fine-Tuning in the API
  • www.marktechpost.com: OpenAI Releases Reinforcement Fine-Tuning (RFT) on o4-mini: A Step Forward in Custom Model Optimization
  • Techzine Global: OpenAI opens the door to reinforcement fine-tuning for o4-mini

@the-decoder.com //
OpenAI is making strides in AI customization and application development with the release of Reinforcement Fine-Tuning (RFT) on its o4-mini reasoning model and the appointment of Fidji Simo as the CEO of Applications. The RFT release allows organizations to tailor their versions of the o4-mini model to specific tasks using custom objectives and reward functions, marking a significant advancement in model optimization. This approach utilizes reinforcement learning principles, where developers provide a task-specific grader that evaluates and scores model outputs based on custom criteria, enabling the model to optimize against a reward signal and align with desired behaviors.

Reinforcement Fine-Tuning is particularly valuable for complex or subjective tasks where ground truth is difficult to define. By using RFT on o4-mini, a compact reasoning model optimized for text and image inputs, developers can fine-tune for high-stakes, domain-specific reasoning tasks while maintaining computational efficiency. Early adopters have demonstrated the practical potential of RFT. This capability allows developers to tweak the model to better fit their needs using OpenAI's platform dashboard, deploy it through OpenAI's API, and connect it to internal systems.

In a move to scale its AI products, OpenAI has appointed Fidji Simo, formerly CEO of Instacart, as the CEO of Applications. Simo will oversee the scaling of AI products, leveraging her extensive experience in consumer tech to drive revenue generation from OpenAI's research and development efforts. Previously serving on OpenAI's board of directors, Simo's background in leading development at Facebook suggests a focus on end-users rather than businesses, potentially paving the way for new subscription services and products aimed at a broader audience. OpenAI is also rolling out a new GitHub connector for ChatGPT's deep research agent, allowing users with Plus, Pro, or Team subscriptions to connect their repositories and ask questions about their code.

Recommended read:
References :
  • AI News | VentureBeat: You can now fine-tune your enterprise’s own version of OpenAI’s o4-mini reasoning model with reinforcement learning
  • www.computerworld.com: OpenAI was founded a decade ago with a focus on research, but it has since expanded into products and infrastructure. Now it is looking to again broaden its presence into user-facing apps. The company announced this week that Fidji Simo will join as CEO of applications, a newly-created position. Simo is the current CEO and chair at grocery delivery company Instacart. She will begin her new role at OpenAI later this year, reporting directly to Sam Altman, who will remain overall CEO and oversee research, compute, and applications.
  • the-decoder.com: OpenAI has appointed Fidji Simo as CEO of its new Applications division, reporting directly to OpenAI CEO Sam Altman.
  • www.marktechpost.com: OpenAI Releases Reinforcement Fine-Tuning (RFT) on o4-mini: A Step Forward in Custom Model Optimization
  • the-decoder.com: OpenAI is expanding its fine-tuning program for o4-mini, introducing Reinforcement Fine-Tuning (RFT) for organizations. The method is designed to help tailor models like o4-mini to highly specific tasks with the help of a programmable grading system.
  • Maginative: OpenAI brings reinforcement fine-tuning and GPT-4.1 Nano Fine-Tuning in the API
  • MarkTechPost: OpenAI Releases Reinforcement Fine-Tuning (RFT) on o4-mini: A Step Forward in Custom Model Optimization
  • Techzine Global: OpenAI opens the door to reinforcement fine-tuning for o4-mini
  • THE DECODER: OpenAI is expanding its fine-tuning program for o4-mini, introducing Reinforcement Fine-Tuning (RFT) for organizations. The method is designed to help tailor models like o4-mini to highly specific tasks with the help of a programmable grading system.
  • AI News | VentureBeat: Last night, OpenAI published a blog post on its official website authored by CEO and co-founder Sam Altman announcing a major new hire: Fidji Simo, currently CEO and Chair at grocery delivery company Instacart, will join OpenAI as CEO of Applications, a newly created executive position. Simo will …
  • techxplore.com: OpenAI offers to help countries build AI systems
  • The Register - Software: OpenAI drafts Instacart boss as CEO of Apps to lure in the normies

@analyticsindiamag.com //
OpenAI has unveiled a new GitHub connector for its ChatGPT Deep Research tool, empowering developers to analyze their codebases directly within the AI assistant. This integration allows seamless connection of both private and public GitHub repositories, enabling comprehensive analysis to generate reports, documentation, and valuable insights based on the code. The Deep Research agent can now sift through source code and engineering documentation, respecting existing GitHub permissions by only accessing authorized repositories, streamlining the process of understanding and maintaining complex projects.

This new functionality aims to simplify code analysis and documentation processes, making it easier for developers to understand and maintain complex projects. Developers can leverage the connector to implement new APIs by finding real examples in their codebase, break down product specifications into manageable technical tasks with dependencies mapped out, or generate summaries of code structure and patterns for onboarding new team members or creating technical documentation. OpenAI Product Leader Nate Gonzalez stated that users found ChatGPT's deep research agent so valuable that they wanted it to connect to their internal sources, in addition to the web.

The GitHub connector is currently rolling out to ChatGPT Plus, Pro, and Team users. Enterprise and Education customers will gain access soon. OpenAI emphasizes that the connector respects existing permissions structures and honors GitHub permission settings. This launch follows the recent integration of ChatGPT Team with tools like Google Drive, furthering OpenAI's goal of seamlessly integrating ChatGPT into internal workflows by pulling relevant context from various platforms where knowledge typically resides within organizations. OpenAI also plans to add more deep research connectors in the future.

Recommended read:
References :
  • Analytics India Magazine: Based on the queries, the deep research agent will retrieve pertinent information from a GitHub repository to compile reports.
  • the-decoder.com: OpenAI is rolling out a new GitHub connector for ChatGPT's deep research agent.
  • Maginative: OpenAI launches GitHub connector for ChatGPT Deep Research that lets developers analyze their actual codebases to generate comprehensive reports and documentation.
  • analyticsindiamag.com: OpenAI’s GitHub Integration Brings ‘Deep Research for Your Code Base’

@docs.anthropic.com //
Anthropic, the generative AI startup, has officially entered the internet search arena with the launch of its new web search API for Claude. This positions Claude as a direct challenger to traditional search engines like Google, offering users real-time access to information through its large language models. This API enables developers to integrate Claude’s search capabilities directly into their own applications, expanding the reach of AI-powered information retrieval.

The Claude web search API provides access to current web information, allowing the AI assistant to conduct multiple, iterative searches to deliver more complete and accurate answers. Claude uses its "reasoning" capabilities to determine if a user’s query would benefit from a real-time search, generating search queries and analyzing the results to inform its responses. The responses it delivers will come with citations that link to the source articles it uses, offering users transparency and enabling them to verify the information for themselves.

This move comes amid signs of a potential shift in the search landscape, with growing user engagement with AI-driven alternatives. Apple is reportedly exploring AI search engines like ChatGPT, Perplexity and Anthropic's Claude, as options in Safari, signaling a shift away from Google’s $20 billion deal to be the default search engine. The decline in traditional search volume is attributed to the conversational and context-aware nature of AI platforms. The move signals a growing trend towards conversational AI in information retrieval, which may reshape how people access and use the internet.

Recommended read:
References :
  • SiliconANGLE: The generative artificial intelligence startup Anthropic PBC is joining rivals such as OpenAI and Perplexity AI Inc. in an effort to overhaul the internet search industry.
  • siliconangle.com: The generative artificial intelligence startup Anthropic PBC is joining rivals such as OpenAI and Perplexity AI Inc. in an effort to overhaul the internet search industry. Today it announced the launch of a new application programming interface that enables its flagship Claude large language models to search the internet for up-to-date information in real time.
  • venturebeat.com: Anthropic launches Claude web search API, betting on the future of post-Google information access
  • the-decoder.com: Anthropic launches a web search feature for its Claude API, letting developers combine Claude models with up-to-date web data without building their own search infrastructure.
  • Simon Willison's Weblog: Introducing web search on the Anthropic API
  • THE DECODER: Anthropic adds web search to Claude API for real-time data and research

Ellie Ramirez-Camara@Data Phoenix //
Microsoft is expanding its AI capabilities with enhancements to its Phi-4 family and the integration of the Agent2Agent (A2A) protocol. The company's new Phi-4-Reasoning and Phi-4-Reasoning-Plus models are designed to deliver strong reasoning performance with low latency. In addition, Microsoft is embracing interoperability by adding support for the open A2A protocol to Azure AI Foundry and Copilot Studio. This move aims to facilitate seamless collaboration between AI agents across various platforms, fostering a more connected and efficient AI ecosystem.

Microsoft's integration of the A2A protocol into Azure AI Foundry and Copilot Studio will empower AI agents to work together across platforms. The A2A protocol defines how agents formulate tasks and execute them, enabling them to delegate tasks, share data, and act together. With A2A support, Copilot Studio agents can call on external agents, including those outside the Microsoft ecosystem and built with tools like LangChain or Semantic Kernel. Microsoft reports that over 230,000 organizations are already utilizing Copilot Studio, with 90 percent of the Fortune 500 among them. Developers can now access sample applications demonstrating automated meeting scheduling between agents.

Independant developer Simon Willison has been testing the phi4-reasoning model, and reported that the 11GB download (available via Ollama) may well overthink things. Willison noted that it produced 56 sentences of reasoning output in response to a prompt of "hi". Microsoft is actively contributing to the A2A specification work on GitHub and intends to play a role in driving its future development. A public preview of A2A in Azure Foundry and Copilot Studio is anticipated to launch soon. Microsoft envisions protocols like A2A as the bedrock of a novel software architecture where interconnected agents automate daily workflows and collaborate across platforms with auditability and control.

Recommended read:
References :
  • bsky.app: Microsoft's phi4-reasoning model, an 11GB download (via Ollama) which may well overthink things
  • Simon Willison: Simon Willison Published some notes on Microsoft's phi4-reasoning model
  • the-decoder.com: Microsoft leverages Google's open A2A protocol for interoperable AI agents
  • the-decoder.com: Microsoft's Phi 4 responds to a simple "Hi" with 56 thoughts
  • Data Phoenix: Microsoft has introduced three new small language models—Phi-4-reasoning, Phi-4-reasoning-plus, and Phi-4-mini-reasoning—that reportedly deliver complex reasoning capabilities comparable to much larger models while maintaining efficiency for deployment across various computing environments.
  • www.microsoft.com: In this issue: New research on compound AI systems and causal verification of the Confidential Consortium Framework; release of Phi-4-reasoning; enriching tabular data with semantic structure, and more.

Matthias Bastian@THE DECODER //
Google has launched an enhanced version of its Gemini 2.5 Pro AI model, specifically tailored for improved coding performance. The Gemini 2.5 Pro Preview, also known as the I/O Edition, is now available to developers ahead of the Google I/O 2025 developer conference. This early release aims to provide developers with advanced tools for building more sophisticated and interactive web applications, responding to what Google describes as "overwhelming enthusiasm" for the model's potential. The updated model demonstrates leadership in coding, solidifying Google’s commitment to advancing AI-driven development tools.

This latest pre-release version of Gemini 2.5 Pro brings major improvements for front-end development and complex programming tasks. The model excels in building full, interactive web apps or simulations from a single prompt and has achieved the top rank on the WebDev Arena leaderboard, which evaluates a model’s ability to develop visually pleasing and functional web applications. According to Google, the model update was accelerated due to positive feedback from users. The Gemini 2.5 Pro Preview also delivers state-of-the-art video understanding, scoring 84.8% on the VideoMME benchmark, enabling new flows such as creating interactive learning apps based on YouTube videos.

The updated Gemini 2.5 Pro, available through the Gemini API in Google AI Studio and Vertex AI, improves efficiency in feature development by automating tasks such as matching style properties and writing CSS code. It also enhances collaboration with companies like Cognition and Replit, pushing the frontiers of agentic programming. Furthermore, the updated model addresses key developer feedback around function calling, with improvements in error reduction and trigger reliability. With its strong coding capabilities and advanced reasoning, the Gemini 2.5 Pro continues to position itself as a leading AI tool for developers.

Recommended read:
References :
  • Analytics India Magazine: Is this Google’s response towards OpenAI’s reaching an agreement to buy Windsurf?
  • Google DeepMind Blog: We’ve seen developers doing amazing things with Gemini 2.5 Pro, so we decided to release an updated version a couple of weeks early to get into developers hands sooner.
  • Developer Tech News: Google has pushed early access to a souped-up Gemini 2.5 Pro Preview ahead of their I/O 2025 developer conference in a couple of weeks. Why the hurry? Well, Google put it down to “overwhelming enthusiasm†and the “amazing things†developers were already cooking up with the previous version of Gemini 2.5 Pro.
  • LearnAI: Learn AI Singapore on Google's Gemini 2.5 Pro Preview model.
  • learn.aisingapore.org: We’ve seen developers doing amazing things with Gemini 2.5 Pro, so we decided to release an updated version a couple of weeks early to get into developers hands sooner.
  • THE DECODER: The latest pre-release version of Google's Gemini 2.5 Pro language model brings major improvements for front-end development and complex programming tasks.
  • AI News | VentureBeat: One of the standout features of the update is its ability to build full, interactive web apps or simulations from a single prompt.
  • www.zdnet.com: The model update was scheduled to debut at Google I/O, but Google released it early in response to positive feedback from users.
  • the-decoder.com: The latest pre-release version of Google's Gemini 2.5 Pro language model brings major improvements for front-end development and complex programming tasks.
  • The Official Google Blog: Build rich, interactive web apps with an updated Gemini 2.5 Pro
  • TestingCatalog: Google debuts Gemini 2.5 Pro I/O Edition with major upgrades for web development
  • AI & Machine Learning: Guide to build MCP servers using vibe coding with Gemini 2.5 Pro
  • LearnAI: Today we’re releasing early access to Gemini 2.5 Pro Preview (I/O edition), an updated version of 2.5 Pro that has significantly improved capabilities for coding, especially building compelling interactive web apps.
  • MarkTechPost: Google Launches Gemini 2.5 Pro I/O: Outperforms GPT-4 in Coding, Supports Native Video Understanding and Leads WebDev Arena
  • learn.aisingapore.org: Coding, web apps with Gemini
  • Google DeepMind Blog: Build rich, interactive web apps with an updated Gemini 2.5 Pro
  • www.developer-tech.com: Google improves Gemini 2.5 Pro ahead of I/O 2025
  • the-decoder.com: Google's new caching feature for Gemini 2.5 aims to reduce costs by up to 75 percent
  • THE DECODER: Google introduces implicit caching in Gemini 2.5, aiming to cut developer costs by as much as 75 percent.

@venturebeat.com //
Nvidia has launched Parakeet-TDT-0.6B-V2, a fully open-source transcription AI model, on Hugging Face. This represents a new standard for Automatic Speech Recognition (ASR). The model, boasting 600 million parameters, has quickly topped the Hugging Face Open ASR Leaderboard with a word error rate of just 6.05%. This level of accuracy positions it near proprietary transcription models, such as OpenAI’s GPT-4o-transcribe and ElevenLabs Scribe, making it a significant advancement in open-source speech AI. Parakeet operates under a commercially permissive CC-BY-4.0 license.

The speed of Parakeet-TDT-0.6B-V2 is a standout feature. According to Hugging Face’s Vaibhav Srivastav, it can "transcribe 60 minutes of audio in 1 second." Nvidia reports this is achieved with a real-time factor of 3386, meaning it processes audio 3386 times faster than real-time when running on Nvidia's GPU-accelerated hardware. This speed is attributed to its transformer-based architecture, fine-tuned with high-quality transcription data and optimized for inference on NVIDIA hardware using TensorRT and FP8 quantization. The model also supports punctuation, capitalization, and detailed word-level timestamping.

Parakeet-TDT-0.6B-V2 is aimed at developers, researchers, and industry teams building various applications. This includes transcription services, voice assistants, subtitle generators, and conversational AI platforms. Its accessibility and performance make it an attractive option for commercial enterprises and indie developers looking to build speech recognition and transcription services into their applications. With its release on May 1, 2025, Parakeet is set to make a considerable impact on the field of speech AI.

Recommended read:
References :
  • Techmeme: Nvidia launches open-source transcription model Parakeet-TDT-0.6B-V2, topping the Hugging Face Open ASR Leaderboard with a word error rate of 6.05% (Carl Franzen/VentureBeat)
  • @techmeme.com - Techmeme: Nvidia launches open-source transcription model Parakeet-TDT-0.6B-V2, topping the Hugging Face Open ASR Leaderboard with a word error rate of 6.05% (Carl Franzen/VentureBeat)
  • venturebeat.com: An attractive proposition for commercial enterprises and indie developers looking to build speech recognition and transcription services...
  • www.marktechpost.com: NVIDIA Open Sources Parakeet TDT 0.6B: Achieving a New Standard for Automatic Speech Recognition ASR and Transcribes an Hour of Audio in One Second
  • AI News | VentureBeat: Reports Nvidia launches fully open source transcription AI model Parakeet-TDT-0.6B-V2 on Hugging Face
  • MarkTechPost: Reports NVIDIA Open Sources Parakeet TDT 0.6B: Achieving a New Standard for Automatic Speech Recognition ASR and Transcribes an Hour of Audio in One Second
  • www.eweek.com: NVIDIA’s AI Transcription Tool Produces 60 Minutes of Text in 1 Second
  • eWEEK: NVIDIA has released a new version of its Parakeet transcription tool, boasting the lowest error rate of any of its competitors. In addition, the company made the code public on GitHub. Parakeet TDT 0.6B is a 600-million-parameter automatic speech recognition model. It can transcribe 60 minutes of audio per second, Hugging Face data scientist Vaibhav […]

@the-decoder.com //
Anysphere, the company behind the AI code editor Cursor, has reportedly secured a massive $900 million in a new funding round. The financing was spearheaded by Thrive Capital, with significant participation from Andreessen Horowitz (a16z) and Accel. This latest investment values Anysphere at an impressive $9 billion, a substantial leap from its previous valuation of $2.5 billion in January of this year.

The demand to invest in Anysphere, and other AI coding startups, has been incredibly high. This surge in value is likely attributed to the company's rapid sales growth and the increasing prominence of AI-powered coding tools. The company's annual recurring revenue reportedly topped $200 million last month, indicating the growing adoption and reliance on its Cursor code editor by developers. According to its website, Cursor produces nearly a billion working lines of code each day.

Cursor, Anysphere's flagship product, features a split-screen interface that combines a traditional code editor with an AI chatbot. This allows developers to use natural language prompts to instruct the chatbot to make code changes, streamlining the coding process. The AI is capable of generating multiple lines of code at once and can search for additional information on the web and in a software project’s documentation when given a challenging task. Under the hood, Cursor is powered by language models from OpenAI, Google LLC and other providers, with the addition of an internally developed model dubbed Cursor-Fast. Anysphere clients include Stripe, Spotify, and OpenAI.

Recommended read:
References :
  • www.techmeme.com: Sources: Anysphere, maker of AI coding tool Cursor, raised $900M at a $9B valuation, up from $2.5B in January, led by Thrive, with a16z and Accel participating
  • @techmeme.com - Techmeme: Sources: Anysphere, maker of AI coding tool Cursor, raised $900M at a $9B valuation, up from $2.5B in January, led by Thrive, with a16z and Accel participating
  • siliconangle.com: AI code editor startup Anysphere reportedly closes $900M funding round
  • the-decoder.com: Cursor developer Anysphere closes mega financing round
  • TechCrunch: Anysphere, which makes Cursor, has reportedly raised $900M at $9B valuation
  • SiliconANGLE: AI code editor startup Anysphere reportedly closes $900M funding round
  • www.ft.com: Anysphere closes $900mn funding round from investors including Thrive Capital and Andreessen Horowitz
  • Techmeme: Sources: Anysphere, maker of AI coding tool Cursor, raised $900M at a $9B valuation, up from $2.5B in January, led by Thrive, with a16z and Accel participating

Mels Dees@Techzine Global //
Microsoft is reportedly preparing to host Elon Musk's Grok AI model within its Azure AI Foundry platform, signaling a potential shift in its AI strategy. The move, stemming from discussions with xAI, Musk's AI company, could make Grok accessible to a broad user base and integrate it into Microsoft's product teams via the Azure cloud service. Azure AI Foundry serves as a generative AI development hub, providing developers with the necessary tools and models to host, run, and manage AI-driven applications, potentially positioning Microsoft as a neutral platform supporting multiple AI models. This follows reports indicating Microsoft is exploring third-party AI models like DeepSeek and Meta for its Copilot service.

Microsoft's potential hosting of Grok comes amid reports that its partnership with OpenAI may be evolving. While Microsoft remains quiet about any deal with xAI, sources indicate that Grok will be available on Azure AI Foundry, providing developers with access to the model. However, Microsoft reportedly intends only to host the Grok model and will not be involved in training future xAI models. This collaboration with xAI could strengthen Microsoft's position as an infrastructure provider for AI models, offering users more freedom of choice in selecting which AI models they want to use within their applications.

Alongside these developments, Microsoft is enhancing its educational offerings with Microsoft 365 Copilot Chat agents. These specialized AI assistants can personalize student support and provide instructor assistance. Copilot Chat agents can be tailored to offer expertise in instructional design, cater to unique student preferences, and analyze institutional data. These agents are designed to empower educators and students alike, transforming education experiences through customized support and efficient access to resources.

Recommended read:
References :
  • www.microsoft.com: Discover how Microsoft 365 Copilot Chat agents in education can enhance learning with personalized student support, instructor assistance, and more.
  • Techzine Global: Microsoft is preparing to host the Grok AI model from xAI, Elon Musk’s AI company, within its Azure AI Foundry platform.
  • www.windowscentral.com: Microsoft is reportedly planning to host Elon Musk's Grok AI model. However, it won't host xAI's servers to train any of its future AI models.
  • thetechbasic.com: Microsoft is getting ready to add Elon Musk’s Grok AI model to its Azure cloud service. This move could help developers build new apps using Grok’s technology.

Isaac Sacolick@drive.starcio.com //
Microsoft is significantly expanding its AI infrastructure and coding capabilities. CEO Satya Nadella recently revealed that Artificial Intelligence now writes between 20% and 30% of the code powering Microsoft's software. In some projects, AI may even write the entirety of the code. This adoption of AI in coding highlights its transformative impact on software development, streamlining repetitive and data-heavy tasks to boost corporate efficiency.

The increasing reliance on AI for code generation is not without its concerns, particularly for new programmers. While AI excels at handling predictable tasks, senior developer oversight remains crucial to ensure the stability and accuracy of the code. Microsoft is reporting better results with AI-generated Python code compared to C++, partly attributed to Python's simpler syntax and memory management features.

In addition to enhancing its coding capabilities, Microsoft is also focusing on expanding its digital commitments and infrastructure in Europe. Furthermore, Appian is transforming low-code app development through AI agents. These agents are making app creation easier and more scalable, fostering collaboration and innovation in the development process. Microsoft has also released its 2025 Work Trend Index, highlighting the emergence of the "Frontier Firm" in Singapore, where businesses are embracing AI agents to enhance workforce capabilities and address capacity gaps.

Recommended read:
References :
  • drive.starcio.com: How Appian is Inspiring with AI Agents and Transforming Low-Code App Development
  • www.tomshardware.com: Microsoft's CEO reveals that AI writes up to 30% of its code — some projects may have all of its code written by AI
  • news.microsoft.com: Microsoft Releases 2025 Work Trend Index: The Frontier Firm Emerges in Singapore

@developer.nvidia.com //
NVIDIA is significantly advancing the capabilities of AI development with the introduction of new tools and technologies. The company's latest innovations focus on enhancing the performance of AI agents, improving integration with various software and hardware platforms, and streamlining the development process for enterprises. These advancements include NVIDIA NeMo microservices for creating data-driven AI agents and a G-Assist plugin builder that enables users to customize AI functionalities on GeForce RTX AI PCs.

NVIDIA's NeMo microservices are designed to empower enterprises to build AI agents that can access and leverage data to enhance productivity and decision-making. These microservices provide a modular platform for building and customizing generative AI models, offering features such as prompt tuning, supervised fine-tuning, and knowledge retrieval tools. NVIDIA envisions these microservices as essential building blocks for creating data flywheels, enabling AI agents to continuously learn and improve from enterprise data, business intelligence, and user feedback. The initial use cases include AI agents used by AT&T to process nearly 10,000 documents and a coding assistant used by Cisco Systems.

The introduction of the G-Assist plugin builder marks a significant step forward in AI-assisted PC control. This tool allows developers to create custom commands to manage both software and hardware functions on GeForce RTX AI PCs. By enabling integration with large language models (LLMs) and other software applications, the plugin builder expands G-Assist's functionality beyond its initial gaming-focused applications. Users can now tailor AI functionalities to suit their specific needs, automating tasks and controlling various PC functions through voice or text commands. The G-Assist tool runs a lightweight language model locally on RTX GPUs, enabling inference without relying on a cloud connection.

Recommended read:
References :
  • developer.nvidia.com: Enhance Your AI Agent with Data Flywheels Using NVIDIA NeMo Microservices
  • www.tomshardware.com: NVIDIA introduces G-Assist plug-in builder, allowing its AI to integrate with LLMs and software
  • developer.nvidia.com: Benchmarking Agentic LLM and VLM Reasoning for Gaming with NVIDIA NIM
  • techstrong.ai: NVIDIA Corp. on Wednesday announced general availability of neural module (NeMo) microservices, the software tools behind artificial intelligence (AI) agents for enterprises.
  • the-decoder.com: With its G-Assist tool and a new plug-in builder, Nvidia introduces a system for AI-assisted PC control. Developers can create their own commands to manage both software and hardware functions.

Matthias Bastian@THE DECODER //
OpenAI has expanded access to its multimodal image generation model, GPT-Image-1, by making it available to developers through the API. This allows for the integration of high-quality image generation capabilities into various applications and platforms. Previously, GPT-Image-1 was primarily used within ChatGPT, where it gained popularity and generated over 700 million images for more than 130 million users within its first week. The move to offer it via API will likely increase these numbers as developers incorporate the technology into their projects. Leading platforms like Adobe and Figma are already integrating the model, showcasing its appeal and potential impact across different industries.

The GPT-Image-1 model is known for its accurate prompt tracking and versatility in creating images across diverse styles, including the rendering of text within images. The API provides developers with granular control over image creation, offering options to adjust quality settings, the number of images produced, background transparency, and output format. Notably, developers can also adjust moderation sensitivity, balancing flexibility with OpenAI's safety guidelines. This includes the implementation of C2PA metadata watermarking, which identifies images as AI-generated.

The pricing model for the GPT-Image-1 API is based on tokens, with separate rates for text input tokens, image input tokens, and image output tokens. Text input tokens are priced at $5 per million, image input tokens at $10 per million, and image output tokens at $40 per million. In practical terms, the cost per generated image ranges from approximately $0.02 for a low-quality square image to $0.19 for a high-quality square image. The API accepts various image formats, including PNG, JPEG, WEBP, and non-animated GIF, with the model capable of interpreting visual content like objects, colors, shapes, and embedded text.

Recommended read:
References :
  • THE DECODER: OpenAI adds ChatGPT image model GPT-Image-1 to API for developers
  • Simon Willison's Weblog: Simon Willison's notes on OpenAI Images API
  • the-decoder.com: OpenAI adds ChatGPT image model "GPT-Image-1" to API for developers
  • AI News | VentureBeat: OpenAI makes ChatGPT’s image generation available as API
  • MarkTechPost: OpenAI has officially announced the release of its image generation API, powered by the gpt-image-1 model. This launch brings the multimodal capabilities of ChatGPT into the hands of developers, enabling programmatic access to image generation—an essential step for building intelligent design tools, creative applications, and multimodal agent systems.
  • www.marktechpost.com: OpenAI Launches gpt-image-1 API: Bringing High-Quality Image Generation to Developers
  • Maginative: Developers Can Now Tap OpenAI’s Image Model Through API
  • www.analyticsvidhya.com: How to Generate and Edit Images Using OpenAI gpt-image-1 API
  • Analytics Vidhya: How to Generate and Edit Images Using OpenAI gpt-image-1 API
  • analyticsindiamag.com: OpenAI Brings Image Generation Model to API via gpt-image-1

@the-decoder.com //
References: composio.dev , THE DECODER ,
OpenAI is actively benchmarking its language models, including o3 and o4-mini, against competitors like Gemini 2.5 Pro, to evaluate their performance in reasoning and tool use efficiency. Benchmarks like the Aider polyglot coding test show that o3 leads in some areas, achieving a new state-of-the-art score of 79.60% compared to Gemini 2.5's 72.90%. However, this performance comes at a higher cost, with o3 being significantly more expensive. O4-mini offers a slightly more balanced price-performance ratio, costing less than o3 while still surpassing Gemini 2.5 on certain tasks. Testing reveals Gemini 2.5 excels in context awareness and iterating on code, making it preferable for real-world use cases, while o4-mini surprisingly excelled in competitive programming.

Open AI have just launched its GPT-Image-1 model for image generation to developers via API. Previously, this model was only accessible through ChatGPT. The versatility of the model means that it can create images across diverse styles, custom guidelines, world knowledge, and accurately render text. The company's blog post said that this unlocks countless practical applications across multiple domains.

Several enterprises and startups are already incorporating the model for creative projects, products, and experiences. Image processing with GPT-Image-1 is billed by tokens. Text input tokens, or the prompt text, will cost $5 per 1 million tokens. Image input tokens will be $10 per million tokens, while image output tokens, or the generated image, will be a whopping $40 per million tokens. Depending on the selected image quality,costs typically range from $0.02 to $0.19 per image.

Recommended read:
References :

Derek Egan@AI & Machine Learning //
Google Cloud is enhancing its MCP Toolbox for Databases to provide simpler and more secure access to enterprise data for AI agents. Announced at Google Cloud Next 2025, this update includes support for Model Context Protocol (MCP), an emerging open standard developed by Anthropic, which aims to standardize how AI systems connect to various data sources. The MCP Toolbox for Databases, formerly known as the Gen AI Toolbox for Databases, acts as an open-source MCP server, allowing developers to connect GenAI agents to enterprise databases like AlloyDB for PostgreSQL, Spanner, and Cloud SQL securely and efficiently.

The enhanced MCP Toolbox for Databases reduces boilerplate code, improves security through OAuth2 and OIDC, and offers end-to-end observability via OpenTelemetry integration. These features simplify the development process, allowing developers to build agents with the Agent Development Kit (ADK). The ADK, an open-source framework, supports the full lifecycle of intelligent agent development, from prototyping and evaluation to production deployment. ADK provides deterministic guardrails, bidirectional audio and video streaming capabilities, and a direct path to production deployment via Vertex AI Agent Engine.

This update represents a significant step forward in creating secure and standardized methods for AI agents to communicate with one another and access enterprise data. Because the Toolbox is fully open-source, it includes contributions from third-party databases such as Neo4j and Dgraph. By supporting MCP, the Toolbox enables developers to leverage a single, standardized protocol to query a wide range of databases, enhancing interoperability and streamlining the development of agentic applications. New customers can also leverage Google Cloud's offer of $300 in free credit to begin building and testing their AI solutions.

Recommended read:
References :
  • cloud.google.com: Announcement of Gen AI Toolbox for Databases.
  • AI & Machine Learning: Google Cloud Blog post about MCP Toolbox for Databases
  • github.com: Google Gen AI Toolbox GitHub repository.
  • TheSequence: The Sequence Engineering #528: Inside Google's New Agent Development Kit
  • Analytics Vidhya: Looking to build intelligent agents with real-world capabilities? Use Google ADK for building agents that can reason, delegate, and respond dynamically.
  • www.github.com: GitHub repository for the Agent Development Kit (ADK) in Python.

Giovanni Galloro@AI & Machine Learning //
Google is enhancing the software development process with its Gemini Code Assist, a tool designed to accelerate the creation of applications from initial requirements to a working prototype. According to a Google Cloud Blog post, Gemini Code Assist integrates directly with Google Docs and VS Code, allowing developers to use natural language prompts to generate code and automate project setup. The tool analyzes requirements documents to create project structures, manage dependencies, and set up virtual environments, reducing the need for manual coding and streamlining the transition from concept to prototype.

Gemini Code Assist facilitates collaborative workflows by extracting and summarizing application features and technical requirements from documents within Google Docs. This allows developers to quickly understand project needs directly within their code editor. By using natural language prompts, developers can then iteratively refine the generated code based on feedback, fostering efficiency and innovation in software development. This approach enables developers to focus on higher-level design and problem-solving, significantly speeding up the application development lifecycle.

The tool supports multiple languages and frameworks, including Python, Flask, and SQLAlchemy, making it versatile for developers with varied skill sets. A Google Codelabs tutorial further highlights Gemini Code Assist's capabilities across key stages of the Software Development Life Cycle (SDLC), such as design, build, test, and deployment. The tutorial demonstrates how to use Gemini Code Assist to generate OpenAPI specifications, develop Python Flask applications, create web front-ends, and even get assistance on deploying applications to Google Cloud Run. Developers can also use features like Code Explanation and Test Case generation.

Recommended read:
References :
  • AI & Machine Learning: Google Cloud Blog post detailing Gemini Code Assist's capabilities in streamlining application prototyping from requirements documents.
  • codelabs.developers.google.com: Codelabs tutorial on Gemini Code Assist and the Software Development Lifecycle (SDLC).
  • developers.google.com: Google Gemini Code Assist tool configuration documentation.
  • TestingCatalog: Google readies native image generation in Gemini ahead of possible I/O reveal

@techhq.com //
References: techhq.com
Google has introduced a "reasoning dial" for its Gemini 2.5 Flash AI model, a new feature designed to give developers control over the amount of AI processing power used for different tasks. This innovative approach aims to address the issue of AI models "overthinking" simple questions and wasting valuable computing resources. The reasoning dial allows developers to fine-tune the system's computational effort, balancing thorough analysis with resource efficiency, ultimately making AI usage more cost-effective and practical for commercial applications.

The motivation behind the reasoning dial stems from the growing inefficiency observed in advanced AI systems when handling basic prompts. As Tulsee Doshi, Director of Product Management at Gemini, explained, models often expend more resources than necessary on simple tasks. By adjusting the reasoning dial, developers can reduce the computational intensity for less complex questions, optimizing performance and reducing costs. This approach prioritizes efficient reasoning, offering an alternative to relying solely on larger models that might consume more resources for similar tasks.

The reasoning dial also tackles a significant economic challenge. According to Google’s documentation, fully activating reasoning capabilities can increase output generation costs sixfold. For developers building commercial applications, this cost increase can quickly become unsustainable. The introduction of the reasoning dial reflects a shift in AI development, prioritizing efficient resource utilization and controlled AI processing, highlighting Google's focus on practical, cost-effective AI solutions.

Recommended read:
References :
  • techhq.com: Google unveils “reasoning dial†for Gemini 2.5 flash: thinking vs. cost

@simonwillison.net //
OpenAI has recently unveiled its latest AI reasoning models, the o3 and o4-mini, marking a significant step forward in the development of AI agents capable of utilizing tools effectively. These models are designed to pause and thoroughly analyze questions before providing a response, enhancing their reasoning capabilities. The o3 model is presented as OpenAI's most advanced in this category, demonstrating superior performance across various benchmarks, including math, coding, reasoning, science, and visual understanding. Meanwhile, the o4-mini model strikes a balance between cost-effectiveness, speed, and overall performance, offering a versatile option for different applications.

OpenAI's o3 and o4-mini are equipped with the ability to leverage tools within the ChatGPT environment, such as web browsing, Python code execution, image processing, and image generation. This integration allows the models to augment their capabilities by cropping or transforming images, searching the web for relevant information, and analyzing data using Python, all within their thought process. A variant of o4-mini, named "o4-mini-high," is also available, catering to users seeking enhanced performance. These models are accessible to subscribers of OpenAI's Pro, Plus, and Team plans, reflecting the company's commitment to providing advanced AI tools to a wide range of users.

Interestingly, the system card for o3 and o4-mini shows that the o3 model tends to make more claims overall. This can lead to both more accurate and more inaccurate claims, including hallucinations, compared to earlier models like o1. OpenAI's internal PersonQA benchmark shows that the hallucination rate increases from 0.16 for o1 to 0.33 for o3. The o3 and o4-mini models also exhibit a limited capability to "sandbag," which, in this context, refers to the model concealing its full capabilities to better achieve a specific goal. Further research is necessary to fully understand the implications of these observations.

Recommended read:
References :
  • Last Week in AI: OpenAI's new GPT-4.1 AI models focus on coding, OpenAI launches a pair of AI reasoning models, o3 and o4-mini, Google's newest Gemini AI model focuses on efficiency, and more!
  • Simon Willison's Weblog: Wrote up some notes on the o3/o4-mini system card, including my frustration at "sandbagging" joining the ever-growing collection of AI terminology with more than one competing definition
  • Towards AI: TAI#149: OpenAI’s Agentic o3; New Open Weights Inference Optimized Models (DeepMind Gemma, Nvidia Nemotron-H)
  • composio.dev: OpenAI o3 and o4-mini are out. They are two reasoning state-of-the-art models. They’re expensive, multimodal, and super efficient at tool use. Significantly,
  • pub.towardsai.net: This week, OpenAI finally released its anticipated o3 and o4-mini models, shifting the focus towards AI agents that skillfully use tools.
  • Composio: OpenAI o3 and o4-mini are out. They are two reasoning state-of-the-art models. They’re expensive, multimodal, and super efficient at tool use. Significantly, The post first appeared on.
  • insideAI News: Dataiku Brings AI Agent Creation to AI Platform
  • techstrong.ai: AI Leadership Insights: Tracking and Ranking AI Agents

Megan Crouse@techrepublic.com //
Microsoft has unveiled BitNet b1.58 2B4T, a groundbreaking AI model designed for exceptional efficiency. Developed by Microsoft's General Artificial Intelligence group, this model utilizes one-bit neural network weights, representing each weight with only three discrete values (-1, 0, or +1). This approach, called ternary quantization, allows each weight to be stored in just 1.58 bits, drastically reducing memory usage. The result is an AI model that can operate on standard CPUs without the need for specialized, energy-intensive GPUs.

Unlike conventional AI models that rely on 16- or 32-bit floating-point numbers, BitNet's unique architecture allows it to run smoothly on hardware like Apple's M2 chip, requiring only 400MB of memory. To compensate for its low-precision weights, BitNet b1.58 2B4T was trained on a massive dataset of four trillion tokens, the equivalent of approximately 33 million books. This extensive training enables the model to perform on par with, and in some cases even better than, other leading models of similar size, such as Meta's Llama 3.2 1B, Google's Gemma 3 1B, and Alibaba's Qwen 2.5 1.5B.

To facilitate the deployment and adoption of this innovative model, Microsoft has released a custom software framework called bitnet.cpp, optimized to take full advantage of BitNet's ternary weights. This framework is available for both GPU and CPU execution, including a lightweight C++ version. The model has demonstrated strong performance across a variety of tasks including math and common sense reasoning in benchmark tests. Microsoft plans to expand BitNet to support longer texts, additional languages, and multimodal inputs like images, while also working on the Phi series, another family of efficient AI models.

Recommended read:
References :
  • the-decoder.com: BitNet: Microsoft shows how to put AI models on a diet The article appeared first on .
  • TechSpot: Microsoft's BitNet shows what AI can do with just 400MB and no GPU
  • www.techrepublic.com: Microsoft’s model BitNet b1.58 2B4T is available on Hugging Face but doesn’t run on GPU and requires a proprietary framework.
  • www.tomshardware.com: Microsoft researchers developed a 1-bit AI model that's efficient enough to run on traditional CPUs without needing specialized chips like NPUs or GPUs.