News from the AI & ML world

DeeperML - #llms

Matthias Bastian@THE DECODER //
OpenAI has announced the integration of GPT-4.1 and GPT-4.1 mini models into ChatGPT, aimed at enhancing coding and web development capabilities. The GPT-4.1 model, designed as a specialized model excelling at coding tasks and instruction following, is now available to ChatGPT Plus, Pro, and Team users. According to OpenAI, GPT-4.1 is faster and a great alternative to OpenAI o3 & o4-mini for everyday coding needs, providing more help to developers creating applications.

OpenAI is also rolling out GPT-4.1 mini, which will be available to all ChatGPT users, including those on the free tier, replacing the previous GPT-4o mini model. This model serves as the fallback option once GPT-4o usage limits are reached. The release notes confirm that GPT 4.1 mini offers various improvements over GPT-4o mini, including instruction-following, coding, and overall intelligence. This initiative is part of OpenAI's effort to make advanced AI tools more accessible and useful for a broader audience, particularly those engaged in programming and web development.

Johannes Heidecke, Head of Systems at OpenAI, has emphasized that the new models build upon the safety measures established for GPT-4o, ensuring parity in safety performance. According to Heidecke, no new safety risks have been introduced, as GPT-4.1 doesn’t introduce new modalities or ways of interacting with the AI, and that it doesn’t surpass o3 in intelligence. The rollout marks another step in OpenAI's increasingly rapid model release cadence, significantly expanding access to specialized capabilities in web development and coding.

Recommended read:
References :
  • twitter.com: GPT-4.1 is a specialized model that excels at coding tasks & instruction following. Because it’s faster, it’s a great alternative to OpenAI o3 & o4-mini for everyday coding needs.
  • www.computerworld.com: OpenAI adds GPT-4.1 models to ChatGPT
  • gHacks Technology News: OpenAI releases GPT-4.1 and GPT-4.1 mini AI models for ChatGPT
  • Maginative: OpenAI Brings GPT-4.1 to ChatGPT
  • www.windowscentral.com: “Am I crazy or is GPT-4.1 the best model for coding?” ChatGPT gets new models with exemplary web development capabilities — but OpenAI is under fire for allegedly skimming through safety processes
  • the-decoder.com: OpenAI brings its new GPT-4.1 model to ChatGPT users
  • www.ghacks.net: OpenAI releases GPT-4.1 and GPT-4.1 mini AI models for ChatGPT
  • www.techradar.com: OpenAI just gave ChatGPT users a huge free upgrade – 4.1 mini is available today
  • AI News | VentureBeat: OpenAI is rolling out GPT-4.1, its new non-reasoning large language model (LLM) that balances high performance with lower cost, to users of ChatGPT.
  • www.marktechpost.com: OpenAI has introduced Codex, a cloud-native software engineering agent integrated into ChatGPT, signaling a new era in AI-assisted software development.

Kevin Okemwa@windowscentral.com //
OpenAI has released GPT-4.1 and GPT-4.1 mini, enhancing coding capabilities within ChatGPT. According to OpenAI on Twitter, GPT-4.1 "excels at coding tasks & instruction following" and serves as a faster alternative to OpenAI o3 & o4-mini for everyday coding needs. GPT-4.1 mini replaces GPT-4o mini as the default for all ChatGPT users, including those on the free tier. The models are available via the “more models” dropdown selection in the top corner of the chat window within ChatGPT.

GPT-4.1 is now accessible to ChatGPT Plus, Pro, and Team users, with Enterprise and Education user access expected in the coming weeks. While initially intended for use only by third-party developers via OpenAI's API, GPT-4.1 was added to ChatGPT following strong user feedback. OpenAI Chief Product Officer Kevin Weil said "We built it for developers, so it's very good at coding and instruction following—give it a try!".

These models support the standard context windows for ChatGPT and are optimized for enterprise-grade practicality. GPT-4.1 delivers improvements over GPT-4o on the SWE-bench Verified software engineering benchmark and Scale’s MultiChallenge benchmark. Safety remains a priority, with OpenAI reporting that GPT-4.1 performs at parity with GPT-4o across standard safety evaluations.

Recommended read:
References :
  • Maginative: OpenAI has integrated its GPT-4.1 model into ChatGPT, providing enhanced coding and instruction-following capabilities to paid users, while also introducing GPT-4.1 mini for all users.
  • pub.towardsai.net: AI Passes Physician-Level Responses in OpenAI’s HealthBench
  • THE DECODER: OpenAI is rolling out its GPT-4.1 model to ChatGPT, making it available outside the API for the first time.
  • AI News | VentureBeat: OpenAI is rolling out GPT-4.1, its new non-reasoning large language model (LLM) that balances high performance with lower cost, to users of ChatGPT.
  • www.zdnet.com: OpenAI's HealthBench shows AI's medical advice is improving - but who will listen?
  • www.techradar.com: OpenAI just gave ChatGPT users a huge free upgrade – 4.1 mini is available today
  • Simon Willison's Weblog: GPT-4.1 will be available directly in ChatGPT starting today. GPT-4.1 is a specialized model that excels at coding tasks & instruction following.
  • www.windowscentral.com: OpenAI is bringing GPT-4.1 and GPT-4.1 minito ChatGPT, and the new AI models excel in web development and coding tasks compared to OpenAI o3 & o4-mini.
  • www.zdnet.com: GPT-4.1 makes ChatGPT smarter, faster, and more useful for paying users, especially coders
  • www.computerworld.com: OpenAI adds GPT-4.1 models to ChatGPT
  • gHacks Technology News: OpenAI releases GPT-4.1 and GPT-4.1 mini AI models for ChatGPT
  • twitter.com: By popular request, GPT-4.1 will be available directly in ChatGPT starting today. GPT-4.1 is a specialized model that excels at coding tasks & instruction following. Because it’s faster, it’s a great alternative to OpenAI o3 & o4-mini for everyday coding needs.
  • www.ghacks.net: Reports on GPT-4.1 and GPT-4.1 mini AI models in ChatGPT, noting their accessibility to both paid and free users.
  • x.com: Provides initial tweet about the availability of GPT-4.1 in ChatGPT.
  • the-decoder.com: OpenAI brings its new GPT-4.1 model to ChatGPT users
  • eWEEK: OpenAI rolls out GPT-4.1 and GPT-4.1 mini to ChatGPT, offering smarter coding and instruction-following tools for free and paid users.

@learn.aisingapore.org //
Anthropic's Claude 3.7 model is making waves in the AI community due to its enhanced reasoning capabilities, specifically through a "deep thinking" approach. This method utilizes chain-of-thought (CoT) techniques, enabling Claude 3.7 to tackle complex problems more effectively. This development represents a significant advancement in Large Language Model (LLM) technology, promising improved performance in a variety of demanding applications.

The implications of this enhanced reasoning are already being seen across different sectors. FloQast, for example, is leveraging Anthropic's Claude 3 on Amazon Bedrock to develop an AI-powered accounting transformation solution. The integration of Claude’s capabilities is assisting companies in streamlining their accounting operations, automating reconciliations, and gaining real-time visibility into financial operations. The model’s ability to handle the complexities of large-scale accounting transactions highlights its potential for real-world applications.

Furthermore, recent reports highlight the competitive landscape where models like Mistral AI's Medium 3 are being compared to Claude Sonnet 3.7. These comparisons focus on balancing performance, cost-effectiveness, and ease of deployment. Simultaneously, Anthropic is also enhancing Claude's functionality by allowing users to connect more applications, expanding its utility across various domains. These advancements underscore the ongoing research and development efforts aimed at maximizing the potential of LLMs and addressing potential security vulnerabilities.

Recommended read:
References :
  • learn.aisingapore.org: This article describes how FloQast utilizes Anthropic’s Claude 3 on Amazon Bedrock for its accounting transformation solution.
  • Last Week in AI: LWiAI Podcast #208 - Claude Integrations, ChatGPT Sycophancy, Leaderboard Cheats
  • techcrunch.com: Anthropic lets users connect more apps to Claude
  • Towards AI: The New AI Model Paradox: When “Upgrades” Feel Like Downgrades (Claude 3.7)
  • Towards AI: How to Achieve Structured Output in Claude 3.7: Three Practical Approaches

@the-decoder.com //
OpenAI is expanding its global reach through strategic partnerships with governments and the introduction of advanced model customization tools. The organization has launched the "OpenAI for Countries" program, an initiative designed to collaborate with governments worldwide on building robust AI infrastructure. This program aims to assist nations in setting up data centers and adapting OpenAI's products to meet local language and specific needs. OpenAI envisions this initiative as part of a broader global strategy to foster cooperation and advance AI capabilities on an international scale.

This expansion also includes technological advancements, with OpenAI releasing Reinforcement Fine-Tuning (RFT) for its o4-mini reasoning model. RFT enables enterprises to fine-tune their own versions of the model using reinforcement learning, tailoring it to their unique data and operational requirements. This allows developers to customize the model to better fit their needs using OpenAI’s platform dashboard, tweaking it for internal terminology, goals, processes and more. Once deployed, if an employee or leader at the company wants to use it through a custom internal chatbot orcustom OpenAI GPTto pull up private, proprietary company knowledge, answer specific questions about company products and policies, or generate new communications and collateral in the company’s voice, they can do so more easily with their RFT version of the model.

The "OpenAI for Countries" program is slated to begin with ten international projects, supported by funding from both OpenAI and participating governments. Chris Lehane, OpenAI's vice president of global policy, indicated that the program was inspired by the AI Action Summit in Paris, where several countries expressed interest in establishing their own "Stargate"-style projects. Moreover, the release of RFT on o4-mini signifies a major step forward in custom model optimization, offering developers a powerful new technique for tailoring foundation models to specialized tasks. This allows for fine-grained control over how models improve, by defining custom objectives and reward functions.

Recommended read:
References :
  • the-decoder.com: OpenAI launches a program to partner with governments on global AI infrastructure
  • AI News | VentureBeat: You can now fine-tune your enterprise’s own version of OpenAI’s o4-mini reasoning model with reinforcement learning
  • www.marktechpost.com: OpenAI releases Reinforcement Fine-Tuning (RFT) on o4-mini: A Step Forward in Custom Model Optimization
  • MarkTechPost: OpenAI Releases Reinforcement Fine-Tuning (RFT) on o4-mini: A Step Forward in Custom Model Optimization
  • www.computerworld.com: OpenAI was founded a decade ago with a focus on research, but it has since expanded into products and infrastructure. Now it is looking to again broaden its presence into user-facing apps. The company announced this week that Fidji Simo will join as CEO of applications, a newly-created position. Simo is the current CEO and chair at grocery delivery company Instacart. She will begin her new role at OpenAI later this year, reporting directly to Sam Altman, who will remain overall CEO and oversee research, compute, and applications.
  • the-decoder.com: OpenAI has appointed Fidji Simo as CEO of its new Applications division, reporting directly to OpenAI CEO Sam Altman.
  • the-decoder.com: OpenAI is expanding its fine-tuning program for o4-mini, introducing Reinforcement Fine-Tuning (RFT) for organizations. The method is designed to help tailor models like o4-mini to highly specific tasks with the help of a programmable grading system.
  • Maginative: OpenAI brings reinforcement fine-tuning and GPT-4.1 Nano Fine-Tuning in the API
  • AI News | VentureBeat: OpenAI names Instacart leader Fidji Simo as new CEO of Applications
  • techxplore.com: OpenAI offers to help countries build AI systems
  • THE DECODER: OpenAI is expanding its fine-tuning program for o4-mini, introducing Reinforcement Fine-Tuning (RFT) for organizations. The method is designed to help tailor models like o4-mini to highly specific tasks with the help of a programmable grading system.
  • Techzine Global: OpenAI opens the door to reinforcement fine-tuning for o4-mini
  • The Register - Software: OpenAI drafts Instacart boss as CEO of Apps to lure in the normies

@www.marktechpost.com //
References: the-decoder.com , Ken Yeung , Towards AI ...
Meta is making significant strides in the AI landscape, highlighted by the release of Llama Prompt Ops, a Python package aimed at streamlining prompt adaptation for Llama models. This open-source tool helps developers enhance prompt effectiveness by transforming inputs to better suit Llama-based LLMs, addressing the challenge of inconsistent performance across different AI models. Llama Prompt Ops facilitates smoother cross-model prompt migration and improves performance and reliability, featuring a transformation pipeline for systematic prompt optimization.

Meanwhile, Meta is expanding its AI strategy with the launch of a standalone Meta AI app, powered by Llama 4, to compete with rivals like Microsoft’s Copilot and ChatGPT. This app is designed to function as a general-purpose chatbot and a replacement for the “Meta View” app used with Meta Ray-Ban glasses, integrating a social component with a public feed showcasing user interactions with the AI. Meta also previewed its Llama API, designed to simplify the integration of its Llama models into third-party products, attracting AI developers with an open-weight model that supports modular, specialized applications.

However, Meta's AI advancements are facing legal challenges, as a US judge is questioning the company's claim that training AI on copyrighted books constitutes fair use. The case, focusing on Meta's Llama model, involves training data including works by Sarah Silverman. The judge raised concerns that using copyrighted material to create a product capable of producing an infinite number of competing products could undermine the market for original works, potentially obligating Meta to pay licenses to copyright holders.

Recommended read:
References :
  • the-decoder.com: US judge questions Meta's claim that training AI on copyrighted books is fair use
  • Ken Yeung: IN THIS ISSUE: Meta hosts its first-ever event around its Llama model, launching a standalone app to take on Microsoft’s Copilot and ChatGPT.
  • MarkTechPost: Meta AI has released Llama Prompt Ops, a Python package designed to streamline the process of adapting prompts for Llama models.
  • Towards AI: Meta AI has unveiled Llama 4, the latest iteration of its open large language models, marking a substantial breakthrough with native multimodality at its core.

@docs.llamaindex.ai //
LlamaIndex is advancing agentic systems design by focusing on the optimal blend of autonomy and structure, particularly through its innovative Workflows system. Workflows provide an event-based mechanism for orchestrating agent execution, connecting individual steps implemented as vanilla functions. This approach enables developers to create chains, branches, loops, and collections within their agentic systems, aligning with established design patterns for effective agents. The system, available in both Python and TypeScript frameworks, is fundamentally simple yet powerful, allowing for complex orchestration of agentic tasks.

LlamaIndex Workflows support hybrid systems by allowing decisions about control flow to be made by LLMs, traditional imperative programming, or a combination of both. This flexibility is crucial for building robust and adaptable AI solutions. Furthermore, Workflows not only facilitate the implementation of agents but also enable the use of sub-agents within each step. This hierarchical agent design can be leveraged to decompose complex tasks into smaller, more manageable units, enhancing the overall efficiency and effectiveness of the system.

The introduction of Workflows underscores LlamaIndex's commitment to providing developers with the tools they need to build sophisticated knowledge assistants and agentic applications. By offering a system that balances autonomy with structured execution, LlamaIndex is addressing the need for design principles when building agents. The company draws from its experience with LlamaCloud and its collaboration with enterprise customers to offer a system that integrates agents, sub-agents, and flexible decision-making capabilities.

Recommended read:
References :
  • Blog on LlamaIndex: LlamaIndex Blog post Bending without breaking: optimal design patterns for effective agents
  • docs.llamaindex.ai: Bending without breaking: optimal design patterns for effective agents

@the-decoder.com //
References: composio.dev , THE DECODER ,
OpenAI is actively benchmarking its language models, including o3 and o4-mini, against competitors like Gemini 2.5 Pro, to evaluate their performance in reasoning and tool use efficiency. Benchmarks like the Aider polyglot coding test show that o3 leads in some areas, achieving a new state-of-the-art score of 79.60% compared to Gemini 2.5's 72.90%. However, this performance comes at a higher cost, with o3 being significantly more expensive. O4-mini offers a slightly more balanced price-performance ratio, costing less than o3 while still surpassing Gemini 2.5 on certain tasks. Testing reveals Gemini 2.5 excels in context awareness and iterating on code, making it preferable for real-world use cases, while o4-mini surprisingly excelled in competitive programming.

Open AI have just launched its GPT-Image-1 model for image generation to developers via API. Previously, this model was only accessible through ChatGPT. The versatility of the model means that it can create images across diverse styles, custom guidelines, world knowledge, and accurately render text. The company's blog post said that this unlocks countless practical applications across multiple domains.

Several enterprises and startups are already incorporating the model for creative projects, products, and experiences. Image processing with GPT-Image-1 is billed by tokens. Text input tokens, or the prompt text, will cost $5 per 1 million tokens. Image input tokens will be $10 per million tokens, while image output tokens, or the generated image, will be a whopping $40 per million tokens. Depending on the selected image quality,costs typically range from $0.02 to $0.19 per image.

Recommended read:
References :

Michael Nuñez@AI News | VentureBeat //
Amazon Web Services (AWS) has announced significant advancements in its AI coding and Large Language Model (LLM) infrastructure. A key highlight is the introduction of SWE-PolyBench, a comprehensive multi-language benchmark designed to evaluate the performance of AI coding assistants. This benchmark addresses the limitations of existing evaluation frameworks by assessing AI agents across a diverse range of programming languages like Python, JavaScript, TypeScript, and Java, using real-world scenarios derived from over 2,000 curated coding challenges from GitHub issues. The aim is to provide researchers and developers with a more accurate understanding of how well these tools can navigate complex codebases and solve intricate programming tasks involving multiple files.

The latest Amazon SageMaker Large Model Inference (LMI) container v15, powered by vLLM 0.8.4, further enhances LLM capabilities. This version supports a wider array of open-source models, including Meta’s Llama 4 models and Google’s Gemma 3, providing users with more flexibility in model selection. LMI v15 delivers significant performance improvements through an async mode and support for the vLLM V1 engine, resulting in higher throughput and reduced CPU overhead. This enables seamless deployment and serving of large language models at scale, with expanded API schema support and multimodal capabilities for vision-language models.

AWS is also launching new Amazon EC2 Graviton4-based instances with NVMe SSD storage. These compute optimized (C8gd), general purpose (M8gd), and memory optimized (R8gd) instances offer up to 30% better compute performance and 40% higher performance for I/O intensive database workloads compared to Graviton3-based instances. They also include larger instance sizes with up to 3x more vCPUs, memory, and local storage. These instances are ideal for storage intensive Linux-based workloads including containerized and micro-services-based applications built using Amazon Elastic Kubernetes Service(Amazon EKS),Amazon Elastic Container Service(Amazon ECS),Amazon Elastic Container Registry(Amazon ECR), Kubernetes, and Docker, as well as applications written in popular programming languages such as C/C++, Rust, Go, Java, Python, .NET Core, Node.js, Ruby, and PHP.

Recommended read:
References :
  • venturebeat.com: Amazon’s SWE-PolyBench just exposed the dirty secret about your AI coding assistant
  • www.marktechpost.com: AWS Introduces SWE-PolyBench: A New Open-Source Multilingual Benchmark for Evaluating AI Coding Agents

@www.microsoft.com //
Microsoft Research is delving into the transformative potential of AI as "Tools for Thought," aiming to redefine AI's role in supporting human cognition. At the upcoming CHI 2025 conference, researchers will present four new research papers and co-host a workshop exploring this intersection of AI and human thinking. The research includes a study on how AI is changing the way we think and work along with three prototype systems designed to support different cognitive tasks. The goal is to explore how AI systems can be used as Tools for Thought and reimagine AI’s role in human thinking.

As AI tools become increasingly capable, Microsoft has unveiled new AI agents designed to enhance productivity in various domains. The "Researcher" agent can tackle complex research tasks by analyzing work data, emails, meetings, files, chats, and web information to deliver expertise on demand. Meanwhile, the "Analyst" agent functions as a virtual data scientist, capable of processing raw data from multiple spreadsheets to forecast demand or visualize customer purchasing patterns. The new AI agents unveiled over the past few weeks can help people every day with things like research, cybersecurity and more.

Johnson & Johnson has reportedly found that only a small percentage, between 10% and 15%, of AI use cases deliver the vast majority (80%) of the value. After encouraging employees to experiment with AI and tracking the results of nearly 900 use cases over about three years, the company is now focusing resources on the highest-value projects. These high-value applications include a generative AI copilot for sales representatives and an internal chatbot answering employee questions. Other AI tools being developed include one for drug discovery and another for identifying and mitigating supply chain risks.

Recommended read:
References :

@www.searchenginejournal.com //
Recent advancements are showing language models (LLMs) are expanding past basic writing and are now being used to generate functional code. These models can produce full scripts, browser extensions, and web applications from natural language prompts, opening up opportunities for those without coding skills. Marketers and other professionals can now automate repetitive tasks, build custom tools, and experiment with technical solutions more easily than ever before. This unlocks a new level of efficiency, allowing individuals to create one-off tools for tasks that previously seemed too time-consuming to justify automation.

Advances in AI are also focusing on improving the accuracy of code generated by LLMs. Researchers at MIT have developed a new approach that guides LLMs to generate code that adheres to the rules of the specific programming language. This method allows the LLM to prioritize outputs that are likely to be valid and accurate, improving computational efficiency. This new architecture has enabled smaller LLMs to outperform larger models in generating accurate outputs in fields like molecular biology and robotics. The goal is to allow non-experts to control AI-generated content by ensuring that the outputs are both useful and correct, potentially improving programming assistants, AI-powered data analysis, and scientific discovery tools.

New tools are emerging to aid developers, such as Amazon Q Developer and OpenAI Codex CLI. Amazon Q Developer is an AI-powered coding assistant that integrates into IDEs like Visual Studio Code, providing context-aware code recommendations, snippets, and unit test suggestions. The service uses advanced generative AI to understand the context of a project and offers features like intelligent code generation, integrated testing and debugging, seamless documentation and effective code review and refactoring. Similarly, OpenAI Codex CLI is a terminal-based AI assistant that allows developers to interact with OpenAI models using natural language to read, modify, and run code. These tools aim to boost coding productivity by assisting with tasks like bug fixing, refactoring, and prototyping.

Recommended read:
References :
  • hackernoon.com: Amazon Q Developer: The Future of AI-Enhanced Coding Productivity
  • Search Engine Journal: LLMs That Code: Why Marketers Should Care (Even If You’ve Never Touched An IDE)
  • www.marktechpost.com: Anthropic Releases a Comprehensive Guide to Building Coding Agents with Claude Code

@www.quantamagazine.org //
Recent developments in the field of large language models (LLMs) are focusing on enhancing reasoning capabilities through reinforcement learning. This approach aims to improve model accuracy and problem-solving, particularly in challenging tasks. While some of the latest LLMs, such as GPT-4.5 and Llama 4, were not explicitly trained using reinforcement learning for reasoning, the release of OpenAI's o3 model shows that strategically investing in compute and tailored reinforcement learning methods can yield significant improvements.

Competitors like xAI and Anthropic have also been incorporating more reasoning features into their models, such as the "thinking" or "extended thinking" button in xAI Grok and Anthropic Claude. The somewhat muted response to GPT-4.5 and Llama 4, which lack explicit reasoning training, suggests that simply scaling model size and data may be reaching its limits. The field is now exploring ways to make language models work better, including the use of reinforcement learning.

One of the ways that researchers are making language models work better is to sidestep the requirement for language as an intermediary step. Language isn't always necessary, and that having to turn ideas into language can slow down the thought process. LLMs process information in mathematical spaces, within deep neural networks, however, they must often leave this latent space for the much more constrained one of individual words. Recent papers suggest that deep neural networks can allow language models to continue thinking in mathematical spaces before producing any text.

Recommended read:
References :
  • pub.towardsai.net: The article discusses the application of reinforcement learning to improve the reasoning abilities of LLMs.
  • Sebastian Raschka, PhD: This blog post delves into the current state of reinforcement learning in enhancing LLM reasoning capabilities, highlighting recent advancements and future expectations.
  • Quanta Magazine: This article explores the use of reinforcement learning to make Language Models work better, especially in challenging reasoning tasks.

Chris McKay@Maginative //
OpenAI has unveiled its latest advancements in AI technology with the launch of the GPT-4.1 family of models. This new suite includes GPT-4.1, GPT-4.1 mini, and GPT-4.1 nano, all accessible via API, and represents a significant leap forward in coding capabilities, instruction following, and context processing. Notably, these models feature an expanded context window of up to 1 million tokens, enabling them to handle larger codebases and extensive documents. The GPT-4.1 family aims to cater to a wide range of developer needs by offering different performance and cost profiles, with the goal of creating more advanced and efficient AI applications.

These models demonstrate superior results on various benchmarks compared to their predecessors, GPT-4o and GPT-4o mini. Specifically, GPT-4.1 showcases a substantial improvement on the SWE-bench Verified coding test with a 54.6% increase, and a 38.3% increase on Scale’s MultiChallenge for instruction following. Each model is designed with a specific purpose in mind: GPT-4.1 excels in high-level cognitive tasks like software development and research, GPT-4.1 mini offers a balanced performance with reduced latency and cost, while GPT-4.1 nano provides the quickest and most affordable option for tasks such as classification. All three models have knowledge updated through June 2024.

The introduction of the GPT-4.1 family also brings about changes in OpenAI's existing model offerings. The GPT-4.5 Preview model in the API is set to be deprecated on July 14, 2025, due to GPT-4.1 offering comparable or better utility at a lower cost. In terms of pricing, GPT-4.1 is 26% less expensive than GPT-4o for median queries, along with increased prompt caching discounts. Early testers have already noted positive outcomes, with improvements in code review suggestions and data retrieval from large documents. OpenAI emphasizes that many underlying improvements are being integrated into the current GPT-4o version within ChatGPT.

Recommended read:
References :
  • TestingCatalog: OpenAI debuts GPT-4.1 family offering 1M token context window
  • venturebeat.com: OpenAI slashes prices for GPT-4.1, igniting AI price war among tech giants
  • Interconnects: OpenAI's latest models optimizing on intelligence per dollar.
  • THE DECODER: OpenAI launches GPT-4.1: New model family to improve agents, long contexts and coding
  • Simon Willison's Weblog: OpenAI three new models this morning: GPT-4.1, GPT-4.1 mini and GPT-4.1 nano. These are API-only models right now, not available through the ChatGPT interface (though you can try them out in OpenAI's ).
  • Analytics Vidhya: All About OpenAI’s Latest GPT 4.1 Family
  • pub.towardsai.net: TAI #148: New API Models from OpenAI (4.1) & xAI (grok-3); Exploring Deep Research’s Scaling Laws
  • Towards AI: The GPT-4.1 models, accessible via API, provide a significant advancement in AI capabilities and offer an intriguing alternative for developers looking for high performance at lower cost.
  • Towards AI: TAI #148: New API Models from OpenAI (4.1) & xAI (grok-3); Exploring Deep Research’s Scaling Laws
  • venturebeat.com: OpenAI’s new GPT-4.1 models can process a million tokens and solve coding problems better than ever
  • techstrong.ai: Just days after announcing its plans to retire GPT-4 in ChatGPT, OpenAI on Monday launched a new set of flagship models named GPT-4.1. The release, which The Verge anticipated in an article last week, included the standard version GPT-4.1 model, along with two smaller models — GPT-4.1 mini, and GPT-4.1 nano which OpenAI touts as […]
  • the-decoder.com: OpenAI launches GPT-4.1: New model family to improve agents, long contexts and coding
  • www.tomsguide.com: OpenAI's latest model is here but it isn't GPT-5, it's 4.1, a model all about coding
  • shellypalmer.com: Shelly Palmer discusses the launch of GPT-4.1 and its improved capabilities.
  • felloai.com: OpenAI Quietly Launched GPT‑4.1 – A GPT-4o Successor That’s Crushing Benchmarks
  • thezvi.wordpress.com: The Zvi discusses the mini upgrade from GPT-4.1.
  • bdtechtalks.com: GPT-4.1: OpenAI’s most confusing model
  • Fello AI: OpenAI Quietly Launched GPT‑4.1 – A GPT-4o Successor That’s Crushing Benchmarks
  • www.eweek.com: eWeek reports on the pros and cons of OpenAI's new GPT-4.1 model.
  • Last Week in AI: Last Week in AI discusses the new GPT 4.1 model release by OpenAI
  • Fello AI: OpenAI’s language models have become part of everyday life for millions of people—whether you’re using ChatGPT to get quick answers, brainstorm ideas, or even generate code. With each new version, the models get faster, smarter, and more capable.
  • thezvi.wordpress.com: Yesterday’s news alert, nevertheless: The verdict is in. GPT-4.1-Mini in particular is an excellent practical model, offering strong performance at a good price. The full GPT-4.1 is an upgrade to OpenAI’s more expensive API offerings, it is modestly better but …
  • composio.dev: GPT-4.1 vs. Deepseek v3 vs. Sonnet 3.7 vs. GPT-4.5
  • hackernoon.com: OpenAI announced GPT-4.1, featuring a staggering 1M-token context window and perfect needle-in-a-haystack accuracy.
  • Shelly Palmer: OpenAI has launched GPT-4.1, along with GPT-4.1 Mini and GPT-4.1 Nano. These models are for developers and will not show up in your ChatGPT model picker.
  • eWEEK: OpenAI is releasing new language models, GPT-4.1, GPT-4.1 mini, and GPT-4.1 nano.

@www.thecanadianpressnews.ca //
Meta Platforms, the parent company of Facebook and Instagram, has announced it will resume using publicly available content from European users to train its artificial intelligence models. This decision comes after a pause last year following privacy concerns raised by activists. Meta plans to use public posts, comments, and interactions with Meta AI from adult users in the European Union to enhance its generative AI models. The company says this data is crucial for developing AI that understands the nuances of European languages, dialects, colloquialisms, humor, and local knowledge.

Meta emphasizes that it will not use private messages or data from users under 18 for AI training. To address privacy concerns, Meta will notify EU users through in-app and email notifications, providing them with a way to opt out of having their data used. These notifications will include a link to a form allowing users to object to the use of their data, and Meta has committed to honoring all previously and newly submitted objection forms. The company states its AI is designed to cater to diverse perspectives and to acknowledge the distinctive attributes of various European communities.

Meta claims its approach aligns with industry practices, noting that companies like Google and OpenAI have already utilized European user data for AI training. Meta defends its actions as necessary to develop AI services that are relevant and beneficial to European users. While Meta highlights that a panel of EU privacy regulators “affirmed” that its original approach met legal obligations. Groups like NOYB had previously complained and urged regulators to intervene, advocating for an opt-in system where users actively consent to the use of their data for AI training.

Recommended read:
References :
  • cyberinsider.com: Meta has announced it will soon begin using public data from adult users in the European Union — including posts, comments, and AI interactions — to train its generative AI models, raising concerns about the boundaries of consent and user awareness across its major platforms.
  • discuss.privacyguides.net: Meta to start training its AI models on public content in the EU after Est. reading time: 3 minutes If you are an EU resident with an Instagram or Facebook account, you should know that Meta will start training its AI models on your posted content.
  • Malwarebytes: Meta users in Europe will have their public posts swept up and ingested for AI training, the company announced this week.
  • : Meta says it will start using publicly available content from European users to train its artificial intelligence models, resuming work put on hold last year after activists raised concerns about data privacy.
  • bsky.app: Meta announced today that it will soon start training its artificial intelligence models using content shared by European adult users on its Facebook and Instagram social media platforms. https://www.bleepingcomputer.com/news/technology/meta-to-resume-ai-training-on-content-shared-by-europeans/
  • BleepingComputer: Meta to resume AI training on content shared by Europeans
  • oodaloop.com: Meta says it will resume AI training with public content from European users
  • BleepingComputer: Meta announced today that it will soon start training its artificial intelligence models using content shared by European adult users on its Facebook and Instagram social media platforms.
  • techxplore.com: Social media company Meta said Monday that it will start using publicly available content from European users to train its artificial intelligence models, resuming work put on hold last year after activists raised concerns about data privacy.
  • finance.yahoo.com: Meta says it will resume AI training with public content from European users
  • www.theverge.com: Reports on Meta resuming AI training with public content from European users.
  • The Hacker News: Meta Resumes E.U. AI Training Using Public User Data After Regulator Approval
  • www.socialmediatoday.com: Meta Begins Training its AI Tools on EU User Data
  • Meta Newsroom: Today, we’re announcing our plans to train AI at Meta using public content —like public posts and comments— shared by adults on our products in the EU.
  • Synced: Meta’s Novel Architectures Spark Debate on the Future of Large Language Models
  • securityaffairs.com: Meta will use public EU user data to train its AI models
  • about.fb.com: Today, we’re announcing our plans to train AI at Meta using public content —like public posts and comments— shared by adults on our products in the EU. People’s interactions with Meta AI – like questions and queries – will also be used to train and improve our models.
  • www.bitdegree.org: Meta Cleared to Train AI with Public Posts in the EU
  • MEDIANAMA: Meta to begin using EU users’ data to train AI models
  • www.medianama.com: Meta to begin using EU users’ data to train AI models
  • The Register - Software: Meta to feed Europe's public posts into AI brains again
  • www.artificialintelligence-news.com: Meta will train AI models using EU user data
  • AI News: Meta will train AI models using EU user data
  • techxmedia.com: Meta announced it will use public posts and comments from adult EU users to train its AI models, ensuring compliance with EU regulations.
  • Digital Information World: Despite all the controversy that arose, tech giant Meta is now preparing to train its AI systems on data belonging to Facebook and Instagram users in the EU.
  • TechCrunch: Meta will start training its AI models on public content in the EU

Chris McKay@Maginative //
OpenAI has launched a new series of GPT-4.1 models, including GPT-4.1, GPT-4.1 mini, and GPT-4.1 nano. These API-only models are not accessible via the ChatGPT interface but offer significant improvements in coding, instruction following, and context handling. All three models support a massive 1 million token context window, and they have a May 31, 2024 cutoff date.

GPT-4.1 demonstrates enhanced performance in coding benchmarks, surpassing GPT-4o by 21.4% on industry benchmarks. The models are also more cost-effective, with GPT-4.1 being 26% cheaper than GPT-4o and offering better latency. The GPT-4.1 nano model is OpenAI's cheapest model yet, priced at $0.10 per million input tokens and $0.40 per million output tokens. As a result of GPT-4.1's improved performance, OpenAI will be deprecating GPT-4.5 Preview on July 14, 2025.

The GPT-4.1 series excels in several key areas, including coding capabilities and instruction following. The models have achieved impressive scores on benchmarks like SWE-bench Verified and Scale’s MultiChallenge, demonstrating real-world software engineering skills and enhanced adherence to requested formats. Several companies have reported significant improvements in their specialized applications, with GPT-4.1 scoring higher on internal coding benchmarks, providing better code review suggestions, and improving the extraction of granular financial data from complex documents.

Recommended read:
References :
  • Simon Willison's Weblog: Simon Willison reports on three new million token input models from OpenAI, including their cheapest model yet.
  • Maginative: OpenAI has rolled out GPT-4.1, GPT-4.1 mini, and GPT-4.1 nano—faster, cheaper models with sharper coding, better instruction following, and support for 1 million-token context windows.
  • bsky.app: New release of my llm-openai plugin supporting today's three new GPT-4.1 models from OpenAI: llm install -U llm-openai-plugin llm -m openai/gpt-4.1 "Generate an SVG of a pelican riding a bicycle"
  • TestingCatalog: OpenAI debuts GPT-4.1 family offering 1M token context window
  • venturebeat.com: VentureBeat's report on the launch of GPT-4.1.
  • Interconnects: OpenAI's GPT-4.1 and separating the API from ChatGPT
  • the-decoder.com: OpenAI launches GPT-4.1: New model family to improve agents, long contexts and coding
  • THE DECODER: OpenAI launches GPT-4.1: New model family to improve agents, long contexts and coding
  • venturebeat.com: OpenAI’s new GPT-4.1 models can process a million tokens and solve coding problems better than ever
  • Analytics Vidhya: All About OpenAI’s Latest GPT 4.1 Family
  • pub.towardsai.net: TAI #148: New API Models from OpenAI (4.1) & xAI (grok-3); Exploring Deep Research’s Scaling Laws
  • Latent.Space: GPT 4.1: The New OpenAI Workhorse
  • www.tomsguide.com: OpenAI launches another model before GPT 5 — here’s what this one can do
  • techstrong.ai: Details the launch of a new set of flagship models named GPT-4.1 which included the standard version GPT-4.1 model, along with two smaller models.
  • Towards AI: Details the launch of GPT-4.1 models, emphasizing coding and instruction-following capabilities.
  • Towards AI: The GPT-4.1 model series release through Azure AI Foundry represents a major step forward in AI capabilities.
  • techstrong.ai: OpenAI Introduces GPT-4.1 with Improved Coding
  • www.analyticsvidhya.com: OpenAI's new models show improvement in multiple benchmarks, excelling in long-context processing (up to 1 million tokens).
  • thezvi.wordpress.com: GPT-4.1 Is a Mini Upgrade
  • felloai.com: OpenAI has just launched a brand-new series of GPT models—GPT‑4.1, GPT‑4.1 mini, and GPT‑4.1 nano—that promise major advances in coding, instruction following, and the ability to handle incredibly long contexts.
  • shellypalmer.com: While admitting that they "suck at naming their models," OpenAI has launched GPT-4.1, along with GPT-4.1 Mini and GPT-4.1 Nano.
  • thezvi.wordpress.com: GPT-4.1 Is a Mini Upgrade
  • www.analyticsvidhya.com: How to Build Agentic RAG Using GPT-4.1?
  • felloai.com: Ultimate Comparison of GPT-4.1 vs GPT-4o: Which One Should You Use?
  • www.eweek.com: OpenAI released GPT-4.1, the newest successor to its GPT-4o series of AI language models.
  • Fello AI: OpenAI has just launched a brand-new series of GPT models—GPT‑4.1, GPT‑4.1 mini, and GPT‑4.1 nano—that promise major advances in coding, instruction following, and the ability to handle incredibly long contexts.
  • Shelly Palmer: While admitting that they "suck at naming their models," OpenAI has launched GPT-4.1, along with GPT-4.1 Mini and GPT-4.1 Nano. These models are for developers and will not show up in your ChatGPT model picker.
  • eWEEK: OpenAI announced on Monday the release of GPT-4.1, the newest successor to its GPT-4o series of AI language models.
  • bdtechtalks.com: GPT-4.1: OpenAI’s most confusing model
  • composio.dev: GPT-4.1 vs. Deepseek v3 vs. Sonnet 3.7 vs. GPT-4.5
  • thezvi.wordpress.com: OpenAI has upgraded its entire suite of models. By all reports, they are back in the game for more than images. GPT-4.1 and especially GPT-4.1-mini are their new API non-reasoning models. All reports are that GPT-4.1-mini especially is very good.
  • thezvi.wordpress.com: Greg Brockman (OpenAI): Just released o3 and o4-mini! These models feel incredibly smart.
  • Last Week in AI: Analyzes OpenAI's new AI models, focusing on their enhanced coding capabilities and concerns about reduced resources for safety testing.
  • techcrunch.com: OpenAI's new GPT-4.1 AI models focus on coding, Google’s newest Gemini AI model focuses on efficiency, and more!
  • TheSequence: The Sequence Radar #526: The OpenAI Blitz: From GPT-4.1 to Windsurf

Megan Crouse@techrepublic.com //
Researchers from DeepSeek and Tsinghua University have recently made significant advancements in AI reasoning capabilities. By combining Reinforcement Learning with a self-reflection mechanism, they have created AI models that can achieve a deeper understanding of problems and solutions without needing external supervision. This innovative approach is setting new standards for AI development, enabling models to reason, self-correct, and explore alternative solutions more effectively. The advancements showcase that outstanding performance and efficiency don’t require secrecy.

Researchers have implemented the Chain-of-Action-Thought (COAT) approach in these enhanced AI models. This method leverages special tokens such as "continue," "reflect," and "explore" to guide the model through distinct reasoning actions. This allows the AI to navigate complex reasoning tasks in a more structured and efficient manner. The models are trained in a two-stage process.

DeepSeek has also released papers expanding on reinforcement learning for LLM alignment. Building off prior work, they introduce Rejective Fine-Tuning (RFT) and Self-Principled Critique Tuning (SPCT). The first method, RFT, has a pre-trained model produce multiple responses and then evaluates and assigns reward scores to each response based on generated principles, helping the model refine its output. The second method, SPCT, uses reinforcement learning to improve the model’s ability to generate critiques and principles without human intervention, creating a feedback loop where the model learns to self-evaluate and improve its reasoning capabilities.

Recommended read:
References :
  • hlfshell: DeepSeek released another cool paper expanding on reinforcement learning for LLM alignment. Building off of their prior work (which I talk about here), they introduce two new methods.
  • www.techrepublic.com: Researchers from DeepSeek and Tsinghua University say combining two techniques improves the answers the large language model creates with computer reasoning techniques.

@www.marktechpost.com //
OpenAI is making headlines on multiple fronts, from model releases and capabilities to government consultations. The company is actively engaging with the UK government amid ongoing discussions about AI training and copyright regulations. OpenAI is advocating for broad access to text and data mining for AI development, arguing it's crucial for the UK to maintain its competitive edge in the AI landscape. They warn that restrictive opt-out systems, similar to those in the EU, could create uncertainty and hinder innovation, suggesting the US approach has been more effective in fostering technological leadership.

OpenAI is also gearing up to release its O3 and O4-mini models in the coming weeks, followed by the highly anticipated GPT-5 in a few months. While the GPT-5 launch is delayed, OpenAI assures that the extra time will allow them to significantly improve the model. The company attributes the delay to the challenges of integrating everything smoothly and ensuring sufficient capacity to handle the expected high demand, likely spurred by the recent launch of GPT-4o image generation, which experienced overwhelming usage.

In a separate development, OpenAI's GPT-4.5 model has achieved a significant milestone by passing the Turing test. A study conducted by researchers at the University of California at San Diego found that GPT-4.5 successfully fooled human participants into believing it was human 73% of the time during text-based conversations. This surpasses even the ability of real humans to convince others of their identity in the same context. Sam Altman, CEO of OpenAI, recently shared insights on the future of AI, emphasizing India's crucial role and the transformative impact of AI on jobs and various industries, from image generation to software development.

Recommended read:
References :
  • THE DECODER: OpenAI plans GPT-5 release in "a few months," shifts strategy on reasoning models
  • venturebeat.com: OpenAI just made ChatGPT Plus free for millions of college students — and it’s a brilliant competitive move against Anthropic
  • eWEEK: OpenAI Model Fools People Into Thinking It’s Human 73% of Exchanges, Passing the Turing Test
  • www.techradar.com: ChatGPT-5 is on hold as OpenAI changes plans and releases new o3 and o4-mini models
  • Analytics Vidhya: 11 Insights from Sam Altman on the Future of AI, Jobs, and India
  • www.itpro.com: OpenAI woos UK government amid consultation on AI training and copyright
  • www.tomsguide.com: OpenAI delaying GPT 5 launch 'for a few months' — but we're still getting new models
  • felloai.com: GPT‑4.5 Passes the Turing Test: A New Milestone in Human‑Like AI

Matthias Bastian@THE DECODER //
OpenAI is making adjustments to its AI model release strategy, with a shift concerning its highly anticipated GPT-5. Originally planned to integrate new reasoning models o3 and o4-mini, OpenAI will now release these as standalone systems in the coming weeks. This decision results in delaying the GPT-5 release by a few months.

CEO Sam Altman cited the difficulty of integrating components into a unified system as a primary factor, along with the potential for GPT-5 to exceed initial expectations. Ensuring adequate computing capacity to meet anticipated demand also played a role. Altman highlighted significant improvements in the o3 model since its initial preview.

OpenAI is also making moves to increase accessibility. It is now offering free ChatGPT Plus subscriptions to college students. This aims to provide access to advanced AI tools like GPT-4o, image generation, and voice interaction. This offering coincides with Anthropic's recent introduction of "Claude for Education," setting the stage for a fierce competition in the education AI market as the tech giants battle for dominance in the $80 billion education AI market.

Recommended read:
References :
  • THE DECODER: OpenAI plans GPT-5 release in "a few months," shifts strategy on reasoning models
  • Simon Willison's Weblog: change of plans: we are going to release o3 and o4-mini after all, probably in a couple of weeks, and then do GPT-5 in a few months
  • www.techradar.com: ChatGPT-5 is delayed, but o3 and o4 mini LLMs to be released in a couple of weeks.
  • The Algorithmic Bridge: GPT-5, o3, and o4-mini Are Coming Soon
  • Maginative: OpenAI has reversed its earlier decision to cancel o3, now planning to release both o3 and o4-mini within weeks while delaying GPT-5 by several months to make substantial improvements.
  • TechCrunch: OpenAI says it’ll release o3 after all, delays GPT-5