News from the AI & ML world

DeeperML

Carl Franzen@AI News | VentureBeat //
OpenAI has recently unveiled GPT-4.1, an enhanced version of its language model, now integrated into ChatGPT. This move expands access to the model's improved coding and instruction-following capabilities for ChatGPT Plus, Pro, and Team subscribers. Enterprise and Education users are slated to gain access in the coming weeks. Furthermore, OpenAI is replacing the GPT-4o mini model with GPT-4.1 mini for all users, including those on the free tier, positioning it as the fallback model when GPT-4o usage limits are reached. According to OpenAI, both models match GPT-4o's safety performance, while offering better coding and instruction-following capabilities.

GPT-4.1 was specifically designed for enterprise-grade practicality, prioritizing developer needs and production use cases. It delivers significant improvements on software engineering and instruction-following benchmarks, with reduced verbosity favored by enterprise users during testing. While the API versions of GPT-4.1 can process up to one million tokens, this expanded capacity is not yet available in ChatGPT, though future support has been hinted at. This extended context capability allows API users to feed entire codebases or large legal and financial documents into the model. The model supports standard context windows for ChatGPT: 8,000 tokens for free users, 32,000 tokens for Plus users, and 128,000 tokens for Pro users.

In addition to model upgrades, OpenAI has introduced HealthBench, a new open-source benchmark for evaluating AI in healthcare scenarios. Developed with over 262 physicians, HealthBench uses multi-turn conversations and rubric criteria to grade models. OpenAI's o3 leads with an overall score of 0.60 on HealthBench. The most provocative result concerns human-AI interaction where with the latest April 2025 models (o3, GPT-4.1), physicians using these AI responses as a base, on average, did not further improve them (both AI alone and AI+physician scoring ~0.48–0.49). For the specific task of crafting HealthBench responses, the newest AI seems to be performing at or beyond the level human experts could refine, even with a strong AI starting point.
Original img attribution: https://venturebeat.com/wp-content/uploads/2025/05/cfr0z3n_stark_white_backdrop_with_colorful_marker_illustration__ec28f705-82a7-40c2-bd35-b4819a1d0290.png?w=1024?w=1200&strip=all
ImgSrc: venturebeat.com

Share: bluesky twitterx--v2 facebook--v1 threads


References :
  • Maginative: OpenAI Brings GPT-4.1 to ChatGPT
  • pub.towardsai.net: AI Passes Physician-Level Responses in OpenAI’s HealthBench
  • THE DECODER: OpenAI brings its new GPT-4.1 model to ChatGPT users
  • AI News | VentureBeat: OpenAI brings GPT-4.1 and 4.1 mini to ChatGPT — what enterprises should know
  • www.zdnet.com: OpenAI's HealthBench shows AI's medical advice is improving - but who will listen?
Classification: