News from the AI & ML world

DeeperML - #aihealthcare

@www.marktechpost.com //
OpenAI has introduced HealthBench, a new open-source benchmark designed to evaluate AI performance in realistic healthcare scenarios. Developed in collaboration with over 262 physicians, HealthBench uses 5,000 multi-turn conversations and over 48,000 rubric criteria to grade AI models across seven medical domains and 49 languages. The benchmark assesses AI responses based on communication quality, instruction following, accuracy, contextual understanding, and completeness, providing a comprehensive evaluation of AI capabilities in healthcare. OpenAI’s latest models, including o3 and GPT-4.1, have shown impressive results on this benchmark.

The most provocative finding from the HealthBench evaluation is that the newest AI models are performing at or beyond the level of human experts in crafting responses to medical queries. Earlier tests from September 2024 showed that doctors could improve AI outputs by editing them, scoring higher than doctors working without AI. However, with the latest April 2025 models, like o3 and GPT-4.1, physicians using these AI responses as a base, on average, did not further improve them. This suggests that for the specific task of generating HealthBench responses, the newest AI matches or exceeds the capabilities of human experts, even with a strong AI starting point.

In related news, FaceAge, a face-reading AI tool developed by researchers at Mass General Brigham, demonstrates promising abilities in predicting cancer outcomes. By analyzing facial photographs, FaceAge estimates a person's biological age and can predict cancer survival with an impressive 81% accuracy rate. This outperforms clinicians in predicting short-term life expectancy, especially for patients receiving palliative radiotherapy. FaceAge identifies subtle facial features associated with aging and provides a quantifiable measure of biological aging that correlates with survival outcomes and health risks, offering doctors more objective and precise survival estimates.

Recommended read:
References :
  • pub.towardsai.net: This week, OpenAI unveiled HealthBench, a significant new open-source benchmark evaluating AI in realistic healthcare scenarios.
  • www.marktechpost.com: This news piece mentions the HealthBench benchmark for evaluating AI models in healthcare.
  • the-decoder.com: The article refers to the HealthBench benchmark developed by OpenAI to assess AI's capabilities in handling healthcare scenarios.
  • www.analyticsvidhya.com: This blog post reports on the release of OpenAI’s HealthBench, an open-source benchmark for evaluating AI models in healthcare.
  • THE DECODER: OpenAI says its latest models outperform doctors in medical benchmark
  • www.zdnet.com: OpenAI's HealthBench shows AI's medical advice is improving - but who will listen?
  • MarkTechPost: OpenAI Releases HealthBench: An Open-Source Benchmark for Measuring the Performance and Safety of Large Language Models in Healthcare
  • eWEEK: FaceAge, a face-reading AI tool that estimates biological age from facial photographs, predicts cancer outcomes with an impressive 81% accuracy rate.
  • The Rundown AI: PLUS: OpenAI launches HealthBench to evaluate AI in healthcare
  • the-decoder.com: The article discusses OpenAI's HealthBench benchmark for evaluating large language models in realistic healthcare settings.
  • www.eweek.com: FaceAge AI Tool Surpasses Doctors with 81% Accuracy in Cancer Survival Prediction
  • Fello AI: Forget everything you thought you knew about medicine! Artificial Intelligence is crashing into healthcare with the force of a meteor, and the breakthroughs are coming so fast it’s hard to keep up.
  • Microsoft Research: Peter Lee and his coauthors, Carey Goldberg and Dr. Zak Kohane, reflect on how generative AI is unfolding in real-world healthcare, drawing on earlier guest conversations to examine what’s working, what’s not, and what questions still remain. The post appeared first on .

@felloai.com //
References: felloai.com , TestingCatalog
Google is significantly expanding its applications of artificial intelligence in healthcare and education, aiming to improve efficiency and accessibility. In healthcare, Google's AMIE (Articulate Medical Intelligence Explorer) AI can now interpret medical images such as X-rays, MRIs, and CT scans, marking a potential breakthrough in AI-powered medical diagnostics. The multimodal AMIE can intelligently request, interpret, and reason about visual medical information during diagnostic conversations, suggesting a future where AI could surpass human capabilities in certain diagnostic areas. This development addresses a previous limitation where AI couldn't directly process and understand medical imaging, a crucial aspect of diagnosis.

Google is also redefining education with AI tools. Infinity Learn, in collaboration with Google Cloud Consulting, has developed an AI tutor to assist students preparing for exams. This AI tutor, powered by Google Cloud’s Vertex AI Retrieval Augmented Generation (RAG) services and a Gemini 2.0 Flash model, acts as a custom search engine, providing detailed guidance for solving problems in subjects like math, physics, and chemistry. The AI tutor is designed not just to provide answers, but to foster in-depth knowledge and conceptual clarity, helping students independently find solutions and understand the reasoning behind them.

Additionally, Google is developing new generative media features for NotebookLM, including video overviews. Users may soon be able to transform their notebook content into short video summaries, potentially powered by Google’s Veo 2 model, which specializes in generating concise video segments. NotebookLM is also hinting at a broader content discovery direction through a newly revealed section titled "Editor’s Picks," suggesting a shift towards a more social or community-driven aspect, potentially turning NotebookLM into a knowledge-sharing platform.

Recommended read:
References :
  • felloai.com: Article on Google working on an AI that will replace your doctor.
  • TestingCatalog: Google is developing Video Overviews feature for NotebookLM, including generative media features.

Sam Khosroshahi@lambdalabs.com //
References: Fello AI , MarkTechPost
NVIDIA is making significant strides in healthcare and robotics, leveraging the power of AI. A key development is the NVIDIA DNA LLM, designed to accelerate genomic research. This technology is set to transform how we analyze and interpret biological data, potentially leading to breakthroughs in curing diseases. NVIDIA is also working on new AI models, specifically multimodal models, to enhance physical common sense and embodied reasoning in AI systems, essential for applications like robotics.

NVIDIA's approach includes innovations like Evo 2, a large AI model for biology trained on a massive dataset of 9.3 trillion DNA base pairs, capable of not just analyzing but also generating entire genomic sequences. Furthermore, the introduction of Cosmos-Reason1, a vision-language model, aims to improve AI's ability to reason about physical environments, a critical advancement for robotics and self-driving vehicles. Dion Harris of NVIDIA provided context to Nvidia CEO Jensen Huang's GTC 2025 keynote, highlighting the company's focus on AI factories and advancements in simulation technology.

Recommended read:
References :
  • Fello AI: NVIDIA DNA LLM: The Power To Curing All Diseases?
  • MarkTechPost: This AI Paper from NVIDIA Introduces Cosmos-Reason1: A Multimodal Model for Physical Common Sense and Embodied Reasoning

Sam Khosroshahi@lambdalabs.com //
References: Fello AI , lambdalabs.com ,
NVIDIA is pushing the boundaries of artificial intelligence in healthcare and robotics, introducing several groundbreaking advancements. One notable innovation is the DNA LLM, designed to decode the complex genetic information found in DNA, RNA, and proteins. This tool aims to transform genomic research, potentially leading to new understandings and treatments for various diseases.

The company's commitment to AI extends to robotics with the release of Isaac GR00T N1, an open-source platform for humanoid robots. This initiative is expected to accelerate innovation in the field, providing developers with the resources needed to create more advanced and capable robots. Additionally, an NVIDIA research team has developed Hymba, a family of small language models that combine transformer attention with state space models, surpassing the Llama-3.2-3B model in performance while significantly reducing cache size and increasing throughput.

Recommended read:
References :
  • Fello AI: NVIDIA DNA LLM: The Power To Curing All Diseases?
  • lambdalabs.com: Lambda Honored to Accelerate AI Innovation in Healthcare with NVIDIA
  • Synced: NVIDIA’s Hybrid: Combining Attention and State Space Models for Breakthrough Performance of Small Language Models

Neel Patel@AI & Machine Learning //
References: Compute , IEEE Spectrum
Google Cloud and NVIDIA are collaborating to accelerate AI in healthcare by leveraging the NVIDIA BioNeMo framework and Google Kubernetes Engine (GKE). This partnership aims to speed up drug discovery and development by providing powerful infrastructure and tools for medical and pharmaceutical researchers. The NVIDIA BioNeMo platform is a generative AI framework enabling researchers to model and simulate biological sequences and structures, placing major demands for computing with powerful GPUs and scalable infrastructure.

With BioNeMo running on GKE, medical organizations can achieve breakthroughs and new research with levels of speed and effectiveness that were previously unheard of. Google DeepMind has also introduced Gemini Robotics, AI models built on Google's Gemini foundation model, enhancing robotics by integrating vision, language, and action. While AI isn't seen as a "silver bullet," Google DeepMind's Demis Hassabis emphasizes its undeniable benefits within five to ten years, as evidenced by developments like Alphafold 3, which accurately predicts the structure of molecules like DNA and RNA.

Recommended read:
References :
  • Compute: Accelerating AI in healthcare using NVIDIA BioNeMo Framework and Blueprints on GKE
  • IEEE Spectrum: With Gemini Robotics, Google Aims for Smarter Robots