News from the AI & ML world

DeeperML - #mit

Zach Winn@news.mit.edu //
MIT spinout Themis AI is tackling a critical issue in the field of artificial intelligence: AI "hallucinations" or instances where AI systems confidently provide incorrect or fabricated information. These inaccuracies can have serious consequences, particularly in high-stakes applications like drug development, autonomous driving, and information synthesis. Themis AI has developed a novel tool called Capsa, designed to quantify model uncertainty and enable AI models to recognize their limitations. Capsa works by modifying AI models to identify patterns in their data processing that indicate ambiguity, incompleteness, or bias. This allows the AI to "admit when it doesn't know," thereby improving the reliability and transparency of AI systems.

The core idea behind Themis AI's Capsa platform is to wrap existing AI models, identify uncertainties and potential failure modes, and then enhance the model's capabilities. Founded in 2021 by MIT Professor Daniela Rus, Alexander Amini, and Elaheh Ahmadi, Themis AI aims to enable safer and more trustworthy AI deployments across various industries. Capsa can be integrated with any machine-learning model to detect and correct unreliable outputs in seconds. The platform has already demonstrated its value in diverse sectors, including helping telecom companies with network planning, assisting oil and gas firms in analyzing seismic imagery, and contributing to the development of more reliable chatbots.

Themis AI’s work builds upon years of research at MIT into model uncertainty. Professor Rus's lab, with funding from Toyota, studied the reliability of AI for autonomous driving, a safety-critical application where accurate model understanding is paramount. The team also developed an algorithm capable of detecting and mitigating racial and gender bias in facial recognition systems. Amini emphasizes that Themis AI's software adds a crucial layer of self-awareness that has been missing in AI systems. The goal is to enable AI to forecast and predict its own failures before they occur, ensuring that these systems are used responsibly and effectively in critical decision-making processes.

Recommended read:
References :

@www.marktechpost.com //
MIT researchers are making significant strides in artificial intelligence, focusing on enhancing AI's ability to learn and interact with the world more naturally. One project involves developing AI models that can learn connections between vision and sound without human intervention. This innovative approach aims to mimic how humans learn, by associating what they see with what they hear. The model could be useful in applications such as journalism and film production, where the model could help with curating multimodal content through automatic video and audio retrieval.

The new machine-learning model can pinpoint exactly where a particular sound occurs in a video clip, eliminating the need for manual labeling. By adjusting how the original model is trained, it learns a finer-grained correspondence between a particular video frame and the audio that occurs in that moment. The enhancements improved the model’s ability to retrieve videos based on an audio query and predict the class of an audio-visual scene, like the sound of a roller coaster in action or an airplane taking flight. Researchers also made architectural tweaks that help the system balance two distinct learning objectives, which improves performance.

Additionally, researchers from the National University of Singapore have introduced 'Thinkless,' an adaptive framework designed to reduce unnecessary reasoning in language models. Thinkless reduces unnecessary reasoning by up to 90% using DeGRPO. By incorporating a novel algorithm called Decoupled Group Relative Policy Optimization (DeGRPO), Thinkless separates the training focus between selecting the reasoning mode and improving the accuracy of the generated response. This framework equips a language model with the ability to dynamically decide between using short or long-form reasoning, addressing the issue of resource-intensive and wasteful reasoning sequences for simple queries.

Recommended read:
References :
  • learn.aisingapore.org: AI learns how vision and sound are connected, without human intervention | MIT News
  • news.mit.edu: AI learns how vision and sound are connected, without human intervention
  • www.marktechpost.com: Researchers from the National University of Singapore Introduce ‘Thinkless,’ an Adaptive Framework that Reduces Unnecessary Reasoning by up to 90% Using DeGRPO
  • news.mit.edu: Learning how to predict rare kinds of failures
  • MarkTechPost: Researchers from the National University of Singapore Introduce ‘Thinkless,’ an Adaptive Framework that Reduces Unnecessary Reasoning by up to 90% Using DeGRPO

@learn.aisingapore.org //
MIT researchers have uncovered a critical flaw in vision-language models (VLMs) that could have serious consequences in high-stakes environments like medical diagnosis. The study, published May 14, 2025, reveals that these AI models, widely used to analyze medical images, struggle with negation words such as "no" and "not." This deficiency causes them to misinterpret queries, leading to potentially catastrophic errors when retrieving images based on the absence of certain objects. An example provided highlights the case of a radiologist using a VLM to find reports of patients with tissue swelling but without an enlarged heart, the model incorrectly identifying reports with both conditions, leading to an inaccurate diagnosis.

Researchers tested the ability of vision-language models to identify negation in image captions and found the models often performed as well as a random guess. To address this issue, the MIT team created a dataset of images with corresponding captions that include negation words describing missing objects. Retraining a vision-language model with this dataset resulted in improved performance when retrieving images that do not contain specific objects, and also boosted accuracy on multiple choice question answering with negated captions.

Kumail Alhamoud, the lead author of the study, emphasized the significant impact of negation words and the potential for catastrophic consequences if these models are used blindly. While the researchers were able to improve model performance through retraining, they caution that more work is needed to address the root causes of this problem. They hope their findings will alert potential users to this previously unnoticed shortcoming, especially in settings where these models are used to determine patient treatments or identify product defects. Marzyeh Ghassemi, the senior author, warned against using large vision/language models without intensive evaluation if something as fundamental as negation is broken.

Recommended read:
References :
  • learn.aisingapore.org: Study shows vision-language models can’t handle queries with negation words | MIT News
  • www.sciencedaily.com: Study shows vision-language models can't handle queries with negation words

Adam Zewe@news.mit.edu //
MIT researchers have unveiled a "periodic table of machine learning," a groundbreaking framework that organizes over 20 common machine-learning algorithms based on a unifying algorithm. This innovative approach allows scientists to combine elements from different methods, potentially leading to improved algorithms or the creation of entirely new ones. The researchers believe this framework will significantly fuel further AI discovery and innovation by providing a structured approach to understanding and developing machine learning techniques.

The core concept behind this "periodic table" is that all these algorithms, while seemingly different, learn a specific kind of relationship between data points. Although the way each algorithm accomplishes this may vary, the fundamental mathematics underlying each approach remains consistent. By identifying a unifying equation, the researchers were able to reframe popular methods and arrange them into a table, categorizing each based on the relationships it learns. Shaden Alshammari, an MIT graduate student and lead author of the related paper, emphasizes that this is not just a metaphor, but a structured system for exploring machine learning.

Just like the periodic table of chemical elements, this new framework contains empty spaces, representing algorithms that should exist but haven't been discovered yet. These spaces act as predictions, guiding researchers toward unexplored areas within machine learning. To illustrate the framework's potential, the researchers combined elements from two different algorithms, resulting in a new image-classification algorithm that outperformed current state-of-the-art approaches by 8 percent. The researchers hope that this "periodic table" will serve as a toolkit, allowing researchers to design new algorithms without needing to rediscover ideas from prior approaches.

Recommended read:
References :
  • news.mit.edu: Researchers have created a unifying framework that can help scientists combine existing ideas to improve AI models or create new ones.
  • www.sciencedaily.com: After uncovering a unifying algorithm that links more than 20 common machine-learning approaches, researchers organized them into a 'periodic table of machine learning' that can help scientists combine elements of different methods to improve algorithms or create new ones.
  • techxplore.com: MIT researchers have created a periodic table that shows how more than 20 classical machine-learning algorithms are connected. The new framework sheds light on how scientists could fuse strategies from different methods to improve existing AI models or come up with new ones.
  • learn.aisingapore.org: This article discusses “Periodic table of machine learning†could fuel AI discovery | MIT News

Jennifer Chu@news.mit.edu //
MIT researchers have recently made significant strides in artificial intelligence, focusing on enhancing robotics, code generation, and system optimization. One project involves a novel robotic system designed to efficiently identify and prioritize objects relevant to assisting humans. By cutting through data noise, the robot can focus on crucial features in a scene, making it ideal for collaborative environments like smart manufacturing and warehouses. This innovative approach could lead to more intuitive and safer robotic helpers in various settings.

Researchers have also developed a new method to improve the accuracy of AI-generated code in any programming language. This approach guides large language models (LLMs) to produce error-free code that adheres to the rules of the specific language being used. By allowing LLMs to focus on outputs most likely to be valid and accurate, while discarding unpromising outputs early on, the system achieves greater computational efficiency. This advancement could help non-experts control AI-generated content and enhance tools for AI-powered data analysis and scientific discovery.

A new methodology for optimizing complex coordinated systems has emerged from MIT, utilizing simple diagrams to refine software optimization in deep-learning models. This diagram-based "language," rooted in category theory, simplifies the process of designing computer algorithms that control various system components. By revealing relationships between algorithms and parallelized GPU hardware, this approach makes it easier to optimize resource usage and manage the intricate interactions between different parts of a system, potentially revolutionizing the way complex systems are designed and controlled.

Recommended read:
References :
  • learn.aisingapore.org: This document is about designing a new way to optimize complex coordinated systems.
  • news.mit.edu: A robotic system that zeroes in on objects most relevant for helping humans

@learn.aisingapore.org //
References: LearnAI , news.mit.edu , techxplore.com ...
MIT researchers have achieved a breakthrough in artificial intelligence, specifically aimed at enhancing the accuracy of AI-generated code. This advancement focuses on guiding large language models (LLMs) to produce outputs that strictly adhere to the rules and structures of various programming languages, preventing common errors that can cause system crashes. The new technique, developed by MIT and collaborators, ensures that the AI's focus remains on generating valid and accurate code by quickly discarding less promising outputs. This approach not only improves code quality but also significantly boosts computational efficiency.

This efficiency gain allows smaller LLMs to perform better than larger models in producing accurate and well-structured outputs across diverse real-world scenarios, including molecular biology and robotics. The new method tackles issues with existing methods which distort the model’s intended meaning or are too time-consuming for complex tasks. Researchers developed a more efficient way to control the outputs of a large language model, guiding it to generate text that adheres to a certain structure, like a programming language, and remains error free.

The implications of this research extend beyond academic circles, potentially revolutionizing programming assistants, AI-driven data analysis, and scientific discovery tools. By enabling non-experts to control AI-generated content, such as business professionals creating complex SQL queries using natural language prompts, this architecture could democratize access to advanced programming and data manipulation. The findings will be presented at the International Conference on Learning Representations.

Recommended read:
References :
  • LearnAI: Making AI-generated code more accurate in any language | MIT News Programmers can now use large language models (LLMs) to generate computer code more quickly. However, this only makes programmers’ lives easier if that code follows the rules of the programming language and doesn’t cause a computer to crash.
  • news.mit.edu: A new technique automatically guides an LLM toward outputs that adhere to the rules of whatever programming language or other format is being used.
  • learn.aisingapore.org: Making AI-generated code more accurate in any language | MIT News
  • techxplore.com: Making AI-generated code more accurate in any language