News from the AI & ML world

DeeperML - #universities

@www.linkedin.com //
Universities are increasingly integrating artificial intelligence into education, not only to enhance teaching methodologies but also to equip students with the essential AI skills they'll need in the future workforce. There's a growing understanding that students should learn how to use AI tools effectively and ethically, rather than simply relying on them as a shortcut for completing assignments. This shift involves incorporating AI into the curriculum in meaningful ways, ensuring students understand both the capabilities and limitations of these technologies.

Estonia is taking a proactive approach with the launch of AI chatbots designed specifically for high school classrooms. This initiative aims to familiarize students with AI in a controlled educational environment. The goal is to empower students to use AI tools responsibly and effectively, moving beyond basic applications to more sophisticated problem-solving and critical thinking.

Furthermore, Microsoft is introducing new AI features for educators within Microsoft 365 Copilot, including Copilot Chat for teens. Microsoft's 2025 AI in Education Report highlights that over 80% of surveyed educators are using AI, but a significant portion still lack confidence in its effective and responsible use. These initiatives aim to provide necessary training and guidance to teachers and administrators, ensuring they can integrate AI seamlessly into their instruction.

Recommended read:
References :

@www.marktechpost.com //
Apple researchers are challenging the perceived reasoning capabilities of Large Reasoning Models (LRMs), sparking debate within the AI community. A recent paper from Apple, titled "The Illusion of Thinking," suggests that these models, which generate intermediate thinking steps like Chain-of-Thought reasoning, struggle with fundamental reasoning tasks. The research indicates that current evaluation methods relying on math and code benchmarks are insufficient, as they often suffer from data contamination and fail to assess the structure or quality of the reasoning process.

To address these shortcomings, Apple researchers introduced controllable puzzle environments, including the Tower of Hanoi, River Crossing, Checker Jumping, and Blocks World, allowing for precise manipulation of problem complexity. These puzzles require diverse reasoning abilities, such as constraint satisfaction and sequential planning, and are free from data contamination. The Apple paper concluded that state-of-the-art LRMs ultimately fail to develop generalizable problem-solving capabilities, with accuracy collapsing to zero beyond certain complexities across different environments.

However, the Apple research has faced criticism. Experts, like Professor Seok Joon Kwon, argue that Apple's lack of high-performance hardware, such as a large GPU-based cluster comparable to those operated by Google or Microsoft, could be a factor in their findings. Some argue that the models perform better on familiar puzzles, suggesting that their success may be linked to training exposure rather than genuine problem-solving skills. Others, such as Alex Lawsen and "C. Opus," argue that the Apple researchers' results don't support claims about fundamental reasoning limitations, but rather highlight engineering challenges related to token limits and evaluation methods.

Recommended read:
References :
  • TheSequence: The Sequence Research #663: The Illusion of Thinking, Inside the Most Controversial AI Paper of Recent Weeks
  • chatgptiseatingtheworld.com: Research: Did Apple researchers overstate “The Illusion of Thinking†in reasoning models. Opus, Lawsen think so.
  • www.marktechpost.com: Apple Researchers Reveal Structural Failures in Large Reasoning Models Using Puzzle-Based Evaluation
  • arstechnica.com: New Apple study challenges whether AI models truly “reason†through problems
  • 9to5Mac: New paper pushes back on Apple’s LLM ‘reasoning collapse’ study

@www.quantamagazine.org //
Researchers are making strides in AI reasoning and efficiency, tackling both complex problem-solving and the energy consumption of these systems. One promising area involves reversible computing, where programs can run backward as easily as forward, theoretically saving energy by avoiding data deletion. Michael Frank, a researcher interested in the physical limits of computation, discovered that reversible computing could keep computational progress going as traditional computing slows due to physical limitations. Christof Teuscher at Portland State University emphasized the potential for significant power savings with this approach.

An evolution of the LLM-as-a-Judge paradigm is emerging. Meta AI has introduced the J1 framework which shifts the paradigm of LLMs from passive generators to active, deliberative evaluators through self-evaluation. This approach, detailed in "J1: Incentivizing Thinking in LLM-as-a-Judge via Reinforcement Learning," addresses the growing need for rigorous and scalable evaluation as AI systems become more capable and widely deployed. By reframing judgment as a structured reasoning task trained through reinforcement learning, J1 aims to create models that perform consistent, interpretable, and high-fidelity evaluations.

Soheil Feizi, an associate professor at the University of Maryland, has received a $1 million federal grant to advance foundational research in reasoning AI models. This funding, stemming from a Presidential Early Career Award for Scientists and Engineers (PECASE), will support his work in defending large language models (LLMs) against attacks, identifying weaknesses in how these models learn, encouraging transparent, step-by-step logic, and understanding the "reasoning tokens" that drive decision-making. Feizi plans to explore innovative approaches like live activation probing and novel reinforcement-learning designs, aiming to transform theoretical advancements into practical applications and real-world usages.

Recommended read:
References :

Chris McKay@Maginative //
Anthropic has unveiled Claude for Education, a specialized AI assistant designed to cultivate critical thinking skills in students. Unlike conventional AI tools that simply provide answers, Claude employs a Socratic-based "Learning Mode" that prompts students with guiding questions, encouraging them to engage in deeper reasoning and problem-solving. This innovative approach aims to address concerns about AI potentially hindering intellectual development by promoting shortcut thinking.

Partnerships with Northeastern University, the London School of Economics, and Champlain College will integrate Claude across multiple campuses, reaching tens of thousands of students. These institutions are making a significant investment in AI, betting that it can improve the learning process. Faculty can use Claude to generate rubrics aligned with learning outcomes and create chemistry equations, while administrative staff can analyze enrollment trends and simplify policy documents. These institutions are testing the system across teaching, research, and administrative workflows.

Recommended read:
References :
  • THE DECODER: Anthropic brings AI assistant Claude to university campuses
  • venturebeat.com: Anthropic flips the script on AI in education: Claude’s Learning Mode makes students do the thinking
  • Maginative: Anthropic Launches Claude for Education with Learning Mode to Foster Critical Thinking in Universities
  • Analytics India Magazine: Anthropic Launches Claude for Education to Support Universities with AI
  • eWEEK: Claude AI Goes to College With Anthropic’s New Education-Specific Learning Mode
  • www.zdnet.com: No, it won't just do their homework for them. Plus, it helps teachers create rubrics and provide feedback.
  • TheSequence: Anthropic's recent journey into the mind of Claude.
  • thezvi.wordpress.com: A new Anthropic paper reports that reasoning model chain of thought (CoT) is often unfaithful. They test on Claude Sonnet 3.7 and r1, I’d love to see someone try this on o3 as well. Note that this does not have …
  • thezvi.substack.com: A new Anthropic paper reports that reasoning model chain of thought (CoT) is often unfaithful. They test on Claude Sonnet 3.7 and r1, I’d love to see someone try this on o3 as well.
  • LearnAI: Anthropic’s Claude for Education offering provides specialized AI assistance for educational purposes, designed to enhance critical thinking.
  • Analytics Vidhya: Anthropic has introduced a new AI learning tool called Claude for Education.