News from the AI & ML world

DeeperML - #alphaevolve

@www.marktechpost.com //
Large Language Models (LLMs) are facing significant challenges in handling real-world conversations, particularly those involving multiple turns and underspecified tasks. Researchers from Microsoft and Salesforce have recently revealed a substantial performance drop of 39% in LLMs when confronted with such conversational scenarios. This decline highlights the difficulty these models have in maintaining contextual coherence and delivering accurate outcomes as conversations evolve and new information is incrementally introduced. Instead of flexibly adjusting to changing user inputs, LLMs often make premature assumptions, leading to errors that persist throughout the dialogue.

These findings underscore a critical gap in how LLMs are currently evaluated. Traditional benchmarks often rely on single-turn, fully-specified prompts, which fail to capture the complexities of real-world interactions where information is fragmented and context must be actively constructed from multiple exchanges. This discrepancy between evaluation methods and actual conversational demands contributes to the challenges LLMs face in integrating underspecified inputs and adapting to evolving user needs. The research emphasizes the need for new evaluation frameworks that better reflect the dynamic and iterative nature of real-world conversations.

In contrast to these challenges, Google's DeepMind has developed AlphaEvolve, an AI agent designed to optimize code and reclaim computational resources. AlphaEvolve autonomously rewrites critical code, resulting in a 0.7% reduction in Google's overall compute usage. This system not only pays for itself but also demonstrates the potential for AI agents to significantly improve efficiency in complex computational environments. AlphaEvolve's architecture, featuring a controller, fast-draft models, deep-thinking models, automated evaluators, and versioned memory, represents a production-grade approach to agent engineering. This allows for continuous improvement at scale.

Share: bluesky twitterx--v2 facebook--v1 threads


References :
  • AI News | VentureBeat: Google’s AlphaEvolve: The AI agent that reclaimed 0.7% of Google’s compute – and how to copy it.
  • MarkTechPost: LLMs Struggle with Real Conversations: Microsoft and Salesforce Researchers Reveal a 39% Performance Drop in Multi-Turn Underspecified Tasks.
Classification:
@Google DeepMind Blog //
Google DeepMind has announced AlphaEvolve, an AI coding agent that enhances the problem-solving abilities of large language models (LLMs) like Gemini. Unveiled in May 2025, AlphaEvolve utilizes a blend of LLMs' creative generation capabilities with genetic algorithms, marking a significant advancement in AI's capacity to automate and optimize coding processes. This system is not merely a code generator but a self-evolving AI designed to discover novel and efficient algorithms through automated code generation and testing.

AlphaEvolve operates through a process akin to natural selection for software. It begins with an initial code, potentially a "skeleton" provided by a human, and iteratively refines it. The system intelligently crafts prompts for the underlying Gemini LLM, instructing it to act as a world-class expert in a specific domain, utilizing context from previous attempts. The LLM then generates a diverse pool of candidate solutions, exploring different approaches to the problem at hand.

The AI agent has already demonstrated its prowess by improving Google’s data center energy efficiency by 0.7%, optimizing matrix multiplications to speed up Gemini’s training by 23%, and co-designing a portion of Google’s next AI chip. Furthermore, AlphaEvolve has tackled over 50 open problems in mathematics, even discovering a new lower bound for the 300-year-old kissing number problem in 11-dimensional space, surpassing the progress of expert mathematicians. This represents a significant stride toward production-grade AI agent engineering.

Share: bluesky twitterx--v2 facebook--v1 threads


References :
  • LearnAI: Google’s AlphaEvolve Is Evolving New Algorithms — And It Could Be a Game Changer
  • The Next Web: 5 impressive feats of DeepMind’s new self-evolving AI coding agent
  • Towards Data Science: A blend of LLMs' creative generation capabilities with genetic algorithms
  • www.unite.ai: Google DeepMind has unveiled AlphaEvolve, an evolutionary coding agent designed to autonomously discover novel algorithms and scientific solutions. Presented in the paper titled “AlphaEvolve: A Coding Agent for Scientific and Algorithmic Discovery,†this research represents a foundational step toward Artificial General Intelligence (AGI) and even Artificial Superintelligence (ASI).
Classification:
@Google DeepMind Blog //
Google DeepMind has unveiled AlphaEvolve, an AI agent powered by Gemini, that is revolutionizing algorithm discovery and scientific optimization. This innovative system combines the creative problem-solving capabilities of large language models (LLMs) with automated evaluators to verify solutions and iteratively improve upon promising ideas. AlphaEvolve represents a significant leap in AI's ability to develop sophisticated algorithms for both scientific challenges and everyday computing problems, expanding upon previous work by evolving entire codebases rather than single functions.

AlphaEvolve has already demonstrated its potential by breaking a 56-year-old mathematical record, discovering a more efficient matrix multiplication algorithm that had eluded human mathematicians. The system leverages an ensemble of state-of-the-art large language models, including Gemini Flash and Gemini Pro, to propose and refine algorithmic solutions as code. These programs are then evaluated using automated metrics, providing an objective assessment of accuracy and quality. This approach makes AlphaEvolve particularly valuable in domains where progress can be clearly and systematically measured, such as math and computer science.

The impact of AlphaEvolve extends beyond theoretical breakthroughs, with algorithms discovered by the system already deployed across Google's computing ecosystem. Notably, AlphaEvolve has enhanced the efficiency of Google's data centers, chip design, and AI training processes, including the training of the large language models underlying AlphaEvolve itself. It has also optimized a matrix multiplication kernel used to train Gemini models and found new solutions to open mathematical problems. By optimizing Google’s massive cluster management system, Borg, AlphaEvolve recovers an average of 0.7% of Google’s worldwide computing resources continuously, which translates to substantial cost savings.

Share: bluesky twitterx--v2 facebook--v1 threads


References :
  • Google DeepMind Blog: New AI agent evolves algorithms for math and practical applications in computing by combining the creativity of large language models with automated evaluators
  • venturebeat.com: VentureBeat article about Google DeepMind's AlphaEvolve system.
  • SiliconANGLE: Google DeepMind develops AlphaEvolve AI agent optimized for coding and math
  • MarkTechPost: Google DeepMind introduces AlphaEvolve: A Gemini-powered coding AI agent for algorithm discovery and scientific optimization
  • Maginative: Google's DeepMind Unveils AlphaEvolve, an AI System that Designs and Optimizes Algorithms
  • THE DECODER: AlphaEvolve is Google DeepMind's new AI system that autonomously creates better algorithms
  • www.marktechpost.com: Google DeepMind Introduces AlphaEvolve: A Gemini-Powered Coding AI Agent for Algorithm Discovery and Scientific Optimization
  • the-decoder.com: AlphaEvolve is Google DeepMind's new AI system that autonomously creates better algorithms
  • thetechbasic.com: DeepMind’s AlphaEvolve: A New Era of AI-Driven Problem Solving
  • The Tech Basic: DeepMind’s AlphaEvolve: A New Era of AI-Driven Problem Solving
  • LearnAI: AlphaEvolve: A Gemini-powered coding agent for designing advanced algorithms
  • The Next Web: Google DeepMind’s AI systems have taken big scientific strides in recent years — from predicting the 3D structures of almost every known protein in the universe to forecasting weather more accurately than ever before.  The UK-based lab today unveiled its latest advancement: AlphaEvolve, an AI coding agent that makes large language models (LLMs) like Gemini better at solving complex computing and mathematical problems.  AlphaEvolve is powered by the same models that it’s trying to improve.
  • learn.aisingapore.org: AlphaEvolve: A Gemini-powered coding agent for designing advanced algorithms
  • LearnAI: Google’s AlphaEvolve Is Evolving New Algorithms — And It Could Be a Game Changer
  • Towards Data Science: Google’s AlphaEvolve Is Evolving New Algorithms — And It Could Be a Game Changer
  • deepmind.google: Provides an overview of AlphaEvolve and its capabilities in designing advanced algorithms.
  • gregrobison.medium.com: AlphaEvolve: How AI-Driven Algorithm Discovery Is Rewriting Computing
  • www.unite.ai: Google DeepMind has unveiled AlphaEvolve, an evolutionary coding agent designed to autonomously discover novel algorithms and scientific solutions. Presented in the paper titled “AlphaEvolve: A Coding Agent for Scientific and Algorithmic Discovery,” this research represents a foundational step toward Artificial General Intelligence (AGI) and even Artificial Superintelligence (ASI).
  • learn.aisingapore.org: Google’s AlphaEvolve Is Evolving New Algorithms — And It Could Be a Game Changer
  • Unite.AI: Google DeepMind has unveiled AlphaEvolve, an evolutionary coding agent designed to autonomously discover novel algorithms and scientific solutions.
  • AI News | VentureBeat: Google's AlphaEvolve is the epitome of a best-practice AI agent orchestration. It offers a lesson in production-grade agent engineering. Discover its architecture & essential takeaways for your enterprise AI strategy.
  • AlternativeTo: Google has unveiled AlphaEvolve, an evolving coding agent that uses Gemini large language models and automated evaluators for discovering, evaluating, and optimizing computer algorithms for math and practical applications.
  • Last Week in AI: Last Week in AI #209 - OpenAI non-profit, US diffusion rules, AlphaEvolve
  • TheSequence: The model is pushing the boundaries of algorithmic discovery.
Classification: