News from the AI & ML world

DeeperML - #aicoding

Steve Newman@Second Thoughts //
New research suggests that the integration of AI coding tools into the development process may not be the productivity silver bullet many have assumed. A recent study conducted by METR, a non-profit AI benchmarking group, observed experienced open-source developers working on complex, mature codebases. Counterintuitively, the findings indicate that these AI tools actually slowed down task completion time by 19%. This slowdown is attributed to factors such as the time spent prompting the AI, waiting for responses, and meticulously reviewing and correcting the generated output. Despite this empirical evidence, many developers continued to use the tools, reporting that the work felt less effortful, even if it wasn't faster.

The study involved 16 seasoned developers and 246 real-world programming tasks. Before engaging with the AI tools, participants optimistically predicted a 24% increase in their productivity. However, after the trial, their revised estimates still overestimated the gains, believing AI had sped up their work by 20%, a stark contrast to the actual observed slowdown of 19%. Furthermore, fewer than 44% of the AI-generated code suggestions were accepted by the developers, with a significant portion of their time dedicated to refining or rewriting the AI's output. Lack of contextual knowledge and the complexity of existing repositories were cited as key reasons for the reduced effectiveness of the AI suggestions.

While the study highlights a potential downside for experienced developers working on established projects, the researchers acknowledge that AI tools may offer greater benefits in other settings. These could include smaller projects, less experienced developers, or situations with different quality standards. This research adds a crucial layer of nuance to the broader narrative surrounding AI's impact on software development, suggesting that the benefits are not universal and may require careful evaluation on a case-by-case basis as the technology continues to evolve.

Recommended read:
References :
  • Marcus on AI: Coding has been the strongest use case. But a new study from METR just dropped.
  • Erik Moeller: Pretty sensibly designed study that focuses on Cursor use in particular and shows that agents slow things down rather than speeding them up for experienced folks maintaining large, complex codebases: That matches my experience so far; they're still too likely to make dumb or destructive suggestions or go in circles.
  • Bernard Marr: Study Shows That Even Experienced Developers Dramatically Overestimate Gains
  • Second Thoughts: Study Shows That Even Experienced Developers Dramatically Overestimate Gains
  • NextBigFuture.com: Study Shows That Even Experienced Developers Dramatically Overestimate Gains
  • Peter Lawrey: It's a mistake to assume AI saves time, especially for experienced developers. For senior developers, "analysis reveals that AI actually increased task completion time by 19%. ... However, despite the slowdown, many developers continued to use AI tools because the work felt less effortful, making work feel more pleasant even if it wasn't faster."
  • The Register - Software: AI coding tools make developers slower but they think they're faster, study finds
  • www.infoworld.com: AI coding tools can slow down seasoned developers by 19%
  • www.techradar.com: It's a mistake to assume AI saves time, especially for experienced developers. For senior developers, analysis reveals that AI actually increased task completion time by 19%.
  • bsky.app: It's a mistake to assume AI saves time, especially for experienced developers. For senior developers, "analysis reveals that AI actually increased task completion time by 19%. ... https://www.techradar.com/pro/using-ai-might-actually-slow-down-experienced-devs
  • metr.org: Pretty sensibly designed study that focuses on Cursor use in particular and shows that agents slow things down rather than speeding them up for experienced folks maintaining large, complex codebases
  • PCMag Middle East ai: Tasks like prompting the AI, waiting for responses, and reviewing its output for errors actually slowed down developers in the study by 19% compared to the control group.
  • Digital Information World: Conducted by the non-profit group , the research tracked the performance of 16 long-time contributors to open-source projects as they completed a series of real-world programming tasks.

M.G. Siegler@Spyglass //
In a significant development in the AI landscape, Google DeepMind has successfully recruited Windsurf's CEO, Varun Mohan, and key members of his R&D team. This strategic move follows the collapse of OpenAI's rumored $3 billion acquisition deal for the AI coding startup Windsurf. The unexpected twist saw Google swooping in to license Windsurf's technology for $2.4 billion and securing top talent for its own advanced projects. This development signals a highly competitive environment for AI innovation, with major players actively seeking to bolster their capabilities.

Google's acquisition of Windsurf's leadership and technology is primarily aimed at strengthening its DeepMind division, particularly for agentic coding projects and the enhancement of its Gemini model. Varun Mohan and co-founder Douglas Chen are expected to spearhead efforts in developing AI agents capable of writing test code, refactoring projects, and automating developer workflows. This integration is poised to boost Google's position in the AI coding sector, directly countering OpenAI's attempts to enhance its expertise in this critical area. The financial details of Google's non-exclusive license for Windsurf's technology have been kept confidential, but the substantial sum indicates the high value placed on Windsurf's innovations.

The fallout from the failed OpenAI deal has left Windsurf in a precarious position. While the company remains independent and will continue to license its technology, it has lost its founding leadership and a portion of its technical advantage. Jeff Wang has stepped up as interim CEO to guide the company, with the majority of its 250 employees remaining. The situation highlights the intense competition and the fluid nature of talent acquisition in the rapidly evolving AI industry, where startups like Windsurf can become caught between tech giants vying for dominance.

Recommended read:
References :
  • Maginative: OpenAI's Windsurf Deal is Dead — Google just Poached the CEO Instead
  • TestingCatalog: Countdown starts for Deep Think rollout while Agent Mode surfaces in code
  • bdtechtalks.com: Google’s reaps the rewards as OpenAI’s deal to acquire Windsurf collapses
  • The Tech Basic: Google DeepMind Snaps Up Windsurf CEO After OpenAI Deal Unravels
  • bdtechtalks.com: The post details the collapse of OpenAI's deal to acquire Windsurf.
  • devops.com: OpenAI’s $3 billion bid to buy artificial intelligence (AI) coding startup Windsurf crumbled late Friday, and rival Alphabet Inc.’s Google quickly picked up the pieces
  • thetechbasic.com: Google DeepMind Snaps Up Windsurf CEO After OpenAI Deal Unravels

@www.infoq.com //
Google has launched Gemini CLI, a new open-source AI command-line interface that brings the full capabilities of its Gemini 2.5 Pro model directly into developers' terminals. Designed for flexibility, transparency, and developer-first workflows, Gemini CLI provides high-performance, natural language AI assistance through a lightweight, locally accessible interface. Last Week in AI #314 also mentioned Gemini CLI, placing it alongside other significant AI developments. Google aims to empower developers by providing a tool that enhances productivity and streamlines AI workflows.

This move has potentially major implications for the AI coding assistant market, especially for developers who previously relied on costly tools. An article on Towards AI highlights that Gemini CLI could effectively eliminate the need for $200/month AI coding tools. This is because it will match or beat expensive tools for $0. The open-source nature of Gemini CLI fosters community-driven development and transparency, enabling developers to customize and extend the tool to suit their specific needs.

Google is also integrating Gemini with other development tools to create a more robust AI development ecosystem. Build Smarter AI Workflows with Gemini + AutoGen + Semantic Kernel suggests that Gemini CLI can be combined with other frameworks to enhance AI workflow. This is a new step to provide developers with a complete suite of tools. Google's launch of Gemini CLI not only underscores its commitment to open-source AI development but also democratizes access to advanced AI capabilities, making them available to a wider range of developers.

Recommended read:
References :
  • Towards AI: Google Just Killed $200/Month AI Coding Tools With This Free Terminal Assistant
  • Last Week in AI: Google is bringing Gemini CLI to developers’ terminals, Anthropic now lets you make apps right from its Claude AI chatbot, and more!
  • www.infoq.com: Google Launches Gemini CLI: Open-Source Terminal AI Agent for Developers
  • www.theverge.com: Google is bringing Gemini CLI to developers’ terminals

@www.apple.com //
References: Nicola Iarocci , IEEE Spectrum ,
AI is rapidly changing the landscape of software development, presenting both opportunities and challenges for developers. While AI coding tools are boosting productivity on stable and mature technologies, some developers worry about the potential loss of the creative aspect of coding. Many developers enjoy the deep immersion and problem-solving that comes from traditional coding methods. The rise of AI-assisted coding necessitates a careful evaluation of which tasks should be delegated to AI and which should remain in the hands of human developers.

AI coding is particularly beneficial for well-established technologies like the C#/.NET stack, significantly increasing efficiency. Tools like Claude Code allow developers to delegate routine tasks, leading to faster development cycles. However, this shift can also lead to a sense of detachment from the creative process, where developers become more like curators, evaluating and tweaking AI-generated code rather than crafting each function from scratch. The concern is whether this new workflow will lead to an industry full of highly productive but less engaged developers.

Despite these concerns, it appears that agentic coding is here to stay due to its efficiency, especially in smaller teams. Experts suggest preserving space for creative flow in some projects, perhaps by resisting the temptation to fully automate tasks in open-source projects. AI coding tools are also becoming more accessible, with platforms like VS Code extending support for Model Context Protocol (MCP) servers, which integrate AI agents with various external tools and services. The future of software development will likely involve a balance between AI assistance and human creativity, requiring developers to adapt to new workflows and prioritize tasks that require human insight and innovation.

Recommended read:
References :
  • Nicola Iarocci: I’ve been doing “agentic coding†for some time, and well, it’s weird. On stable, mature technology (in my case, the C#/.NET stack), it is beneficial, as it significantly boosts productivity.
  • IEEE Spectrum: The Best AI Coding Tools You Can Use Right Now
  • github.blog: Why developer expertise matters more than ever in the age of AI

Matthew S.@IEEE Spectrum //
References: Matt Corey , IEEE Spectrum ,
AI coding tools are transforming software development, offering developers increased speed and greater ambition in their projects. Tools like Anthropic's Claude Code and Cursor are gaining traction for their ability to assist with code generation, debugging, and adaptation across different platforms. This assistance is translating into substantial time savings, enabling developers to tackle more complex projects that were previously considered too time-intensive.

Developers are reporting significant improvements in their workflows with the integration of AI. Matt Corey (@matt1corey@iosdev.space) highlighted that Claude Code has not only accelerated his work but has also empowered him to be more ambitious in the types of projects he undertakes. Tools like Claude have allowed users to add features they might not have bothered with previously due to time constraints.

The benefits extend to code adaptation as well. balloob (@balloob@fosstodon.org) shared an experience of using Claude to adapt code from one integration to another in Home Assistant. By pointing Claude at a change in one integration and instructing it to apply the same change to another similar integration, balloob was able to save days of work. This capability demonstrates the power of AI in streamlining repetitive tasks and boosting overall developer productivity.

Recommended read:
References :
  • Matt Corey: User testimonial about increased speed and ambition due to Claude Code.
  • IEEE Spectrum: Overview of AI coding tools, including Cursor and Anthropic's Claude Code.
  • Matt Corey: With Claude Code, I did all of this work in 2 days, PLUS refined some animations in the app and fixed a few small bugs that I found. And I only started using Claude Code 3 weeks ago. I can't wait to see the kind of impact this will have on my business.

@siliconangle.com //
Anysphere Inc., the company behind the AI-powered code editor Cursor, has announced a massive $900 million funding round, rocketing its valuation to $9.9 billion. The Series C funding was led by Thrive Capital, with significant participation from Andreessen Horowitz, Accel, and DST Global. This funding round confirms recent rumors and highlights the immense investor confidence in the future of AI-driven software development. The company, launched in 2023 by MIT alumni, has rapidly become a popular AI-first coding environment.

The valuation increase reflects Anysphere's impressive sales growth, reaching $500 million in annualized recurring revenue (ARR) just three years after launching. Cursor is reportedly generating nearly a billion lines of AI-assisted code per day. Investors estimate that this growth rate makes Anysphere the fastest-growing software startup of all time. Cursor's widespread adoption within major tech firms such as NVIDIA, Uber, and Adobe, where it is used by more than half of the Fortune 500, further solidifies its market position.

Cursor is based on VS Code and is designed to automate programming tasks through its AI capabilities, including an embedded chatbot that generates code and provides technical explanations. It helps developers perform tasks using natural language processing to generate corresponding terminal commands. The code editor also functions as a spell checker, identifying and correcting both obvious and subtle bugs. Anysphere generates revenue through paid versions of Cursor, with Pro and Enterprise tiers offering increased usage limits and enhanced features. This new funding should enable Anysphere to further its AI coding research and address competition.

Recommended read:
References :
  • siliconangle.com: Anysphere raises $900M for its AI-powered Cursor code editor
  • NextBigFuture.com: AI Programming Company Cursor Raises $900 Million
  • www.unite.ai: Cursor AI Rockets to $9.9 Billion Valuation with Massive $900 Million Raise
  • SiliconANGLE: Anysphere raises $900M for its AI-powered Cursor code editor