News from the AI & ML world

DeeperML - #openai

Steve Newman@Second Thoughts //
New research suggests that the integration of AI coding tools into the development process may not be the productivity silver bullet many have assumed. A recent study conducted by METR, a non-profit AI benchmarking group, observed experienced open-source developers working on complex, mature codebases. Counterintuitively, the findings indicate that these AI tools actually slowed down task completion time by 19%. This slowdown is attributed to factors such as the time spent prompting the AI, waiting for responses, and meticulously reviewing and correcting the generated output. Despite this empirical evidence, many developers continued to use the tools, reporting that the work felt less effortful, even if it wasn't faster.

The study involved 16 seasoned developers and 246 real-world programming tasks. Before engaging with the AI tools, participants optimistically predicted a 24% increase in their productivity. However, after the trial, their revised estimates still overestimated the gains, believing AI had sped up their work by 20%, a stark contrast to the actual observed slowdown of 19%. Furthermore, fewer than 44% of the AI-generated code suggestions were accepted by the developers, with a significant portion of their time dedicated to refining or rewriting the AI's output. Lack of contextual knowledge and the complexity of existing repositories were cited as key reasons for the reduced effectiveness of the AI suggestions.

While the study highlights a potential downside for experienced developers working on established projects, the researchers acknowledge that AI tools may offer greater benefits in other settings. These could include smaller projects, less experienced developers, or situations with different quality standards. This research adds a crucial layer of nuance to the broader narrative surrounding AI's impact on software development, suggesting that the benefits are not universal and may require careful evaluation on a case-by-case basis as the technology continues to evolve.

Share: bluesky twitterx--v2 facebook--v1 threads


References :
  • Marcus on AI: Coding has been the strongest use case. But a new study from METR just dropped.
  • Erik Moeller: Pretty sensibly designed study that focuses on Cursor use in particular and shows that agents slow things down rather than speeding them up for experienced folks maintaining large, complex codebases: That matches my experience so far; they're still too likely to make dumb or destructive suggestions or go in circles.
  • Bernard Marr: Study Shows That Even Experienced Developers Dramatically Overestimate Gains
  • Second Thoughts: Study Shows That Even Experienced Developers Dramatically Overestimate Gains
  • NextBigFuture.com: Study Shows That Even Experienced Developers Dramatically Overestimate Gains
  • Peter Lawrey: It's a mistake to assume AI saves time, especially for experienced developers. For senior developers, "analysis reveals that AI actually increased task completion time by 19%. ... However, despite the slowdown, many developers continued to use AI tools because the work felt less effortful, making work feel more pleasant even if it wasn't faster."
  • The Register - Software: AI coding tools make developers slower but they think they're faster, study finds
  • www.infoworld.com: AI coding tools can slow down seasoned developers by 19%
  • www.techradar.com: It's a mistake to assume AI saves time, especially for experienced developers. For senior developers, analysis reveals that AI actually increased task completion time by 19%.
  • bsky.app: It's a mistake to assume AI saves time, especially for experienced developers. For senior developers, "analysis reveals that AI actually increased task completion time by 19%. ... https://www.techradar.com/pro/using-ai-might-actually-slow-down-experienced-devs
  • metr.org: Pretty sensibly designed study that focuses on Cursor use in particular and shows that agents slow things down rather than speeding them up for experienced folks maintaining large, complex codebases
  • PCMag Middle East ai: Tasks like prompting the AI, waiting for responses, and reviewing its output for errors actually slowed down developers in the study by 19% compared to the control group.
  • Digital Information World: Conducted by the non-profit group , the research tracked the performance of 16 long-time contributors to open-source projects as they completed a series of real-world programming tasks.
Classification:
M.G. Siegler@Spyglass //
In a significant development in the AI landscape, Google DeepMind has successfully recruited Windsurf's CEO, Varun Mohan, and key members of his R&D team. This strategic move follows the collapse of OpenAI's rumored $3 billion acquisition deal for the AI coding startup Windsurf. The unexpected twist saw Google swooping in to license Windsurf's technology for $2.4 billion and securing top talent for its own advanced projects. This development signals a highly competitive environment for AI innovation, with major players actively seeking to bolster their capabilities.

Google's acquisition of Windsurf's leadership and technology is primarily aimed at strengthening its DeepMind division, particularly for agentic coding projects and the enhancement of its Gemini model. Varun Mohan and co-founder Douglas Chen are expected to spearhead efforts in developing AI agents capable of writing test code, refactoring projects, and automating developer workflows. This integration is poised to boost Google's position in the AI coding sector, directly countering OpenAI's attempts to enhance its expertise in this critical area. The financial details of Google's non-exclusive license for Windsurf's technology have been kept confidential, but the substantial sum indicates the high value placed on Windsurf's innovations.

The fallout from the failed OpenAI deal has left Windsurf in a precarious position. While the company remains independent and will continue to license its technology, it has lost its founding leadership and a portion of its technical advantage. Jeff Wang has stepped up as interim CEO to guide the company, with the majority of its 250 employees remaining. The situation highlights the intense competition and the fluid nature of talent acquisition in the rapidly evolving AI industry, where startups like Windsurf can become caught between tech giants vying for dominance.

Share: bluesky twitterx--v2 facebook--v1 threads


References :
  • Maginative: OpenAI's Windsurf Deal is Dead — Google just Poached the CEO Instead
  • TestingCatalog: Countdown starts for Deep Think rollout while Agent Mode surfaces in code
  • bdtechtalks.com: Google’s reaps the rewards as OpenAI’s deal to acquire Windsurf collapses
  • The Tech Basic: Google DeepMind Snaps Up Windsurf CEO After OpenAI Deal Unravels
  • bdtechtalks.com: The post details the collapse of OpenAI's deal to acquire Windsurf.
  • devops.com: OpenAI’s $3 billion bid to buy artificial intelligence (AI) coding startup Windsurf crumbled late Friday, and rival Alphabet Inc.’s Google quickly picked up the pieces
  • thetechbasic.com: Google DeepMind Snaps Up Windsurf CEO After OpenAI Deal Unravels
Classification:
Rashi Shrivastava,@Rashi Shrivastava //
OpenAI is making significant strides in AI training and infrastructure. Sam Altman, CEO of OpenAI, envisions a new type of computer designed specifically for AI, suggesting current devices are not optimized for advanced AI capabilities. This new hardware aims to support always-on, context-aware AI assistants that can understand and act on a user's environment, schedule, and preferences in real-time. These AI-first computers could handle tasks like booking travel, summarizing content, and planning daily schedules through an intelligent interface.

OpenAI is also actively involved in initiatives to improve AI literacy. The company is backing a new AI training academy for teachers, indicating a focus on integrating AI more effectively into education. Furthermore, OpenAI continues to refine its language models, such as ChatGPT, for diverse applications, including creating and grading assignments within the classroom setting. This effort reflects a broader push to enhance coding workflows and other tasks.

Adding to their suite of AI tools, OpenAI is reportedly preparing to launch a new AI-powered web browser. This browser is expected to rival Google Chrome, and is designed with a ChatGPT-like interface. Instead of traditional website navigation, interactions would be handled through the AI, streamlining tasks and potentially offering a more direct way to access information. Such a move could give OpenAI direct access to user data, which is crucial for enhancing their AI models and improving targeted advertising capabilities.

Share: bluesky twitterx--v2 facebook--v1 threads


References :
  • www.tomsguide.com: OpenAI's Sam Altman says your computer isn’t built for AI — so it’s creating something entirely new
  • Towards AI: AI in the Classroom: Create and Grade Assignments with ChatGPT
  • Rashi Shrivastava: The Prompt: OpenAI Backs New AI Training Academy For Teachers
Classification: