News from the AI & ML world
Steve Newman@Second Thoughts
//
New research suggests that the integration of AI coding tools into the development process may not be the productivity silver bullet many have assumed. A recent study conducted by METR, a non-profit AI benchmarking group, observed experienced open-source developers working on complex, mature codebases. Counterintuitively, the findings indicate that these AI tools actually slowed down task completion time by 19%. This slowdown is attributed to factors such as the time spent prompting the AI, waiting for responses, and meticulously reviewing and correcting the generated output. Despite this empirical evidence, many developers continued to use the tools, reporting that the work felt less effortful, even if it wasn't faster.
The study involved 16 seasoned developers and 246 real-world programming tasks. Before engaging with the AI tools, participants optimistically predicted a 24% increase in their productivity. However, after the trial, their revised estimates still overestimated the gains, believing AI had sped up their work by 20%, a stark contrast to the actual observed slowdown of 19%. Furthermore, fewer than 44% of the AI-generated code suggestions were accepted by the developers, with a significant portion of their time dedicated to refining or rewriting the AI's output. Lack of contextual knowledge and the complexity of existing repositories were cited as key reasons for the reduced effectiveness of the AI suggestions.
While the study highlights a potential downside for experienced developers working on established projects, the researchers acknowledge that AI tools may offer greater benefits in other settings. These could include smaller projects, less experienced developers, or situations with different quality standards. This research adds a crucial layer of nuance to the broader narrative surrounding AI's impact on software development, suggesting that the benefits are not universal and may require careful evaluation on a case-by-case basis as the technology continues to evolve.
ImgSrc: substackcdn.com
References :
- Marcus on AI: Coding has been the strongest use case. But a new study from METR just dropped.
- Erik Moeller: Pretty sensibly designed study that focuses on Cursor use in particular and shows that agents slow things down rather than speeding them up for experienced folks maintaining large, complex codebases: That matches my experience so far; they're still too likely to make dumb or destructive suggestions or go in circles.
- Bernard Marr: Study Shows That Even Experienced Developers Dramatically Overestimate Gains
- Second Thoughts: Study Shows That Even Experienced Developers Dramatically Overestimate Gains
- NextBigFuture.com: Study Shows That Even Experienced Developers Dramatically Overestimate Gains
- Peter Lawrey: It's a mistake to assume AI saves time, especially for experienced developers. For senior developers, "analysis reveals that AI actually increased task completion time by 19%. ... However, despite the slowdown, many developers continued to use AI tools because the work felt less effortful, making work feel more pleasant even if it wasn't faster."
- The Register - Software: AI coding tools make developers slower but they think they're faster, study finds
- www.infoworld.com: AI coding tools can slow down seasoned developers by 19%
- www.techradar.com: It's a mistake to assume AI saves time, especially for experienced developers. For senior developers, analysis reveals that AI actually increased task completion time by 19%.
- bsky.app: It's a mistake to assume AI saves time, especially for experienced developers. For senior developers, "analysis reveals that AI actually increased task completion time by 19%. ... https://www.techradar.com/pro/using-ai-might-actually-slow-down-experienced-devs
- metr.org: Pretty sensibly designed study that focuses on Cursor use in particular and shows that agents slow things down rather than speeding them up for experienced folks maintaining large, complex codebases
- PCMag Middle East ai: Tasks like prompting the AI, waiting for responses, and reviewing its output for errors actually slowed down developers in the study by 19% compared to the control group.
- Digital Information World: Conducted by the non-profit group , the research tracked the performance of 16 long-time contributors to open-source projects as they completed a series of real-world programming tasks.
Classification: