this post was submitted on 19 Jul 2025
454 points (96.9% liked)

Technology

72957 readers
2900 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
 

Coders spent more time prompting and reviewing AI generations than they saved on coding. On the surface, METR's results seem to contradict other benchmarks and experiments that demonstrate increases in coding efficiency when AI tools are used. But those often also measure productivity in terms of total lines of code or the number of discrete tasks/code commits/pull requests completed, all of which can be poor proxies for actual coding efficiency. These factors lead the researchers to conclude that current AI coding tools may be particularly ill-suited to "settings with very high quality standards, or with many implicit requirements (e.g., relating to documentation, testing coverage, or linting/formatting) that take humans substantial time to learn." While those factors may not apply in "many realistic, economically relevant settings" involving simpler code bases, they could limit the impact of AI tools in this study and similar real-world situations.

you are viewing a single comment's thread
view the rest of the comments
[–] Sagan_Wept@lemmynsfw.com 39 points 18 hours ago (4 children)

Their sample size was 16 people...

[–] tankfox@midwest.social 9 points 14 hours ago* (last edited 14 hours ago)

Who are in the process of learning to do something new, versus the workflow that they've been trained in and have a lot of experience in.

Where was the sample of non-coders tasked with doing the same thing, using AI to help or learning without assistance?

Where was the sample of coders prohibited from looking anything up and having to rely solely on their prior knowledge to do the job?

It might help refine what's actually being tested.

[–] kromem@lemmy.world 1 points 12 hours ago

Where the most experienced minority only had a few weeks of using AI inside an IDE like Cursor.

[–] bulwark@lemmy.world 3 points 16 hours ago

I'm not really sure why it was such a small sample size. It definitely casts doubt on some of their conclusions. I also have issues with some methodology used. I think a better study that came out a week or two ago was the one that showed visible neurological decline from AI use.

[–] mspencer712@programming.dev 2 points 16 hours ago

I got flamed pretty hard for pointing out that this sample size really needs to be in the title, but it needs to be said. Thank you. Sixteen people is basically a forum thread, and not a very popular one.

It’s still useful information and a good read, but a lot of people don’t click through to the article, they just remember the title and move on.