this post was submitted on 03 Jun 2026
98 points (87.7% liked)

TechTakes

2600 readers
38 users here now

Big brain tech dude got yet another clueless take over at HackerNews etc? Here's the place to vent. Orange site, VC foolishness, all welcome.

This is not debate club. Unless it’s amusing debate.

For actually-good tech, you want our NotAwfulTech community

founded 3 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] MoonMelon@lemmy.ml 17 points 2 weeks ago (8 children)

Reading his response, I think calling it "slop" isn't being totally fair, but it does sound like he should hand it off again or close the project. Not having test coverage for something is bad, but it happens. It sounds like the alternatives have this issue also. But the sailing comment is kind of tragic. Just go sailing, dude. Unless you have a phylactery under your desk the project will outlive you anyway, and honestly that's the best compliment a developer can get.

[–] dgerard@awful.systems 11 points 2 weeks ago (7 children)

It literally is slop. It's always correct to call slop slop.

[–] MoonMelon@lemmy.ml 21 points 2 weeks ago (6 children)

I rewrote the rsync test suite in python from the old shell script design. I did the design for that myself (and I’m really quite pleased with it), but used claude with cross-checks from codex and gemini to do the grunt work. I did not just vibe-code “convert test suite to python”.... I used AI tools to do the grunt work because they are good at that. I reviewed every part of it myself and ran through a huge amount of CI time getting it right

If what he claims is true then he's using LLMs for test coverage with significant editing by hand. I hate LLMs, but even I have to admit this seems like one of the few, valid use cases of LLM assisted coding. Unless "slop" has become one of those words that's just lost all meaning.

[–] diz@awful.systems 11 points 2 weeks ago* (last edited 2 weeks ago)

It's a perfect example of how "using LLMs for test coverage" can also be harmful. He expected the tests to to prevent introduction of said regressions, probably based on a combination of the quantity of tests and their style (they look like what decent human written tests look like). But the tests are AI slop, and so they give a lot less value per line of code than he expects, hence a significant regression.

It is literally useful to call these tests AI slop, and the problem is in part caused by not calling them AI slop, and having consequent inflated expectations. LLMs are not any better at writing tests than at writing other code! It is merely that the bar for tests can, legitimately, be a lot lower (in projects where there would otherwise be no tests at all). Making an exception to calling AI generated tests "slop" is thus counter productive, because it leads people to act as if LLMs are actually better at writing tests than at writing other code, and not just because the bar for tests is frequently very low.

edit: actually scratch that I looked at the PR and those tests even look like dogshit and worse than the tests I seen claude write at a workplace that was into vibecoding (which i since quit).

load more comments (5 replies)
load more comments (5 replies)
load more comments (5 replies)