scruiser

joined 2 years ago
[–] scruiser@awful.systems 7 points 5 days ago (2 children)

Wonder of the goblin stuff is the start of some model collapse.

That is exactly it. Their official explanation avoids the phrase model collapse, but that is exactly what they describe: using the output of one model as training data for another amplified the occurrence of the word goblin (and other creatures), which apparently initially occurred because of their system prompt which was aimed at maximizing the Eliza effect (again they avoid an honest framing, but that is totally what they are doing and it is pretty gross considering all the cases of AI psychosis that have been occuring) by telling the model "You are an unapologetically nerdy, playful and wise AI mentor to a human. "

[–] scruiser@awful.systems 5 points 1 week ago

Widespread financial fraud which was legitimized and in some cases directly backed by EAs! Surely there are no parallels!

[–] scruiser@awful.systems 6 points 1 week ago

Zitron’s analogy is excellent because the bubble is multifactorial and the analogies that we can make are factor-to-factor. Here’s some things that caused the dot-com bubble; people were overly optimistic about:

Ed has also been clear there are a few factors that make this bubble worse (for the economy and the general public) than the dotcom bubble. For one, Ed is strongly convinced that GPU lifecycles are much shorter and worse than fiber optic life cycles. You build fiber optic infrastructure and it will last for decades. Meanwhile, GPUs used constantly at max load have life cycles of 3-5 years. The end result of the internet is also much more useful and less of a double-edged sword than the slop generators which churn out propaganda and spam.

[–] scruiser@awful.systems 6 points 1 week ago* (last edited 1 week ago)

I am a pretty big fan of Ed's work, so I'm going to hold my nose and read Kelsey's work thoroughly enough to do a line by line debunking:

Over the last two years, he has called the top repeatedly:

Well yes, but he has also explicitly said that the bubble peaking and popping would be a multiyear process. I've only kept up with his every article for the past year, but in the past year, his median guess for the bubble pop becoming undeniable was 2027. I guess making timelines with big events in 2027 and hedging on the median number is only for the rationalists? Also, we are already starting to see the narrative fray as Anthropic and OpenAI experiment with price hikes and struggle with getting ready for IPO, which would count as meeting his predictions for the start of the bubble pop.

In 2026, the focus is much more on alleging widespread, Enron- or FTX-tier outright fraud.

This is basically an admission that he can’t make the case in terms of the economics anymore.

??? Ed has been making the case for circular financing and investors being deceived because he thinks there are circular financing deals and investors being deceived. Ed has slightly softened on his position on exactly how useless or not LLMs are, but he is still holding to his economic case that the amount they cost isn't worth the value they provide, extremely blatantly so once consumers start paying the real cost and not the VC-subsidized cost.

By almost every metric, AI progress from 2024 to 2026 has been much faster than AI progress from 2022 to 2024.

And she is quoting a rat-adjacent think-tank for proof that AI improvement has been exponential. Even among the rationalist, the case has been made that the benchmarks are not reflective of real world usage/value and that costs are growing with "capabilities".

It can no longer argue that costs aren’t falling; they are.

Even accepting the premise that real costs have fallen, Kelsey fails to address Ed's case that the costs LLM companies charge is massively subsidized. If real costs are 10x the current subsidized costs (which have already been pushed up as far they can be without losing customers), and model inference prices miraculously drop 5x (which Kelsey would treat as a given, but I think is pretty unlikely barring some radical paradigm shifts), that is still a 2x gap.

It is a straightforward crime to claim $2 billion in monthly revenue if you mean that you are giving away services that would have a $2 billion market value.

Yes, exactly. Technically OpenAI and Anthropic play games with ARR and "gross" revenue (i.e. magically excluding the cost of training the model in the first place), but in a just nation it would straightforwardly be a crime. Why does she find this hard to believe?

Epoch AI has an in-depth analysis of the same financial questions from the same public information

(Looks inside the Epoch AI article):

So what are the profits? One option is to look at gross profits. This only considers the direct cost of running a model

Ed has gone into detail repeatedly about why excluding the cost of training the model is bullshit.

(More details from the article)

But we can still do an illustrative calculation: let’s conservatively assume that OpenAI started R&D on GPT-5 after o3’s release last April. Then there’d still be four months between then and GPT-5’s release in August,22 during which OpenAI spent around $5 billion on R&D.23 But that’s still higher than the $2 billion of gross profits. In other words, OpenAI spent more on R&D in the four months preceding GPT-5, than it made in gross profits during GPT-5’s four-month tenure.24

Oh that is surprising, the Epoch AI article actually acknowledges the point that these models are wildly unprofitable once you account for the training cost! Of course, they throw away their point in the next section by just magically assuming LLMs will prove to massively valuable in the near future! (One of the exact things Ed has complained about!)

He’s found too many grounds for dismissing all the financial information we have as dishonest or irrelevant to seriously engage with what any of it would imply if it were true.

He has shown in detail how the companies use barely technically not lying obfuscated bullshit metrics like gross profit or ARR to inflate their numbers and if you try un-obfuscate them the numbers look a lot worse.

Kelsey goes on to try to claim how much value LLMs provide

Making them more productive is a big deal, and in 2026, AI makes them more productive.

Zitron can’t really contest this with contemporary data, so he cites 2024 and 2025 studies of much weaker AIs with much weaker productivity impacts.

Two years to... 4 months ago! Such outdated information! In the first place there has been very few rigorous studies of how much of a productivity boost LLM coding agents actually provide, and one of the few studies with even a passing attempt at rigor (while still below good academic standards), was METR's study (and keep in mind they are a rat-adjacent think tank and not proper academics), which showed programmers thought they got a productivity boost but actually got a net productivity decrease!

From this set of beliefs, you could, in fact, defend a delightful bespoke AI bubble take: that AI would have been a catastrophic investment bubble, but the AI companies were saved from their mistakes by the determined NIMBYs of America killing off the excess data center build-out.

But that’s not Zitron’s stance. He seems to account “the build-out is too aggressive” and “the build-out is not happening as planned” as both independent strikes against AI — both things that show it’s bad, and the more of those he finds, the more bad it is.

It could in fact be all 3! The hyped-up build out, such as that indicated by OpenAI's and Oracle's 300 billion dollar detail was completely insanely too aggressive (for it to pay off, Ed calculated LLMs would have to drastically exceed Netflix+Microsoft Office in terms of ubiquity and price point), not achievable given realistic build times for data centers (Ed has also brought the numbers here), and even at the reduced actually rate of build out, still not actually financially viable (is simply because the LLM companies aren't charging enough). So yes, both things are bad, and one type of badness partway mitigates the other, but it is still all bad!

[–] scruiser@awful.systems 9 points 1 week ago (1 children)

I advise being very cautious about consuming Zitron’s posts

He has got a dramatic and vitriolic style, but as dgerard says, he has also dug through the numbers. I see lots of criticism of Ed's style, but not nearly so substantial criticism about the hard numbers he has come up with. The LLM companies put out contradictory and obfuscated numbers, and taken naively they seem to contradict Ed's numbers, but as Ed has shown, many, many times, when you start trying to un-obfuscate them they start looking really bad for everyone betting on LLMs.

Many coders are using chatbots, but I don’t know of evidence that it makes them more productive

So more and more coders are coming around to "actually AI code is okay"... but as we've seen repeatedly with LLM generated content, it is very easy for people to "Clever Hans" themselves and convince themselves LLMs are contributing more than they actually are, so I am not going to trust anecdotal reports.

[–] scruiser@awful.systems 9 points 2 weeks ago

I wouldn't give him credit for a full admission. He isn't acknowledging that "biased left-wing experts" means expert like psychologists with a basic understanding of psychometric validity and geneticists with the basic understanding that popular notions of race don't have a genetic basis and biological determinism is false.

[–] scruiser@awful.systems 4 points 2 weeks ago

It has already worked out that way: https://www.reddit.com/r/singularity/comments/1snbv4m/white_house_moves_to_give_us_agencies_anthropic/

So you aren't even being paranoid, this seems like a straightforward calculation for Anthropic to have made.

[–] scruiser@awful.systems 6 points 2 weeks ago* (last edited 2 weeks ago) (2 children)

The security blog I linked the other day has more criticisms of Anthropics mythos cybersecurity claims:

-Apparently Opus 4.6 may have found the FreeBSD Anthropic has made a huge deal about Mythos finding? And Anthropic didn't clarify that there older model had found the bug as well: https://www.flyingpenguin.com/freebsd-cve-2026-4747-log-suggests-mythos-is-a-marketing-trick/

-More explanation about why Anthropic's entire approach with Mythos and cybersecurity is more oriented around marketing than good (or any) cybersecurity practices. Also, the author makes the point that if you did have a tool that could rapidly refactor code into other languages, the solution to the vast majority of bugs and vulnerabilities Mythos found isn't bug hunting one by one with Anthropic's (much more expensive) LLM, it is to refactor code into a memory safe language and to make some boilerplate counter-approaches cheaper to implement. (I think the author is too credulous of LLM coding agents code quality here, but given those assumptions I think there point is correct.) https://www.flyingpenguin.com/how-sans-mythos-marketing-disappoints-defenders/

-Bonus, MCP (model context protocol, a standard for tools for LLM agents Anthropic has developed and tried to push) is insecure by default and Anthropic has refused to fix it! Which is really hypocritical given that many of the "vulnerabilities" Mythos found are small things that aren't actually properly exploitable under most conditions. https://www.flyingpenguin.com/ox-security-report-anthropic-mcp-is-execute-first-validate-never/

[–] scruiser@awful.systems 13 points 2 weeks ago

Habryka defends colonialism, straight out, no qualifiers: https://www.lesswrong.com/posts/w3MJcDueo77D3Ldta/let-goodness-conquer-all-that-it-can-defend

Ok, fine, I'll go even further. I am glad about the colonization of North America. The American experiment was one of the greatest successes in history, and of course, it was a giant fucking mess. But despite it all, despite the Trail of Tears, despite smallpox ravaging the land, despite the conquistadors and the looting and the rapes — it was still worth it. America is worth it. Democracy was worth it.

A surprisingly high number of comments push back, but Habryka's post is still highly upvoted, and the push back is in the typical rationalist jargon filled, assume-charitably mess.

[–] scruiser@awful.systems 5 points 3 weeks ago (2 children)

Hey Eliezer is very mad about being quoted as saying to bomb them! (He's made very clear that he wants them destroyed with air strikes!)

[–] scruiser@awful.systems 10 points 3 weeks ago (2 children)

A detailed analysis of why Anthropic's claims about Mythos's cybersecurity implications are bs: https://www.flyingpenguin.com/the-boy-that-cried-mythos-verification-is-collapsing-trust-in-anthropic/

And a followup post about why Anthropic's Glasswing project violates cybersecurity community norms and is an attempt to form a cartel: https://www.flyingpenguin.com/cartel-or-not-anthropic-mythos-is-a-curious-case/

[–] scruiser@awful.systems 8 points 3 weeks ago

Eliezer complaining about vigilante actions is really ironic considering one of his main themes in Harry Potter and the Methods of Rationalist was about "heroic responsibility" and complaining about how ordinary people default to doing nothing. I guess what he actually meant was for right-thinking people (people that agree with him) to take the actions he approves of.

 

So seeing the reaction on lesswrong to Eliezer's book has been interesting. It turns out, even among people that already mostly agree with him, a lot of them were hoping he would make their case better than he has (either because they aren't as convinced as him, or they are, but were hoping for something more palatable to the general public).

This review (lesswrong discussion here), calls out a really obvious issue: Eliezer's AI doom story was formed before Deep Learning took off, and in fact was mostly focusing on more GOFAI than neural networks, yet somehow, the details of the story haven't changed at all. The reviewer is a rationalist that still believes in AI doom, so I wouldn't give her too much credit, but she does note this is a major discrepancy from someone that espouses a philosophy that (nominally) features a lot of updating your beliefs in response to evidence. The reviewer also notes that "it should be illegal to own more than eight of the most powerful GPUs available in 2024 without international monitoring" is kind of unworkable.

This reviewer liked the book more than they expected to, because Eliezer and Nate Soares gets some details of the AI doom lore closer to the reviewer's current favored headcanon. The reviewer does complain that maybe weird and condescending parables aren't the best outreach strategy!

This reviewer has written their own AI doom explainer which they think is better! From their limited description, I kind of agree, because it sounds like the focus on current real world scenarios and harms (and extrapolate them to doom). But again, I wouldn't give them too much credit, it sounds like they don't understand why existential doom is actually promoted (as a distraction and source of crit-hype). They also note the 8 GPUs thing is batshit.

Overall, it sounds like lesswrongers view the book as an improvement to the sprawling mess of arguments in the sequences (and scattered across other places like Arbital), but still not as well structured as they could be or stylistically quite right for a normy audience (i.e. the condescending parables and diversions into unrelated science-y topics). And some are worried that Nate and Eliezer's focus on an unworkable strategy (shut it all down, 8 GPU max!) with no intermediate steps or goals or options might not be the best.

view more: next ›