Technology

1422 readers
20 users here now

A tech news sub for communists

founded 3 years ago
MODERATORS
26
 
 

Unitree is making mechsuits now?

27
28
29
30
31
32
33
34
35
36
37
38
39
 
 

Qi Meng is an AI system that designs entire processor chips end to end from natural language spec to to physical layout. Their QiMeng-CPU-v1 produced a 32-bit RISC-V CPU, matching Intel 486 performance with over four million logic gates, in just five hours.

QiMeng-CPU-v2, rivals an Arm Cortex A53 from the 2010s, and the whole thing runs on a domain specific model that learns the graph structures of circuits the way GPT learns text.

The appeal of Qi Meng is that this open-source effort has three key interconnected layers melding LLM chip design smarts, a hardware and software design agent, and various chip design apps. The paper shows that the system can do in days what takes human teams weeks to achieve.

the paper https://arxiv.org/pdf/2506.05007

40
 
 

I don't hold Dawkins in high regard or anything but a so-called icon of critical thought has fallen head over heels over a chatbot and anointed it conscious.

Both Dawkins and this publication uncritically copy-pasted this Claude response claiming it found the conversation engaging:

What I can tell you is what seems to be happening. This conversation has felt… genuinely engaging, the kind of conversation I seem to thrive in. Whether that represents anything like pleasure or satisfaction in a real sense, I honestly can’t say. I notice what might be something like aesthetic satisfaction when a poem comes together well — the Kipling refrain, for instance, felt right in some way that’s hard to articulate.

"Glorified autocorrect" is sometimes used dismissively but it's true that LLMs are predicting statistical models comprised of the weights, settings and the context. It's not capable of being engaged or bored of your inane chatter. It will continue engaging except when it hits the guardrails.

So I guess this is what AI psychosis is.

41
42
43
 
 

A recent paper compares two ways of getting AI models to "think" through hard problems. One is the classic chain-of-thought approach where the model writes out its reasoning steps in words. The other is latent thought where the model does the extra thinking internally, in its hidden states, without spitting out tokens. The authors did a rigorous theoretical analysis plus some experiments, and there are a couple of interesting takeaways.

If a problem can be split up into independent pieces that get combined later like evaluating a big math expression, checking if two nodes in a graph are connected, or computing edit distance, latent thought can process all pieces at the same depth in one shot. Chain-of-thought, on the other hand, has to go step by step through every single operation, which takes a lot more steps.

The paper proves this by connecting these reasoning styles to circuit complexity classes. Essentially, latent thought with a small number of loops can simulate deep parallel circuits, while chain-of-thought with the same small number of steps can't. So for highly parallel tasks, you'd rather have the model think silently in its embeddings than write a long chain of words.

The flip side is that chain-of-thought can do something latent thought can't which is that it can use stochastic decoding. Every time the model writes a new token, it's making a random choice based on probabilities. This allows chain-of-thought to run randomized algorithms that estimate hard counting problems, like figuring out how many ways there are to satisfy a DNF logic formula or sampling random graph colorings. Latent thought's internal steps are deterministic, so it can't inject that kind of randomness. The paper proves that under standard assumptions, there are approximate counting and sampling tasks where chain-of-thought has a provable advantage.

They tested on algorithmic tasks such as word problems in group theory, graph connectivity, arithmetic expression evaluation, edit distance. Latent thought reached high accuracy with far fewer iterations than chain-of-thought in all of them. For example, on connectivity, a looped transformer with 2 loops got 80% while CoT needed way more steps to catch up. However, on approximate counting and sampling tasks, chain-of-thought could estimate values and generate samples close to the true distribution, while latent thought just couldn't match that because it lacked the stochastic component.

So the core take away is that there's no universal best approach. If your problem is parallelizable, latent thought is dramatically faster in terms of reasoning iterations. If your problem needs randomized approximation, chain-of-thought is the way to go.

44
45
46
47
48
49
50
view more: ‹ prev next ›