this post was submitted on 03 Jun 2026
965 points (99.7% liked)

People Twitter

10091 readers
407 users here now

People tweeting stuff. We allow tweets from anyone.

RULES:

  1. Mark NSFW content.
  2. No doxxing people.
  3. Must be a pic of the tweet or similar. No direct links to the tweet.
  4. No bullying or international politcs
  5. Be excellent to each other.
  6. Provide an archived link to the tweet (or similar) being shown if it's a major figure or a politician. Archive.is the best way.

founded 3 years ago
MODERATORS
965
Managers (europe.pub)
submitted 2 weeks ago* (last edited 2 weeks ago) by inari@piefed.zip to c/whitepeopletwitter@sh.itjust.works
you are viewing a single comment's thread
view the rest of the comments
[โ€“] theunknownmuncher@lemmy.world 1 points 2 weeks ago (1 children)

I run 27b at q8 with unquantized KV cache and 256k context on two Instinct MI60 GPUs. Definitely the best model that I have been able to run locally at a reasonable speed. 35b generates tokens as fast as you'd expect from any cloud provider. 27b is slower than 35b, of course, but token generation is still faster than my reading speed and suitable with coding agents.