Selfhosted

60074 readers

703 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.
No spam.
Posts here are to be centered around self-hosting. Please ensure it is clear in your post how it relates to self-hosting.
Don't duplicate the full text of your blog or git here. Just post the link for folks to click.
Submission headline should match the article title.
No trolling.

Resources:

selfh.st Newsletter and index of selfhosted software and apps
awesome-selfhosted software
awesome-sysadmin resources
Self-Hosted Podcast from Jupiter Broadcasting

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 3 years ago

MODERATORS

curbstickle@anarchist.nexus

curbstickle_lw@lemmy.world

PewDiePie releases Codex/ClaudeCode/Cursor killer, Odysseous (FOSS) (youtu.be)

submitted 3 weeks ago by appauled@sh.itjust.works to c/selfhosted@lemmy.world

68 comments fedilink hide all child comments

cross-posted from: https://sh.itjust.works/post/61139432

I seriously can't believe how much progress he's made for the FOSS community. He actually might take a bite out of the big 3's profits with this

you are viewing a single comment's thread
view the rest of the comments

[–] Rhaedas@fedia.io 33 points 3 weeks ago (1 children)

16GB is plenty for even older model setups. Now they've got a few models designed so you load just parts of the model onto the GPU (Mixture of Experts) and use the CPU for less referenced sections, so you get both reasonable speed and a much more complex model.

[–] onlinepersona@programming.dev 2 points 2 weeks ago (1 children)

Oh nice. Does that depend on just the model or are there other requirements like CUDA or something?

[–] Rhaedas@fedia.io 2 points 2 weeks ago (1 children)

Most models are going to require CUDA. There are some AMD ones out there, but it's a totally different math and setup. As for the one I mentioned, it's a pretty new idea so there are only a few out there, maybe just one (Qwen based). But I did get a 31B model to work on my 12GB, I just had to move from Ollama to llama.cpp to gain the control needed to set the parameters, and fine tune what it put on the CUDA to the max it would take. I had Claude help me along the way.

It's new enough that there aren't any good abliterated/uncensored models yet.

[–] Jayjader@jlai.lu 3 points 2 weeks ago

I'm surprised that you're talking about models being CUDA-specific or AMD-specific. I've had a bunch of models running on my amd-only pc, using ollama, lemonade, and lm-studio, through either rocm or vulkan. None of these models were billed as AMD-specific. I had to do some config tweaking for ollama to use my graphics card but that's more because I have a weird in-between-generations card that also predates the LLM hype (6700XT).

However, I did generally need to look for the GGUF format versions of things - usually accounts like unsloth have them uploaded on huggingface barely a day or two after the original version gets posted.