this post was submitted on 27 Jun 2026

208 points (96.8% liked)

Selfhosted

60253 readers

738 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

Detailed Rules Post

Be civil.
No spam.
Posts are to be related to self-hosting.
Don't duplicate the full text of your blog or readme if you're providing a link.
Submission headline should match the article title.
No trolling.
Promotion posts require active participation, with an account that is at least 30 days old. F/LOSS without a paywall has exceptions, with requirements. See the rules link for details.

Resources:

selfh.st Newsletter and index of selfhosted software and apps
awesome-selfhosted software
awesome-sysadmin resources
Self-Hosted Podcast from Jupiter Broadcasting

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 3 years ago

MODERATORS

curbstickle@anarchist.nexus

curbstickle_lw@lemmy.world

208

Selfhosted & AI (anarchist.nexus)

submitted 2 days ago by curbstickle@anarchist.nexus to c/selfhosted@lemmy.world

112 comments fedilink hide all child comments

Yup, I'm posting another this week. Sorry.

This week I'm hoping we can wrangle a solution around AI and our selfhosted community. There are plenty of strong opinions (both pro and con), but one thing is for certain - there needs to be better disclosure in promo posts. Two options (that aren't mutually exclusive):

Any posts of an AI focused, AI Developed, etc software gets an [AI] tag. No, a [Not-AI] tag is not needed to accomplish this, thats kind of a "non-golfer" sort of tag.
Comment requiring an AI disclosure response to every promo post, if its not detailed in the post itself. Specifics (generating docs for commands, translation, whole-boat vibe-coded this app, etc) would be requested.

I will say that having disclosure and/or tagging would mean that comments that just say "slop" or "fuck ai" or whatever would be off topic at that point, that information is already provided, so its just noise (and sometimes pretty uncivil - I've been light on that for now due to the need for a rule on this).

The tag [AI] would make it easy to filter out (or search for, if that's your thing), but there is a wildly different degree of AI use out there, and from the posts with a positive score, its usually due to responsible AI use (translations, a snippet they had to do something obscure with, available to use with AI but doesn't require it, whatever), which is why I think the disclosure has a place as a benefit to everyone.

Please provide any input or alternative options on this, and I can then put it to a vote like the last one. Comments seem to be the best approach without involving something off-site, but if you have a better idea/option, please share.

you are viewing a single comment's thread
view the rest of the comments

[–] curbstickle_lw@lemmy.world 16 points 1 day ago (2 children)

Just to point out a few projects that allow AI contributions:

Firefox
NodeJS
Chromium
curl
Go
InfluxDB
MariaDB
Prometheus
Linux
openSSL
Blender
Mattermost
Caddy

If you want all projects related to AI in a different community, it may be easier for you to start "selfhosted_without_ai" or something.

[–] mereo@piefed.ca 12 points 1 day ago* (last edited 1 day ago) (1 children)

Indeed, AI is a tool, and the human should be an expert who verify its work. What I don't like are posts about apps that are completely vibe-coded without any thought put into them, which pose dangerous security risks.

[–] curbstickle@anarchist.nexus 7 points 1 day ago (1 children)

Thats the reasoning behind the disclosure bit, I agree its a tool, and great when used correctly.

But if you try and use a hammer like a drill, you're gonna have a bad time.

[–] nullpotential@lemmy.dbzer0.com 2 points 1 day ago (1 children)

It is not 'just a tool.' It is not "great." Too many people focus on how it is used and not how it is created, how it affects us, and how it affects the world.

[–] curbstickle@anarchist.nexus 6 points 1 day ago (1 children)

I'm just going to shortcut this and say two things:

I can guarantee the overwhelming majority (if not all) of your issues have nothing to actually do with LLMs and everything to do with corporations. Power use, data center buildouts, market impact, whatever - none it is an an llm problem. LLMs are just another piece of software, thats all.
Your personal opinion on this, as well as mine, does not change the overall conversation here. So how about we just stick to the topic at hand?

[–] midribbon_action@lemmy.blahaj.zone 1 points 10 hours ago (1 children)

LLMs are inextricably tied to nvidia gpus. Local or cloud, the technology exists to help the shovel salespeople. The gold diggers, everyone this tag is supposed to segregate, have been misled by corporations. Without their lies, and a pliant media, this tag would be unnecessary, and llms would be rolled out in a more limited and responsible way. To promote different uses of gold during a gold rush is going to inflate the bubble and enrich the rich, unless it is properly contextualized. Technology does not exist in a void, pretending it does digs us a deeper hole.

[–] curbstickle@anarchist.nexus 1 points 10 hours ago (1 children)

I run my local models on a Mac mini m2, but I could also be using AMD (ROCm), Intel with OpenVINO, or just CPU. Simple edge applications I could even use something like an RK3588.

Being tied to NVidia is marketing from NVidia, not the reality of LLMs.

[–] midribbon_action@lemmy.blahaj.zone 1 points 9 hours ago* (last edited 9 hours ago) (1 children)

Congrats, you are in the 1%. And that is exactly the type of additional context I think is necessary when discussing it, thank you.

[–] curbstickle@anarchist.nexus 1 points 9 hours ago (1 children)

Congrats, you are in the 1%

Sorry, I'm not following here. Are you suggesting I'm in the 1% of income earners? Because if so... LOL not even close, I'm barely in the top 40% by rough math.

Just to note, an apple silicon Mac is one of the most efficient (dollars, wattage, whatever take your pic) ways to run an LLM. Mine was a build target for iOS stuff for a client I'm now repurposing, that was a refurb that cost about $600.

I can't even buy a GPU for that price these days.

[–] midribbon_action@lemmy.blahaj.zone 1 points 8 hours ago (1 children)

You are in the 1% of llm users.

Do you think anybody is choosing m2s over nvidia? Everybody running on cpu dreams of being rich enough to do real gold digging on an ampere. Amd is the only direct competitor, an order of magnitude smaller, and supports fewer models. I don't think any hobbyist has one of these alternatives at the top of their wishlist, they are substitutes.

[–] curbstickle@anarchist.nexus 1 points 8 hours ago (1 children)

I have two clients I just set up on-prem LLM for with a cluster of Mac studios. Another already had about a half dozen custom servers with radeon pros doing a different job which got repurposed for on-prem.

I don't think the wishlist really matters, honestly. That'd just be pointing to the marketing team at NVidia IMO.

I actually don't have any really current NVidia hardware myself, I have mostly AMD GPUs, though ive been looking to pick up an Intel for AV1 purposes. I'd also mention that I recommend apple often (for this specific purpose only) due to their efficiency and power use.

In any case, that doesn't change the reality here - there is no single specific manufacturer that must be used, and all an LLM is, is software. Its not to blame for what companies are doing any more than Linux is to blame because its preferred as a server OS.

[–] midribbon_action@lemmy.blahaj.zone 1 points 7 hours ago (1 children)

OK, but still, just a clarification like "I'm not buying nor will I ever buy nvidia chips, now here's a thing I made with ai..." is enough context, in my opinion. Just blindly saying "ai is good for this" is promoting the bubble, as the vast vast majority of llm users are running nvidia chips.

Also, whatever financials you showed to convince them it was a good investment, whatever type of business it is, I think it was bullshit. I haven't seen any proof of efficiency gains at any company rollout of ai, certainly not on macs. Most companies are currently pulling back on token usage is my understanding. ROI has been unmeasurable so far. At best, you informed your clients explicitly this was a highly speculative purchase and may not benefit them.

[–] curbstickle@anarchist.nexus 1 points 7 hours ago* (last edited 7 hours ago) (1 children)

Thats certainly a take.

Also, whatever financials you showed to convince them it was a good investment

I don't show financials or propose these decisions, I get paid to design and sometimes implement.

As far as whether or not its a benefit, I'm going to have to completely disagree. As I previously mentioned - its a tool. They are great at detecting potential security issues in code data extraction and classification (especially with unstructured or poorly structured sources, like PDFs), knowledge base searches (especially where that knowledge may be spread across multiple internal sources like a wiki, memos, miscellaneous docs, etc), doc review for tone to meet standards, etc.

Your statement that it essentially doesn't pay to use llms is intrinsically tied to the OpenAI/MS/NVidia/Anthropic/etc "everything can be done with AI now!" marketing nonsense just the same as believing it has payoff for all scenarios. You recognize the "all uses are good uses" as being complete bs, but you're jumping to "that means no uses are good uses".

And that is decidedly not true.

The best example is that data ingest I mentioned. I had a client looking to bring in a bunch of differently formatted forms to a database. What they had been doing was taking their regular employees who handle these forms and using them for data entry - a pretty poor use of their time.

Instead, these scans were evaluated by a tuned model specific to their needs. Each form has a unique ID (though the way it could be numbered was very different), which then gets assigned to one of these folks for review at ingest. They are given a new unique number, and a verification flag (3 stages - first employee review, second employee review, and final import acceptance) which was basically the same flow as the previous setup.

The difference is that each person didnt need to hunt across the form to find the details. When the comparison comes up for approval at each stage, they get the snippet being brought in and the field its being applied to. It can be approved for that field, sent back for reevaluation, or sent for human only review (often this is because the scan sucked).

The project took less than 10% of the original timeframe, and the people handling the forms (and previously assigned for ingest) didn't end up with the stupidly increased workload that originally got assigned.

Again, using a tool at what its good for is what's important. Using it for what you think that it can do (ie: the executive method) is just piss poor practice due to easily convinced c suite who gobble up marketing nonsense.

Edit: For the record, hardware costs were under $50k. The consulting costs themselves were higher than that, and considering it was to originally take over a year to do, I'd easily bet it was a cost benefit even if they threw out the hardware after (they didn't, it got repurposed, its not needed for new forms).

[–] midribbon_action@lemmy.blahaj.zone 1 points 6 hours ago* (last edited 5 hours ago) (1 children)

I don't think you need hardly any hardware to do ocr. USPS started doing reliable ocr on 80s hardware. You really think an ai cluster is necessary for that?

Anyways, cool anecdote, not an actual financial study or report, and very long-winded honestly.

Post-edit reply: wow, that's kinda fucked up not to disclose that they disassembled it already. Looks like they found better uses. That's your success story?

[–] curbstickle@anarchist.nexus 1 points 6 hours ago (1 children)

OCR <> data ingest

OCR wouldn't work, as I mentioned, because of the varying structures of the forms.

I'm sorry my answer was too "long winded" for you, I was trying to be informative, but clearly you aren't interested in that. Enjoy your day.

[–] midribbon_action@lemmy.blahaj.zone 1 points 6 hours ago (1 children)

Don't think that's true. You can run the whole form through, come out with an identical pdf with searchable/copyable text. Even a completely novel form uses the same alphabet. Add some regex to pull out the fields you need to enter, and on failure give it to a human. All of that can be done with python on a raspberry pi. A decade ago.

https://github.com/ocrmypdf/OCRmyPDF

[–] curbstickle@anarchist.nexus 1 points 6 hours ago (1 children)

You'd be wrong.

The fields aren't all the same kinds of values, which requires relationship between the data to be evaluated for entry.

You're assuming this is transposing contents, which was not the issue. Your example is what was initially planned and halted before transitioning to the approach I helped deploy.

[–] midribbon_action@lemmy.blahaj.zone 1 points 6 hours ago (1 children)

That's wrong, you didn't know that there's another if/else statement required by them. That's what the supercomputer is for.

That's how you sound.

[–] curbstickle@anarchist.nexus 1 points 6 hours ago (1 children)

So I'll go back to my previous comment; you're not actually interested in understanding the use, you have a pre-determined (and uninformed) view of use and operation, and providing that information as an example is "long-winded".

Ill be done with this discussion now. Enjoy your day.

[–] midribbon_action@lemmy.blahaj.zone 1 points 5 hours ago

The difference is that each person didnt need to hunt across the form to find the details. When the comparison comes up for approval at each stage, they get the snippet being brought in and the field its being applied to.

This is the only technical detail in the whole 500 word comment.

[–] technocrit@lemmy.dbzer0.com -2 points 1 day ago* (last edited 1 day ago) (1 children)

There is absolutely zero "AI" involved in the development of any of these. They just use computer programs. No actual intelligence apart from humans.

[–] curbstickle_lw@lemmy.world 3 points 1 day ago* (last edited 1 day ago)

I think you may be misunderstanding the terminology here.

AI is a general term. LLMs are a subset, as are ML, DL, ANNs, NLP, CV, Expert models, etc.

Today you would define what we have as ANI, where the "N" stands for "Narrow". This is also known as "weak" AI.

What you're referring to would be called AGI, where the "G" stands for "General", where an AI would have a human degree of intelligence. This is pure concept today, and does not exist.

Also on the list would be ASI, where the "S" is for "Super", where the AI in question has more collective intelligence than humanity across all domains. This is purely hypothetical.

But AI has existed for decades. The first application I know of is Dendral, which was created in the 1950s to analyze mass spectrometry data to identify organic molecules. This was what's called an Expert model - basically a lot of if-then statements, and led to things like MYCIN.

We don't need to redefine words here.