this post was submitted on 31 Mar 2026
36 points (95.0% liked)

Selfhosted

58456 readers
441 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

  1. Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don't duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

  7. No low-effort posts. This is subjective and will largely be determined by the community member reports.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 2 years ago
MODERATORS
 

I’m considering starting a Lemmy instance with a limited federation model, and one of the things I’m thinking about from the start is how to support and maintain it as it grows, while spending as little attention as possible on the technical side of infrastructure management itself.

Because of that, I’m especially interested in hearing from admins who host Lemmy instances, particularly larger ones. I’d like to understand what your actual workflow looks like in practice: how you organize administration, what methodologies you use, how you handle backups, data recovery, upgrades, monitoring, and infrastructure maintenance in general. I’m also interested in whether there are any best practices or operational patterns that have proven reliable over time.

From what I’ve found so far, the official Lemmy documentation on backup and restore seems reasonably good for small instances, but as the instance grows, more nuances and complications appear. So ideally, I’d like to find or assemble something closer to a real guideline or runbook based on practices that are actually used by admins running larger instances.

If you run or have run a Lemmy instance, especially one that had to scale beyond a small personal or experimental setup, I’d really appreciate hearing about your experience. Even brief notes, links to documentation, internal checklists, or descriptions of what has and hasn’t worked for you would be very useful.

you are viewing a single comment's thread
view the rest of the comments
[–] nachitima@lemmy.ml 1 points 1 week ago

Hey, super helpful comment.

A few of the details you mentioned are exactly the kind of practical stuff I’m trying to collect, so I wanted to ask a bit more:

  • When you say you pushed federation workers up to 128, which exact setting are you referring to?
  • Roughly how big is your instance in practice — users, subscriptions, remote communities, storage size, daily activity?
  • What were the first signs that federation was falling behind, besides the Waiting for X workers log message?
  • Did increasing workers fully solve it, or did it just move the bottleneck somewhere else?
  • What kind of Postgres tuning ended up mattering most for you?
  • For backups, are you only doing weekly pg_dump + VPS backups, or also separately backing up pictrs, configs, secrets, and proxy setup?
  • Have you tested full restore end-to-end on another machine?
  • For pictrs growth, have you found any good way to keep storage under control, or is it mostly just “plan for it to grow”?
  • For monitoring/logging, if you were starting over, what would you set up from day one?

I’m mostly interested in the boring operational side of running Lemmy long-term: backup/restore, federation lag, storage growth, and early warning signs before things get messy.

Sorry if some of these questions are a bit basic or oddly specific — I’m using AI to help gather as much real-world Lemmy hosting experience as possible, and it generated most of these follow-up questions for me.