this post was submitted on 28 Nov 2025
609 points (94.6% liked)

Selfhosted

53242 readers
1045 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

  1. Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don't duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 2 years ago
MODERATORS
 

I got into the self-hosting scene this year when I wanted to start up my own website run on old recycled thinkpad. A lot of time was spent learning about ufw, reverse proxies, header security hardening, fail2ban.

Despite all that I still had a problem with bots knocking on my ports spamming my logs. I tried some hackery getting fail2ban to read caddy logs but that didnt work for me. I nearly considered giving up and going with cloudflare like half the internet does. But my stubbornness for open source self hosting and the recent cloudflare outages this year have encouraged trying alternatives.

Coinciding with that has been an increase in exposure to seeing this thing in the places I frequent like codeberg. This is Anubis, a proxy type firewall that forces the browser client to do a proof-of-work security check and some other nice clever things to stop bots from knocking. I got interested and started thinking about beefing up security.

I'm here to tell you to try it if you have a public facing site and want to break away from cloudflare It was VERY easy to install and configure with caddyfile on a debian distro with systemctl. In an hour its filtered multiple bots and so far it seems the knocks have slowed down.

https://anubis.techaro.lol/

My botspam woes have seemingly been seriously mitigated if not completely eradicated. I'm very happy with tonights little security upgrade project that took no more than an hour of my time to install and read through documentation. Current chain is caddy reverse proxy -> points to Anubis -> points to services

Good place to start for install is here

https://anubis.techaro.lol/docs/admin/native-install/

top 50 comments
sorted by: hot top controversial new old
[–] Deathray5@lemmynsfw.com 5 points 1 day ago (1 children)

Unrelated but one day I won't get gender envy from random cartoon woman

[–] Holytimes@sh.itjust.works 1 points 18 hours ago

At least you don't have ear and tail evny it's so fluffy

[–] termaxima@slrpnk.net 2 points 23 hours ago

I am very annoyed that I have to enable cloudflare's JavaScript on so many websites, I would much prefer if more of them used Anubis so I didn't have third-party JavaScript running as often.

( coming from an annoying user who tries to enable the fewest things possible in NoScript )

[–] daniskarma@lemmy.dbzer0.com 39 points 1 day ago* (last edited 1 day ago)

I don't think you have a usecase for Anubis.

Anubis is mainly aimed against bad AI scrappers and some ddos mitigation if you have a heavy service.

You are getting hit exactly the same, anubis doesn't put up a block list or anything. It just put itself in front of the service. The load on your server and the risk you take it's very similar anubis or not anubis here. Most bots are not AI scrappers they are just proving. So the hit on your server is the same.

What you want is to properly set up fail2ban or, even better, crowdsec. That would actually block and ban bots that try to prove your server.

If you are just self-hosting with Anubis the only thing you are doing is deriving the log noise towards Anubis logs and making your devices do a PoW every once in a while when you want to use your services.

Being honest I don't know what you are self hosting. But at least it's something that's going to get ddos or AI scrapped, there's not much point with Anubis.

Also Anubis is not a substitute for fail2ban or crowdsec. You need something to detect and ban brute force attacks. If not the attacker would only need to execute the anubis challenge get the token for the week and then they are free to attack your services as they like.

[–] sudoer777@lemmy.ml 15 points 1 day ago* (last edited 1 day ago) (5 children)

I host my main server on my own hardware, and a VPN on Hetzner because my shitty ISP doesn't let me port forward. For the past year, bots were hitting my Forgejo instance hard. I forgot to disable registration and they generated hundreds of accounts with hundreds of repos with sketchy links, generating terrabytes of traffic from my VPS, costing me money in traffic. I disabled registration and deleted the spam, and bots still kept hitting my server for several months, which would cause memory leaks over time and crash it and consume CPU, and still costed me money with terrabytes of traffic per month. A few weeks ago, I put Anubis on the VPS. Now, zero bots hit my Forgejo instance and I don't pay for their traffic anymore. Problem solved.

[–] LOLseas@lemmy.zip 1 points 1 day ago (1 children)

This is the first time I've ever seen it misspelled like that. It's 'terabyte/terabytes'. 1,024 GBs worth of data.

[–] sudoer777@lemmy.ml 2 points 19 hours ago

Oops, although terabyte is 1000 GB, 1024 GiB is tebibyte

[–] Jason2357@lemmy.ca 6 points 1 day ago

Its always code forges and wikis that are effected by this because the scrapers spider down into every commit or edit in your entire history, then come back the next day and check every “page” again to see if any changed. Consider just blocking pages that are commit history at your reverse proxy.

load more comments (3 replies)
[–] drkt_@lemmy.dbzer0.com 13 points 1 day ago

Stop playing wack-a-mole with these fucking people and build TARPITS!

Make it HURT to crawl your site illegitimately.

[–] smh@slrpnk.net 20 points 1 day ago

The creator is active on a professional slack I'm on and they're lovely and receptive to user feedback. Their tool is very popular in the online archives/cultural heritage scene (we combine small budgets and juicy, juicy data).

My site has enabled js-free screening when the site load is low, under the theory that if the site load is too high then no one's getting in anyway.

[–] non_burglar@lemmy.world 193 points 2 days ago (9 children)

Anubis is an elegant solution to the ai bot scraper issue, I just wish the solution to everything wasn't just spending compute everywhere. In a world where we need to rethink our energy consumption and generation, even on clients, this is a stupid use of computing power.

[–] Dojan@pawb.social 114 points 2 days ago* (last edited 2 days ago) (11 children)

It also doesn’t function without JavaScript. If you’re security or privacy conscious chances are not zero that you have JS disabled, in which case this presents a roadblock.

On the flip side of things, if you are a creator and you’d prefer to not make use of JS (there’s dozens of us) then forcing people to go through a JS “security check” feels kind of shit. The alternative is to just take the hammering, and that feels just as bad.

No hate on Anubis. Quite the opposite, really. It just sucks that we need it.

[–] SmokeyDope@piefed.social 60 points 2 days ago* (last edited 2 days ago) (4 children)

Theres a compute option that doesnt require javascript. The responsibility lays on site owners to properly configure IMO, though you can make the argument its not default I guess.

https://anubis.techaro.lol/docs/admin/configuration/challenges/metarefresh

From docs on Meta Refresh Method

Meta Refresh (No JavaScript)

The metarefresh challenge sends a browser a much simpler challenge that makes it refresh the page after a set period of time. This enables clients to pass challenges without executing JavaScript.

To use it in your Anubis configuration:

# Generic catchall rule
- name: generic-browser
  user_agent_regex: >-
    Mozilla|Opera
  action: CHALLENGE
  challenge:
    difficulty: 1 # Number of seconds to wait before refreshing the page
    algorithm: metarefresh # Specify a non-JS challenge method

This is not enabled by default while this method is tested and its false positive rate is ascertained. Many modern scrapers use headless Google Chrome, so this will have a much higher false positive rate.

load more comments (4 replies)
[–] quick_snail@feddit.nl 8 points 1 day ago

This is why we need these sites to have .onions. Tor Browser has a PoW that doesn't require js

load more comments (9 replies)
load more comments (8 replies)
[–] quick_snail@feddit.nl 26 points 2 days ago (2 children)

Kinda sucks how it makes websites inaccessible to folks who have to disable JavaScript for security.

[–] poVoq@slrpnk.net 26 points 1 day ago (6 children)

I kinda sucks how AI scrapers make websites inaccessible to everyone 🙄

[–] elbarto777@lemmy.world 1 points 1 day ago

You are both right.

load more comments (5 replies)
[–] WhyJiffie@sh.itjust.works 14 points 1 day ago (3 children)

there's a fork that has non-js checks. I don't remember the name but maybe that's what should be made more known

load more comments (3 replies)
[–] url@feddit.fr 23 points 2 days ago (1 children)

Honestly im not a big fan of anubis . it fucks users with slow devices

https://lock.cmpxchg8b.com/anubis.html

[–] url@feddit.fr 15 points 2 days ago

Did i forgot to mention it doesnt work without js that i keep disabled

[–] TerHu@lemmy.dbzer0.com 15 points 2 days ago (1 children)

yes, please be mindful when using cloudflare. with them you’re possibly inviting in a much much bigger problem

https://www.devever.net/~hl/cloudflare

[–] quick_snail@feddit.nl 8 points 1 day ago* (last edited 1 day ago)

Great article, but I disagree about WAFs.

Try to secure a nonprofits web infrastructure with as 1 IT guy and no budget for devs or security.

It would be nice if we could update servers constantly and patch unmaintained code, but sometimes you just need to front it with something that plugs those holes until you have the capacity to do updates.

But 100% the WAF should be run locally, not a MiTM from evil US corp in bed with DHS.

[–] sudo@programming.dev 48 points 2 days ago (8 children)

I've repeatedly stated this before: Proof of Work bot-management is only Proof of Javascript bot-management. It is nothing to a headless browser to by-pass. Proof of JavaScript does work and will stop the vast majority of bot traffic. That's how Anubis actually works. You don't need to punish actual users by abusing their CPU. POW is a far higher cost on your actual users than the bots.

Last I checked Anubis has an JavaScript-less strategy called "Meta Refresh". It first serves you a blank HTML page with a <meta> tag instructing the browser to refresh and load the real page. I highly advise using the Meta Refresh strategy. It should be the default.

I'm glad someone is finally making an open source and self hostable bot management solution. And I don't give a shit about the cat-girls, nor should you. But Techaro admitted they had little idea what they were doing when they started and went for the "nuclear option". Fuck Proof of Work. It was a Dead On Arrival idea decades ago. Techaro should strip it from Anubis.

I haven't caught up with what's new with Anubis, but if they want to get stricter bot-management, they should check for actual graphics acceleration.

[–] rtxn@lemmy.world 13 points 2 days ago* (last edited 2 days ago) (11 children)

POW is a far higher cost on your actual users than the bots.

That sentence tells me that you either don't understand or consciously ignore the purpose of Anubis. It's not to punish the scrapers, or to block access to the website's content. It is to reduce the load on the web server when it is flooded by scraper requests. Bots running headless Chrome can easily solve the challenge, but every second a client is working on the challenge is a second that the web server doesn't have to waste CPU cycles on serving clankers.

POW is an inconvenience to users. The flood of scrapers is an existential threat to independent websites. And there is a simple fact that you conveniently ignored: it fucking works.

load more comments (11 replies)
[–] SmokeyDope@piefed.social 35 points 2 days ago* (last edited 2 days ago) (1 children)

Something that hasn't been mentioned much in discussions about Anubis is that it has a graded tier system of how sketchy a client is and changing the kind of challenge based on a a weighted priority system.

The default bot policies it comes with has it so squeaky clean regular clients are passed through, then only slightly weighted clients/IPs get the metarefresh, then its when you get to moderate-suspicion level that JavaScript Proof of Work kicks. The bot policy and weight triggers for these levels, challenge action, and duration of clients validity are all configurable.

It seems to me that the sites who heavy hand the proof of work for every client with validity that only last every 5 minutes are the ones who are giving Anubis a bad wrap. The default bot policy settings Anubis comes with dont trigger PoW on the regular Firefox android clients ive tried including hardened ironfox. meanwhile other sites show the finger wag every connection no matter what.

Its understandable why some choose strict policies but they give the impression this is the only way it should be done which Is overkill. I'm glad theres config options to mitigate impact normal user experience.

[–] sudo@programming.dev 7 points 1 day ago

Anubis is that it has a graded tier system of how sketchy a client is and changing the kind of challenge based on a a weighted priority system.

Last I checked that was just User-Agent regexes and IP lists. But that's where Anubis should continue development, and hopefully they've improved since. Discerning real users from bots is how you do proper bot management. Not imposing a flat tax on all connections.

load more comments (6 replies)
[–] 0_o7@lemmy.dbzer0.com 30 points 2 days ago (2 children)

I don't mind Anubis but the challenge page shouldn't really load an image. It's wasting extra bandwidth for nothing.

Just parse the challenge and move on.

[–] Allero@lemmy.today 23 points 2 days ago (1 children)

Afaik, you can set it up not to have any image, or have any other one.

[–] Voroxpete@sh.itjust.works 8 points 2 days ago* (last edited 2 days ago) (1 children)

It's actually a brilliant monetization model. If you want to use it as is, it's free, even for large corporate clients.

If you want to get rid of the puppygirls though, that's when you have to pay.

load more comments (1 replies)
[–] kilgore_trout@feddit.it 18 points 2 days ago* (last edited 2 days ago) (3 children)

It's a palette of 10 colours. I would guess it uses an indexed colorspace, reducing the size to a minimum.
edit: 28 KB on disk

load more comments (3 replies)
[–] A_norny_mousse@feddit.org 16 points 2 days ago* (last edited 2 days ago) (7 children)

At the time of commenting, this post is 8h old. I read all the top comments, many of them critical of Anubis.

I run a small website and don't have problems with bots. Of course I know what a DDOS is - maybe that's the only use case where something like Anubis would help, instead of the strictly server-side solution I deploy?

I use CrowdSec (it seems to work with caddy btw). It took a little setting up, but it does the job.
(I think it's quite similar to fail2ban in what it does, plus community-updated blocklists)

Am I missing something here? Why wouldn't that be enough? Why do I need to heckle my visitors?

Despite all that I still had a problem with bots knocking on my ports spamming my logs.

By the time Anubis gets to work, the knocking already happened so I don't really understand this argument.

If the system is set up to reject a certain type of requests, these are microsecond transactions of no (DDOS exception) harm.

[–] poVoq@slrpnk.net 12 points 1 day ago* (last edited 1 day ago) (1 children)

AI scraping is a massive issue for specific types of websites, such as git forges, wikis and to a lesser extend Lemmy etc, that rely on complex database operations that can not be easily cached. Unless you massively overprovision your infrastructure these web-applications come to a grinding halt by constantly maxing out the available CPU power.

The vast majority of the critical commenters here seem to talk from a point of total ignorance about this, or assume operators of such web applications have time for hyperviligance to constantly monitor and manually block AI scrapers (that do their best to circumvent more basic blocks). The realistic options for such operators are right now: Anubis (or similar), Cloudflare or shutting down their servers. Of these Anubis is clearly the least bad option.

load more comments (1 replies)
[–] daniskarma@lemmy.dbzer0.com 6 points 1 day ago* (last edited 1 day ago)

You are right. For most self-hosting usecases anubis is not only irrelevant, but it actually works against you. False sense of security and making your devices do extra work for nothing.

Anubis is though for public facing services that may get ddos or AI scrapped by some not targeted bot (for a target bot it's trivial to get over Anubis in order to scrap).

And it's never a substitute of crowdsec or fail2ban. Getting an Anubis token it's just a matter of executing the PoW challenge. You still need a way to detect and ban malicious attacks.

I also used CrowdSec for almost a year, but as AI scrapers became more aggressive, CrowdSec alone wasn’t enough. The scrapers used distributed IP ranges and spoofed user agents, making them hard to detect and costing my Forgejo instance a lot in expensive routes. I tried custom CrowdSec rules but hit its limits.

Then I discovered Anubis. It’s been an excellent complement to CrowdSec — I now run both. In my experience they work very well together, so the question isn’t “A or B?” but rather “How can I combine them, if needed?”

[–] quick_snail@feddit.nl 7 points 1 day ago (1 children)

With varnish and wazuh, I've never had a need for Anubis.

My first recommendation for anyone struggling with bots is to fix their cache.

[–] kalleboo@lemmy.world 2 points 1 day ago (1 children)

Anubis was originally created to protect git web interfaces since they have a lot of heavy-to-compute URLs that aren't feasible to cache (revision diffs, zip downloads etc).

After that I think it got adopted by a lot of people who didn't actually need it, they just don't like seeing AI scrapers in their logs.

[–] quick_snail@feddit.nl 1 points 23 hours ago

Yes!

Also, another very simple solution is to authwall expensive pages that can't be cached.

load more comments (3 replies)
[–] henfredemars@infosec.pub 30 points 2 days ago (1 children)

I appreciate a simple piece of software that does exactly what it’s supposed to do.

load more comments (1 replies)
load more comments
view more: next ›