this post was submitted on 18 Jun 2025
879 points (98.8% liked)

Fediverse

34695 readers
86 users here now

A community to talk about the Fediverse and all it's related services using ActivityPub (Mastodon, Lemmy, KBin, etc).

If you wanted to get help with moderating your own community then head over to !moderators@lemmy.world!

Rules

Learn more at these websites: Join The Fediverse Wiki, Fediverse.info, Wikipedia Page, The Federation Info (Stats), FediDB (Stats), Sub Rehab (Reddit Migration)

founded 2 years ago
MODERATORS
top 50 comments
sorted by: hot top controversial new old
[–] Steven_T_Baxter@lemmy.world 1 points 15 hours ago

I don't mind training AI to give everyone smarter answers. It just seems the more civic, community minded, thing to do. BUT, what I want is a % of the revenue they make selling my metadata, and a list of who they sold it to.

[–] it_depends_man@lemmy.world 191 points 4 days ago (1 children)

Mastodon dot SOCIAL did, the big public instance. Mastodon the software doesn't have these restrictions.

[–] Scrollone@feddit.it 27 points 4 days ago

It wouldn't even make sense for the Mastodon software to have such a restriction... The article title is misleading.

[–] ShinkanTrain@lemmy.ml 101 points 4 days ago

AI scrapers:

[–] Fizz@lemmy.nz 92 points 4 days ago (5 children)

Yeah this will do absolutely nothing.

[–] Cris_Color@lemmy.world 123 points 4 days ago (1 children)

I agree, but I'm glad they did it anyway.

[–] Fizz@lemmy.nz 7 points 4 days ago

Fair, there is no reason not to.

[–] SmolSteely@lemmynsfw.com 55 points 4 days ago (1 children)

It does provide for the possibility of future legal action. This should have been done a year or two ago

[–] drmoose@lemmy.world 13 points 4 days ago* (last edited 4 days ago) (6 children)

No it doesn't because all mastodon data is public and does not require ToS agreement to be collected.

Mastodon could only argue damages but that would be impossible to litigate in any extent due to decentralized and free nature of Mastodon and Fediverse. Except for some backward countries like China or Japan where there's no information freedom protections and any corporation can sue you for damages for any information infringement (even if it's not yours).

This is a good thing. Mastodon shouldn't control anything related to the legality of data flowing in the fediverse - that's the entire point.

[–] umbraroze@slrpnk.net 6 points 4 days ago (1 children)

The way copyright law works, by default you don't have any right to make use of anything, even if it's posted publicly. Why do people allow Fediverse platforms to do the thing they do? Leniency on their part.

Gathering data from Mastodon for AI training is technically feasible, but that doesn't mean it's legally justified. Many people will object to that. Many already do!

[–] drmoose@lemmy.world 3 points 4 days ago (4 children)

No that's not how copyright works. Copyright prohibits distribution not copying.

[–] umbraroze@slrpnk.net 3 points 4 days ago (1 children)

Er, yes, my point was copyright very much concerns what you're allowed to do with data. But that goes beyond distribution. Derivative works are a complicated topic.

My point stands, whether you technically can copy stuff has no bearing on whether you're allowed to use it and for what purpose.

[–] drmoose@lemmy.world 3 points 4 days ago (1 children)

Well it depends on the use. If its a movie that I copied then I can watch it, if it's a picture I can print it and put it on a wall at my home. Even AI training currently its considered to be entirely legal to train on copyrighted data. You can even parse copyrighted data for analytics which is entirely legal as well.

So you can do a lot with copyrighted data without breaching the copyright, including AI training as it's the article topic.

[–] umbraroze@slrpnk.net 2 points 4 days ago (1 children)

Private use of the copyrighted works is pretty much a separate topic entirely.

And while the law isn't settled on the topic, it's wrong to argue AI training is something that happens entirely in a private setting, especially when that work is made available publicly in some form or another.

Sure, there's a problem with the current copyright laws that has to be addressed. It's quite similar to the "TiVo loophole" in OSS licenses. It was addressed, and certainly not in favour of the loophole exploiters. That one could be fixed on licence level because it was ultimately a licence question, but the AI training question, however, needs to be taken to the legislation level. Internationally, too.

[–] drmoose@lemmy.world 3 points 4 days ago

I don't think this precedence will ever get set because we don't have universal global IP protections. The west will never set it due to fear of China winning the AI race.

In their opinion (which I agree with) this is the greater good and someone's mastodon posts or similar being fed to AI training machine is a lesser evil compared to losing technological advantage to the biggest authoritarian state in the world.

load more comments (3 replies)
[–] ideonek@piefed.social 5 points 4 days ago

I think that the point is that instances can choose thier own rules. Article is about an instance. Not about the entire platform.

load more comments (4 replies)
[–] Ulrich@feddit.org 11 points 4 days ago* (last edited 4 days ago)

It potentially gives them grounds for a lawsuit. Probably not but potentially. There's no reason not to explicitly deny permission. They have everything to gain and nothing to lose.

[–] fmstrat@lemmy.nowsci.com 8 points 4 days ago* (last edited 4 days ago)

Gives them legal standing against scraping for if it is needed in the future.

[–] anothermember@feddit.uk 77 points 4 days ago (1 children)

That's a really misleading headline; a Mastodon instance has done this, Mastodon as a whole can't do this because it's free software, it can be used for any purpose.

[–] froufox@lemmy.blahaj.zone 5 points 4 days ago (1 children)

I'm wondering, is it possible to include that restriction in public license for the software mastodon?

[–] anothermember@feddit.uk 12 points 4 days ago (2 children)

It wouldn't be a free software licence by the FSF definition (rule zero). Of interest the FSF rejects the original JSON licence because it contains the clause “The Software shall be used for Good, not Evil.” Since Mastodon uses AGPL, it wouldn't be compatible.

[–] trevor@lemmy.blahaj.zone 5 points 4 days ago (2 children)

This is why I hope to see rule zero get shit-canned. It's a naive vestige from a time long before we hit late-stage capitalism. Corporate interests have slithered their way into every facet of our lives and we should be working to make software that we write hostile to their practices as much as we can.

If that means that the organizations that have a stranglehold on Open Source™️ don't like it, so be it. We can follow in the spirit of open source without the naivety or captured interests of organizations that define the arbitrary terms by which we categorize software licenses.

[–] anothermember@feddit.uk 6 points 4 days ago

It just means that the decision comes down to the instance owner not the software developer, which I think is right. Everyone should be able to decide what their computer does, that's important to hold on to.

[–] carotte@lemmy.blahaj.zone 4 points 4 days ago (3 children)

this reminds me of the Hippocratic License, which comes with a bunch of modules restricting the use of software based on ethical considerations (for example, there’s a module forbidding the use by police, and another one forbidding the use by any institution on the BDS list)

i think the FSF, in their eternal and unchallengeable wisdom (/s), also declared that it wasn’t foss

[–] xor@lemmy.blahaj.zone 5 points 3 days ago

I mean, they're right that it's not FOSS - the F is free as in available to anybody who may wish to use it, which is incompatible with defining who is allowed

the Hippocratic License

Interesting link, thanks for the discovery!

[–] melmi@lemmy.blahaj.zone 2 points 3 days ago

This is interesting! I've been exploring this and it seems like a neat little license.

I'm not a lawyer, but one funny edge case I noticed is that the Extractive Industries module seems like it makes it a breach of license for crystal shops to use your software since you're involved in the sale of minerals.

I would tend to agree with FSF that it's not FOSS, though. There are so many restrictions on this license and who can use it, based on fairly arbitrary things like "if CBP claims you're doing forced labor" or "you do business in this specific region". It might be more moral, but it's a different approach than FOSS, which is less restrictive than more and prioritizes "Freedom" above everything else. Maybe it's time for a different approach, though?

[–] froufox@lemmy.blahaj.zone 1 points 4 days ago

cool, didnt know about this nuance. based JSON license by the way.

[–] rumba@lemmy.zip 29 points 4 days ago

Wait, they changed the TOS on a site to say that you can't scrape it, when the entirety of the site is available without agreeing to the TOS?

[–] Ascend910@lemmy.ml 29 points 4 days ago
[–] Suavevillain@lemmy.world 15 points 3 days ago

It is better than nothing even if it is hard to enforce.

[–] D06M4@lemmy.zip 21 points 4 days ago

This was one of the few ToS updates I was actually glad to read. ToS changes usually mean a company is slowly rephrasing them to fuck us over.

[–] daniskarma@lemmy.dbzer0.com 18 points 4 days ago

I wonder how does that work with federation.

If a second instance does not have that restriction, is there any "legal" effect on the federated content?

[–] bizza@lemmy.zip 14 points 4 days ago (3 children)

Just like when mastodon.social condemned Meta for their horrible moderation decisions and inability to act properly in the interest of its users, and said that the instance would be cutting ties/not federating with Threads, they kept on federating like nothing happened.

I don't believe anything coming out of mastodon.social unless I can see action being taken with my own two eyes.

Also, blocking scrapers is very easy, and it has nothing to do with a robots.txt (which they ignore).

[–] lazynooblet@lazysoci.al 25 points 4 days ago (1 children)

How is blocking scrapers easy?

This instance receives 500+ IPs with differing user agents all connecting at once but keeping within rate limits by distribution of bots.

The only way I know it's a scraper is if they do something dumb like using "google.com" as the referrer for every request or by eyeballing the logs and noticing multiple entries from the same /12.

[–] rumba@lemmy.zip 7 points 4 days ago

Exactly this, you can only stop scrapers that play by the rules.

Each one of those books powering GPT had like protection on them already.

[–] Ulrich@feddit.org 15 points 4 days ago

blocking scrapers is very easy

The entirety of the internet disagrees.

[–] andypiper@lemmy.world 2 points 3 days ago

and said that the instance would be cutting ties/not federating with Threads,

Can you please show exactly there this was said?

[–] Cocopanda@lemmy.world 7 points 3 days ago (1 children)

Looks like I’m joining Mastodon officially.

It's honestly not bad, definitely the most mature fediverse service

[–] mintiefresh@lemmy.ca 9 points 4 days ago

Well done Mastodon.social.

Even if it may do much, it's still better than not doing it.

[–] Phegan@lemmy.world 5 points 3 days ago
[–] LainTrain@lemmy.dbzer0.com 6 points 4 days ago (1 children)

I will create a masto instance where this is mandatory to counter balance

[–] papigkos@lemmy.wtf 6 points 4 days ago

Failing to train an AI model using your posts as part of the training data within 7 days of posting will result in a permanent ban.

[–] RickyRigatoni@retrolemmy.com 2 points 3 days ago

Terms of Service are a joke and not legally binding. This is just a useless feel good motion.

load more comments
view more: next ›