this post was submitted on 07 Apr 2025
53 points (96.5% liked)

Technology

2396 readers
441 users here now

Which posts fit here?

Anything that is at least tangentially connected to the technology, social media platforms, informational technologies and tech policy.


Rules

1. English onlyTitle and associated content has to be in English.
2. Use original linkPost URL should be the original link to the article (even if paywalled) and archived copies left in the body. It allows avoiding duplicate posts when cross-posting.
3. Respectful communicationAll communication has to be respectful of differing opinions, viewpoints, and experiences.
4. InclusivityEveryone is welcome here regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity and expression, education, socio-economic status, nationality, personal appearance, race, caste, color, religion, or sexual identity and orientation.
5. Ad hominem attacksAny kind of personal attacks are expressly forbidden. If you can't argue your position without attacking a person's character, you already lost the argument.
6. Off-topic tangentsStay on topic. Keep it relevant.
7. Instance rules may applyIf something is not covered by community rules, but are against lemmy.zip instance rules, they will be enforced.


Companion communities

!globalnews@lemmy.zip
!interestingshare@lemmy.zip


Icon attribution | Banner attribution


If someone is interested in moderating this community, message @brikox@lemmy.zip.

founded 1 year ago
MODERATORS
53
submitted 18 hours ago* (last edited 18 hours ago) by BrikoX@lemmy.zip to c/technology@lemmy.zip
 

We need to talk about the data. Crap data. We’re destroying our environment to create and store trillions of blurred images, half-baked videos, rip-off AI ‘songs’, rip-off AI animations, videos and images, emails with mega attachments, never-to-be-watched-again presentations, never-to-be-read-again reports, files and drawings from cancelled projects, drafts of drafts of drafts, out of date, inaccurate and plain wrong information, and gigabytes and gigabytes of poorly written, meandering content.

top 5 comments
sorted by: hot top controversial new old
[–] capuccino@lemmy.world 1 points 5 hours ago

"Big data" they call it.

[–] oce@jlai.lu 17 points 17 hours ago

It forgot the server logs that will never be read with no proper retention strategy.

[–] spankmonkey@lemmy.world 12 points 17 hours ago

The Cloud made the crap data problem infinitely worse. The Cloud is what happens when the cost of storing data is less than the cost of figuring out what to do with the crap.

Yeah, cheaper to hold it just in case is actually a best case scenario for audit trails and the occasional look back. If 99.9999% is useless down thw road but one file answers some obscure question and it would have been more expensive to sort through it, then the cost savings and benefit was worth it financially.

And nobody in management cares because it’s so ‘cheap’ to store data. And this is what AI is being trained on. And we wonder why AI gets stuff wrong so often? Crap data in. Crap data out. And nobody cares.

Hold up. No, you don't get to blame cheap data retention for AI being shit. AI is shit becsuse they train it on this shitty data instead of curating better quality data. AI gets shit wrong because they are training it on reddit data without taking into account humor subreddits instead of educationally verified content. Libraries curate their content,AI just jams whatever they can find into their AI model.

People and companies are not responsible for AI using their shitty content and presenting it as a reliable source of information.

[–] Illegalmexicant@lemmy.world 8 points 17 hours ago

So are the photos on my phone. I started getting old and now I can see and can't remember so everything gets a picture! Went from big butts and boobs to small text and team viewer logins.