this post was submitted on 02 Apr 2025
        
      
      69 points (100.0% liked)
      Technology
    40584 readers
  
      
      363 users here now
      A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.
Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.
Subcommunities on Beehaw:
This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.
        founded 3 years ago
      
      MODERATORS
      
    you are viewing a single comment's thread
view the rest of the comments
    view the rest of the comments
If the ID is the MD5 of the path, rainbow tables are completely useless. You don't have the hash. You need to derive the hash by guessing the path to an existing file, for each file.
How unique do you suppose file system paths are?
How many hashes would one need to gather to quickly determine the root path for all files? Paths are not random so guessing the path is just a rainbow table.
The scanning for known releases becomes trivial once the file system pattern is known.
If the server is using a standard path prefix and a standard file layout and is using standard file names it isn't that difficult to find the location of a media file and then from there it would be easier to find bore files, assuming the paths are consistent.
But even for low entropy strings, long strings are difficult to brute force, and rainbow tables are useless for this use case.
I've not looked but if the video id is based on its path, then surely the path includes the filename no? You can't split a hash into its separate original parts, you either guess the entire thing or not. So in that case, the hash is going to challenging to brute force.
It's not that challenging if you are looking for specific media files, but if you wanted to enumerate the files on a server it's basically impossible.
Well lets say your a big movie studio... In the past 10 years you've released 40-50 movies. You pay some lawfirm to go out and find illegal copies of your movies.
Those 40-50 movies * 1000 or 10000 common paths/names makes you a nice table of likely candidates. Prehash that table in MD5. It doesn't take all that much effort to "enumerate" all the movies that your studio cares about. 50000 http requests is childs play and you can scan a public server within minutes for your list.
Fully bruteforcing the thing... yeah that's ridiculous. But I don't think that people are naming bigbucksbunny.mkv as Rp23GXTHp4GN7P6j86HjRdxtfSKKAArj.mkv. So it's not like we're looking for "random" or "all" files anyway.
I don't think anyone was ever saying that the risk here is full enumeration. Though it is technically possible with sufficient time... just will take a lot of time.
That is possible, but I don't think you need to worry about that. Having a copy of a movie is not normally itself a crime.
Having it publicly accessible on a web server is distribution. And that normally IS a crime unless you have some licenses to do so.
I think in this case whether it's distribution or not would have to go to court. It's not intentended to be distribution. Depending on the judge and the lawyers it could be distribution or not distribution or the prosecution may have committed a crime in finding it.
Sure. Now who here wants to litigate it and find out?
Web scanners/crawlers aren't illegal though. And since it's not authenticated there's no attempt to break any security/authentication/encryption. You don't get in trouble for finding a random URL in a google search and accessing it. You'd get in trouble if you had to bypass some security measure to get there.
The point of this all is that these endpoints have no measure in place. Seemingly on purpose, and it's documented by the maintainers that they don't intend to fix it and leaving it open is intentional.
You can gamble it. I won't. I just can't accept that "Jellyfin is better" that keeps getting pushed when big gaping problematic holes like this exist.
Trying hundreds or thousands of hashes against the servers of random unconsenting people on the internet is beyond what I would be comfortable with. People have been prosecuted for less. It's not the same as a crawler where you try a few well known locations and follow links. You're trying to gain access to a system that somebody did not intend for you to have access to.
These endpoints probably don't have protection because they were never designed to and it's hard to add it later. Theoretically, if the IDs are random that's probably good enough except that you wouldn't be able to revoke access once somebody had it. The IDs probably aren't random because at some point only the path is used. It's how software evolves. It's not on purpose that somebody may be able to guess the ID to gain access to it.
And installing a rootkit just because a customer put my music disc in a computer would be beyond what I'm comfortable with. However we know they've done it, and more or less got away with it.