this post was submitted on 04 Jul 2025
132 points (99.3% liked)
Programmer Humor
24736 readers
1796 users here now
Welcome to Programmer Humor!
This is a place where you can post jokes, memes, humor, etc. related to programming!
For sharing awful code theres also Programming Horror.
Rules
- Keep content in english
- No advertisements
- Posts must be related to programming or programmer topics
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I think SQLite is a great middle ground. It saves the database as a single .db file, and can do everything an SQL database can do. Querying for data is a lot more flexible and a lot faster. The tools for manipulating the data in any way you want are very good and very robust.
However, I'm not sure how it would affect file size. It might be smaller because JSON/YAML wastes a lot of characters on redundant information (field names) and storing numbers as text, which the database would store as binary data in a defined structure. On the other hand, extra space is used to make common SQL operations happen much faster using fancy data structures. I don't know which effect is greater so file size could be bigger or smaller.
SQLite would definitely be smaller, faster, and require less memory.
Thing is, it's 2025, roughly 20 years since anybody's given half a shit about storage efficiency, memory efficiency, or even CPU efficiency for anything so small. Presumably this is not something they need to query dynamically.
True (in most contexts, probably including this one), but I think that only makes the case for SQLite stronger. What people do still care about is a good flexible, usable and reliable interface. I'm not sure how to get that with YAML.
YAML is not a good format for this. But any line based or steamable format would be good enough for log data like this. Really easy to parse with any language or even directly with shell scripts. No need to even know SQL, any text processing would work fine.
I didn't look to much at the data but I think csv might actually be an appropriate format for this?
Nice simple plaintext and very easy to parse into a datastructure for analysing/using it in python or similar
CSV would be fine. The big problem with the data as presented is it is a YAML list, so needs the whole file to be read into memory and decoded before you get and values out of it. Any line based encoding would be vastly better and allow line based processing to be done. CSV, json objects encoded into a single line, some other streaming binary format. Does not make much difference overall as long as it is line based or at least streamable.