Selfhosted
A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.
Rules:
-
Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.
-
No spam posting.
-
Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.
-
Don't duplicate the full text of your blog or github here. Just post the link for folks to click.
-
Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).
-
No trolling.
Resources:
- selfh.st Newsletter and index of selfhosted software and apps
- awesome-selfhosted software
- awesome-sysadmin resources
- Self-Hosted Podcast from Jupiter Broadcasting
Any issues on the community? Report it using the report flag.
Questions? DM the mods!
view the rest of the comments
The main point is that sync (like RAID) isn't a backup. If ransomware got in and started encrypting all your files, how would you know / protect yourself..
There's a lot of focus on 3-2-1 backups, so offsite is good, but consider your G-F-S strategy too - as long as this remote copy isn't your only long-term backup option, then sync might be ok for you
So, syncthing / rsync / etc is fine... but maybe just point it to your monthly / weekly / daily backup folder(s) rather than the main files?
You also had some other suggestions I think, like zfs / btrfs snapshots... which would be a point in time copy of your files.
Or burn the photos to DVD / Bluray and store them at the other location? No power requirements there...
I didn’t consider that, excellent point. Forgive my ignorance because I’m not certain how the backup systems work, and feel free to ignore this if you don’t know. I presume they compare some metadata or hash of a file against another file and then decide if it’s the same or not to back up? Let’s say I have a file that I have already backed up, and then there is some ransomware that encrypted my files. Would the back up software make a second copy of the file?
So for most of the important files, I just do a sync to an external drive periodically. Basically when I know there have been a lot of changes. For example I went on a trip last year and came back with nearly 2 TBs of photos/videos. After ingesting the files to unRAID, I synced my external drive. Since I haven’t done much with those files since that first sync, I haven’t done the periodic sync since then. But now you’ve opened my eyes that even this could be a problem. How would the G-F-S strategy work in this case?
I thought about zfs or btrfs but my Unraid array is unfortunately xfs and it’s too large at this point to restart from scratch.
Haha that would be a lot of blurays.
It depends on the sync / backup software
Syncthing uses a stored list of hashes (which is why it takes a long time for the initial scan), then it can monitor filesystem activity for changes to know what to sync.
Rsync compares all source and destination files with some magical high speed algorithm
Then, backup software does... whatever.
Back in the day on FAT filesystems they used the archive bit on each file's metadata, which was (IIRC) set during a backup and reset with any writes to that file. The next backup could then just backup those files.
Your current strategy is ok - just doing an offline backup after a bulk update, maybe it's just making that more robust by automating it...?
I suspect you have quite a large archive as photos don't compress well, and +2TBs won't disappear with dedupe... so, it's mostly about long term archival rather than highly dynamic data changes.
So that +2TB... do you drop those files in amongst everything else, or do you have 2 separate locations ie, "My Photos" + "To Be Organised"?
Maybe only backup "MyPhotos" once a year / quarter (for example), but fully sync "To Be Organised"... then you've reduced risk, and volume of backup data...?
Ahh ok, that makes sense. Hah magical algorithm.
Yeah it’s about 30TB of photos/videos. I only recently got into videography which takes up a ton of space. About 25% of that is videos converted into an editing codec, but I don’t have those backed up to external drives. I also have some folders excluded that I know have duplicates. A winter project of mine will be to clear out some of the duplicates, and then cull the photos/videos I definitely don’t need. I got into a bad data hoarding habit and kept everything even after selecting the keepers.
I have an in progress folder where I dump everything, then folders by year/month for projects and keepers. I need to do better with culling as I go.
I like that idea, I will incorporate it into my strategy.
Thank you for taking the time to help me out with this, much appreciated!