I don't have a testing environment, but essentially all my services are on docker saving their data in a directory mounted on the local filesystem. The dockerfile reads the sha version of the image from an env file. I have a shell script which:
- Triggers a new btrfs snapshot of the volume containing everyithing
- Pulls the new docker images and stores their hashes in the env file
- Restarts all the containers.
if a new Docker version is broken rolling back is as simple as copying the old version in the env file and recreating the container. If data gets corrupted I can just copy the last working status from an old snaphot.
The whole os is on a btrfs volume which is snapshotted regularly, so ideally if an update fucks it up beyond recovery I can always boot from a rescue image and restore an old snapshot. But I honestly feel this is extra precaution: in years that I run debian on all my computers, it never reached the point of being not bootable.
I managed to remove all the kernels instead of all the old kernels. It was a good learning experience fixing it later, and now I pay much more attention when
apt
warns about "potentially dangerous operations".