Rich Freeman on 11 Aug 2013 17:20:06 -0700 |
[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]
[PLUG] Offline Backup Solutions |
After attending yesterday's Bacula talk I am thinking about doing offline backups to an eSATA drive. I'm not sure if Bacula is actually the right tool for the job though. I'd like to define the following classes of jobs: 1. MythTV video - 1-2x/wk backup, no retention of deleted/changed files (~1TB with high turnover) 2. Unimportant files - daily backup, short retention of deleted/changed files (~1TB with low turnover) 3. Important files - hourly backup, long retention of deleted/changed files (~30GB with low turnover) Some of the important files might come from other hosts running Windows (which makes something like Bacula more attractive). I'd like all but #1 to run automatically (optional for #1). I'd like my large offline storage to remain, well, offline (not physically connected). Automated backups would all go into online storage, and would be migrated to the offline storage when it is connected. Online storage would have a capacity of ~100-200GB tops (ie it cannot store a full backup of anything but the important files). I'm not sure if any of the out-of-the-box solutions will really handle this. For MythTV I'm thinking that must manual rsyncs might be my best option as it would be fast and accurate (I can trust names/mtimes and just want to mirror). For the unimportant files I'd have to ensure that all full backups are manual and any automated backups are incrementals/differentials, since I can only perform the full backups with the offline storage which I'd need to supervise. Any suggestions? How are others handling offline storage? I could just manually mirror things but then I lose the security of automated backups. I could leave the offline storage online, but then that makes it vulnerable to many failures that would take out the originals (even if unmounted when not in use). I was looking at Bacula and it seems like I could sort-of do this. I'd define the offline storage for full unimportant backups as a pool and only manually trigger those, and then have regular differential/incrementals directed to the online storage area. I could then migrate that data to the offline storage from time to time to keep it from filling up. The only problem with this is that the retention periods in Bacula are a bit kludgy - I'd need many pairs of pools on both physical devices to get all that to work out. I'm not sure if Bacula will even enforce retention during a migration (if you migrate a volume into a pool that is full will it purge existing volumes to make room for the new ones?). This just seems more complicated than it needs to be. Surely somebody must be doing backups using offline disks? Most of the logic is built around having a box of tapes and rotating through those, but that is incredibly expensive these days as tape just hasn't kept pace, and I'm not going to rotate disks that will end up being 90% empty, or have the system be doing full backups on multiple-TB of data with any frequency. I could just do manual rsyncs/etc, but then if I forget to do it for a week I am taking a fair bit of risk, and managing retention with rsync doesn't sound simple. I could also just leave the drive online but unmounted. One advantage of rsync though is that recovery is brain-dead simple. I don't mind the thought of recovering onto bare metal from something like tar/dar/etc, but for something like Bacula the bar is considerably higher. The important stuff is already being backed up to S3, and I don't think I'm going to change that. This is really about faster recovery in the event of something other than a fire and backing up all the other junk that doesn't warrant that kind of treatment. I'm also contemplating moving to btrfs and I'd really only want to do that if I had a fairly full set of recent backups at all times. How are others handling offline backup? I may just be over-engineering things. I could probably script up manual backups using rsync/sarab/dar fairly easily, and I know those would be easy to restore. (sarab is a script that wraps around dar, and dar is like tar but with indexing so that most operations don't require scanning the whole file) Rich ___________________________________________________________________________ Philadelphia Linux Users Group -- http://www.phillylinux.org Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce General Discussion -- http://lists.phillylinux.org/mailman/listinfo/plug