JP Vossen on 27 Mar 2017 08:53:11 -0700

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] Backups vs Copies: was Avoid Arvixe at all costs!

On 03/27/2017 11:25 AM, Rich Freeman wrote:
On Mon, Mar 27, 2017 at 11:07 AM, Fred Stluka <> wrote:

I get a full backup, plus daily incrementals out of a single rsync
% rsync -rpogtlv --del --backup --backup-dir=sparse src/ full

This updates my full backup tree, but instead of overwriting or
deleting files from that tree, it moves them to a new timestamped
incremental tree that is sparsely populated only with the files
that would have been changed or deleted.

That essentially works, but you might seriously consider using
rsnapshot instead in such a situation.  The only thing that changes is
how the files are organized.

Instead of a bazillion timestamped files all over the place, you
instead end up with timestamped parent directories that are populated
with full backups, which are sparse in the sense that they're full of
hard-links where files didn't change.

The advantage of rsnapshot is that you can just copy/rsync one of
those directories back and you get your filesystem in the state as of
the timestamp you used (presumably the latest), vs getting a
filesystem full of timestamped incremental files all over the place
that you then need to try to clean up.  Either way you can still go
back in time for any particular file.

But, they both get the job done, and depending on how you prefer the
format of your archive and the state after restoration, either could
be the "better" solution.

I forget if it's come up in this thread, but I've been using to do that kind of thing for many years. It's written in Perl, uses rsync (or other things, but I use only rsync) and does file-level de-dup as well. v3 and below use hard links, as discussed above, but v4 does away with those somehow, I haven't investigated. The problem with huge numbers and trees of hardlinks is that they can be difficult to move to a new volume, if you are replacing hard drives for whatever reason.

I can't count the number of times BackupPC has saved some bacon by allowing me to go back in time, usually to rescue some file I accidentally overwrote. I have about 20 point-in-time backups for each device ranging from <1 day to 467.9 days on the one I just checked via the simple web GUI. I have 10 active devices, are: * Pool is 381.78GB comprising 634427 files and 4369 directories (as of 2017-03-27 01:16)
* Pool hashing gives 160 repeated files with longest chain 37
* Nightly cleanup removed 3857 files of size 6.70GB (around 2017-03-27 01:16)

Note these are data and config only, not full bare-metal, so /etc/ (and etckeeper), /home/, /var/spool/cron/ and such. I also have 1-to-1 full system backups using straight rsync, but those are latest-copy only, not back-in-time like BackupPC. And I'd argue that these days, back-in-time is absolutely *critical* because of Ransomware threats if nothing else, but PBCAK/ID10T issues are a close second.

The "disadvantages" are that the file pool is not directly accessible or copyable, so to restore you use the web GUI (or there may be a CLI tool) to either put the files back directly, someplace else, or make a tarball. And client setup is not trivial. I use only rsync over SSH but it can use Samba and other things, so there are the usual SSH key things to do, and the configs are Perl code. Configs are very well documented, but there are a great many options and a few hoops to jump.

In all fairness, I've been using this for so long that there may be better ways of setup I'm not familiar with and I'm doing something the old way. Whatever, it works great for me, and I can share my docs if needed.

--  -------------------------------------------------------------------
JP Vossen, CISSP | |
Philadelphia Linux Users Group         --
Announcements -
General Discussion  --