Rich Freeman on 30 Aug 2013 10:17:19 -0700


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] tar/pigz question


On Fri, Aug 30, 2013 at 10:52 AM, Carl Johnson
<cjohnson19791979@gmail.com> wrote:
> Along the lines of yesterday's tar discussion, I have been working on
> diffusing a.... err ummm....I mean... (ahem). I've been working on a backup
> routine for a few days now. I have an issue with a part of this routine that
> I'm not sure how to resolve. Here's what I'm doing......

Having spent quite a bit of time recently on backups I'd really
reconsider any use of tar in a backup solution.  If your goal is
something that is really least-common-denominator (such as tarballs
for distribution) then it makes sense.  For just doing backup, I'd
probably use something else unless you're writing to non-seekable
media like tape or a network stream (a network mount is seekable).

For stuff that doesn't compress well something like rsync/rsnapshot is
very simple and very effective.  For something that does compress well
duplicity seems like a really good solution.  Both have the major
advantage that doing something like touch <1TB-file> end up storing a
kilobyte of metadata and not re-transmitting the entire file like tar
would.

Oh, and if you don't trust the file times rsync also supports
checksum-based incrementals (though you'll want rsyncd on the far side
if you don't want a TON of network traffic on incrementals - not sure
if rsync+ssh can do remote hash calcs).  Duplicity might or might not
support that (possibly via rsync option passthrough - it is built on
librsync).

FYI - I'm really grokking rsnapshot.  Even for high-turnover stuff
like mythtv video I'm using it for daily "incrementals" - but the
backups are all filled-in with hard links vs the previous backup (not
quite full de-duping but close).  I actually used my rsnapshot backups
as the basis for initially populating my btrfs partition so that I
could do an rsync while the server was up and not kill disk
performance (since neither my backup nor my btrfs filesystems were in
use otherwise), and then just did one last rsync vs the live data just
before migrating.  I'm now running on mirrored btrfs just fine, and am
running snapper hourly as an additional layer of protection on top of
the daily rsnapshot + duplicity backups.

>
> /bin/tar -c -v --use-compress-program=pigz /mnt/drive_z/ | /usr/bin/pigz >
> /mnt/backupLUN09_SAN/file-server-backup--`date +%m_%d_%y`--.tar.gz

One thing to note in your command line (which Mark did correct) is
that you are using both use-compress-program AND piping the output
into a compression program.  That will probably double-compress your
data, resulting in likely confusion during restoration, and also
certainly burning a ton of CPU for no additional benefit.

Rich
___________________________________________________________________________
Philadelphia Linux Users Group         --        http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug