Rich Kulawiec on 17 Mar 2017 12:57:06 -0700

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] Avoid Arvixe at all costs!

On Fri, Mar 17, 2017 at 11:54:40AM -0400, Steve Litt wrote:
> I tend not to back up my OS, but only data I can't reinstall or
> repurchase: In other words, data created by me. I might revisit that
> stance later.

I'm going to try to convince you to revisit that stance. ;)

My backup philosophy since forever has been "every file on every filesystem".

Yes, this does mean that in some instances I'm backing up 9 copies of /usr
that are putatively identical.  However, since I use dump and incremental
backups (with a numbering scheme that accounts for write-rarely filesystems
like /usr) the cost of doing so is small.  And in relative terms, it's
gotten smaller over the years: once upon a time, the size of /usr was
a significant fraction of total filesystem size.  These days, it's not
uncommon for it to be only a few percent or less. So why not?

I do this for several reasons:

	1. Insulation against error.  If those 9 copies of /usr are
	putatively identical -- because manual procedures or automatic
	processes (e.g., cron-driven rsync) are supposed to be making them
	so -- then those backups should be as well.  If they're not:
	investigation is required.

	2. Disaster recovery.  While I might have 9 systems with identical
	/usr, those same 9 systems might have 3 different versions
	of /usr/local.  If the world implodes and I have to recreate on
	bare metal using distribution media and backups, then when I'm
	47 hours into it, I do NOT want to have to stop to think "
	which /usr/local is this supposed to have?"  I want the answer
	to be obvious and automatic.  I want it to be something I can't
	get wrong while under duress.  I want it to be something that
	I can hand over to an assistant who's never done this before
	with some assurance that *they* can't get it wrong.

	3. Security.  While it's possible that an intruder who can
	penetrate one of those 9 systems can penetrate the other 8,
	it's not guaranteed.  Maybe I'll catch a break and notice something
	amiss.  But I can't catch a break if I don't have this covered
	by backups.  Nor can I do all the post-incident forensic analysis
	that I might be able to do.  And do *not* want to have to say
	to someone "well, ummm, no, I don't actually have a complete
	backup of that system because it was supposed to be
	cookie-cuttered just like the others, and I was trying to
	save some space".

	4. Redundancy.  I'm a big fan of having multiple overlapping
	(partially or fully overlapping) instances of backups on multiple
	different devices, in order to insulate against all single
	points-of-failure and as many multiple points-of-failure as I
	can handle without going off the rails WRT complexity and cost.
	This is mostly a product of long and sometimes painful experience,
	including sentences that began "If only there was..." and ended
	in bitter disappointment.  So I tend to over-engineer backup
	systems in the hope that I don't have to go through any of that
	again.  For example: if I'm backing up 9 systems to a backup
	server with a dozen physical disks, I do NOT set those 12 up with
	RAID: I don't need the performance and I want each physical
	disk, if it fails, to be the only thing that fails.  I also
	scatter the backups across them, e.g., today's level 5's go
	to disk 3, tomorrow's level 6's go to disk 7, etc.

There's no guarantee that any of this will do the slightest good, but
since it's quite likely the different between backing up 18.4T and 18.7T,
and since it can all be done with automation, why not?

Note: this only works to a certain scale: it's not going to work
so well with 5K systems, and then different strategies are called for.

Philadelphia Linux Users Group         --
Announcements -
General Discussion  --