Keith C. Perry on 24 Apr 2016 17:43:05 -0700


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] [plug-announce] TOMORROW - Tue, Apr 19, 2016: PLUG North - "Linux Containers" by Jason Plum and Rich Freeman (*6:30 pm* at CoreDial in Blue Bell)


I understand what a silent corruption is however you have to make a fair comparison.  If cosmic rays can flip bits in hardware or software that goes undetected then all bets are off and you're going to have undetected data corruption no matter how your data is stored.

Storage mechanisms are going to use reliable methods to correct or at least detect bad data but there is still a chance that the n+1 plan is defeated by the n+2 event.  In my experience, its just not that "easy".  Which is to say, it is rare those cosmic rays or random events silently flip a bit and human inspection is the only thing that reveals a problem.

When you stay within statistical norms, this is just not something you can make a choice of file system on.

I don't see why there would be increased concern as storage system get larger.  Densities increases have been slowing in favor of a LVM constructs because there are physical limits to how much data can be stored in standard hard disk form factors.  One of those limiters is going to be how durable the data is.  If we can't reliably retrieve data at a certain density then we will never see those densities.  This brings me back to when I was at Sony and blue lasers were first being discussed.  We already knew red lasers has certain acceptable limits that we were going to hit sooner than later.  Initially there were spot focus problems with blue lasers which delayed their use so focus was put into other areas such as error correction and tracking (the ability to keep the laser on track as well as read the bit stream reliably).  When bluray and other products came out, they benefited from those techniques as well.  Fast forward to now...  Consumers and business use optical media for data storage because it is considered "reliable".  Sure, there could be a silent failure but how many people completely verify their burned disks these days?

So, even though BTRFS and other "modern" COW files might have one type of advantage, practically speaking, all points together might not actually yield a detectable net benefit for this single point.

This is sort of the curse of new technology.  The new thingy may or may not be "better" but we don't know until we use it.  However, even if we do, failures are still going to happen.  The new thingy is usually blamed first, which may or may not be correct because you now have to evaluate an entire system and not just a single component- the file system in this case.  That is something that takes lots of confirmation but if its a rare occurrence to begin with, there may never been enough observable data to make a definitive statement.

Coming back around to containers, I still think you nailed it earlier.  It would have been better to have some choices- every piece of tech has its fans so over time we could see how each ephemeral method worked.  Maybe it would have worked as developers conceived, maybe not.  Maybe in a couple of years we'll be talking about a new method all together.

Here's something I just found...  (well, "pacman -Ss btrfs" pointed me in the right direction)

http://snapper.io/overview.html

Apparently someone figured out how to do snapshot management for LVM, BTRFS and EXT4  :D


~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ 
Keith C. Perry, MS E.E. 
Owner, DAO Technologies LLC 
(O) +1.215.525.4165 x2033 
(M) +1.215.432.5167 
www.daotechnologies.com

----- Original Message -----
From: "Rich Freeman" <r-plug@thefreemanclan.net>
To: "Philadelphia Linux User's Group Discussion List" <plug@lists.phillylinux.org>
Sent: Friday, April 22, 2016 8:44:12 PM
Subject: Re: [PLUG] [plug-announce] TOMORROW - Tue, Apr 19, 2016: PLUG North - "Linux Containers" by Jason Plum and Rich Freeman (*6:30 pm* at CoreDial in Blue Bell)

On Fri, Apr 22, 2016 at 3:01 AM, Keith C. Perry
<kperry@daotechnologies.com> wrote:
>
> "Sure, but when you get a silent bit flip on a disk the following happens:
> 1.  mdadm reads the bad data off the disk and notes the drive did not
> return any errors.
> 2.  mdadm hands the data to lvm, which hands it to btrfs.
> 3.  btrfs checks the checksum and it is wrong, so it reports a read failure."
>
> True but if is was that "easy" to silently flip a bit, software RAID wouldn't be a thing.  That's more of a problem with hardware RAID because the controller can lie about the fsync status.  Worst than that, even on good cards, RAID subsystem issues like the classic battery failure can go unreported as well.  Beyond LVM's own checksum facilities there is also dmeventd which should detect problems so issues like what you describe should never happen silently.  When detected, there are a number of things that can be done to deal with failures (i.e. bad disk in a RAID, JBOD or mirror set).  Every fs would be vulnerable to this situation but its just not something I've ever seen personally.  I can't remember the last time I heard someone working directly with mdadm having such a problem.

I think you're misunderstanding silent corruptions.  A silent
corruption is a disk corruption where the disk returns data other than
what was intended to be stored, but the disk did not return any error
message/etc either during storage or retrieval.  I'm not aware of any
hardware raid cards that handle this, and mdadm certainly doesn't.
dmeventd probably wouldn't detect this either, as I don't believe LVM
does any kind of error detection beyond just passing through errors
returned by the lower layers.

I'm not aware of any storage system in common use that can handle such
an error easily if it is detected.  An mdadm scrub would detect an
error, but then all it knows is that the disks are inconsistent.
You'd need to somehow figure out which disk is the bad one, and then
remove that disk from the array, re-add it after wiping its metadata,
and rebuild the array.  Figuring out which disk is the "bad" one would
be difficult.  I put bad in quotes because smartctl/etc would show
that all the drives are good, and the drive is probably perfectly
functional.  It just had some data get corrupted, probably as a result
of a cosmic ray strike/etc.

Historically this was never a big problem, but there is a lot of
concern that it is becoming more of a problem as storage densities
increase.  As the amount of data crammed into a small area of disk
increases, it takes less energy to change its state, and thus
increases the chance that cosmic ray flux/etc can impart that amount
of energy to the region of the disk before its magnetization gets
refreshed.

Keep in mind that unless you're looking for corruptions you wouldn't
know you have them.  That's why they're called "silent."

-- 
Rich
___________________________________________________________________________
Philadelphia Linux Users Group         --        http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug
___________________________________________________________________________
Philadelphia Linux Users Group         --        http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug