Rich Freeman on 22 Apr 2016 17:44:19 -0700


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] [plug-announce] TOMORROW - Tue, Apr 19, 2016: PLUG North - "Linux Containers" by Jason Plum and Rich Freeman (*6:30 pm* at CoreDial in Blue Bell)


On Fri, Apr 22, 2016 at 3:01 AM, Keith C. Perry
<kperry@daotechnologies.com> wrote:
>
> "Sure, but when you get a silent bit flip on a disk the following happens:
> 1.  mdadm reads the bad data off the disk and notes the drive did not
> return any errors.
> 2.  mdadm hands the data to lvm, which hands it to btrfs.
> 3.  btrfs checks the checksum and it is wrong, so it reports a read failure."
>
> True but if is was that "easy" to silently flip a bit, software RAID wouldn't be a thing.  That's more of a problem with hardware RAID because the controller can lie about the fsync status.  Worst than that, even on good cards, RAID subsystem issues like the classic battery failure can go unreported as well.  Beyond LVM's own checksum facilities there is also dmeventd which should detect problems so issues like what you describe should never happen silently.  When detected, there are a number of things that can be done to deal with failures (i.e. bad disk in a RAID, JBOD or mirror set).  Every fs would be vulnerable to this situation but its just not something I've ever seen personally.  I can't remember the last time I heard someone working directly with mdadm having such a problem.

I think you're misunderstanding silent corruptions.  A silent
corruption is a disk corruption where the disk returns data other than
what was intended to be stored, but the disk did not return any error
message/etc either during storage or retrieval.  I'm not aware of any
hardware raid cards that handle this, and mdadm certainly doesn't.
dmeventd probably wouldn't detect this either, as I don't believe LVM
does any kind of error detection beyond just passing through errors
returned by the lower layers.

I'm not aware of any storage system in common use that can handle such
an error easily if it is detected.  An mdadm scrub would detect an
error, but then all it knows is that the disks are inconsistent.
You'd need to somehow figure out which disk is the bad one, and then
remove that disk from the array, re-add it after wiping its metadata,
and rebuild the array.  Figuring out which disk is the "bad" one would
be difficult.  I put bad in quotes because smartctl/etc would show
that all the drives are good, and the drive is probably perfectly
functional.  It just had some data get corrupted, probably as a result
of a cosmic ray strike/etc.

Historically this was never a big problem, but there is a lot of
concern that it is becoming more of a problem as storage densities
increase.  As the amount of data crammed into a small area of disk
increases, it takes less energy to change its state, and thus
increases the chance that cosmic ray flux/etc can impart that amount
of energy to the region of the disk before its magnetization gets
refreshed.

Keep in mind that unless you're looking for corruptions you wouldn't
know you have them.  That's why they're called "silent."

-- 
Rich
___________________________________________________________________________
Philadelphia Linux Users Group         --        http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug