Re: [PLUG] Degraded RAID

Rich Freeman on 21 Aug 2011 19:24:21 -0700

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] Degraded RAID

From: Rich Freeman <r-plug@thefreemanclan.net>
To: "Philadelphia Linux User's Group Discussion List" <plug@lists.phillylinux.org>
Subject: Re: [PLUG] Degraded RAID
Date: Sun, 21 Aug 2011 22:24:16 -0400
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:content-type :content-transfer-encoding; bh=RgzPFfImM3g/HFdsLCiyr/dOwlfBJ4lIdOS5nRDtJig=; b=IkJIyyyfM+TuTUjDLzjnxe1NCdNG7WIcDVkR/NOhRUK5bo/Y7hUj5K//yoZC1YiCgY BblH061bPaRK5Xhxlqr7YOvCtNQiPgJACN1btSyhuLIbZP/QTIE5BNBdNYgJNosWign6 zLgJLtVVUOZPqQucfLPcWUU+HnDGrzqsDVLSM=
Reply-to: Philadelphia Linux User's Group Discussion List <plug@lists.phillylinux.org>
Sender: plug-bounces@lists.phillylinux.org

On Sun, Aug 21, 2011 at 10:04 PM, Jeff Bailey <skydiver38@verizon.net> wrote:
> How do I know why it got removed, and whether it is actually failed?  Can I
> just try to re-add it, and if it works, great and if it fails, I'll have an
> idea why?  I know which device it is - where do I go from there?

You can certainly try to re-add it.  To know why it failed you
probably need to check your logs.  However, I suspect the relevant
logs to to dmesg, and may or may not make it to syslog.  You can check
the status in /proc/mdstat and if it doesn't get added be sure to
check dmesg for anything interesting.

I had a motherboard that was a bit flaky with the IDE controller and
sometimes one of my drives wouldn't get detected.  If I ran in
degraded mode it might or might not get automatically re-added on the
next reboot.  Something like that is always a possibility.

Just be sure you're adding the right drive to the array so that you
don't hose something important (it will wipe the drive).  if you
actually lose a drive devices like /dev/sd[abcd] will get re-ordered
so /dev/sd# might not be what you think it is.  You could do a "file
-s <device>" to try to get an idea of what is already on the device
first, or use mdadm --examine <device> to see how it used to fit into
an array.

If you re-add the old device back to the raid it will probably rebuild
fairly quickly - the raid keeps track of what actually changed and
only rebuilds those regions.  If you want to be extra-safe you can
"echo check > /sys/block/md#/md/sync_action" once it is done
rebuilding - that will force the raid to check the parity on every
stripe and will detect any damage to the array (oterwise you'll find
it the first time you read a damaged stripe).  It can potentially try
to fix errors, but the design of linux software raid makes that
potentially imperfect.  Filesystems like btrfs that checksum
everything provide a higher degree of assurance that you're
overwriting the bad data with good data.

Hope that helps!

Rich
___________________________________________________________________________
Philadelphia Linux Users Group         --        http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug

References:
- [PLUG] Degraded RAID
  - From: Jeff Bailey <skydiver38@verizon.net>
- Re: [PLUG] Degraded RAID
  - From: Rich Freeman <r-plug@thefreemanclan.net>
- Re: [PLUG] Degraded RAID
  - From: Jeff Bailey <skydiver38@verizon.net>

Prev by Date: Re: [PLUG] Degraded RAID
Next by Date: Re: [PLUG] Ctrl-Alt-Backspace (was Re: Debian unstable locking up and corrupting filesystem)
Previous by thread: Re: [PLUG] Degraded RAID
Next by thread: [PLUG] Debian unstable locking up and corrupting filesystem
Index(es):
- Date
- Thread