Rich Freeman on 21 Aug 2011 19:24:21 -0700 |
[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]
Re: [PLUG] Degraded RAID |
On Sun, Aug 21, 2011 at 10:04 PM, Jeff Bailey <skydiver38@verizon.net> wrote: > How do I know why it got removed, and whether it is actually failed? Can I > just try to re-add it, and if it works, great and if it fails, I'll have an > idea why? I know which device it is - where do I go from there? You can certainly try to re-add it. To know why it failed you probably need to check your logs. However, I suspect the relevant logs to to dmesg, and may or may not make it to syslog. You can check the status in /proc/mdstat and if it doesn't get added be sure to check dmesg for anything interesting. I had a motherboard that was a bit flaky with the IDE controller and sometimes one of my drives wouldn't get detected. If I ran in degraded mode it might or might not get automatically re-added on the next reboot. Something like that is always a possibility. Just be sure you're adding the right drive to the array so that you don't hose something important (it will wipe the drive). if you actually lose a drive devices like /dev/sd[abcd] will get re-ordered so /dev/sd# might not be what you think it is. You could do a "file -s <device>" to try to get an idea of what is already on the device first, or use mdadm --examine <device> to see how it used to fit into an array. If you re-add the old device back to the raid it will probably rebuild fairly quickly - the raid keeps track of what actually changed and only rebuilds those regions. If you want to be extra-safe you can "echo check > /sys/block/md#/md/sync_action" once it is done rebuilding - that will force the raid to check the parity on every stripe and will detect any damage to the array (oterwise you'll find it the first time you read a damaged stripe). It can potentially try to fix errors, but the design of linux software raid makes that potentially imperfect. Filesystems like btrfs that checksum everything provide a higher degree of assurance that you're overwriting the bad data with good data. Hope that helps! Rich ___________________________________________________________________________ Philadelphia Linux Users Group -- http://www.phillylinux.org Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce General Discussion -- http://lists.phillylinux.org/mailman/listinfo/plug