I have, orrrrr rather HAD a Centos 6 box with a three softraid arrays setup that I upgraded before my first cup of coffee this morning. It's all downhill from here....
For some reason yum thought it knew more about my raid setup than I did and stuck a few "NAME=" bits and a bunch of other crap into my /etc/mdadm.conf. This of course was done before the new initramfs was auto-built for the new kernel image. Cool! Break the RAID config file then jam it into the ramdisk. Awesome! So, as expected after a reboot the kernel panicked telling me that it couldn't find (among other things) its' root FS.
Excellent so far, right? It gets better!
I figured "Ok, I'll just fire up sysresccd, chroot what I need, fix the mdadm.conf in the initramfs, "init 6" and all will be right with the world once again. WRONG! For whatever the reason, sysresccd decided to pick just a few random disks out of one of the arrays and, initiate a resync during boot before I had an interactive tty. Nice! Thanks for that. Fantastic!
The next thing that I did was stop the resync, and fix my mdadm.conf in initramfs. Reboot. Two of the three arrays came back up assembled and clean. One (RAID6) didn't. Here's what the disks in the affected array look like now.....
[root@SAN ~]# mdadm -E /dev/sdj
/dev/sdj:
MBR Magic : aa55
Partition[0] : 4294967295 sectors at 1 (type ee)
[root@SAN ~]# mdadm -E /dev/sdn
/dev/sdn:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : cb360579:c72e69f9:a378bc9e:f7498b21
Name : SAN.iscsi.export:2 (local to host
SAN.iscsi.net)
Creation Time : Tue Aug 12 00:35:45 2014
Raid Level : raid6
Raid Devices : 14
Avail Dev Size : 7813775024 (3725.90 GiB 4000.65 GB)
Array Size : 46882646016 (44710.78 GiB 48007.83 GB)
Used Dev Size : 7813774336 (3725.90 GiB 4000.65 GB)
Data Offset : 249856 sectors
Super Offset : 8 sectors
Unused Space : before=249768 sectors, after=12976 sectors
State : clean
Device UUID : 34a42a25:97152243:aec7c16e:663d6632
Update Time : Fri May 15 12:33:05 2015
Bad Block Log : 512 entries available at offset 72 sectors
Checksum : a93512dd - correct
Events : 28347
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 10
Array State : AAAAAAAAAAAAAA ('A' == active, '.' == missing, 'R' == replacing)
Yes, before you ask. There's more than two disks out of the fourteen that have borked superblocks like this.
My question now is, what should I do? Do I just lower the flag to half mast, have a moment of silence and start from scratch? Ideas?