mdadm's rules of order

user warning: Table './drinking_drpl2/watchdog' is marked as crashed and last (automatic?) repair failed query: INSERT INTO watchdog (uid, type, message, variables, severity, link, location, referer, hostname, timestamp) VALUES (0, 'flickr', 'Could not connect to Flickr, Error: Forbidden', 'a:0:{}', 4, '', 'http://rothwerx.com/content/mdadms-rules-order', '', '35.185.0.83', 1513151233) in /home1/drinking/public_html/rothwerx/modules/dblog/dblog.module on line 146.

Jeremiah - Posted on 04 September 2012

My GIS-using client recently had a power failure at their office, and their GIS fileserver failed to fully boot when it regained power. This system had a 5-disk RAID5, and according to /proc/mdstat, all the drives were marked as spare and the md was not active.

So I tried to reassemble, but it said it tried to assemble 3 drives and 1 spare, which was not enough to start the array. Naturally I had checked SMART to see if anything was reporting failed there, and nothing was. I know drive manufacturers can fudge their SMART results, but I still figure it's a pretty decent indicator.

So I tried an 'mdadm --assemble --force', and an 'mdadm --assemble --force --run' (with all the right options after those commands, of course), but both times they started to reassemble but then failed at about 1%. Now it's looking like it's time to start over on this RAID. So I zeroed the superblocks on each disk ('mdadm --zero-superblock /dev/sdx1') and then recreated using the alphabetical order of the disks, which seemed natural to me and probably how I created it in the first place:
gis:~# mdadm --create /dev/md1 --assume-clean --level=5 --verbose --raid-devices=5 /dev/sdc1 /dev/sdd1 /dev/sde1 /dev/sdf1 /dev/sdg1

But when I then tried to mount the filesystem or run a filesystem check, I kept getting an error saying my filesystem revision was too high:

gis:~# fsck /dev/md1
fsck from util-linux-ng 2.17.2
e2fsck 1.41.11 (14-Mar-2010)
fsck.ext3: Filesystem revision too high while trying to open /dev/md1
The filesystem revision is apparently too high for this version of e2fsck.
(Or the filesystem superblock is corrupt)

The superblock could not be read or does not describe a correct ext2
filesystem. If the device is valid and it really contains an ext2
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
e2fsck -b 8193

I knew it was an ext3 FS, because I created it myself, and that's what was in my fstab. But just for fun I tried using fsck.ext4. Still no luck. My dmesg and syslog just said: "EXT3-fs: md1: couldn't mount because of unsupported optional features (fd18000)."

Just when I was about to give up hope of being able to recover these terabytes of data (luckily they have backups, but that's a lot of data to have to restore), I found some forum thread that pointed out that the order of the drives is important, and it should be recreated in the same order as it previously was. Luckily I still had the output from /proc/mdstat from before I recreated it.

My mdstat output had said:

md1 : inactive sde1[0](S) sdg1[3](S) sdc1[5](S) sdd1[2](S) sdf1[1](S)

So I stopped the array and ran 'mdadm --create' again, but this time in the order specified before:

mdadm --create /dev/md1 --assume-clean --level=5 --verbose --raid-devices=5 /dev/sde1 /dev/sdf1 /dev/sdd1 /dev/sdg1 /dev/sdc1

After that, I did a successful fsck, mounted it, and was on my way. Now I know to keep record of the drive order in all my RAID arrays, because that order is important.