[ale] RAID mirror boot nightmare
Bob Toxen
transam at VerySecureLinux.com
Wed Jul 11 02:27:41 EDT 2012
All,
PROBLEM SOLVED! Phil's suggestion of the initrd being wrong was
correct. I was starting to suspect this. See below for details on how
I got into this mess and how I got out.
Phil: please send me private email with your name as it should appear on
your $50 check and your mailing address. I was deadly serious about
the reward as I was desperate as this system is for a client and I want
to give a real thanks!
The command to rebuild the initrd under CentOS is mkinitrd.
In the md superblock there is a field called "Preferred Minor", i.e.,
preferred minor device for the md device that is created. There seems
to be no command and option to just update this field; apparently one
must use mdadm with --create or --assemble to update the md superblock
on the underlying real disk devices.
Due to the "brilliance" of whomever wrote that md code, on first write
when any md device is activated, that md device's minor device is written
into the superblock stored under each underlying device, e.g., /dev/sda6
and /dev/sda6. When I used the CD Rescue code, it generated md devices
of the form /dev/md123, 4, 5, 6, 7. Probably, when I then ran "fsck -f"
(or just read a file which causes the file's access time to be updated)
under the CD Rescue CD's Linux, it changed the preferred minor device
in each underlying disk device, precipitating this nightmare.
Unfortunately, on boot the kernel fails to give useful info on what
device it was trying to mount or why it failed -- very UN-Linux-like.
I booted from a different non-RAID partition, mounted the md partitions
now called /dev/md126 and /dev/md0 as /mnt2 and /mnt2/boot and issued
the command
mkinitrd --fstab=/mnt2/etc/fstab /boot/initrd_126 `uname -r`
and then edited my /boot/grub/grub.conf to change the initrd field to
initrd_126.
I then edited /mnt2/etc/fstab to specify /dev/md126 as my / device
and /dev/md0 as /boot. The kernel doesn't really seem to care about
/etc/mdadm.conf as the newer Linux RAID stores the info in the
"md superblock" (not to be confused with the ext[234] *nix superblock).
The kernel also seemed to ignore on the grub line:
md=4,/dev/sda6,/dev/sdb6
I then booted successfully. I then copied this new grub.conf and
initrd_126 to my new raid / and /boot partitions for redundancy
The above got my md devices running under the new names but with only
sda as I had not yet installed my replacement second disk.
To recover with my new empty disk, I installed it as sdb and did:
sfdisk -d /dev/sda | sfdisk /dev/sdb
NOTE 1: Save your partition tables to a file thusly (and then to another
system):
sfdisk -d /dev/sda > partition_table_sda
sfdisk -d /dev/sdb > partition_table_sdb
To later recover (DANGEROUS):
sfdisk /dev/sda < partition_table_sda
NOTE 2: For those that do full backups with tar, rsync, etc. which
does NOT save inode numbers, it is very important to backup
the inode numbers. Thus, when you eventually suffer disk
corruption (possible under ext3 occasionally with unclean
shutdown), when fsck asks about inode 235255 you can grep for it
in your backup and know which file may be corrupted and in need
of a restore. Also, if a directory file gets trashed you will
know where to restore the orphan files that ended up in
/lost+found.
One way to capture inodes (prior to backup) is the following
except prune to skip /proc and other fake file systems:
find / -ls > /root/inodes.list
To change them back to my original preferred names I then did:
1. Booted to my primary non-raid sda5.
2. Ensured that no md devices were mounted with the following
(better than "mount" because it doesn't babble about /dev, /proc,
etc.):
df -h
3. Deactivated the "wrong" md device:
mdadm -S /dev/md126
4. Created the "right" md device (this worked because after installing
replacing my failed sdb disk I allowed RAID automatically to sync
over several hours):
mdadm mdadm -A /dev/md4 -v -U super-minor /dev/sd[ab]6
Verify that md4 was created successfully:
cat /proc/mdstat
mdadm -D /dev/md4
mdadm -E /dev/sd[ab]6
Alternatively (if there was a out-of-date file system on /dev/sdb6
such as if the replacement disk had been used)
I first would have had to scribble both the md superblock and the
ext3 superblock with:
mdadm --zero-superblock /dev/sdb6 # Dangerous
dd bs=512 count=1 if=/dev/zero of=/dev/sdb6 # Dangerous
Then update /boot/grub/grub.conf, /etc/fstab, and /etc/mdadm.conf.
Btw, to show a disk partition's UUID (suitable to surround with ``):
blkid -o value /dev/md0 | head -1
blkid -o value /dev/sda6 | head -1
Btw, to set a partition's UUID (maybe after "dd if=/dev/sda5
of=/dev/sda6") do (except some Distros use uuid instead of uuidgen):
tune2fs /dev/sda6 -u `uuidgen`
THANKS also to LinuxGnome (first to respond), Scott McBrien, and Erik
Mathis for their help.
ALE comes through again for Linux!
Best regards,
Bob Toxen
bob at VerySecureLinux.com
transam at VerySecureLinux.com [ALE subscribre]
On Tue, Jul 10, 2012 at 08:40:48AM -0400, Phil Turmel wrote:
> Good morning Bob,
> Might be useful to show us the output of "mdadm -E /dev/sda[26]".
> You might just need to run "update-initrd" or whatever the equivalent is
> for CentOS 5.8. You always need to do this when you rearrange your boot
> devices.
> Phil.
> On 07/10/2012 01:33 AM, Bob Toxen wrote:
> > Additional details on this miserable problem:
> > On Boot the kernel complains of:
> > Creating root device
> > Mounting root filesystem
> > Mount: Could not find filesystem '/dev/root'
> > after talking about md0 apparently being created successful and lastly
> > panics.
> This suggests that something in your initrd doesn't match your system
> any more. Its assembling md0 when your mdadm.conf below specifies md1
> and md4.
> > /boot/grub/grub.conf entry being booted:
> > title CentOS-single-md4
> > root (hd0,0)
> > kernel /vmlinuz-2.6.18-308.4.1.el5 ro root=/dev/md4 md=4,/dev/sda6,/dev/sdb6 md=1,/dev/sda2,/dev/sdb2 md-mod.start_dirty_degraded=1 rhgb single noresume
> > initrd /initrd-2.6.18-308.4.1.el5.img
> You shouldn't need the md=n,/dev/... items in this list if your
> mdadm.conf is correct in the initrd.
> > /etc/mdadm.conf (heavily edited by me including switching from uuid to
> > devices; I don't presently list swap as that is not critical and it
> > fails before even thinking about swap):
> > # mdadm.conf written out by anaconda
> > DEVICE /dev/sda[26] /dev/sdb[26]
> > MAILADDR root
> > ARRAY /dev/md4 level=raid1 num-devices=2 devices=/dev/sda6,/dev/sdb6 auto=yes
> > ARRAY /dev/md1 level=raid1 num-devices=2 devices=/dev/sda2,/dev/sdb2 auto=yes
> I've had the best success when the ARRAY lines have only the md node and
> the uuid.
> > fdisk output:
> > Disk /dev/sda: 1000.2 GB, 1000204886016 bytes
> > 255 heads, 63 sectors/track, 121601 cylinders
> > Units = cylinders of 16065 * 512 = 8225280 bytes
> > Device Boot Start End Blocks Id System
> > /dev/sda1 * 1 13 104391 83 Linux
> > /dev/sda2 * 14 26 104422+ fd Linux raid autodetect
> > /dev/sda3 27 4200 33527655 82 Linux swap / Solaris
> > /dev/sda4 4201 121601 943023532+ f W95 Ext'd (LBA)
> > /dev/sda5 4201 62900 471507718+ 83 Linux
> > /dev/sda6 62901 121600 471507718+ fd Linux raid autodetect
> > /etc/fstab:
> > /dev/md4 / ext3 defaults 1 2
> > /dev/md1 /boot ext3 defaults 1 2
> > #normal /dev/md3 / ext3 defaults 1 1
> > #normal /dev/md0 /boot ext3 defaults 1 2
> > #normal /dev/md4 /root2 ext3 defaults 1 2
> > #normal /dev/md1 /boot2 ext3 defaults 1 2
> > tmpfs /dev/shm tmpfs defaults 0 0
> > devpts /dev/pts devpts gid=5,mode=620 0 0
> > sysfs /sys sysfs defaults 0 0
> > proc /proc proc defaults 0 0
> > /dev/md2 swap swap defaults 0 0
> > What magic am I missing? Please help!!!
> _______________________________________________
> Ale mailing list
> Ale at ale.org
> http://mail.ale.org/mailman/listinfo/ale
> See JOBS, ANNOUNCE and SCHOOLS lists at
> http://mail.ale.org/mailman/listinfo
More information about the Ale
mailing list