Upgrading a Ubuntu Dapper 6.06 server to Lucid 10.06 with software RAID 1
Posted by Jim Morris on Mon Nov 29 23:06:57 -0800 2010
I needed to upgrade a remote server that was running Dapper to Lucid, as Dapper is no longer supported (or will soon be EOL'd). This server uses a software RAID 1, and that seems to be the problem.
I tried to do this over the network via ssh in a screen, however after it upgraded to Hardy it failed to boot.
The error was after booting it dropped me into initramfs saying it could not find /dev/md1 there are several theories as to why this happens, and solutions ranging from adding rootdelay=300 to the grub kernel stanza to rebuilding the array. (I recommend doing that anyway if you have slower, older machines and scsi drives as I do).
Here is how I finally recovered without doing a clean install.
First you need to use the do-release-upgrade as suggested in the release notes, first from dapper to hardy, then from hardy to lucid. The trouble is that for some reason the RAID lost one of its members, becoming a degraded array. The drive was fine but something in the upgrade process dropped one of the members.
There is a known bug in Hardy that will not boot from a degraded soft RAID array, it has been fixed but the mdadm has to be reconfigured to tell it to boot from a degraded array. I know this works in Lucid, so after bringing the server home from the remote colo, I wanted to finish the next step and upgrade to Lucid, hoping that would fix the problem.
Anyway you have to get into the system in order to fix anything and I finally found a way to do that from initramfs after hours of futzing around.
Here is the trick for my system where the boot stuff is on partition 1, root is on partition 3 and swap is on partition 2, as I have an older system that needs the boot stuff to be near the start of a large drive I split boot and root into two different RAID arrays: md0 for boot and md1 for root.
Tpye the following in the initramfs boot prompt...
This will create the mdx devices mount them where they need to be then leave initrams to boot normally.
Once you have booted you need to fix it so it will boot a degraded array...
Ok how to actually fix this? Well it probably depends on why the raid member was dropped during the upgrade. In my case I needed to make sure the raid was assembled properly, then...
> sudo rm /etc/mdadm/mdadm.conf
> sudo dpkg-reconfigure mdadm
and say yes when it asks if you want to boot a degraded array.
Look at the generated /etc/mdadm/mdadm.conf and make sure it is correct, then reboot.
To test you may want to degrade an array and see if it still boots.