1

I did something very stupid today. Now I'm just hoping to find someone who is even more smart than I am stupid.

I have a couple of hot swappable disks. Since I'm going to screencast Ubuntu 12.04LTS next week, I thought I'd install a fresh copy on an empty disk. I thought I'd see if I could do that using VirtualBox, so that I could just reboot the host to that disk when it was finished and setup completely. So I created a VMDK that pointed to that disk with raw access. Worked great.

Then I thought, I might as well test with windows first, because that would allow me to screencast how to install Ubuntu from a running Windows machine. Install went fine. I then rebooted to it, but Windows crashed. I weren't too surprised by that, though. I didn't really expect that to work in the first place. But it asked if it should fix the startup problems for me. Yes please. It told me everything was now fine, and I could reboot. Windows still didn't boot, so I gave up on that project and rebooted to Ubuntu.

Unknown RAID level -1000000, Grub says now. I can't boot from my main disks. But I have Ubuntu on my keyring of course, so I booted from that. Installed mdadm and run mdadm --assemble --scan. It then tells me

ubuntu@ubuntu:~$ sudo mdadm --assemble --scan
mdadm: Devices UUID-00000000:00000000:00000000:00000000 and UUID-c00b1e54:78802534:df92b1b7:9e64ccd8 have the same name: /dev/md1
mdadm: Duplicate MD device names in conf file were found.

That's worse; I can't activate the RAID. This is 1.8TB of data, so I really would rather not restore from backup. As you can imagine, all inputs are valuable to me now. More information about the setup.

I have three disks 1.5TB each. The first partition on all of them is a RAID 1. This is used for boot. They also have a second partition which is used for a RAID5. This RAID5 is used for LVM.

Now, palimpsest detects the partitions properly. So I'm not in total despair yet. It seems unlikely that Windows will have damaged the second partition, hundreds of megabytes into the disk? My assumption is that it's only damaged something at the beginning. Then the question becomes how do I fix it?

I've also run


ubuntu@ubuntu:~$ sudo mdadm --examine --scan
ARRAY /dev/md1 UUID=00000000:00000000:00000000:00000000
   spares=2
ARRAY /dev/md0 UUID=37bc1971:5b00e915:2f3fc100:0972a2ae
ARRAY /dev/md1 UUID=c00b1e54:78802534:df92b1b7:9e64ccd8

It doesn't look entirely hopeless, but I'm a little bit stuck. Geniuses, where are you! :)

1 Answers1

0

It's not the partitions that were damaged, it was your metadata. Having the conf file around is nice because it tells you which UUIDs really matter to the original install. So your task is to preserve those UUIDs and delete the other ones, and then cross your fingers and pray that your data wasn't corrupted in addition to the metadata mess up.

Look in man mdadm for metadata manipulation switches.

As a side note, this is where HW RAIDs have value, when using multiple operating systems, the HW RAIDs metadata is never exposed to the host, they get a "new disk", not a disk that has a partition with some metadata on it. So what you just encountered can't happen.

Good luck, if you really get stuck you might want to join the MD mailing list and for help there. http://marc.info/?l=linux-raid&r=1&b=201204&w=2

ppetraki
  • 5,483