Subject: RAID0 (I know, I know) reconstruction on another drive pair.
To: None <netbsd-help@netbsd.org>
From: Marc Tooley <netbsdMLpostNO@spam.quake.ca>
List: netbsd-help
Date: 08/25/2005 12:35:32
I have a RAID0 partition that has had one of the drives just go partly 
belly-up in it, and am trying to salvage what I can from the setup. 
Basically, drive #1 has:

 a:   4195233        63     4.2BSD   1024  8192    16  # (Cyl.      0
 b:   1023120   4195296       swap                     # (Cyl.   4162
 c:  60030369        63     unused      0     0        # (Cyl.      0
 d:  60030432         0     unused      0     0        # (Cyl.      0
 e:  10486224   5218416     4.2BSD   1024  8192    16  # (Cyl.   5177
 f:  44325792  15704640       RAID                     # (Cyl.  15580

Drive #2 has:

 c:  60030369        63     unused      0     0        # (Cyl.      0*-
 d:  60030432         0     unused      0     0        # (Cyl.      0 -
 e:  15704577        63     4.2BSD   1024  8192    16  # (Cyl.      0*- 
 f:  44325792  15704640       RAID                     # (Cyl.  15580 -

I have two other drives that aren't identical, but close (2 x 40GB 
instead of 2 x 30GB) and I did the following to copy over the bad 
volumes to the good:

dd conv=noerror if=/dev/rwd0d ibs=64k | progress -l 30g dd of=/dev/rwd2d 
obs=64k
dd conv=noerror if=/dev/rwd1d ibs=64k | progress -l 30g dd of=/dev/rwd3d 
obs=64k

... and this seemed to work. Basically, I was able to run:

disklabel wd2
disklabel wd3

... and both commands returned useful information after a reboot. I was 
also able to duplicate the raid.conf file for the first set, for the 
new set, and run:

raidctl -c raid1.conf raid1

... and then I got an identical disklabel to the original raid0 set 
from:

disklabel raid1

... my excitement and relief was premature, unfortunately.

I have two problems now:

. Mounting the new RAID0 set gives me *all kinds* of problems. Almost 
every file has a bad type on it: files are special device files, 
directories are files, strange modes are all over the place.. etc etc.

. The second volume set isn't bootable, while the first one is.

. Doing anything useful is difficult because my kernel:

NetBSD warp 3.99.3 NetBSD 3.99.3

... panics at the first sign of corruption. It's been slow going. :(

My questions:

1. Do you have any suggestions for me to rebuild the RAID set on the new 
volume with the greatest chance of success? Hopefully Mr. Oster might 
be kind enough to reply to the list. :-)

2. Why didn't dd copy over the bootblocks and make the "clean" set 
bootable? When I pull the bad drives, the machine insists there's "No 
operating system." wd0d and wd2d/wd3d all have d partitions that 
encompass the whole 30GB portion from 0 onwards.

Thanks in advance for your comments.

-Marc