Subject: replacing a failed disk in a raidframe raid1 mirror
To: None <netbsd-users@NetBSD.org>
From: Carl Brewer <carl@bl.echidna.id.au>
List: netbsd-users
Date: 08/24/2005 09:28:22
Hello,

I've got my first disk failure in a raidframe array! :

raidctl -s says :
/dev/wd1a status is: failed.  Skipping label.

This is on a box that has a simple RAID1 mirror for
its entire disk setup (it's a simple LAN server). It's got
a pair of Maxtor 80GB HDDs.  See :

mail: {117} df
Filesystem  512-blocks     Used     Avail Capacity  Mounted on
/dev/raid0a     508222   256468    226342    53%    /
/dev/raid0f    4128988  1569328   2353208    40%    /var
/dev/raid0e    8258300  3003380   4842004    38%    /usr
/dev/raid0g  140565428 100568668  32968488    75%    /home
kernfs               2        2         0   100%    /kern

It's a NetBSD 2.0.2 server on i386 hw.  dmesg says this :
mail: {123} grep ^wd /var/run/dmesg.boot
wd0 at atabus0 drive 0: <Maxtor 6Y080L0>
wd0: drive supports 16-sector PIO transfers, LBA addressing
wd0: 78167 MB, 158816 cyl, 16 head, 63 sec, 512 bytes/sect x 160086528 
sectors
wd0: 32-bit data port
wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 6 (Ultra/133)
wd0(hptide0:0:0): using PIO mode 4, Ultra-DMA mode 5 (Ultra/100) (using 
DMA data transfers)
wd1 at atabus1 drive 0: <Maxtor 6Y080L0>
wd1: drive supports 16-sector PIO transfers, LBA addressing
wd1: 78167 MB, 158816 cyl, 16 head, 63 sec, 512 bytes/sect x 160086528 
sectors
wd1: 32-bit data port
wd1: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 6 (Ultra/133)
wd1(hptide0:1:0): using PIO mode 4, Ultra-DMA mode 5 (Ultra/100) (using 
DMA data transfers)

I had a quick read through the guide to setting up raidframe,
http://www.netbsd.org/guide/en/chap-rf.html#chap-rf-intro
but before I actually do this, I want to make sure what I'll be
doing is correct. I didn't see a monkeysheet for how to replace
a failed drive in the guide.  If I missed it, my apologies.

My first question is how do I tell which physical disk is
which?  When I open up the box, is there some way to identify which
disk is wd0 and which is wd1?  I assume it's related to the
htpide0:1:0/htpide0:0:0 values returned in dmesg.boot, but does
that translate to master/slave on an IDE bus?  Or different
channels?  The box has an el-cheapo adaptec raid card,
which isn't raid at all, it's just working as a multi channel
IDE controller card.

Then, is there some howto for rebuilding the array somewhere?
Do I basically replicate the steps in 16.3.4 of the guide? :
http://www.netbsd.org/guide/en/chap-rf.html#chap-rf-second-disk
That seems a lot of mucking about, is there an easier or better
or less complex (so less error-prone)way to do it?  I have backups of
the box, but all the same, I don't want to trash the filesystem and have
  to restore!

Thanks for any pointed to doco I may have missed!

Carl