Subject: Raid1 Disk Failure - Diagnosing/Repairing and/or Replacing disk
To: None <netbsd-users@netbsd.org>
From: None <yancm@sdf.lonestar.org>
List: netbsd-users
Date: 09/16/2005 12:14:36
I'm relatively new to Raid. Earlier this year I successfully moved my
home/office NetBSD 2.0-Stable system onto Raid 1 setup on an 300 MHz
PC/IDE bus. I confirmed everything worked, etc and has been working great.

2 days ago I got a message that one of the disks failed.

# raidctl -s raid0
Components:
           /dev/wd0a: failed
           /dev/wd2a: optimal
No spares.
/dev/wd0a status is: failed.  Skipping label.
Component label for /dev/wd2a:
   Row: 0, Column: 1, Num Rows: 1, Num Columns: 2
   Version: 2, Serial Number: 2147483647, Mod Counter: 222
   Clean: No, Status: 0
   sectPerSU: 128, SUsPerPU: 1, SUsPerRU: 1
   Queue size: 100, blocksize: 512, numBlocks: 156301312
   RAID Level: 1
   Autoconfig: Yes
   Root partition: Yes
   Last configured as: raid0
Parity status: clean
Reconstruction is 100% complete.
Parity Re-write is 100% complete.
Copyback is 100% complete.

I've searched the mailing lists and the manual and am a bit wary of
optimal next steps.

I've ordered a 3rd disk of the same size (80G) and model number just to be
safe, but...

Q1: Is there anyway to diagnose the disk live in the system? If so what
are the steps and commands I'd need?

Q2: I think I can fumble my through the manual, but it doesn't seem to
describe what you need to do to replace a failed disk. Again a logical
sequence of steps and commands would be greatly appreciated

Any hand holding would be greatly appreciated.

Thanks,
gene