Subject: Re: Raid1 Disk Failure - Diagnosing/Repairing and/or Replacing disk
To: None <firstname.lastname@example.org>
From: Patrick Welche <email@example.com>
Date: 09/16/2005 18:52:42
On Fri, Sep 16, 2005 at 12:14:36PM -0500, firstname.lastname@example.org wrote:
> I'm relatively new to Raid. Earlier this year I successfully moved my
> home/office NetBSD 2.0-Stable system onto Raid 1 setup on an 300 MHz
> PC/IDE bus. I confirmed everything worked, etc and has been working great.
> 2 days ago I got a message that one of the disks failed.
> # raidctl -s raid0
> /dev/wd0a: failed
> /dev/wd2a: optimal
> No spares.
> /dev/wd0a status is: failed. Skipping label.
> Component label for /dev/wd2a:
> Row: 0, Column: 1, Num Rows: 1, Num Columns: 2
> Version: 2, Serial Number: 2147483647, Mod Counter: 222
> Clean: No, Status: 0
> sectPerSU: 128, SUsPerPU: 1, SUsPerRU: 1
> Queue size: 100, blocksize: 512, numBlocks: 156301312
> RAID Level: 1
> Autoconfig: Yes
> Root partition: Yes
> Last configured as: raid0
> Parity status: clean
> Reconstruction is 100% complete.
> Parity Re-write is 100% complete.
> Copyback is 100% complete.
> I've searched the mailing lists and the manual and am a bit wary of
> optimal next steps.
> I've ordered a 3rd disk of the same size (80G) and model number just to be
> safe, but...
> Q1: Is there anyway to diagnose the disk live in the system? If so what
> are the steps and commands I'd need?
> Q2: I think I can fumble my through the manual, but it doesn't seem to
> describe what you need to do to replace a failed disk. Again a logical
> sequence of steps and commands would be greatly appreciated
How about the bit in raidctl(1) starting at "Dealing with Component Failures"?
Swap out wd0, then raidctl -s raid0 will say something like
I don't know whether this is necessary or not, but then I would disklabel
wd0. (Checking in dmesg that wd0 really is the new drive, but it should
be with the above message..) then
raidctl -F component0 raid0
raidctl -s raid0