netbsd-users: Re: [Raidframe] How to recover a degraded mirror without a

Subject: Re: [Raidframe] How to recover a degraded mirror without a
To: < <netbsd-users@netbsd.org>>
From: Daniel Cox <dc@microbits.com.au>
List: netbsd-users
Date: 12/02/2005 09:42:41

You can simply replace the failed drive (there is no need to remove it =
first).
Put a disklabel on the new drive (not needed in your case if using wd0d)

Then:
raidctl -a /dev/wd0d raid0
raidctl -F component1 raid0
 (or component0? use raidctl -s raid0 to find out)

Daniel.

>>> Matthew Braithwaite <matt@braithwaite.net> 2/12/05 8:57:50 >>>
After two years of solid service, a drive in my Raidframe mirror has
failed, and I'm having trouble finding information on how to recover.
The man pages talk about recovery in terms of reconstruction onto a
spare, but I don't have a spare in my dinky little 1U box.

Instead, I need to replace the failed drive, then recover the mirror.

The naive sequence of operations would be:

  1. Shutdown
  2. Replace failed drive with new blank drive
  3. Boot

But I'm not sure what I'd to to tell Raidframe that the new wd0d isn't
the wd0d that was there before.

Another worry I had about this procedure is that Raidframe might
become confused on reboot because it find a majority of its database
replicas.

So then I came up with this sequence:

  1. Remove the failed wd0d from the configuration with raidctl -c.
     (I think this would leave an intact mirror, but one with a=20
     missing component?)
  2. Shutdown
  3. Replace failed drive
  4. Boot
  5. Introduce the new wd0d as a spare, and reconstruct onto it.

Is this the best way to go about it?

Thanks in advance for any advice!

				* * *

# raidctl config file for /dev/raid0c

START array
# numRow numCol numSpare
1 2 0

START disks
/dev/wd1d
/dev/wd0d  <-- FAILED

START layout
# sectPerSU SUsPerParityUnit SUsPerReconUnit RAID_level_1
128 1 1 1

START queue
fifo 100


This is NetBSD 1.6.1 on Sparc64.