Subject: Re: Raid reconstrution fails
To: Christoph Kaegi <kgc@zhwin.ch>
From: Greg Oster <oster@cs.usask.ca>
List: netbsd-users
Date: 03/04/2004 07:31:11
Christoph Kaegi writes:
> Hello list
> =

> Ich have a system which has a failed raid-1 set after a power failure.
> =

> -------------------------------------- 8< -----------------------------=
------
> ---
> # raidctl -s raid1
> Components:
>           component0: failed
>            /dev/wd0b: optimal
> No spares.
> component0 status is: failed.  Skipping label.
> Component label for /dev/wd0b:
>    Row: 0, Column: 1, Num Rows: 1, Num Columns: 2
>    Version: 2, Serial Number: 2002101, Mod Counter: 232
>    Clean: No, Status: 0
>    sectPerSU: 128, SUsPerPU: 1, SUsPerRU: 1
>    Q=FC=FC size: 100, blocksize: 512, numBlocks: 2097536
>    RAID Level: 1
>    Autoconfig: Yes
>    Root partition: No
>    Last configured as: raid1
> Parity status: DIRTY
> Reconstruction is 100% complete.
> Parity Re-write is 100% complete.
> Copyback is 100% complete.
> -------------------------------------- 8< -----------------------------=
------
> ---
> =

> Now, when I try to
> =

>   raidctl -v -R component0 raid1

"component0" indicates that the real component is missing.  The =

question you need to find the answer for is "Where is the missing =

component?"

> it says in the logs:
> =

> =

> -------------------------------------- 8< -----------------------------=
------
> ---
> Mar  4 09:11:44 sstffw /netbsd: Rebuild: 0 0
> Mar  4 09:11:44 sstffw /netbsd: About to (re-)open the device for rebui=
lding:
>  component0
> Mar  4 09:11:44 sstffw /netbsd: raid1: rebuilding: raidlookup on device=
: comp
> onent0 failed: 2!
> -------------------------------------- 8< -----------------------------=
------
> ---
> =

> What does that mean?

You can't rebuild onto "nothing".

> The other raidsets (raid0 and raid2) are ok.

For whatever reason, the component label for one of the components in =

raid1 has gone missing, and the autoconfig code was unable to find it
on reboot.

The output of 'raidctl -s raid0' and 'cat /var/run/dmesg.boot' would help=

diagnose this further.

Later...

Greg Oster