netbsd-bugs: Re: kern/29540: raidframe can show clean parity on raid1 with a failed disk

Subject: Re: kern/29540: raidframe can show clean parity on raid1 with a failed disk
To: None <gnats-bugs@netbsd.org, netbsd-bugs@netbsd.org>
From: Greg Oster <oster@cs.usask.ca>
List: netbsd-bugs
Date: 02/27/2005 00:01:46

riz@tastylime.net writes:
> >Number:         29540
> >Category:       kern
> >Synopsis:       raidctl can show clean parity on raid1 with failed disk
[snip]
> Components:
>           component0: failed
>            /dev/wd0a: optimal
[snip]
> >How-To-Repeat:
> 
> 	Build a raid1 set, calculate parity.  Then, relabel one disk,
> removing the raid partition, and newfs it.  Reboot normally, and
> see that parity is 'clean'.
> 
> >Fix:
> 	Unknown - this is probably straightforward, and one could
> possibly argue that the behavior is "correct" - but it's certainly
> surprising.

The behaviour is correct.  The parity status is not dependent on 
whether or not there is any parity present.  Another way to think 
about "clean" is "if you end up having to read parity bits, and there 
are parity bits, then they are known to be good".  "Dirty" means 
"we havn't verified the parity to be good, and if you need to re
construct something from that parity, you're on your own."  

What the "clean" status tells you in your case is that when the RAID
set went into degraded mode, the parity bits were in sync with the
data bits (in RAID 1 land, that both components had the same data).  
If it had said "dirty" instead, then that would have meant that the 
system didn't know for sure if /dev/wd0a was in sync with component0, 
and there possibly could be data loss (since component1 is basically 
the "parity" for component0 in the case of a RAID 1 set).

Perhaps an explanation of this for the man-page might be in order, 
but the behaviour is correct..

Later...

Greg Oster