Subject: Re: setting the clean-bit in a RAID 0
To: Frederick Bruckman <fb@enteract.com>
From: Greg Oster <oster@cs.usask.ca>
List: current-users
Date: 04/07/1999 12:14:30
Frederick Bruckman writes:
> On Wed, 7 Apr 1999, Greg Oster wrote:
> > Paul B Dokas writes:
> > > caligula# raidctl -i raid0
> > > Initiating re-write of parity
> > > raidctl: ioctl (RAIDFRAME_REWRITEPARITY) failed: Invalid argument
> > > 
> > > I've done this before.  It makes to difference.  The clean bits have
> > > not been set.
> > 
> > Doing the above will now succeed trivially, and will set the clean bit for 
> > RAID 0.  It's not the worlds most elegant solution, but it'll certainly 
> > suffice until I get a chance to revisit the issue...
> 
> It's reasonable to want to use the clean bits to script recovery after
> a power failure.

Yup! :-)  That's why one can think of the clean bits as not so much whether 
the parity is correct or not, but whether the RAID set was shutdown cleanly.
(With the requirement that 'raidctl -i' be used to get the RAID set into a
"known, clean state". )

I agree that there needs to be a 'nice' way of detecting that:
  a) things did not shutdown cleanly,
  b) we need to correct the problem, and
  c) we *can* correct the problem.

What I havn't done is figured out the best way of doing that...  

> I can think of at least three ways to do this. 1)
> Parse the output of `raidctl -c' with sed. 2) Have `raidctl -c' return
> a distinct value if the underlying partitions aren't clean, whether
> actually mounting or not, or 3) Have another switch that does no more
> than check parity, or rebuilds it only if necessary. (OK, that's 5.)

Yup.  I've considered variants of what you suggest, but havn't got 
anything in concrete yet.  In some cases it's nice to be able to just
check the status, and decide from there (with a script) what to do.  
In others, it's nice to say "check it, and fix it, as necessary".  
So options 3 and 4 to me have the most merit... 

> I do think it's important to at least have a fighting chance of coming
> up gracefully if the power is simply cycled, but the old way of
> rebuilding parity on every boot is way too slow. Any thoughts on this?

I nuked the "rebuild parity on every boot" stuff a couple of weeks ago. 

If the 'raidctl -c' succeeds on bootup, chances are pretty good that 
there is "enough stuff" there to actually fsck the partitions and 
actually use the data.  What needs to be added is the way of detecting that 
a) parity is not clean, and b) the parity disk (or equivalent) has not died 
and then invoking 'raidctl -i' based on that.

Later...

Greg Oster