Subject: Re: Why my life is sucking. Part 2.
To: Luke Mewburn <lukem@wasabisystems.com>
From: Greg Oster <oster@cs.usask.ca>
List: current-users
Date: 01/18/2001 08:05:34
Luke Mewburn writes:
> On Thu, Jan 18, 2001 at 10:34:29AM +0200, Alan Barrett wrote:
> > On Wed, 17 Jan 2001, Manuel Bouyer wrote:
> > > What I've done is to make /etc/rc.d/raidframe run after fsck (no problems
>  as
> > > I only use autoconfig). This way the rebuild (run in background anyway)
> > > doesn't compete with fsck for disk access. This speeds things up a lot.
> > 
> > We could split /etc/rc.d/raidframe into two halves, with the
> > configuration part ("raidctl -c") running before fsck, and the parity
> > reconstruction part ("raidctl -P") running in the background after
> > fsck.
> 
> I think that this is a good idea;

I think we need to be very careful here...

There are some serious consequences to doing *ANYTHING* to data on a 
RAID set before the parity rewrite completes.  If the parity is 
not correct for a given stripe, and you write real data to a portion 
of that stripe, old (and *incorrect*) parity will be used to create the 
new parity, and that new parity will also be *incorrect*.  If a component
fails before that incorrect parity gets fixed, and that component was 
a 'data' portion of that stripe, that data *WILL* be reconstructed 
incorrectly.  And it won't matter if fsck had just updated that data block - 
the reconstruction will get it *wrong*.  So even allowing fsck and the 
parity rewrite to happen at the same time is stretching things a little...

Yes, it's a question of "what are the odds?", and "can you afford those odds?"

> 	for $dev in `iostat -x | awk '/^raid/ { print $1 }'`; do
> 		raidctl -P $dev
> 	done

"yet another option" to raidctl, asking it to return all active RAID devices 
(ala 'ifconfig') would probably work here too...

Later...

Greg Oster