Subject: Re: Why my life is sucking. Part 2.
To: Luke Mewburn <lukem@wasabisystems.com>
From: Greg Oster <oster@cs.usask.ca>
List: current-users
Date: 01/18/2001 08:05:34
Luke Mewburn writes:
> On Thu, Jan 18, 2001 at 10:34:29AM +0200, Alan Barrett wrote:
> > On Wed, 17 Jan 2001, Manuel Bouyer wrote:
> > > What I've done is to make /etc/rc.d/raidframe run after fsck (no problems
> as
> > > I only use autoconfig). This way the rebuild (run in background anyway)
> > > doesn't compete with fsck for disk access. This speeds things up a lot.
> >
> > We could split /etc/rc.d/raidframe into two halves, with the
> > configuration part ("raidctl -c") running before fsck, and the parity
> > reconstruction part ("raidctl -P") running in the background after
> > fsck.
>
> I think that this is a good idea;
I think we need to be very careful here...
There are some serious consequences to doing *ANYTHING* to data on a
RAID set before the parity rewrite completes. If the parity is
not correct for a given stripe, and you write real data to a portion
of that stripe, old (and *incorrect*) parity will be used to create the
new parity, and that new parity will also be *incorrect*. If a component
fails before that incorrect parity gets fixed, and that component was
a 'data' portion of that stripe, that data *WILL* be reconstructed
incorrectly. And it won't matter if fsck had just updated that data block -
the reconstruction will get it *wrong*. So even allowing fsck and the
parity rewrite to happen at the same time is stretching things a little...
Yes, it's a question of "what are the odds?", and "can you afford those odds?"
> for $dev in `iostat -x | awk '/^raid/ { print $1 }'`; do
> raidctl -P $dev
> done
"yet another option" to raidctl, asking it to return all active RAID devices
(ala 'ifconfig') would probably work here too...
Later...
Greg Oster