Subject: Re: parity check with root on raid
To: None <netbsd-help@netbsd.org>
From: Greg Oster <oster@cs.usask.ca>
List: netbsd-help
Date: 04/21/2005 15:09:37
Jukka Salmi writes:
> Greg Oster --> netbsd-help (2005-04-21 08:55:31 -0600):
> [...]
> > > > Shouldn't parity be checked (and possibly be rewritten) before filesyst
> ems
> > > > are checked and mounted?
> > 
> > In theory, yes, but if you have a huge array that might take hours to 
> > check, you probably don't want the unavailable for that long.  The 
> > time it takes to do the check is the time your data is "unprotected" 
> > against a component failure, so whether you want to be "live" during 
> > is the question... 
> 
> I see. So what about a user-settable variable which determines whether
> to run 'raidctl -P' in the background and thus to continue booting, or to
> run it in the foreground and thus to wait until it returns?

That'd be fine by me...  but I'll leave that to the /etc/rc.d folks 
to figure out how to make it work :) 

> > Note that if you are doing a fsck at the same time as doing a parity 
> > check that they will be fighting against each other, and the fsck 
> > will take much longer than normal to complete.  If we ever get a 
> > filesystem that doesn't require a long fsck, then we'd certainly want 
> > to move the parity check to make it occur as early as possible.
> 
> Hmm, to choose when to run the parity check seems not to be possible
> with setting a variable... We'd probably need two different parity check
> rc.d scripts.

There might be some fancy way to do it with a REQUIRE: option?  I'm not 
sure.

> > One fsck doesn't always find all the problems.  If you have a really 
> > nasty crash, multiple "fsck -f"'s might be needed before fsck doesn't 
> > find any further errors.  Just because you did one fsck doesn't mean 
> > it fixed all the problem!  (I wouldn't go blaming RAIDframe if you 
> > just did a single fsck after a nasty crash.)
> 
> Shouldn't /etc/rc.d/fsck handle this? That is, to run fsck up to
> $fsck_max_runs times unless it succeeds?

I think with the way fsck runs from /etc/rc.d is that if it encounters
certain serious errors that it punts, and forces an admin to run fsck 
by hand.  It's getting the admin to run fsck a couple of times that
could be the hard part...  

Later...

Greg Oster