Subject: RaidFrame enhancements (was Re: Why my life is sucking. Part 2.)
To: Greg Oster <oster@cs.usask.ca>
From: Robert Elz <kre@munnari.OZ.AU>
List: current-users
Date: 01/26/2001 23:49:22
    Date:        Thu, 18 Jan 2001 19:24:30 -0600
    From:        Greg Oster <oster@cs.usask.ca>
    Message-ID:  <200101190124.TAA19029@cs.usask.ca>

I have been catching up on almost a month's worth of NetBSD mail.

This discussion seemed to just peter out about a week ago with ...

  | I need to think about this more... 

I suspect by now that you've all worked out that you're all right,
there was never any real contradiction here.

Greg is 100% right - it would be great to have the raidframe code
keep better track of what parts of a raid device has correct parity,
and what parts need reconstructing after a crash, and to try and
minimise the latter set.   If that could be done, it would vastly
speed up reconstruction, and diminish the truly vulnerable time
a lot, so if this can be done, great.

But Thor, Manuel, Bill (probably others) were also right, it
needs to be possible for the system to reboot as soon as possible
and start working as soon as possible.   Raid (well, raid > 0) is
typically more likely to be used on servers that need to be operational
all the time (that operational need is what justifies the extra
expense to afford the extra space to hold all this parity info).

Those systems in particular need to be up and running again as
quickly as can possibly be achieved.   And that means most likely
be running when the raidframe parity is still being reconstructed.
While minimising the time to do the reconstruction is a great aim,
there will be times when reconstructing the whole device will be
essential (like after a component has failed and been replaced).
Restricting the system to not be able to do any real work while
that is going on makes raid much less attractive.

So, I think that both enhancements are needed.

But I wouldn't be wasting a lot of effort into optimising the
second one - it shouldn't normally be run often.   Just a single
system global ("raid is reconstructing") or perhaps better, and
probably just as easy, one per raid device (I mean, raid0, raid1...
not the components) ("this raid is reconstructing").   When that is
clear, operations proceed just like now.   When it is set, then
do however much work is needed to figure out if the current slice
needs its parity fixed before proceeding with the current operation,
and fix it if so.   This shouldn't be happening often enough in the
grand scheme to be optimised a lot - as long as the system can
proceed, and be used, it doesn't matter if performance is degraded
just a little (and no, it doesn't even matter if that window of
vunerability is extended by a few minutes - keeping the server
running is far more important).    Of course, if the other optimisation
is done as well, there will usually be not too many slices to
actually reconstruct before the flag can be turned off, which would
be a good thing to have as well.

kre