Subject: Re: raidframe copyback blocks the whole system !?
To: None <kilbi@rad.rwth-aachen.de>
From: Greg Oster <oster@cs.usask.ca>
List: current-users
Date: 06/24/2002 08:42:09
Markus W Kilbinger writes:
> >>>>> "Greg" == Greg Oster <oster@cs.usask.ca> writes:
>
> >> After simulating (raidctl -F) a disk failure of a 4 disk raid5
> >> (3 + 1 spare), the spare disk got in 'used_spare' state and the
> >> whole raid was still usable under the 'reconstructing' phase.
> >> Fine!
> >>
> >> But what's the smooth way (== the system is still usable during
> >> that time) back? 'raidctl -B' did the job, but blocks the whole
> >> system for the complete copyback time!
>
> Greg> It blocks the filesystem(s) on the RAID set... Unfortunatly,
> Greg> this is a limitation of the copyback code.
>
> As positive aspect copyback is about twice as fast as the
> reconstructing way...
>
> >> Trying it with 'raidctl -R ...', while the spare is in use,
> >> reconstructed the formerly failed disk smoothly, but left the
> >> spare disk in 'used_spare' state.
>
> Greg> Hmm!! It really shouldn't be letting you do a 'raidctl -R'
> Greg> after you've already reconstructed to a spare... smells like
> Greg> a bug..
>
> A reboot 'solved' this problem, but that's not the clean way for a
> raid system, anyway! ;-)
Well... if you have hot-swap drives, then you could just stuff the new drive in
place of the old one, and not worry about doing a copyback.. (i.e. the new
drive you put in becomes the hot spare). If you don't have hot-swap drives,
then you'll have to take things off-line anyway, and at that point you can
just shuffle the disks around :)
> -> send-pr?
If you'd like. There may already be a PR, but I don't recall for sure.. I do
know that this issue is on my RAIDframe "todo" list.. (unfortunatly, fixing
it is going to require a complete rewrite of the copyback code.. :( )
> So, copyback is the only (clean) way back from a used spare to the
> normal raid disk on a running machine?
Technically, yes. However: the used spare should be just as good as a
normal raid disk (unless you're using a slower disk or something.)
If the used spare is exactly the same as the other drives in the array,
I wouldn't even worry about doing the copyback -- at some point when you
reboot, the used spare will be pulled into the array as a normal component
(assuming you're using the autoconfig stuff). And in the meantime, the
spare disk should function just as well as a normal component.
Later...
Greg Oster