Subject: Re: raidframe copyback blocks the whole system !?
To: None <kilbi@rad.rwth-aachen.de>
From: Greg Oster <oster@cs.usask.ca>
List: current-users
Date: 06/24/2002 08:42:09
Markus W Kilbinger writes:
> >>>>> "Greg" == Greg Oster <oster@cs.usask.ca> writes:
> 
>     >> After simulating (raidctl -F) a disk failure of a 4 disk raid5
>     >> (3 + 1 spare), the spare disk got in 'used_spare' state and the
>     >> whole raid was still usable under the 'reconstructing' phase.
>     >> Fine!
>     >> 
>     >> But what's the smooth way (== the system is still usable during
>     >> that time) back? 'raidctl -B' did the job, but blocks the whole
>     >> system for the complete copyback time!
> 
>     Greg> It blocks the filesystem(s) on the RAID set... Unfortunatly,
>     Greg> this is a limitation of the copyback code.
> 
> As positive aspect copyback is about twice as fast as the
> reconstructing way...
> 
>     >> Trying it with 'raidctl -R ...', while the spare is in use,
>     >> reconstructed the formerly failed disk smoothly, but left the
>     >> spare disk in 'used_spare' state.
> 
>     Greg> Hmm!! It really shouldn't be letting you do a 'raidctl -R'
>     Greg> after you've already reconstructed to a spare... smells like
>     Greg> a bug..
> 
> A reboot 'solved' this problem, but that's not the clean way for a
> raid system, anyway! ;-)

Well... if you have hot-swap drives, then you could just stuff the new drive in 
place of the old one, and not worry about doing a copyback.. (i.e. the new 
drive you put in becomes the hot spare).  If you don't have hot-swap drives, 
then you'll have to take things off-line anyway, and at that point you can 
just shuffle the disks around :)

> -> send-pr?

If you'd like.  There may already be a PR, but I don't recall for sure.. I do 
know that this issue is on my RAIDframe "todo" list..  (unfortunatly, fixing 
it is going to require a complete rewrite of the copyback code.. :( )
 
> So, copyback is the only (clean) way back from a used spare to the
> normal raid disk on a running machine?

Technically, yes.  However: the used spare should be just as good as a 
normal raid disk (unless you're using a slower disk or something.)  
If the used spare is exactly the same as the other drives in the array, 
I wouldn't even worry about doing the copyback -- at some point when you 
reboot, the used spare will be pulled into the array as a normal component 
(assuming you're using the autoconfig stuff).  And in the meantime, the 
spare disk should function just as well as a normal component.

Later...

Greg Oster