Subject: Re: raidframe copyback blocks the whole system !?
To: Greg Oster <oster@cs.usask.ca>
From: Markus W Kilbinger <kilbi@rad.rwth-aachen.de>
List: current-users
Date: 06/25/2002 10:53:18
>>>>> "Greg" == Greg Oster <oster@cs.usask.ca> writes:
Greg> It blocks the filesystem(s) on the RAID set... Unfortunatly,
Greg> this is a limitation of the copyback code.
Good/important to know!
>> >> Trying it with 'raidctl -R ...', while the spare is in use,
>> >> reconstructed the formerly failed disk smoothly, but left
>> >> the spare disk in 'used_spare' state.
>>
Greg> Hmm!! It really shouldn't be letting you do a 'raidctl -R'
Greg> after you've already reconstructed to a spare... smells like
Greg> a bug..
>>
>> A reboot 'solved' this problem, but that's not the clean way for a
>> raid system, anyway! ;-)
Greg> Well... if you have hot-swap drives, then you could just
Greg> stuff the new drive in place of the old one, and not worry
Greg> about doing a copyback.. (i.e. the new drive you put in
Greg> becomes the hot spare).
Hmm, how to come!? The old/new drive still has state 'failed', the
spare is 'used_spare'. So, what will happen if another drive fails in
this stage? How will the 'failed' drive become the new spare _without_
reboot?
Greg> If you don't have hot-swap drives, then you'll have to take
Greg> things off-line anyway, and at that point you can just
Greg> shuffle the disks around :)
Yeah, but that's what I want to avoid in my (academic) scenario... ;-)
-> send-pr?
Greg> If you'd like. There may already be a PR, but I don't recall
Greg> for sure.. I do know that this issue is on my RAIDframe
Greg> "todo" list..
Ok, that's sufficient! ;-)
>> So, copyback is the only (clean) way back from a used spare to
>> the normal raid disk on a running machine?
Greg> Technically, yes. However: the used spare should be just as
Greg> good as a normal raid disk (unless you're using a slower
Greg> disk or something.) If the used spare is exactly the same as
Greg> the other drives in the array, I wouldn't even worry about
Greg> doing the copyback -- at some point when you reboot, the
Greg> used spare will be pulled into the array as a normal
Greg> component (assuming you're using the autoconfig stuff). And
Greg> in the meantime, the spare disk should function just as well
Greg> as a normal component.
My thought's were about the steps after 'used_spare' without rebooting
the machine (== hot swap drives). Copyback seems to be the only way to
accomplish that, then. I was just looking for a similarly smooth
(blocking free) proceeding with/after the required drive swap, without
rebooting.
So, it's only a very small issue, because the first disk failure
should have warned you about the disk problem, anyway, and plan the
next steps.
Markus.