[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: Question about raidframe use
Brian Buhrow writes:
> Hello. I've been a raidframe user for a long time, but I've recently
> come across a situation that I'm not sure I know how to get out of
> gracefully without rebooting. I'm wondering if anyone can tell me of
> another way to do what I want.
> The scenario looks like this:
> 1. I have a raidframe set up, the type of raidframe is not important for
> purposes of this problem.
> 2. One component goes bad, so I use
> raidframe -a /dev/newdisk
I suspect you mean 'raidctl' here...
> raidframe -F badcomponent
> to get the spare component online.
> 3. Once the spare component is reconstructed, it stays listed in the
> raidframe -s output as a used_spare. This until the server is rebooted and
> the autoconfig pulls it in as a full fledged component.
> 4. Before I reboot, but after I've reconstructed to the spare, I find I
> need to delete the spare and reconfigure the underlying drive before
> to it as a spare again.
> raidframe -f /dev/newdisk
> doesn't work because it says that /dev/newdisk is not a component of the
> raid set.
> raidframe -f /dev/badcomponent doesn't work either, for the same reason.
Hmm.. I'm not sure what the goal of this step is... I do know that it
won't let you do it, because /dev/newdisk won't be (currently) marked
as "optimal" (instead, it's a "used_spare"). The issue is that the
code can't handle remapping a failed spare to another spare, and so
hand-failing a spare is disallowed...
> So, the questions are:
> 1. Can the raidframe code be changed topromote used_spares to full
> components once the reconstruction is complete? (I realize this blurs the
> line between spares and full components, but right now, there is no
> auto-reconstruction mechanism, and there doesn't seem to be a way of
> failing a used_spare.)
It can be changed... it's just a not-so-simple matter of programming
that I've not gotten to yet... (It basically requires re-writing a
whole mess of code related to how the spares are handled...)
> 2. Failing that, can the raiddlookup code be changed to permit the manual
> failing of used_spares?
I think I looked at doing that one time and discovered that things
weren't setup to handle that case... (e.g. say you fail component '2'
and rebuild to spare '5'. Then you fail spare '5' and rebuild to
spare '6'. The issue is that RAIDframe knows how to map from '5' back
to '2, but it doesn't now how to map from '6' to '5', and then from '5'
to '2' (or any other sort of transitive steps))
> I like option 1 better, since it implies that you could go through an
> endless cycle of sparing and failing disks without rebooting and always end
> up with the ability to manipulate components that are full components,
> rather than used_spares.
> Am I completely missing something here?
Nope :-} Fixing this one has been on my TODO list for ages, but last
I looked there were some fairly invasive changes required... And I
agree that being able to do an endless cycle of sparing and failing
would be ideal... (I'll have another look at what is involved in
changing this... If I recall, some things got a lot simpler, at the
expense of no longer being able to do a 'copyback' (which I don't
think anyone uses anyway!!))
Main Index |
Thread Index |