tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

re: raidctl -B syntax



	Hello.  I use raid1 and raid5 every day with raidframe and I've encountered 
a variety of errors and recovery schemes, some of which I've written about on
the tech-kern list over the years.
	To address Edgar's question, I hope, here's what I know:

Given a raid set with raid1 (A and B) as Edgar lays out.

1.  If B fails, you have the following:
A: optimal
B: failed.

2.  Add C as a hot spare and reconstruct to it:
Now you have: 
A: optimal
B: failed
C: used_spare

Here are the questions I think Edgar is asking:

O  What happens if an i/o error occurs on B during reconstruction?  Answer:
This can't happen because the contents of B are not used during
reconstruction.  B is what failed, so it is considered dead.

O  What happens if I reboot after reconstruction is complete and the
originally "failed" B comes back to life?
Answer:  It falls out of the raid set entirely and you'll end up with:
A: optimal
C: optimal
The reason is that when reconstruction occurred, the Mod counter gets
incremented and the "failed" B will not have been updated, so it will be
ignored during the autoconfiguration process.  However, further scanning
should come across "C" which will be correct, and it will be taken as the
optimal second member of the raid1 raid set.

	I've been using raidframe for as long as it's been in the NetBSD tree
and I've never used the -B flag to perform a copyback from a spare to a
replaced original.  Instead, what I do is ad the "replaced" component as an
additional hot spare and then use the -F flag to  fail the  used_spare back
to the "new" original disk.  The reason for this is that I discovered, and
I believe Greg confirmed, that raidframe can't actually promote a component from
spared status to optimal status.  (I believe this is in the tech-kern
archives.).  So, what you have to do is reboot or, at the very least, 
unconfigure the raid set and then reconfigure it with the replaced disk as
if it were an original member of the set.  You can do this with the
configuration file, or if autoconfig is enabled for the raid set, when the
raid set comes back on line, the used spare component will show up as if it were
always a full fledged member of the raid set.  Because of this state of
affairs, I don't think you can actually use the -B flag with the
expectation that it wil do what you want, which is bring your original raid
set component back on-line as an optimal member of the raid set.  This is
actually a problem in situations where you're running on fully hot swappable
hardware, because it means you still have to take the raid set off-line at
some point to turn your used spares back into optimal members of the set.

	I should note that I'm still running NetBSD-5 with all of my raid
sets, but I don't see any commits to lead me to believe that any of this
information has changed in newer versions of the OS.

Hope this helps.
-Brian

On Dec 21, 12:25am, matthew green wrote:
} Subject: re: raidctl -B syntax
} > I am confident that an I/O error duing reconstruction will result in the
} > reconstruction failing.
} 
} it does for RAID1.  i've not used RAID5 for years.
} 
} i had a disk failure, followed by the otherside giving read
} errors while reconstructing.  my rebuild failed sort of
} appropriately (it would be nice if the re-rebuild would know
} where to restart from.)
} 
} (i managed to recover the failed blocks from the 2nd disk
} from the 1st one.  at least they managed to fail in different
} regions of the dis, obviating the need for more annoying
} methods of restore :-)
} 
} 
} .mrg.
>-- End of excerpt from matthew green




Home | Main Index | Thread Index | Old Index