Subject: Re: Bad sectors vs RAIDframe
To: Stephen Borrill <netbsd@precedence.co.uk>
From: Thor Lancelot Simon <tls@rek.tjls.com>
List: tech-kern
Date: 06/06/2005 12:22:50
On Mon, Jun 06, 2005 at 01:35:14PM +0100, Stephen Borrill wrote:
> On Wed, 4 May 2005, Stephen Borrill wrote:
> 
> Just had another 2 of these drives fail in the same way at a different 
> site, thus killing the RAID array completely (what is it with Maxtor 
> 6Y080M0 drives?). With a regular FFS partition, some data recovery is 

Most IDE drives only spare out sectors on *write* (one must ask: what,
exactly, could they do to avoid presenting a read error on read -- and
note that you don't want to *retry* a read at the RAIDframe level since
if the drive *did* spare the sector after a read, it'll contain all
zeroes!).

Luckily, this gives you an easy way to recover this drive: add it back to
the set (as a spare if necessary) and tell RAIDframe to rebuild onto it, and
away you go.  But with two bad drives in a parity RAID set you may have
more work to do. :-/

We got a bad run of Samsung Spinpoint drives that we unfortunately
installed in NetBSD Foundation servers about a year ago.  I have had
to recover several of them (all in 2-way RAIDframe mirrors) by using
dd to copy the data from the corresponding sectors on one drive over
the bad sectors of the other, often doing this in both directions to
recover from multi-drive failures within a set.  Since then, RAIDframe
has been changed so that it retries on disk error before failing a
component, and never fails components from non-redundant sets -- so a
newer kernel may let you get somewhere with data recovery, too.

What RAIDframe should probably do, in an ideal world, is reconstruct
the contents of a read-errored sector from the data in the other
components, then immediately write it back, forcing good data into a
replacement sector on the failing disk.  But starting such write-backs
from the context of a read failure in RAIDframe may be... ugly.

-- 
 Thor Lancelot Simon	                                      tls@rek.tjls.com

"The inconsistency is startling, though admittedly, if consistency is to be
 abandoned or transcended, there is no problem."		- Noam Chomsky