Subject: Re: CVS commit: src/sys/dev/ata
To: Daniel Carosone <dan@geek.com.au>
From: Manuel Bouyer <bouyer@antioche.lip6.fr>
List: source-changes
Date: 06/02/2004 12:42:16
On Wed, Jun 02, 2004 at 08:28:00PM +1000, Daniel Carosone wrote:
> 
> > XXX WTF is the point of this shit, anyway?  In most cases, the way you're
> > supposed to fix a bad block on an ATA disk is to rewrite it -- which will
> > either just transparently fix it, or spare it.  This code actively prevents
> > that.
> 
> The point, originally, was to prevent systems and drives grinding
> themselves into the ground on a read error - potentially causing more
> damage to the drive in some cases, and in any case taking a long time
> to never fix the error.  Machines could become unusable, even for the
> administrator trying to fix the problem.
> 
> Perhaps it's a little ill-conceived: instead of being a hard list of
> "inacessible" blocks, perhaps it should be more like a negative cache
> for readable blocks -- but still allow the blocks to be written and
> potentially fix/remap?

I think it's just the read test wich is misplaced: if it was in
wdstrategy() instead of wddone(), we could still be write the on bad block.
Or, maybe better: in the list of bad block, also record if the error was
for read or write. If we got an error for read, allow write operations to this
block.  If we got an error for write, return EIO for both read and write.

> 
> Administrator intervention is usually required when a drive has
> reached the point where this is being triggered, anyway.
> 
> FWIW, I've found a number of drives that don't seem to remap bad
> blocks while write-cache is on.  I originally suspected bad drive
> firmware, but I've now confirmed this behaviour across a range of
> vendors.  I know wonder whether we're resetting the command/drive on
> errors because of too short timeouts, and it never has a chance to
> complete the process except where the writes are synchronous?

It may be worse than that. I suspect that when the write cache is on,
write error are not reported (IDE don't have the delayed error SCSI has).
Once all the spare sectors have been allocated, you can't remap new bad
blocks any more, but don't get an error when writting.

> 
> I have a little pattern-test script that uses a random-key cgd to dd
> encrypted-zero's (ie, "random" patterns") over a disk and cmp the
> decrypted zero's afterwards, then re-key and repeat endlessly.  After
> a cycle or two with write cache off, every "failing" disk I've done
> this to bar one has recovered and tested clean, and that one disk was
> very ill indeed. (I don't trust those disks, but I have a lot of
> /scratch space as a result)

Hum, this would mean that data can get corrupted on write when the cache is
on, even if there are spare sectors free, right ?
Even if IDE is crap in the first place, I can't see a reason for such behavior.

-- 
Manuel Bouyer <bouyer@antioche.eu.org>
     NetBSD: 26 ans d'experience feront toujours la difference
--