Subject: Re: CVS commit: src/sys/dev/ata
To: Manuel Bouyer <bouyer@antioche.lip6.fr>
From: Daniel Carosone <dan@geek.com.au>
List: source-changes
Date: 06/02/2004 20:51:51
--REOXmE5zpOGz6VLu
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Wed, Jun 02, 2004 at 12:42:16PM +0200, Manuel Bouyer wrote:
> On Wed, Jun 02, 2004 at 08:28:00PM +1000, Daniel Carosone wrote:
> > Perhaps it's a little ill-conceived: instead of being a hard list of
> > "inacessible" blocks, perhaps it should be more like a negative cache
> > for readable blocks -- but still allow the blocks to be written and
> > potentially fix/remap?
>=20
> Or, maybe better: in the list of bad block, also record if the error was
> for read or write. If we got an error for read, allow write operations to=
 this
> block.  If we got an error for write, return EIO for both read and write.

Sounds sensible to me.

> > FWIW, I've found a number of drives that don't seem to remap bad
> > blocks while write-cache is on.  I originally suspected bad drive
> > firmware, but I've now confirmed this behaviour across a range of
> > vendors.  I know wonder whether we're resetting the command/drive on
> > errors because of too short timeouts, and it never has a chance to
> > complete the process except where the writes are synchronous?
>=20
> It may be worse than that. I suspect that when the write cache is on,
> write error are not reported (IDE don't have the delayed error SCSI has).
> Once all the spare sectors have been allocated, you can't remap new bad
> blocks any more, but don't get an error when writting.

Hm. Certainly, in the work I've done trying to overwrite and recover
bad disks, there were no write errors reported.  However, I've not yet
(that I can recall) seen a case where, on the followup read pass, the
drive has returned a good read of bad data (ie, cmp says "not zero"),
rather than the drive reporting read errors (drive wrote to a bad
sector, and didn't detect and remap this).

So, it's still bad, but I get the clear impression that remapping
isn't happening at all - so it seems unlikely you'll run out of spare
sectors.

> > I have a little pattern-test script that uses a random-key cgd to dd
> > encrypted-zero's (ie, "random" patterns") over a disk and cmp the
> > decrypted zero's afterwards, then re-key and repeat endlessly.  After
> > a cycle or two with write cache off, every "failing" disk I've done
> > this to bar one has recovered and tested clean, and that one disk was
> > very ill indeed. (I don't trust those disks, but I have a lot of
> > /scratch space as a result)
>=20
> Hum, this would mean that data can get corrupted on write when the cache =
is
> on, even if there are spare sectors free, right ?

That is what I have seen.  Overwrite the entire disk, and still get
read errors afterwards.  Turn off write cache, overwrite again, drive
reads fine.

> Even if IDE is crap in the first place, I can't see a reason for such beh=
avior.

Indeed. If drive manufacturers disable read-after-write for speed when
write-cache is on, it might lead to such behaviour - but I'm told the
spec requires read verification on.  I think its more likely we're
doing something to stop the process running to completion, such as
issuing a reset on a too-short timeout.

FWIW, at least 2 of the drives I have "recovered" in this way failed
while they were running in linux machines, so if this is indeed the
problem its not unique to NetBSD.

--
Dan.
--REOXmE5zpOGz6VLu
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (NetBSD)

iD8DBQFAvbFGEAVxvV4N66cRAqfPAKC9yAcLIwDzELerqsSzeDPPptVeHwCdF6l4
91XmLrer5TPFtnZigB9sN+E=
=vKfZ
-----END PGP SIGNATURE-----

--REOXmE5zpOGz6VLu--