Subject: Re: Soft error on disk write corrupted drive (LBA28-1 - problem?)
To: None <port-i386@NetBSD.org, tech-kern@NetBSD.org>
From: Ignatios Souvatzis <ignatios@cs.uni-bonn.de>
List: port-i386
Date: 08/31/2007 13:07:13
[Added tech-kern and enhanced subject line - i.s.]

On Fri, Aug 31, 2007, Stuart Brooks <stuartb@cat.co.za> wrote:
[...]
> I managed to obtain a clean disk (identical model,WDC WD5000AAJS-22TKA0, 
> Rev: 12.01C01) and could reproduce the problem by doing a dd of 1MB 
> blocks across the suspect sector (268435451).
> 
> Aug 31 11:59:50 30_DEMO_697 /netbsd: wd1g: error writing fsbn 216369024 
> of 216369024-216369151 (wd1 bn 268435391; cn 266304 tn 15 sn 14), retrying
> 
> Aug 31 11:59:50 30_DEMO_697 /netbsd: wd1: (id not found)
> Aug 31 11:59:51 30_DEMO_697 /netbsd: wd1: soft error (corrected)
> 
> With a block size of 512 bytes it didn't manifest. Where to from here? 
> At least it's reproducible...

I think you used the buffered disk interfaces (/dev/wd1g) - if I'm not
wrong, you should try with dd to the raw disk - (/dev/rwd1g), for
completeness.

But anyway, I seem to recall former discussions about some disks with a
broken LBA28 boundary and how to handle them - that is, they needed
LBA48 addressing even for block LBA28-boundary - 1. In this case you
should see the problem also for smaller transfers, as soon as block
0xfffffff is touched. Can you try that please?

(Oh, and some real LBA28-1-problem-expert, please speak up!)

Regards,
	-is