Subject: Re: Soft error on disk write corrupted drive (LBA28-1 - problem?)
To: <>
From: Stuart Brooks <stuartb@cat.co.za>
List: tech-kern
Date: 08/31/2007 14:05:12
Ignatios Souvatzis wrote:
> [Added tech-kern and enhanced subject line - i.s.]
>
> On Fri, Aug 31, 2007, Stuart Brooks <stuartb@cat.co.za> wrote:
> [...]
>   
>> I managed to obtain a clean disk (identical model,WDC WD5000AAJS-22TKA0, 
>> Rev: 12.01C01) and could reproduce the problem by doing a dd of 1MB 
>> blocks across the suspect sector (268435451).
>>
>> Aug 31 11:59:50 30_DEMO_697 /netbsd: wd1g: error writing fsbn 216369024 
>> of 216369024-216369151 (wd1 bn 268435391; cn 266304 tn 15 sn 14), retrying
>>
>> Aug 31 11:59:50 30_DEMO_697 /netbsd: wd1: (id not found)
>> Aug 31 11:59:51 30_DEMO_697 /netbsd: wd1: soft error (corrected)
>>
>> With a block size of 512 bytes it didn't manifest. Where to from here? 
>> At least it's reproducible...
>>     
>
> I think you used the buffered disk interfaces (/dev/wd1g) - if I'm not
> wrong, you should try with dd to the raw disk - (/dev/rwd1g), for
> completeness.
>
> But anyway, I seem to recall former discussions about some disks with a
> broken LBA28 boundary and how to handle them - that is, they needed
> LBA48 addressing even for block LBA28-boundary - 1. In this case you
> should see the problem also for smaller transfers, as soon as block
> 0xfffffff is touched. Can you try that please?
>
> (Oh, and some real LBA28-1-problem-expert, please speak up!)
>
> Regards,
> 	-is
>
>
>   
I initially reproduced it using the non-buffered interfaces (rwd1g):

dd if=/dev/zero of=/dev/rwd1g seek=105600 bs=1024k count=1000


I then tried to reproduce by using rwd1d which initially did't cause the 
problem:

dd if=/dev/zero of=/dev/rwd1d seek=131000 bs=1024k count=1000


But then modified the block size (in case it was a boundary condition) and the problem occurred again:

dd if=/dev/zero of=/dev/rwd1d seek=134200 bs=1000k count=1000