Subject: Re: Soft error on disk write corrupted drive
To: <>
From: Stuart Brooks <stuartb@cat.co.za>
List: port-i386
Date: 08/31/2007 12:08:09
Stuart Brooks wrote:
> Manuel Bouyer wrote:
>> On Thu, Aug 30, 2007 at 08:53:59PM +0100, David Laight wrote:
>>
>>> On Thu, Aug 30, 2007 at 09:27:43PM +0200, Manuel Bouyer wrote:
>>>
>>>>> Aug 18 14:56:00 Connswater1 /netbsd: wd0g: error writing fsbn
>>>>> 216369084 of 216369084-216369211 (wd0 bn 268435451; cn 266305 tn 0
>>>>> sn 11), retrying
>>>>> Aug 18 14:56:00 Connswater1 /netbsd: wd0: (id not found)
>>>>> Aug 18 14:56:01 Connswater1 /netbsd: wd0: soft error (corrected)
>>>>>
>>>> Hum, 268435451 = 0xffffffb. This looks like LBA48 lossage.
>>>> Maybe this drive doesn't handle properly LBA48 PIO transfers.
>>>>
>>> Is this a case where we are doing LBA28 transfers of multiple sectors
>>> that cross the boundary ?
>>>
>>
>> I suspect it is, yes. But the controller may be at fault too here.
>>
>>
> Thanks for all the posts. Some more information has come to light
> which may be of interest. I have just experienced exactly the same
> problem on another disk and the logs indicate an error within 12
> sectors of the original error:
>
I managed to obtain a clean disk (identical model,WDC WD5000AAJS-22TKA0,
Rev: 12.01C01) and could reproduce the problem by doing a dd of 1MB
blocks across the suspect sector (268435451).
Aug 31 11:59:50 30_DEMO_697 /netbsd: wd1g: error writing fsbn 216369024
of 216369024-216369151 (wd1 bn 268435391; cn 266304 tn 15 sn 14), retrying
Aug 31 11:59:50 30_DEMO_697 /netbsd: wd1: (id not found)
Aug 31 11:59:51 30_DEMO_697 /netbsd: wd1: soft error (corrected)
With a block size of 512 bytes it didn't manifest. Where to from here?
At least it's reproducible...
Stuart