Subject: Re: IDE SMART 'Hardware ECC Recovered' issues...
To: Tobias Nygren <tnn@netilium.org>
From: =?ISO-8859-1?Q?Timo_Sch=F6ler?= <wanker4freedom@web.de>
List: netbsd-help
Date: 02/18/2005 16:13:50
>> hi list,
>>
>> i have a 2.0.1-RELEASE system running on a dual PIII machine (IBM
>> Intellistation which features ECC, might be important?) configured to
>> run a RAIDframe RAID1.
>>
>> this is the only x86 machine i have, all others are non-x86; i have a
>> few Sun Ultra 1E and Ultra 2E running in a similar config (also RAID1)
>> which do not show this or a similar problem.
>>
>> yesterday i tested the RAID config by fiddling around and making one 
>> of
>> the HDs unavailable to the system. all went okay.
>>
>> so i had to rebuild the parity of the RAID set -- so far, so good. out
>> of a curiosity, i enabled SMART on the HDs (which are both of this 
>> type
>> [1]) to check the drives' temperature(s)...
>>
>> i saw following entry on wd1 while rebuilding the RAID:
>>
>> (...)
>> 195 100    0     no  online  positive    Hardware ECC Recovered
>> 6171836
>> (...)
>>
>> while wd0 showed 0 (zero) errors.
>>
>> after a while (almost 4/5 of the rebuilding done), first errors 
>> occured
>> on wd0 also:
>>
>> (...)
>> 195 100    0     no  online  positive    Hardware ECC Recovered
>> 24580
>> (...)
>>
>> is this something to be worried about? i guess yes :(
>>
>> the drives are brand-new...
>>
>> help is very much appreciated -- tia!
>>
>> [1] -- dmesg output
>>
>> wd0 at atabus0 drive 0: <SAMSUNG SP0812N>
>> wd0: drive supports 16-sector PIO transfers, LBA48 addressing
>> wd0: 76351 MB, 155127 cyl, 16 head, 63 sec, 512 bytes/sect x 156368016
>> sectors
>> wd0: 32-bit data port
>> wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 5 
>> (Ultra/100)
>>
>> -- 
>> Timo Schoeler | http://macfinity.net/~tis
>> //macfinity -- finest IT services | http://macfinity.net
>>
>> There are 10 types of people in the world. Those who understand binary
>> and those who don't.
>>
>>
>
> Hi,
>
> Afaik this counter shows internal ECC recovery in the disk
> "as the data comes from the drive heads". Again afaik, this is normal
> to occur to some extent in modern GMR-head-based drives.

hi, for me that sounds as the disc said 'huh, i wasn't able to decipher 
what was on my platter, so i read again, proved it and wrote +1 to the 
counter'...? jeeeeez.

>
> It has nothing to do with the fact that your computer has ECC main 
> memory.

yip, that's clear to me. but i searched the mail archives before and 
found a similar posting (from greywolf IIRC) where a connection between 
this SMART counter and an ECC-less memory problem was constructed.

> Cheers,
> -Tobias

cheers!

timo