Subject: Re: WD_SOFTBADSECT & WD_QUIRK_FORCE_LBA48 usage ...
To: None <davef1624@aol.com>
From: Manuel Bouyer <bouyer@antioche.eu.org>
List: tech-kern
Date: 10/05/2005 22:30:22
On Tue, Oct 04, 2005 at 09:06:49PM -0400, davef1624@aol.com wrote:
> Manuel - thanks for your previous answers to my WD_SOFTBADSECT 
> questions;
> I have a few more questions though for you, etc.
> (my previous email is attached at the very end) ...
> 
> My original question:
> >>We're currently using a fairly 'old' wd.c driver & 1.6 NetBSD 
> kernel; Nov 1, 2002 to be exact.
> >>I'm wondering if there are any critical bug fixes (to either wd.c, 
> ata*, pciide* drivers)
> >>that might impact disk driver/subsystem reliability and/or error 
> recovery since this date?
> 
> Your reply:
> >Probably, but if you don't have problems, I'm not sure why you worry 
> :)
> 
> 
> Actually, we are seeing several apparent reliability issues with the 
> IDE drives we're using.
> Some of the drives experience a bad sector/block after only ~ 5,000 - 
> 10,000 hours of operation.

Ha, I'm not alone :)

> In addition, the IDE drive sometimes cannot spare out the bad block.

Note that an IDE drive will spare a block only on write. So if you get a
bad block during read, it will remain bad until you write something
to it.

> When we run the 'smartmon' diagnostics on the disk- they usually pass 
> the Health Check fine,
> but fail the extended diagnostics (usually because of repeated bad read 
> errors from the disk).
> 
> Also, fsck and other system processes will repeatedly retry reading 
> and/or writing these bad blocks:
> 
> >kernel: pciide0:1:0: device timeout, c_bcount=8192, c_skip0
> >kernel: pciide0 channel 1: reset failed for drive 0
> >kernel: wd0a: device timeout reading fsbn 8288336 of 8288336-8288351 
> (wd0 bn 8288336; cn 8222 tn 8 sn 56), retrying
> >kernel: pciide0:1:0: not ready, st=0x80, err=0x00
> >kernel: wd0a: device timeout reading fsbn 8288336 of 8288336-8288351 
> (wd0 bn 8288336; cn 8222 tn 8 sn 56), retrying
> >kernel: wd0: soft error (corrected)
> >kernel: pciide0:1:0: bus-master DMA error: missing interrupt, 
> status=0x21
> >kernel: pciide0:1:0: device timeout, c_bcount=65536, c_skip0
> >kernel: wd0a: device timeout reading fsbn 8343104 of 8343104-8343231 
> (wd0 bn 8343104; cn 8276 tn 14 sn 14), retrying
> 
> 
> Therefore, I'm looking into any critical fixes that would improve our 
> system's resiliency to these kinds of errors;
> our system needs to be as robust as possible.
> 
> There appear to be several alternatives:
> 
> 1)  Use the WD_SOFTBADSECT 'automatic bad-sector list' fix - introduced 
> on Apr 15, 2003
>     (Revision 1.241 of wd.c).
>     My question concerns the following (taken from wd(4) man-page):
> 
>      > This feature does not interoperate well with the sector 
> remapping features of modern disks.
>      > To let the disk remap a sector internally, the software bad 
> sector list must be flushed or disabled before.
> 
>      Can you further explain this to me?

A bad sector will be remapped on write. But if it's in the bad block list,
the driver will return an I/O error on write without sending the
command to the disk. So bad sectors will never be remapped once they've
been recorded in the kernel bad sector list.

> How would I remap a bad 
> sector when using WD_SOFTBADSECT?

First flush the bad sector list with atactl, then do a write to this
sector.

>     I'd like to avoid having to reboot if possible.
> 
> 2)  Use the WD_QUIRK_FORCE_LBA48 feature.   Can you briefly explain 
> this feature to me as well?

It's only usefull for large drives, and known to be usefull only for
some seagate drives. Recent seagate drives have a broken firmware, which
rejects an I/O request to sector 0xfffffff (aka 128GB-1) when this request
is done in LBA mode, it's accepted properly when using LBA48 commands.
The workaround is to use LBA48 for sector 0xfffffff on these drives.
If your drives are not seagates, or smaller than 128GB, you won't need
this.

> 
> 3)  Use RAIDframe for data mirroring; we only have one physical drive 
> in the system though.
>      Is it possible to use RAID to perform data mirroring onto two 
> separate file-system partitions on the same drive?
>      This would help to protect us from bad disk blocks on an otherwise 
> working drive.

Yes, this should work. But you may find performances awfull. 

-- 
Manuel Bouyer <bouyer@antioche.eu.org>
     NetBSD: 26 ans d'experience feront toujours la difference
--