Subject: Re: kern/14007: uncorrectable data error reading fsbn -- problems with IDE hard disk
To: None <sen@eccosys.com>
From: Manuel Bouyer <bouyer@antioche.eu.org>
List: netbsd-bugs
Date: 09/19/2001 22:29:01
On Tue, Sep 18, 2001 at 11:10:24PM -0700, sen@eccosys.com wrote:
> 
> >Number:         14007
> >Category:       kern
> >Synopsis:       uncorrectable data error reading fsbn -- problems with IDE hard disk
> >Confidential:   no
> >Severity:       serious
> >Priority:       high
> >Responsible:    kern-bug-people
> >State:          open
> >Class:          sw-bug
> >Submitter-Id:   net
> >Arrival-Date:   Tue Sep 18 23:11:00 PDT 2001
> >Closed-Date:
> >Last-Modified:
> >Originator:     Sen Nagata
> >Release:        1.5.1
> >Organization:
> >Environment:
> NetBSD 1.5.1 (GENERIC_LAPTOP) #33: Mon Jul 2 15:56:09 CEST 2001 he@nsa.uninett.no:/usr/src/sys/arch/i386/compile/GENERIC_LAPTOP i386
> >Description:
> While using 1.5.2, when trying to read some files, I heard my hard 
> disk make some unhealthy sounding noises.  Switching to a console, I 
> see something like:
> 
> wd0: transfer error, downgrading to Ultra-DMA mode 1
> wd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 1 (using DMA data transfers)
> wd0e: uncorrectable data error reading fsbn 5489344 of 5489344-5489359 (wd0 bn 7894369; cn 8353 tn 12 sn 28), retrying
> ...
> 
> I've also noticed that sometimes when the disk problems occur, there
> are several mode downgrade attempts -- e.g. starting from some mode ->
> Ultra-DMA mode 1 PIO mode 4 -> DMA mode 2 PIO mode 4 -> PIO mode 4.

Yes, I need to fix this. For this kind of error it's not good to downgrade.

> 
> This happened to me a day or so after upgrading to 1.5.2, so I did a fresh install of 1.5.2 on to a different hard disk and a different machine, followed by transferring data (cp -pR).  Bulk transfers of
> data would fail occasionally, but transferring files individually 
> appeared to work (I interlaced copying with syncing the disk).
> 
> Today, the same thing started to occur on the new machine so I booted
> a 1.5.1 kernel hoping the problem would go away, but no help there 
> either.
> 
> I've experienced this on a custom compiled kernel (uses the default
> pciide and wd settings) as well as the GENERIC_LAPTOP kernel.
> 
> BTW, both hard disk are IBM hard disks and the machines I tried 
> this on are ThinkPads (600E and X20).
> 
> I hope someone else can reproduce this but searching the archives
> and pr forms didn't turn up anything for me.

Well, your disk is obviously dead.
Now, the problem is to find why it died. Does it get enouth power ?
Doesn't it get too hot ?

It's quite possible that windows won't push it that hard.

--
Manuel Bouyer <bouyer@antioche.eu.org>
--