Subject: Re: kern/14007: uncorrectable data error reading fsbn -- problems with IDE hard disk
To: None <sen@eccosys.com>
From: Manuel Bouyer <bouyer@antioche.eu.org>
List: netbsd-bugs
Date: 09/20/2001 21:01:27
On Thu, Sep 20, 2001 at 10:18:22AM +0900, sen@eccosys.com wrote:
> > Well, your disk is obviously dead.
> 
> That may be true, but it's been two disks for me in rapid succession
> on two different machines with two different power supplies, and I'd
> like to figure out how to avoid toasting further disks -- not to
> mention getting a stable environment back ;-)
> 
> In case I presented the sequence of events poorly, here's another
> attempt:
> 
> 1) Problem occurs on ThinkPad X20 w/ an IBM hard disk.
> 
> 2) Suspecting the problem was with the hardware, I did a fresh install
>    of 1.5.2 on a ThinkPad 600E with a completely new hard disk (also
>    IBM) and a different power supply.
> 
> 3) After installation on the 600E, I put the old hard disk in the
>    extra bay to do a data transfer -- I transferred my /home directory
>    (which when done using bulk methods failed, but copying individual
>    files interlaced w/ syncs did work) and /usr/pkgsrc.
> 
> 4) Then I built and installed some packages on to the new hard disk.
> 
> 5) Some time afterwards, when trying to work on some packages in
>    /usr/pkgsrc, I started getting the problem on areas of the disk
>    which had been used to build and install packages earlier (but
>    also other areas, so far only located in /usr/pkgsrc).
> 
> BTW, both machines were sitting on desk surfaces when the problems
> were noticed and I almost always have the power supplies plugged in
> (and the batteries are nearly always close to fully charged).
> 
> > Now, the problem is to find why it died. 
> 
> Yes, I'm very anxious to do this.
> 
> > Does it get enouth power ?  Doesn't it get too hot ?
> 
> How can I determine the answer to these questions meaningfully?  FWIW,
> I don't particularly notice the machines getting hot.

Can you test if the drive itself gets hot ?

Also what kind of IDE controller do you have in these machines ?

> 
> Should I disable Ultra DMA on any new disks that I use?  If so, is it
> enough to recompile a kernel w/ the following sorts of settings?
> 
>   wd* at pciide? channel ? drive ? flags 0x0fac

flags 0x0f00
would be enouth (disable Ultra-DMA, and let the driver find the rigth PIO and
DMA modes).

> 
> [ Is there some way to change this setting dynamically?  May be using
> the kernel debugger? ]

It can be done with ddb, or eventually gdb on the kernel binary.
You have to change the rigth place in the cfdata array, which is generated
by config when a kernel is built.
The array is defined in ioconf.c in the kernel build directory.

> 
> > It's quite possible that windows won't push it that hard.
> 
> That's possible.  Ah, you mean and that's why the problem may not have
> been noticed more widely?

Yes. Really, I don't believe a driver can damage a hard disk.

--
Manuel Bouyer <bouyer@antioche.eu.org>
--