Subject: Fried /usr partition while updating to -current
To: Douglas K. Rand <rand@Aero.UND.NoDak.Edu>
From: Dirk Steinberg <steinber@schoenfix.ert.rwth-aachen.de>
List: current-users
Date: 02/21/1994 11:25:30
>>>>> "Douglas" == Douglas K Rand <rand@Aero.UND.NoDak.Edu> writes:

    Douglas> My mother told me not to use anything but released
    Douglas> software! ;-) I was in the process of upgrading to a
    Douglas> -current system when the kernel (from -current, about
    Douglas> 2-13) started spouting:

    Douglas> wd1: lost interrupt - status d2, error 1
    Douglas> wdc0: busy too long, resetting
    Douglas> wdc0: failed to reset controller

    Douglas> over and over and over. So I reset the machine, and when
    Douglas> it booted back up /usr was toast. Right now
    Douglas> /usr/libexec/getty doesn't exist!

I reported the very same bug about 2 weeks ago to this list and also
with send-pr. There was a moderate discussion/followup. Someone
reported to also have had these symptoms. Many people suggested that
my hardware is faulty (I have a Quantum LPS 240 AT), but I'm quite
sure it isn't. Nobody even came close to explaining the bug, let alone
offer a fix. This bug is strictly related to IDE/MFM/RLL disks, so
people with SCSI disk (like most (all?) of the principal developers)
will never see this. Therefore they are unable to reproduce and fix
this bug, which seems to have been around for quite some time.

    Douglas> What I had accomplished was a complete build of both the
    Douglas> kernel and the executables with out shared libraries. I
    Douglas> was in the process of rebuilding the system with shared
    Douglas> libraries, when this happened.

    Douglas> Should I try again? I think I'll have to reload the
    Douglas> binaries (I remember something about -[semi]current
    Douglas> binaries somewhere) from 0.9 and upgrade again. Or should
    Douglas> I just jump out the window and get it over with?

You can certainly retry, and it might happen again, or it might not
happen again. This is definitly a heisenbug. All you can do is take
backups very so often. It happened to me 3 times within 3 weeks.

    Douglas> I'm using a Promise Technology DC4030 local bus caching
    Douglas> IDE controller with a Maxtor 540 disk drive.

I'm also quite annoyed by this bug, but I can't fix it myself. This
one certainly makes NetBSD unusable as a real work platform for
certain IDE disks (there seem to be other Mainboard/controller/disk
combinations that work flawlessly, but nobody knows why).

I assure you that I feel with you.

Greetings,

	Dirk

-----------------------------------------------------------------------------
Dirk W. Steinberg - RWTH Aachen - Internet email: steinber@ert.rwth-aachen.de
Aachen University of Technology / IS2-Integrated Systems in Signal Processing
Rhein.Westf.Tech.Hochsch. Aachen / Integrierte Systeme der Signalverarbeitung
Templergraben 55 / D-52056 Aachen / phone:+49 241 807879 / fax:+49 241 807631
Home address: Kleikstr. 63, D-52134 Herzogenrath,Germany/phone: +49 2406 7225

------------------------------------------------------------------------------