Subject: Re: wd.c crashes/hard errors
To: None <steinber@machtnix.ert.rwth-aachen.de>
From: - Greg Earle <earle@isolar.Tujunga.CA.US>
List: current-users
Date: 02/10/1994 15:04:15
Dirk Steinberg writes:
>    >> This error is persistent across reboots, power off, etc. Now
>    >>since I have a IDE disk I shoudn't get hard errors. I never had
>    >>any hard errors before, and my Linux partition still works
>    >>fine. So my NetBSD installation is hosed for now. I sure hope
>    >>this error goes away when I reinstall/re-mkfs. Is it actually
>    >>possible that the faulty wd.c caused damage to my disk, or that
>    >>it at least screwed up the low-level format on some track? If
>    >>so, how could I reformat a single track without reformatting
>    >>the entire disk? And how to format (low-level) a IDE disk in
>    >>the first place? I know how it works for MFM/RLL/ESDI and SCSI
>    >>disks and have done this many times before.  But IDE disks?
>
>    Douglas> 	I was able to restore my system by doing a disklabel,
>    Douglas> and putting a clean fs on the root partition, then
>    Douglas> reinstalling all the file that were on that partition.
>    Douglas> The disk errors did not re-appear till a few days later
>    Douglas> when the f..ken thing crashed again.  This time I removed
>    Douglas> the second drive and things have been fine.
>
>This makes me hope that my drive is not physically damaged (or low-level
>un-formatted).  The error message is really weird, though.  As I said, this is
>the third time this has happened to me, and the Quantum is my only drive!  So
>your workaround won't work for me ...
>
>I also wonder why the crashes are so bad that even fsck in manual mode cannot
>repair them.  On any other UNIX system that uses a BSD UFS/fsck I've seen, you
>lose at most a few files after a crash.  The kernel must be doing something
>really horrible when it crashes; just not syncing all buffers cannot be the
>cause.

Bwaa hah hah.  Never underestimate the power of the system to do anything it
wants.  In 1988 I was working for Sun Consulting and using an old Sun-3/160 at
home.  All of a sudden it started developing the tendancy to get random
Watchdog Resets and the net result would be that upon reboot, "fsck" would find
300+ trashed files in /usr.  After about the 3rd time this happened (and after
the 4-5 hours I'd waste finding/restoring things, groveling through lost+found
et al.), I said "To Hell with this" and got the CPU board replaced.  Never had
that problem again.  Dunno why a Watchdog Reset would always cause the disk to
get so scrambled, but that's life.

	- Greg

------------------------------------------------------------------------------