Subject: Re: Hanging problems
To: Charles M. Hannum <mycroft@ai.mit.edu>
From: Duncan McEwan <duncan@comp.vuw.ac.nz>
List: current-users
Date: 05/10/1995 17:35:34
> The problems people noted with recent i386 kernels `hanging' during
> autoconfig should be fixed now.  If you're impatient, you can just add:
> 
> #define ICU_HARDWARE_MASK

If one of the people who noted this problem was me (in a recent message with
the subject "Recent kernels failing to boot on my Micronics MPower 4
motherboard") I'm sorry to say that this doesn't fix it for me.

However, a suggestion from Mike Long <mike.long@analog.com>, to try turning off
the external cache *did* work ... sort of.  The system booted, and seemed to
run OK, but hung in the same way (ie, with high pitched beeps I described in my
earlier message) when I rebooted (immediately after the "rebooting..."
message).

I've spent some time tracing through the autoconfiguration code, and using some
kernel printf's found the hang (when the cache was turned on) occured sometime
after returning from the first call to fdattach() but before returning from
fdcattach().

This seemed to indicate that the problem was to do with probing for floppy
drives after the first, so I changed the "for" loop in fd.c to only check for
one drive.  The resulting kernel booted fine (with the external cache enabled).

A slightly better fix seemed to be to change the config file to explicity name
the fd device, rather than use wildcards. Ie:

	fd0 at fdc0 drive 0

Again this kernel booted fine.  As an experiment, I added explicitly configured
additional floppy drives one at a time, and found the machine hung when booting
a kernel with four floppy drives configured.  It also hung with less than 4
drives when I didn't explicitly provide drive numbers.  Ie:

	fd0 at fdc0 drive ?
	fd1 at fdc0 drive ?
	
I guess this indicates some kind of hardware bug, but I would appreciate any
comments on: 1) why turning the external cache off changes the behaviour; 2)
what the floppy probe code could be doing to cause this, and whether there is a
nicer fix than explicitly naming the fd devices.

Thanks in advance.

Duncan