Subject: Re: CVS commit: src
To: Colin Wood <cwood@ichips.intel.com>
From: Bob Nestor <rnestor@metronet.com>
List: port-mac68k
Date: 03/27/1999 14:42:39
Colin Wood <cwood@ichips.intel.com> wrote:

>Bob Nestor wrote:
>> Paul Goyette <paul@whooppee.com> wrote:
>> 
>> >On Fri, 26 Mar 1999, Colin Wood wrote:
>> >
>> >> sort of.  it boots, and sysinst runs, but the multiple disk bug that we
>> >> seem to have prevents you from actually doing anything useful at the
>> >> moment.  we panic due to trying to close the raw partition of a disk
>> >> that's already closed...of course, it's not really closed, it's just that
>> >> we've managed to trash memory, i think.  ah well...it should work pretty
>> >> well if everything is compiled under 1.3.x, tho.
>> >
>> >I thought that Scott's recent interrupt clean-up took care of the
>> >"multiple disk" bug?  At least, it seems to have fixed it on my system!
>> 
>> It didn't fix the problem on one of my disks, and changed it a bit on one 
>> other disk I have.   There's still a serious problem in there someplace, 
>> but it's beyond me where it's hiding.
>
>i'm hoping that it might be due to spurious serial interrupts.  the reason
>why i say this is that i've got one panic that goes to intrhand() and then
>bombs out with a bad interrupt vector (i can only assume the stack has
>gotten corrupted at this point).  anyway, intrhand() is only called for
>interrupt levels 3-6.  of these, we only use level 4 on my se/30, which i
>think is the serial interrupt.  keep in mind that i may have absolutely
>no idea what i'm talking about here :-)  i was trying to compile a kernel
>w/o serial support to test this out when i managed to seriously hose my
>/usr filesystem (/usr/bin went *poof*).  of course, the installer is now
>being quite pissy as well...ugh!

Well that kind of goes along with what I've seen too.  My feeling is the 
problem is caused by a spurious interrupt, although I was always thinking 
it was coming from the SCSI chain. I have seen one or two times when I 
get a spurious SCSI interrupt reported, but you might be onto something 
with your attack on the serial interrupt code.  My system (Performa 550) 
is really very sensitive to the ADB interrupts.  In fact John had a bitch 
of a time getting the HWDIRECT code working on it and I finally shipped 
the whole system out to him for the summer so he could debug it.  Even 
now if the ADB Debug switch is turned on my system won't complete booting.

Good luck, and let me know if there's anything I can do to help out.
-bob