Subject: Re: reproducible kernel panic w/ 2.0RC4MP
To: None <port-macppc@netbsd.org>
From: Tim Kelly <hockey@dialectronics.com>
List: port-macppc
Date: 12/03/2004 13:20:15
At 12:40 PM -0500 12/3/04, Tim Kelly wrote:
>In order to overcome this, I added code in do_pending_int to checkwith MP
>if the PSL_EE is off on CPU0 (us) enable it but raise the spl level to
>IPL_IPI. I also recover what the interrupt mask was before raising to
>IPL_IPI, so that existing pending interrupts will still get processed, but
>all new interrupts will be masked as pending. This may introduce some other
>race condition, but my thinking was that we'd like to handling the existing
>interrupts without being interrupted by anything less than an IPI. If this
>patch tests out, I may allow the pending interrupt to handle the spl in the
>future, unless someone can explain what using IPL_IPI could prevent from
>occuring. One concern could be that a lower interrupt could require the
>services of a higher interrupt and that IPL_IPI could block this. Since the
>focus for solving this bug has been to identify why CPU0 is not responding
>to IPIs, I wanted to ensure IPIs get handled.

After thinking about this more, I suspect that this will introduce a
problem where an interrupt will come in at a higher level than the one
being processed in do_pending_int but will get marked pending instead of
executed immediately. I will post a revised patch after testing (which
takes a while), but in the meantime, in extintr.c at

 #ifdef MULTIPROCESSOR
+	tmsr = emsr;
 	if (ci->ci_cpuid == 0) {
+
+		/* EE was already off */
+		/* we may have an IPI pending */
+		/* for SP, PSL_EE off is by design */
+		if (!(emsr & PSL_EE)) {
+			emsr |= PSL_EE;
+			/* already pending interrupts get processed */
+			/* because the mask is against earlier pcpl */
+			/* new interrupts get marked pending */
+			splraise(imask[IPL_IPI]);
+		}
+
+
 #endif

just comment out the

	splraise(imask[IPL_IPI]);

line. This will maintain the interrupt mask.

tim