Subject: Re: Fix for SGImips interrupt nesting crashes
To: Stephen Ma <stephenm@employees.org>
From: Rafal Boni <rafal@attbi.com>
List: port-sgimips
Date: 04/23/2002 07:49:34
In message <15553.43086.616065.281311@khemlani.nowhere.org>, you write: 

-> 
-> Rafal> 	Below is a simple patch to the sgimips cpu_intr() routine to
-> Rafal> prevent the arbitrary levels of interrupt nesting that were
-> Rafal> possible before and lead to really strange panic()s.
-> 
-> Some quick comments/suggestions:
-> 
-> > @@ -235,7 +239,8 @@ ip22_intr(status, cause, pc, ipending)
-> >  		cause &= ~MIPS_INT_MASK_4;
-> >  	}
-> >  
-> > -	_splset((status & ~cause & MIPS_HARD_INT_MASK) | MIPS_SR_INT_IE);
-> > +	if (cause & MIPS_HARD_INT_MASK) 
-> > +		mips_spurint_evcnt.ev_count++;
-> >  }
-> 
-> I think you'll count masked interrupts as spurious interrupts here -
-> you probably want to check (cause & status & MIPS_HARD_INT_MASK).

Right you are...

-> In any case, I'm not sure that's going to be useful by itself - if
-> there are any unmasked interrupts not handled in ip22_intr(), then as
-> soon as the interrupt handler returns, the unhandled interrupt will
-> trigger again, causing the CPU to hang. What you really need to do is
-> to mask them off in the status register (unfortunately, it's a little
-> tricky to do this). Maybe a #ifdef DIAGNOSTIC panic() would be a
-> little more useful here.

After I corrected the accounting up above, I still see an occasional
spurious interrupt; I'm guessing the zs driver may cause an occasional
spurious interrupt due to the way it handles interrupts (process events
until there are no more left), which won't cause the appropriate cause
bit to be cleared.

Unless the cause bits always get cleared for INT0/INT1, I don't think 
a panic() (#ifdef DIAGNOSTIC or not) makes sense here, altough given
what you correctly point out above, it may make more sense to catch
the "unhandled interrupt" case vs. the "spurious interrupt" case.
OTOH, seeing as there *shouldn't* be anything wired to the other
cpu interrupt pins, masking them off in the SR *is* probably the
right thing to do, even if trickier.

-> Can this be simplified to just
-> 
->   if ((ipending & MIPS_SOFT_INT_MASK_1)
-> 	|| (ssir && (status & MIPS_SOFT_INT_MASK_1))) {
->     _splset(MIPS_SR_INT_IE | (status & ~ipending & MIPS_HARD_INT_MASK));
->     clrsoftintr(MIPS_SOFT_INT_MASK_1);
->     softintr_dispatch();
->   }
-> 
-> Each time cpu_intr() is recursively called, the status register will
-> cumulatively mask the pending interrupts, so it'll already mask out
-> whatever is contained in servicing_mask.

It sure looks like it can; I've gone ahead and made the change and ran
my tests overnight and everything still looks good... I'm trying to think
why I did the mask bit originally, but now I can't think of it (mmm, need
coffee 8-).

--rafal

----
Rafal Boni                                                     rafal@attbi.com
  We are all worms.  But I do believe I am a glowworm.  -- Winston Churchill