Subject: Re: kernel stack overflow due to deep interrupt nesting
To: None <port-mips@netbsd.org>
From: Toru Nishimura <locore32@gaea.ocn.ne.jp>
List: port-mips
Date: 04/06/2002 15:43:36
Rafal Boni <rafal@attbi.com> digged out;

> I finally tracked down the problem to a kernel stack overflow due to
> too deep interrupt nesting... Here's a backtrace (the panic is a
> check I added to make tracking this down a bit easier):
>
> (ip22_intr() is observed called in a deeply recursive fashion.)
>
> panic: cpu_intr: max_intr_depth too high: 16
>
> The problem is (and this can probably also happen on any other
> MIPS port that uses a platform-specific IO interrupt handler
> since many do the same thing) that interrupts are generally
> turned on in the platform-specific IO interrupt handler, which
> can cause it to be interrupted to service new interrupts, etc.
> etc.

The stack trace clearly shows that ip22_intr() makes "tail-recursion."
This must not be happen and indicates a straight consequence of
programming error in ip22_intr().  Then increasing kernel stack depth
is not an option.

cpu_intr() should eat all of interrups pending in an ordered and
organized way to realize "spl" frame work which NetBSD OS stands
on.

On the other hand, I admit the current practice found in locore_XXX is
not designed optiomal and have room to provide a neatly standardized
context of how processor's IM/IE/EXL bits are prepared while
cpu_intr() is called.  Note that significant design difference between
R3000 and R4000 derivative must be taken in count.

Toru Nishimura/ALKYL Technology