Subject: Re: mutex fault
To: Andrew Doran <ad@NetBSD.org>
From: Manuel Bouyer <bouyer@antioche.eu.org>
List: tech-kern
Date: 12/09/2007 15:05:23
On Sun, Dec 09, 2007 at 11:24:59AM +0100, Manuel Bouyer wrote:
> On Sat, Dec 08, 2007 at 11:45:24PM +0000, Andrew Doran wrote:
> > > OK, it looks like it's because it can call softint_schedule() even if
> > > ci_ilevel is >= IPL_HIGH. I'll try to come up with a solution based on
> > > atomic ops only.
> > 
> > Ah, I forgot to reply - sorry. softint_schedule() is fine from any IPL and
> > at any point in the kernel.
> 
> I guess by "any IPL" you mean "at or below IPL_HIGH". The way the xenevt
> driver works means it interrupts above any IPL (only cli() can block its
> interrutps, really). Looking at softint_schedule(), I suspect this can corrupt
> the si_q queue, at last. I can't see how it can cause the lwp_lock/unlock
> problem though.

If the problem is from softint_schedule(), the attached patch avoids calling
it, and should be safe even when the current IPL is IPL_HIGH.

Kazushi, can you try it too ? I see the panic only once in 2 days on my
systems ...

-- 
Manuel Bouyer <bouyer@antioche.eu.org>
     NetBSD: 26 ans d'experience feront toujours la difference
--