Subject: Re: mutex fault
To: Andrew Doran <ad@NetBSD.org>
From: Manuel Bouyer <bouyer@antioche.eu.org>
List: tech-kern
Date: 12/09/2007 15:06:05
--C7zPtVaVf+AK4Oqc
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

On Sun, Dec 09, 2007 at 03:05:22PM +0100, Manuel Bouyer wrote:
> On Sun, Dec 09, 2007 at 11:24:59AM +0100, Manuel Bouyer wrote:
> > On Sat, Dec 08, 2007 at 11:45:24PM +0000, Andrew Doran wrote:
> > > > OK, it looks like it's because it can call softint_schedule() even if
> > > > ci_ilevel is >= IPL_HIGH. I'll try to come up with a solution based on
> > > > atomic ops only.
> > > 
> > > Ah, I forgot to reply - sorry. softint_schedule() is fine from any IPL and
> > > at any point in the kernel.
> > 
> > I guess by "any IPL" you mean "at or below IPL_HIGH". The way the xenevt
> > driver works means it interrupts above any IPL (only cli() can block its
> > interrutps, really). Looking at softint_schedule(), I suspect this can corrupt
> > the si_q queue, at last. I can't see how it can cause the lwp_lock/unlock
> > problem though.
> 
> If the problem is from softint_schedule(), the attached patch avoids calling
> it, and should be safe even when the current IPL is IPL_HIGH.
> 
> Kazushi, can you try it too ? I see the panic only once in 2 days on my
> systems ...

here is the patch

-- 
Manuel Bouyer <bouyer@antioche.eu.org>
     NetBSD: 26 ans d'experience feront toujours la difference
--

--C7zPtVaVf+AK4Oqc
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename=diff

Index: xen/xenevt.c
===================================================================
RCS file: /cvsroot/src/sys/arch/xen/xen/xenevt.c,v
retrieving revision 1.16
diff -u -r1.16 xenevt.c
--- xen/xenevt.c	22 Nov 2007 16:17:10 -0000	1.16
+++ xen/xenevt.c	9 Dec 2007 14:02:18 -0000
@@ -55,6 +55,8 @@
 
 extern struct evcnt softxenevt_evtcnt;
 
+int softxenevt_setipending =  0;
+
 /*
  * Interface between the event channel and userland.
  * Each process with a xenevt device instance open can regiter events it
@@ -164,7 +166,13 @@
 			STAILQ_INSERT_TAIL(&devevent_pending, d, pendingq);
 			simple_unlock(&devevent_pending_lock);
 			d->pending = true;
-			softintr(SIR_XENEVT);
+			/* We can't call softintr() here because we may
+			 * already be at IPL_HIGH. There will be a splx()
+			 * anyway, so just set this IPL as pending, it'll be
+			 * enough to get the soft intr processed
+			 */
+			hypervisor_set_ipending(1 << IPL_SOFTXENEVT, 0, 0);
+			softxenevt_setipending++;
 		}
 	}
 }

--C7zPtVaVf+AK4Oqc--