Subject: Re: kern/25285: i386 MP panic: TLB IPI rendezvous failed (mask 1)
To: None <M.Drochner@fz-juelich.de>
From: YAMAMOTO Takashi <yamt@mwd.biglobe.ne.jp>
List: current-users
Date: 06/04/2004 22:49:01
hi,
> dokas@cs.umn.edu said:
> > Anyone know why this is happening?
>
> IPIs can get lost appearently.
> I don't fully understand how this can happen, but changing
> the code to be more conservative helped on my dual-Opteron.
because spllower() doesn't check ipending and update ilevel atomically,
interrupt priority inversion, which is a serious problem for ipis,
can happen.
the following is a simple fix, although i'm not sure it's the best one.
YAMAMOTO Takashi
Index: arch/x86/include/intr.h
===================================================================
--- arch/x86/include/intr.h (revision 599)
+++ arch/x86/include/intr.h (working copy)
@@ -158,16 +158,21 @@ static __inline void
spllower(int nlevel)
{
struct cpu_info *ci = curcpu();
+ u_int32_t imask;
+ u_long psl;
__splbarrier();
- /*
- * Since this should only lower the interrupt level,
- * the XOR below should only show interrupts that
- * are being unmasked.
- */
- ci->ci_ilevel = nlevel;
- if (ci->ci_ipending & IUNMASK(ci,nlevel))
+
+ imask = IUNMASK(ci, nlevel);
+ psl = read_psl();
+ disable_intr();
+ if (ci->ci_ipending & imask) {
Xspllower(nlevel);
+ /* Xspllower does enable_intr() */
+ } else {
+ ci->ci_ilevel = nlevel;
+ write_psl(psl);
+ }
}
/*