tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Kernel panic in "subr_xcall.c"

On Mon, Oct 19, 2009 at 07:28:26AM +0100, Matthias Scheler wrote:
> Here is the stack trace of the panic (copied of the console):
> xc_lowpri
> pool_cache_invalidate
> pmap_growkernel
> I wonder whether it is related this change:

Ok, I think I found the problem:

1.) pool_cache_invalidate() calls xc_broadcast() with ci = NULL.
2.) xc_broadcast() calls xc_lowpri() with ci = NULL.
3.) xc_lowpri() iterates over all CPUs but doesn't fine any
    running CPU and therefore doesn't schedule any cross calls.
4.) The KASSERT() at the end of loop in xc_lowpri() triggers
    because "xc_tailp" and "xc_headp" are both zero.

The following patch avoids the problem:

Index: subr_xcall.c
RCS file: /cvsroot/src/sys/kern/subr_xcall.c,v
retrieving revision 1.10
diff -u -r1.10 subr_xcall.c
--- subr_xcall.c        5 Mar 2009 13:18:51 -0000       1.10
+++ subr_xcall.c        19 Oct 2009 08:02:54 -0000
@@ -196,7 +196,7 @@
                ci->ci_data.cpu_xcall_pending = true;
-       KASSERT(xc_tailp < xc_headp);
+       KASSERT(xc_tailp <= xc_headp);
        where = xc_headp;
But I'm not convinced it is the right thing. There is a problem after
all because no cross call has been issued and the supplied function
hasn't been called at all.

What is the correct fix? Should pool_cache_invalidate() check whether
the current CPU is running and not use xc_broadcast() if it isn't?
It currently already checks for the number of CPUs ...

        if (ncpu < 2) {
        } else {
                where = xc_broadcast(0, (xcfunc_t)pool_cache_xcall, pc, NULL);

... but that is apparently not good enough.

        Kind regards

Matthias Scheler                        

Home | Main Index | Thread Index | Old Index