tech-kern archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: Kernel panic in "subr_xcall.c"
On Mon, Oct 19, 2009 at 07:28:26AM +0100, Matthias Scheler wrote:
> Here is the stack trace of the panic (copied of the console):
>
> xc_lowpri
> pool_cache_invalidate
> pmap_growkernel
>
> I wonder whether it is related this change:
>
> http://mail-index.netbsd.org/source-changes/2009/10/15/msg001938.html
Ok, I think I found the problem:
1.) pool_cache_invalidate() calls xc_broadcast() with ci = NULL.
2.) xc_broadcast() calls xc_lowpri() with ci = NULL.
3.) xc_lowpri() iterates over all CPUs but doesn't fine any
running CPU and therefore doesn't schedule any cross calls.
4.) The KASSERT() at the end of loop in xc_lowpri() triggers
because "xc_tailp" and "xc_headp" are both zero.
The following patch avoids the problem:
Index: subr_xcall.c
===================================================================
RCS file: /cvsroot/src/sys/kern/subr_xcall.c,v
retrieving revision 1.10
diff -u -r1.10 subr_xcall.c
--- subr_xcall.c 5 Mar 2009 13:18:51 -0000 1.10
+++ subr_xcall.c 19 Oct 2009 08:02:54 -0000
@@ -196,7 +196,7 @@
ci->ci_data.cpu_xcall_pending = true;
cv_signal(&ci->ci_data.cpu_xcall);
}
- KASSERT(xc_tailp < xc_headp);
+ KASSERT(xc_tailp <= xc_headp);
where = xc_headp;
mutex_exit(&xc_lock);
But I'm not convinced it is the right thing. There is a problem after
all because no cross call has been issued and the supplied function
hasn't been called at all.
What is the correct fix? Should pool_cache_invalidate() check whether
the current CPU is running and not use xc_broadcast() if it isn't?
It currently already checks for the number of CPUs ...
if (ncpu < 2) {
pool_cache_xcall(pc);
} else {
where = xc_broadcast(0, (xcfunc_t)pool_cache_xcall, pc, NULL);
xc_wait(where);
}
... but that is apparently not good enough.
Kind regards
--
Matthias Scheler http://zhadum.org.uk/
Home |
Main Index |
Thread Index |
Old Index