tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Kernel panic in "subr_xcall.c"



On Oct 19, 2009, at 1:23 AM, Matthias Scheler wrote:

> On Mon, Oct 19, 2009 at 09:05:43AM +0100, Matthias Scheler wrote:
>> Ok, I think I found the problem:
>> 
>> 1.) pool_cache_invalidate() calls xc_broadcast() with ci = NULL.
>> 2.) xc_broadcast() calls xc_lowpri() with ci = NULL.
>> 3.) xc_lowpri() iterates over all CPUs but doesn't fine any
>>    running CPU and therefore doesn't schedule any cross calls.
>> 4.) The KASSERT() at the end of loop in xc_lowpri() triggers
>>    because "xc_tailp" and "xc_headp" are both zero.
>> 
>> The following patch avoids the problem:
> 
> Here is a slightly better patch:

I like this patch, and considered this approach myself before Jean-Yves checked 
in the other one to subr_pool.c.  It has the nice effect of not requiring every 
consumer of cross-calls to be aware of the bootstrap issue.

> 
> Index: subr_xcall.c
> ===================================================================
> RCS file: /cvsroot/src/sys/kern/subr_xcall.c,v
> retrieving revision 1.10
> diff -u -r1.10 subr_xcall.c
> --- subr_xcall.c      5 Mar 2009 13:18:51 -0000       1.10
> +++ subr_xcall.c      19 Oct 2009 08:19:14 -0000
> @@ -172,9 +172,11 @@
> xc_lowpri(u_int flags, xcfunc_t func, void *arg1, void *arg2,
>         struct cpu_info *ci)
> {
> -     CPU_INFO_ITERATOR cii;
> +     bool call_direct;
>       uint64_t where;
> 
> +     call_direct = false;
> +
>       mutex_enter(&xc_lock);
>       while (xc_headp != xc_tailp)
>               cv_wait(&xc_busy, &xc_lock);
> @@ -182,10 +184,15 @@
>       xc_arg2 = arg2;
>       xc_func = func;
>       if (ci == NULL) {
> +             CPU_INFO_ITERATOR cii;
> +
>               xc_broadcast_ev.ev_count++;
>               for (CPU_INFO_FOREACH(cii, ci)) {
> -                     if ((ci->ci_schedstate.spc_flags & SPCF_RUNNING) == 0)
> +                     if (!(ci->ci_schedstate.spc_flags & SPCF_RUNNING)) {
> +                             if (curcpu() == ci)
> +                                     call_direct = true;
>                               continue;
> +                     }
>                       xc_headp += 1;
>                       ci->ci_data.cpu_xcall_pending = true;
>                       cv_signal(&ci->ci_data.cpu_xcall);
> @@ -196,10 +203,13 @@
>               ci->ci_data.cpu_xcall_pending = true;
>               cv_signal(&ci->ci_data.cpu_xcall);
>       }
> -     KASSERT(xc_tailp < xc_headp);
> +     KASSERT(xc_tailp < xc_headp || call_direct);
>       where = xc_headp;
>       mutex_exit(&xc_lock);
> 
> +     if (call_direct)
> +             (*func)(arg1, arg2);
> +
>       return where;
> }
> 
> That is of course assuming that it is a bug in the cross call subsystem.
> 
>       Kind regards
> 
> -- 
> Matthias Scheler                                  http://zhadum.org.uk/

-- thorpej



Home | Main Index | Thread Index | Old Index