Subject: deadlock with sched_lock in SA code
To: None <tech-kern@netbsd.org>
From: Chuck Silvers <chuq@chuq.com>
List: tech-kern
Date: 08/28/2005 08:17:49
hi folks,

while looking into another problem, I ended up looking at this dump:

panic: kernel diagnostic assertion "_simple_lock_held((&sched_lock)) == 0" failed: file "/usr/src/sys/kern/kern_synch.c", line 680


(gdb) target kcore netbsd.4.core
panic: kernel %sassertion "%s" failed: file "%s", line %d
#0  0x1fe74000 in ?? ()
(gdb) bt
#0  0x1fe74000 in ?? ()
#1  0xc034f686 in cpu_reboot (howto=256, bootstr=0x0)
    at /usr/src/sys/arch/i386/i386/machdep.c:752
#2  0xc02b918c in panic (
    fmt=0xc04c58a0 "kernel %sassertion \"%s\" failed: file \"%s\", line %d")
    at /usr/src/sys/kern/subr_prf.c:253
#3  0xc040c138 in __assert (t=0xc0451adf "diagnostic ", 
    f=0xc0499ae0 "/usr/src/sys/kern/kern_synch.c", l=680, 
    e=0xc0497560 "_simple_lock_held((&sched_lock)) == 0")
    at /usr/src/sys/lib/libkern/__assert.c:45
#4  0xc02aa6c7 in wakeup (ident=0xc051437c) at x86/intr.h:160
#5  0xc0341503 in uvm_pagealloc_strat (obj=<incomplete type>, off=0, 
    anon=<incomplete type>, flags=1, strat=0, free_list=0)
    at /usr/src/sys/uvm/uvm_page.c:1072
#6  0xc03356c5 in uvm_km_alloc_poolpage_cache (map=0xc04f8840, waitok=0)
    at /usr/src/sys/uvm/uvm_km.c:683
#7  0xc02b8b7b in pool_allocator_alloc (org=0xc05113e0, flags=0)
    at /usr/src/sys/kern/subr_pool.c:2185
#8  0xc02b6ead in pool_get (pp=0xc05113e0, flags=0)
    at /usr/src/sys/kern/subr_pool.c:899
#9  0xc02a013c in sadata_upcall_alloc (waitok=0)
    at /usr/src/sys/kern/kern_sa.c:114
#10 0xc02a10f0 in sa_switch (l=0xcc4c0218, type=2)
    at /usr/src/sys/kern/kern_sa.c:940
#11 0xc02aa30c in ltsleep (ident=0xc0512888, priority=280, 
    wmesg=0xc0471b29 "select", timo=201, interlock=0x0)
    at /usr/src/sys/kern/kern_synch.c:493
#12 0xc02bda1e in selcommon (l=0xcc4c0218, retval=0xcc4f3f5c, nd=0, u_in=0x0, 
    u_ou=0x0, u_ex=0x0, tv=0xcc4f3f2c, mask=0x0)
    at /usr/src/sys/kern/sys_generic.c:788
#13 0xc02bd773 in sys_select (l=0xcc4c0218, v=0xcc4f3f64, retval=0xcc4f3f5c)
    at /usr/src/sys/kern/sys_generic.c:713
#14 0xc03583d3 in syscall_plain (frame=0xcc4f3fa8)
    at /usr/src/sys/arch/i386/i386/syscall.c:160


allocating pages from UVM can call wakeup(), so we must avoid that
while holding sched_lock.  one way to do this would be to call
sadata_upcall_alloc() before acquiring sched_lock and passing the
resulting pointer to sa_switch(), instead of calling that in
sa_switch() itself.  does anyone have any better suggestions?
if not, I'll fix it that way.

-Chuck