tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Need help looking at kernel dump for "/netbsd: panic: lock error"



On Wed, 9 Jan 2013, David Holland wrote:

On Wed, Jan 09, 2013 at 10:15:39AM -0800, Hisashi T Fujinaka wrote:
> NetBSD documentation just kind of pointed me at figuring out which
> processes are running and looking at the backtrace. Well, the backtrace
> just looks to me like it was printing out the error. I know I should
> look at other frames, but not sure what I should be looking at.

> #8  0xffffffff803faab7 in trap (frame=0xfffffe8038bc2840) at 
/usr/src/sys/arch/amd64/amd64/trap.c:568

...execution entered the kernel because of a processor exception.

> #7  uvm_fault_internal (orig_map=<optimized out>, vaddr=<optimized out>, 
access_type=<optimized out>, fault_flag=<optimized out>)
>     at /usr/src/sys/uvm/uvm_fault.c:877

...this is the entry point in the VM system for handling page faults.

> #6  uvm_fault_check (maxprot=false, ranons=<optimized out>, 
flt=0xfffffe8038bc26e0, ufi=0xfffffe8038bc26a0)
>     at /usr/src/sys/uvm/uvm_fault.c:957
> #5  0xffffffff8043a933 in uvmfault_lookup (write_lock=false, 
ufi=0xfffffe8038bc26a0) at /usr/src/sys/uvm/uvm_fault_i.h:126

UVM goes to process the fault...

> #4  0xffffffff8027d62f in rw_vector_enter (rw=0xfffffe8076e20198, 
op=RW_READER) at /usr/src/sys/kern/kern_rwlock.c:341

takes a lock...

> #3  0xffffffff803bba6a in lockdebug_abort (lock=0xfffffe8076e20198, 
ops=0xffffffff806ecaa0,
>     func=0xffffffff804d73c0 "rw_vector_enter", msg=0xffffffff8051a7a7 "locking 
against myself")
>     at /usr/src/sys/kern/subr_lockdebug.c:858
> #2  0xffffffff803c2159 in panic (fmt=<unavailable>) at 
/usr/src/sys/kern/subr_prf.c:205
> #1  0xffffffff803c2084 in vpanic (fmt=0xffffffff8053631d "lock error", 
ap=0xfffffe8038bc2410) at /usr/src/sys/kern/subr_prf.c:308

and lockdebug alerts us to the fact that the same kernel thread (lwp)
is holding this lock already.

Further analysis requires wading into uvmfault_lookup to try to figure
out exactly how this happened. Since you appear to have debug info
this probably won't be that hard.

So far I'm just kind of flailing around. I don't know enough about the VM
system to really know all that much.

(gdb) f 5
#5  0xffffffff8043a933 in uvmfault_lookup (write_lock=false, 
ufi=0xfffffe8038bc26a0) at /usr/src/sys/uvm/uvm_fault_i.h:126
126                             vm_map_lock_read(ufi->map);


(gdb) print *(ufi->map)
$4 = {pmap = 0xfffffe8078a4f1a0, lock = {rw_owner = 18446742426536628260}, busy 
= 0x0, mutex = {u = {mtxa_owner = 0}},
  misc_lock = {u = {mtxa_owner = 0}}, cv = {cv_opaque = {0x0, 
0xfffffe8076e201b8, 0xffffffff8053d7f0}}, flags = 65, rb_tree = {
    rbt_root = 0xfffffe8009af2480, rbt_ops = 0xffffffff804f6860, rbt_minmax = 
{0xfffffe8079744a20, 0xfffffe800b7337e0}}, header = {
    rb_node = {rb_nodes = {0x0, 0x0}, rb_info = 0}, gap = 0, maxgap = 0, prev = 
0xfffffe800b7337e0, next = 0xfffffe8079744a20,
    start = 140187732541440, end = 0, object = {uvm_obj = 0x0, sub_map = 0x0}, 
offset = 0, etype = 0, protection = 0,
    max_protection = 0, inheritance = 0, wired_count = 0, aref = {ar_pageoff = 
0, ar_amap = 0x0}, advice = 0, map_attrib = 0,
    flags = 0 '\000'}, nentries = 236, size = 303292416, ref_count = 1, hint = 
0xfffffe800cf75d88,
  first_free = 0xfffffe8076e201f8, timestamp = 28}


--
Hisashi T Fujinaka - htodd%twofifty.com@localhost
BSEE(6/86) + BSChem(3/95) + BAEnglish(8/95) + MSCS(8/03) + $2.50 = latte


Home | Main Index | Thread Index | Old Index