Subject: frequent panics in sysctl_doeproc
To: None <current-users@NetBSD.org>
From: Tobias Nygren <tnn@NetBSD.org>
List: current-users
Date: 07/30/2007 19:03:33
Hi everyone,

I could really use some help analysing this crash.
My sparc64 box has been panicing alot recently when running top(1).
The backtrace is rather odd. My current theory is that we assume
that some page is wired when it actually isn't. Is the any other
possible reason for taking a fault when attempting to lock a mutex?

> top
Mutex error: lockdebug_barrier: spin lock held

lock address : 0x0000000028895210 type     :               spin
shared holds :                  0 exclusive:                  1
shares wanted:                  0 exclusive:                  0
current cpu  :                  0 last held:                  0
current lwp  : 0x00000000288a3160 last held: 0x00000000288a3160
last locked  : 0x000000000154c7d4 unlocked : 0x00000000015903f4
owner field  : 0x00ff0a0000000000 wait/spin:                0/1

panic: LOCKDEBUG
cpu0: kdb breakpoint at 18285cc
Stopped in pid 266.1 (top) at   netbsd:cpu_Debugger+0x8:        nop
db> bt
panic(19b0c78, 15b9b44, 19b0af0, 19b0b08, ffffffffffffffe8, 5445098)
  +0x1a0
lockdebug_abort1(288e9c70, 1cc2bc0, 19b0af0, 19b0b08, 5445000, 1000000)
  +0x74
lockdebug_barrier(1cad8f0, 1, 1, 0, 0, 0) +0x134
rw_vector_enter(1c79880, 0, 226000, 288fdc58, 0, 0) +0xd8
vm_map_lock_read(1c79878, 6, 226000, 288fdc58, 0, 0) +0xf067c
uvmfault_lookup(288fd270, 0, 288fdd88, 288fddd8, 288fdd98, 288fdd84)
   +0x94
uvm_fault_internal(1c79878, 1019b2000, 1, 0, 0, 1c05800) +0xbc
data_access_fault(288fd5c0, 30, 1924ad8, 1019b2000, 1019b38a0, 800809)
   +0x5c8
?(1cb2300, 154c7d4, 0, 5445098, ffffffffffffffe8, 5445098) 0x1008bb4
lwp_lock(5445188, 1019b38a0, 8, 1, 5445000, 1000000) 0x1549a74
fill_kproc2(288951e0, 5445000, 288951e0, 0, 0, 0) +0xa84
sysctl_doeproc(288fdc68, 4, 226000, 288fdc58, 0, 0) +0x618
sysctl_dispatch(288fdc60, 6, 226000, 288fdc58, 0, 0) +0x270
sys___sysctl(288a3160, 288fdd98, 288fdd88, 288fddd8, 288fdd98, 288fdd84)
   +0x1ec
syscall_plain(288fded0, 8ca, 40b3a86c, 40b3a870, 0, badcafe) +0x194
?(ffffffffffffc658, 6, 226000, ffffffffffffc670, 0, 0) at 0x10093e4
db>

Here's the kernel current configuration:

include "arch/sparc64/conf/GENERIC"
options NMBCLUSTERS=4096
options NKMEMPAGES=65536
makeoptions    DEBUG="-g"
makeoptions     COPTS="-O0"
options DEBUG
options DIAGNOSTIC
options LOCKDEBUG
options INSECURE
pseudo-device pf
pseudo-device pflog
no pseudo-device veriexec
no options FILEASSOC

I've uploaded "nm -n" and "objdump -d" output here:

http://www.netilium.org/~tnn/20070730/

-Tobias