Subject: mutex fault
To: None <port-xen@NetBSD.org>
From: Kazushi (Jam) Marukawa <jam@pobox.com>
List: port-xen
Date: 11/25/2007 00:11:19
Hi,

I was having mutex error for a long while on recent NetBSD
and Xen3.  Actually, I was having problem from 4.99.19
(about May) to 4.99.37 (Nov 22 kernel).

I was hoping the problem would be fixed some day, but...

Here is a trace dump from 4.99.37 kernel and Xen3.1.0nb2.
Please let me know if there are something I can try.
Thanks.

----
Mutex error: mutex_vector_exit: exiting unheld spin mutex

lock address : 0x00000000c098dda0
current cpu  :                  0
current lwp  : 0x00000000ca7a5e00
owner field  : 0x0000000000000b00 wait/spin:                0/1

panic: lock error
fatal breakpoint trap in supervisor mode
trap type 1 code 0 eip c04ca529 cs 9 eflags 246 cr2 0 ilevel b
Stopped in pid 0.2 (system) at  netbsd:breakpoint+0x1:  ret
db> trace
breakpoint(c0879e4d,c0876a65,c0701530,c0876a01,c098dda0) at netbsd:breakpoint+0x1
lockdebug_abort(c098dda0,c08e726c,c0701530,c0876a01,cd3b6eac) at netbsd:lockdebug_abort+0x61
mutex_abort(c098dda0,c0701530,c0876a01,c04db1d7,cd3b6eac) at netbsd:mutex_abort+0x34
mutex_vector_exit(c098dda0,ca003b9a,caf1ef78,c04db275,c092a520) at netbsd:mutex_vector_exit+0xcf
callout_softclock(0,0,caf1efb8,c04dbf26,0) at netbsd:callout_softclock+0x24d
softintr_dispatch(0,caf1efa8,c0ae800f,f,fffff000) at netbsd:softintr_dispatch+0x3f
DDB lost frame for netbsd:Xsoftclock+0x3c, trying 0xcaf1efa0
Xsoftclock() at netbsd:Xsoftclock+0x3c
--- interrupt ---
--- switch to interrupt stack ---
?(31,ca000011,ca790011,0,1f44) at 0x1
db> reboot
syncing disks... Mutex error: mutex_vector_enter: locking against myself

lock address : 0x00000000c098cde4
current cpu  :                  0
current lwp  : 0x00000000ca7a5e00
owner field  : 0x0000000000010b00 wait/spin:                0/1

panic: lock error
fatal breakpoint trap in supervisor mode
trap type 1 code 0 eip c04ca529 cs 9 eflags 246 cr2 0 ilevel c
Stopped in pid 0.2 (system) at  netbsd:breakpoint+0x1:  ret
db> trace
breakpoint(c0879e4d,c0876a65,c0701542,c0864d11,c098cde4) at netbsd:breakpoint+0x1
lockdebug_abort(c098cde4,c08e726c,c0701542,c0864d11,8) at netbsd:lockdebug_abort+0x61
mutex_abort(c098cde4,c0701542,c0864d11,c04db1d7,0) at netbsd:mutex_abort+0x34
mutex_vector_enter(c098cde4,3e,4,ca7a5e00,0) at netbsd:mutex_vector_enter+0x110
suspendsched(c087e487,c08ded7c,caf1ebe0,c01c15f0,20) at netbsd:suspendsched+0xc3
vfs_shutdown(20,caf1ebf0,c08ded80,ffffffff,0) at netbsd:vfs_shutdown+0x28
cpu_reboot(0,0,caf1ecb0,c08ded80,caf1ecb0) at netbsd:cpu_reboot+0x106
db_reboot_cmd(c0939507,0,c0939500,caf1ec30,1) at netbsd:db_reboot_cmd+0x48
db_command(c0844e10,c0845032,c0a3df6e,c04ca529,0) at netbsd:db_command+0xc8
db_command_loop(c04ca529,0,2,c08e754d,8f00) at netbsd:db_command_loop+0xd2
db_trap(1,0,58,ffffffff,f1ed88) at netbsd:db_trap+0xdc
kdb_trap(1,0,caf1ee5c,c04ca529,9) at netbsd:kdb_trap+0xe1
trap() at netbsd:trap+0x17c
--- trap (number 1) ---
breakpoint(c0879e4d,c0876a65,c0701530,c0876a01,c098dda0) at netbsd:breakpoint+0x1
lockdebug_abort(c098dda0,c08e726c,c0701530,c0876a01,cd3b6eac) at netbsd:lockdebug_abort+0x61
mutex_abort(c098dda0,c0701530,c0876a01,c04db1d7,cd3b6eac) at netbsd:mutex_abort+0x34
mutex_vector_exit(c098dda0,ca003b9a,caf1ef78,c04db275,c092a520) at netbsd:mutex_vector_exit+0xcf
callout_softclock(0,0,caf1efb8,c04dbf26,0) at netbsd:callout_softclock+0x24d
softintr_dispatch(0,caf1efa8,c0ae800f,f,fffff000) at netbsd:softintr_dispatch+0x3f
DDB lost frame for netbsd:Xsoftclock+0x3c, trying 0xcaf1efa0
Xsoftclock() at netbsd:Xsoftclock+0x3c
--- interrupt ---
--- switch to interrupt stack ---
?(31,ca000011,ca790011,0,1f44) at 0x1
db> reboot
Mutex error: mutex_vector_enter: locking against myself

lock address : 0x00000000c098cde4
current cpu  :                  0
current lwp  : 0x00000000ca7a5e00
owner field  : 0x0000000000010b00 wait/spin:                0/1

panic: lock error
fatal breakpoint trap in supervisor mode
trap type 1 code 0 eip c04ca529 cs 9 eflags 246 cr2 0 ilevel b
Stopped in pid 0.2 (system) at  netbsd:breakpoint+0x1:  ret
db> trace
breakpoint(c0879e4d,c0876a65,c0701542,c0864d11,c098cde4) at netbsd:breakpoint+0x1
lockdebug_abort(c098cde4,c08e726c,c0701542,c0864d11,0) at netbsd:lockdebug_abort+0x61
mutex_abort(c098cde4,c0701542,c0864d11,5,caf1e308) at netbsd:mutex_abort+0x34
mutex_vector_enter(c098cde4,2,2,0,c1584000) at netbsd:mutex_vector_enter+0x110
selwakeup(c16f486c,203,0,c16f486c,c16f486c) at netbsd:selwakeup+0x68
selnotify(c16f486c,0,10,0,caf1e394) at netbsd:selnotify+0x1a
sowakeup(c16f47e8,c16f486c,1,c17d4f00,c16e9a00) at netbsd:sowakeup+0x26
udp4_sendup(c16f47e8,c15e004c,caf1e3fc,c04db275,caf1e42c) at netbsd:udp4_sendup+0xd8
udp_input(c16e9a00,14,11,1,2) at netbsd:udp_input+0x211
ip_input(c16e9a00,0,c15cd000,c0104417,caf1e4d8) at netbsd:ip_input+0x5ba
ipintr(caf1e4d8,ca7b0011,31,11,11) at netbsd:ipintr+0x48
Xsoftnet() at netbsd:Xsoftnet+0x57
--- interrupt ---
0x9:
db> reboot
panic: req id: 54 buf found twice
fatal breakpoint trap in supervisor mode
trap type 1 code 0 eip c04ca529 cs 9 eflags 246 cr2 0 ilevel 9
Stopped in pid 0.2 (system) at  netbsd:breakpoint+0x1:  ret
db> trace
breakpoint(c08a575f,36,1,15012366,c15f5f30) at netbsd:breakpoint+0x1
twa_check_response_q(0,a400,c,4,3) at netbsd:twa_check_response_q+0xb9
twa_done(4,a400,4,4,3) at netbsd:twa_done+0xdf
twa_intr(c15f4000,caf1de08,cae34822,800,cae34822) at netbsd:twa_intr+0xb5
evtchn_do_event(6,caf1de08,caf1ddc0,c04cdeea,6) at netbsd:evtchn_do_event+0xce
call_evtchn_do_event(caf1de08,2,c1400011,caf10031,c0640011) at netbsd:call_evtchn_do_event+0x1e
hypervisor_callback(0,0,c0102500,10,0) at netbsd:hypervisor_callback+0x65
arpintr(caf1de84,11,31,11,c0a50011) at netbsd:arpintr+0x7d
Xsoftnet() at netbsd:Xsoftnet+0x4a
--- interrupt ---
0x9:
db> reboot
rebooting...
(XEN) Domain 0 shutdown: rebooting machine.

-- Kazushi
I have learned
To spell hors d'oeuvres
Which still grates on
Some people's n'oeuvres.
		-- Warren Knox