Subject: Re: mutex fault and no symbol table on recent Dom0 and Xen
To: None <port-xen@NetBSD.org>
From: Kazushi (Jam) Marukawa <jam@pobox.com>
List: port-xen
Date: 07/30/2007 00:13:15
   On Jul 3,  9:45, Kazushi (Jam) Marukawa wrote:
   > Subject: mutex fault and no symbol table on recent Dom0 and Xen
   > Hi,
   > 
   > I'm having difficulties with recent Dom0 kernel.  It crashes
   > with following message.
   > 
   >    Mutex error: mutex_vector_exit: exiting unheld spin mutex
   > 
   > I tried Dom0 from both 20070627 and 20070701.  Both crashed
   > because of mutex error.  Here is a crash log from 0627.  Not
   > sure why, but db didn't show symbol table.

I tried recent kernel, 2007/07/26 source codes, and got the
same problem.

This time, I konw db doesn't work with Xen 3.1, so I hand-
translated addresses to symbol name.  Can someone inspect
this problem?  Thanks.

-----
Mutex error: mutex_vector_exit: exiting unheld spin mutex

lock address : 0x00000000c0978280
current cpu  :                  0
current lwp  : 0x00000000cae21e00
owner field  : 0x0000000000000b00 wait/spin: 0/1

panic: lock error
Stopped in pid 0.2 (system) at  0xc04bbe14:     popl    %ebp
db> trace
?(c085f60d,cae4eee4,cae4eed8,c0439717,b) at 0xc04bbe14
?(c083d699,c083c34b,c06ef970,c083c2e7,c0978280) at 0xc04396a5
?(0,c0978280,c08d198c,c06ef970,c083c2e7) at 0xc0434d21
?(c0978280,c06ef970,c083c2e7,0,0) at 0xc0414c2c
?(c0978280,ca003b9a,cae4ef78,e1894609,c0914520) at 0xc0414e0f
?(0,0,cae4efb8,c04cc0b6,0) at 0xc042b8e0
?(0,cae4efa8,0,f,fffff000) at 0xc04f42af
?(0,cad9cbdc,3,1,c04f42bb) at 0xc0103dfd
?(f,cad9cbdc,cad9cb94,0,0) at 0xc04cc6b7
--- switch to interrupt stack ---
?(cad9cbdc,0,11,31,ca000011) at 0xc0102550
?(c040bd50,cae21e00,cad9cc6c,c040bdfe,cae21e00) at 0xc0104323
?(cae21e00,cad9cc68,c0413b6e,0,cae21e00) at 0xc04be545
?(cae21e00,0,c01001e7,c01001df,c01001e7) at 0xc040bdfe
db>
-----

NetBSD's DB cannot read symbol table on Xen 3.1.  Therefore,
I translated it to name by hand.

-----
Mutex error: mutex_vector_exit: exiting unheld spin mutex

lock address : 0x00000000c0978280 (sched_mutex)
current cpu  :                  0
current lwp  : 0x00000000cae21e00
owner field  : 0x0000000000000b00 wait/spin: 0/1

panic: lock error
Stopped in pid 0.2 (system) at  cpu_Debugger+0x4:     popl    %ebp
db> trace
db> trace
cpu_Debugger(c085f60d,cae4eee4,cae4eed8,c0439717,b) at cpu_Debugger+0x4
panic(c083d699,c083c34b,c06ef970,c083c2e7,c0978280) at panic+0x155
lockdebug_abort(0,c0978280,c08d198c,c06ef970,c083c2e7) at lockdebug_abort+0x61
mutex_abort(c0978280,c06ef970,c083c2e7,0,0) at mutex_abort+0x3c
mutex_vector_exit(c0978280,ca003b9a,cae4ef78,e1894609,c0914520) at mutex_vector_exit+0xcf
callout_softclock(0,0,cae4efb8,c04cc0b6,0) at callout_softclock+0x240
softintr_dispatch(0,cae4efa8,0,f,fffff000) at softintr_dispatch+0x3f
Xsoftclock(0,cad9cbdc,3,1,c04f42bb) at Xsoftclock+0x39
evtchn_do_event(f,cad9cbdc,cad9cb94,0,0) at evtchn_do_event+0xf7
--- switch to interrupt stack ---
call_evtchn_do_event(cad9cbdc,0,11,31,ca000011) at call_evtchn_do_event+0x1c
hypervisor_callback(c040bd50,cae21e00,cad9cc6c,c040bdfe,cae21e00) at hypervisor_callback+0x63
cpu_idle(cae21e00,cad9cc68,c0413b6e,0,cae21e00) at cpu_idle+0x25
idle_loop(cae21e00,0,c01001e7,c01001df,c01001e7) at idle_loop+0xae
db>
-----

-- Kazushi
This process can check if this value is zero, and if it is, it does
something child-like.
		-- Forbes Burkowski, Computer Science 454