Subject: kern/36844: getting mutex error on NetBSD kernel 4.99.20 and later
To: None <kern-bug-people@netbsd.org, gnats-admin@netbsd.org,>
From: None <jam@pobox.com>
List: netbsd-bugs
Date: 08/26/2007 15:10:00
>Number:         36844
>Category:       kern
>Synopsis:       getting mutex error on NetBSD kernel 4.99.20 and later
>Confidential:   no
>Severity:       critical
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun Aug 26 15:10:00 +0000 2007
>Originator:     Kazushi (Jam) Marukawa
>Release:        NetBSD 4.99.20 and later.
>Organization:
>Environment:
System: NetBSD fs 4.99.29 NetBSD 4.99.29 (XEN3_DOM0) #0: Tue Aug 21 20:03:29 JST 2007 jam@fs:/mnt/raid/netbsd/20070821/src/sys/arch/i386/compile/XEN3_DOM0 i386
Architecture: i386
Machine: i386
>Description:
	NetBSD crashs with following log.

-----
Mutex error: mutex_vector_exit: exiting unheld spin mutex

lock address : 0x00000000c0983440
current cpu  :                  0
current lwp  : 0x00000000caf01e00
owner field  : 0x0000000000000b00 wait/spin:                0/1

panic: lock error
Stopped in pid 0.2 (system) at  netbsd:cpu_Debugger+0x4:        popl    %ebp
db> trace
cpu_Debugger(c086a9c0,caef9ee4,caef9ed8,c0440857,b) at netbsd:cpu_Debugger+0x4
panic(c0848599,c0847257,c06fa470,c08471f3,c0983440) at netbsd:panic+0x155
lockdebug_abort(0,c0983440,c08dd78c,c06fa470,c08471f3) at netbsd:lockdebug_abort
+0x61
mutex_abort(c0983440,c06fa470,c08471f3,c15c5d60,987a20) at netbsd:mutex_abort+0x
3c
mutex_vector_exit(c0983440,ca003b9a,caef9f78,a242eb5f,c0920320) at netbsd:mutex_
vector_exit+0xcf
callout_softclock(0,0,caef9fb8,c04d54b6,0) at netbsd:callout_softclock+0x24d
softintr_dispatch(0,caef9fa8,0,f,fffff000) at netbsd:softintr_dispatch+0x3f
DDB lost frame for netbsd:Xsoftclock+0x3c, trying 0xcaef9fa0
Xsoftclock() at netbsd:Xsoftclock+0x3c
--- interrupt ---
--- switch to interrupt stack ---
?(31,ca000011,cae60011,0,7b48) at 0x1
db> reboot
syncing disks... Mutex error: mutex_vector_enter: locking against myself

lock address : 0x00000000c0981f80
current cpu  :                  0
current lwp  : 0x00000000caf01e00
owner field  : 0x0000000000010b00 wait/spin:                0/1

panic: lock error
Stopped in pid 0.2 (system) at  netbsd:cpu_Debugger+0x4:        popl    %ebp
db> trace
cpu_Debugger(c086a9c0,caef9af8,caef9aec,c0440857,c) at netbsd:cpu_Debugger+0x4
panic(c0848599,c0847257,c06fa482,c084041a,c0981f80) at netbsd:panic+0x155
lockdebug_abort(0,c0981f80,c08dd78c,c06fa482,c084041a) at netbsd:lockdebug_abort
+0x61
mutex_abort(c0981f80,c06fa482,c084041a,c,0) at netbsd:mutex_abort+0x3c
mutex_vector_enter(c0981f80,3e,4,caf01e00,0) at netbsd:mutex_vector_enter+0x110
suspendsched(c084a3e6,72,caef9bcc,c01bfc30,0) at netbsd:suspendsched+0xa3
vfs_shutdown(20,74,caef9bec,c01be8fc,0) at netbsd:vfs_shutdown+0x28
cpu_reboot(0,0,caef9c6c,c043f59c,c06e2f60) at netbsd:cpu_reboot+0x106
db_reboot_cmd(c04c4cc4,0,c092f387,caef9c20,8) at netbsd:db_reboot_cmd+0x48
db_command(c0834230,c0834429,c0a28df6,0,0) at netbsd:db_command+0xb0
db_command_loop(c04c4cc4,0,2,c098131d,8f00) at netbsd:db_command_loop+0xd8
db_trap(1,0,58,562140,0) at netbsd:db_trap+0xdf
kdb_trap(1,0,caef9e58,def9df0,a) at netbsd:kdb_trap+0xe1
trap() at netbsd:trap+0xd5
--- trap (number 1) ---
cpu_Debugger(c086a9c0,caef9ee4,caef9ed8,c0440857,b) at netbsd:cpu_Debugger+0x4
panic(c0848599,c0847257,c06fa470,c08471f3,c0983440) at netbsd:panic+0x155
lockdebug_abort(0,c0983440,c08dd78c,c06fa470,c08471f3) at netbsd:lockdebug_abort
+0x61
mutex_abort(c0983440,c06fa470,c08471f3,c15c5d60,987a20) at netbsd:mutex_abort+0x
3c
mutex_vector_exit(c0983440,ca003b9a,caef9f78,a242eb5f,c0920320) at netbsd:mutex_
vector_exit+0xcf
callout_softclock(0,0,caef9fb8,c04d54b6,0) at netbsd:callout_softclock+0x24d
softintr_dispatch(0,caef9fa8,0,f,fffff000) at netbsd:softintr_dispatch+0x3f
DDB lost frame for netbsd:Xsoftclock+0x3c, trying 0xcaef9fa0
Xsoftclock() at netbsd:Xsoftclock+0x3c
--- interrupt ---
--- switch to interrupt stack ---
?(31,ca000011,cae60011,0,7b48) at 0x1
-----

I push reboot button after above log.

>How-To-Repeat:
	I'm running NetBSD 4.99.29 as Dom0 with Xen3.1.
	I'm also running Windows XP as its DomU-HVM.
	It runs but eventually crashs in few days.

	I'm not sure but it seems that NetBSD crashs if and only if Windows XP
	is running as DomU-HVM.
>Fix:
	Not sure.  Please ivestigate this problem.

>Unformatted:
 	Following stack trace is taken from NetBSD 4.99.29.
 	It's source code is taken by Aug 21 00:00:00 UTC 2007.