Subject: recursive locking in uvm?
To: None <tech-kern@netbsd.org>
From: der Mouse <mouse@Rodents.Montreal.QC.CA>
List: tech-kern
Date: 03/26/2002 12:48:17
I'm working with a company using a proprietary PowerPC port, and
they're having trouble.  Most of it is neither here nor there as far as
NetBSD at large is concerned.  But last night, I found something that
looks to me like a real bug.

I got this output while rebooting, with a LOCKDEBUG kernel:

/dev/rwd0a: 1137 files, 460468 used, 1174888 free (320 frags, 146821 blocks, 0.0% fragmentation)
/dev/rwd0a: MARKING FILE SYSTEM CLEAN
[0x3852008:10] simple_lock: lock held
[0x3852008:10] lock: 0x1750e0, currently at: ../../../../uvm/uvm_km.c:364
[0x3852008:10] last locked: ../../../../uvm/uvm_vnode.c:493 [65412289]
[0x3852008:10] last unlocked: ../../../../uvm/uvm_vnode.c:755 [65346014]
[0x3852008:10] simple_unlock: lock not held
[0x3852008:10] lock: 0x1750e0, currently at: ../../../../uvm/uvm_vnode.c:755
[0x3852008:10] last locked: ../../../../uvm/uvm_km.c:364 [65412473]
[0x3852008:10] last unlocked: ../../../../uvm/uvm_km.c:366 [65412478]
Check /dev/wd0e
/dev/rwd0e: 21 files, 619 used, 200052 free (52 frags, 25000 blocks, 0.0% fragmentation)
/dev/rwd0e: MARKING FILE SYSTEM CLEAN

(The stuff in [ ] in the lock lines is debugging output I added.)  On
further investigation, this proved to be coming from UVM locking the
page queues (uvm_lock_pageq), which is a simple_lock, and then, with
that lock held, ultimately calling another routine that tries to also
lock the page queues.  With simple locks, this produces output like the
above.  The call stack is

	simple_lock
	uvm_km_pgremove, uvm_km.c 364
	uvm_unmap_remove, uvm_map.c 1103
	uvm_unmap, uvm_map_i.h 177
	uvm_km_free, uvm_km.c 600
	pmap_free_pv, pmap.c 802
	pmap_remove_pv, pmap.c 947
	pmap_page_protect, pmap.c 1360
	uvn_flush, uvm_vnode.c 570

Is this a real bug?  It sure looks like it to me: either uvm's paradigm
for pageq locking is broken or the pmap in question is misusing a uvm
interface.  The pmap.c in question is arch/powerpc/powerpc/pmap.c
version 1.44; the UVM file versions in question are

/*	$NetBSD: uvm_km.c,v 1.49 2001/06/02 18:09:26 chs Exp $	*/
/*	$NetBSD: uvm_map.c,v 1.99 2001/06/02 18:09:26 chs Exp $	*/
/*	$NetBSD: uvm_map_i.h,v 1.21 2001/06/02 18:09:27 chs Exp $	*/
/*	$NetBSD: uvm_vnode.c,v 1.50 2001/05/26 21:27:21 chs Exp $	*/

Would this go away with NEW_PMAP, maybe?  Worth a PR?

/~\ The ASCII				der Mouse
\ / Ribbon Campaign
 X  Against HTML	       mouse@rodents.montreal.qc.ca
/ \ Email!	     7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B