Subject: lock bug in getnewvnode, or uvm_km_kmemalloc/uvm_map ?
To: None <tech-kern@netbsd.org>
From: Manuel Bouyer <bouyer@antioche.lip6.fr>
List: tech-kern
Date: 11/19/2002 15:46:41
--4Ckj6UjgE2iN1+kY
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

Hi,
trying to backport the raidrame lock fixes to 1.6, I ran into a panic
at raid autoconfig time, because pool_get(PR_WAITOK) is called with a
simple_lock held. 

The call trace is:
getnewvnode() (the simple_lock is grabbed in getnewvnode)
pool_get(PR_NOWAIT)
pool_allocator_alloc()
pool_page_alloc_nointr()
uvm_km_alloc_poolpage1()
uvm_km_kmemalloc()
uvm_map()
pool_get(PR_WAITOK)

The bug is, although pool_get() was initially called PR_NOWAIT, the pool
memory allocator will call pool_get(PR_WAITOK). This is triggered by
raidframe and getnewvnode(), but I think it's not specific to it, and
I think it's still present in current.

uvm_km_kmemalloc() is properly called with UVM_KMF_NOWAIT.
But uvm_map() is called with flags stripped to UVM_KMF_TRYLOCK, so
UVM_KMF_NOWAIT is lost here.
Second problem, uvm_mapent_alloc() called from uvm_map() will call
pool_get(PR_WAITOK) unconditionally. It seems to me that uvm_map() was not
designed to be called from a context where we can sleep, but I may be
wrong.

So there is a problem for the PR_NOWAIT case, because we may, from
here, call uvm_map() and a pool_get(PR_WAITOK). Or maybe PR_NOWAIT should
only be used for pools with special memory allocators ?
In this case it would really be a bug in getnewvnode() ...

I hacked getnewvnode() to unlock the lock before calling pool_get() (instead
of unlocking it just after). Not really nice (we may have to relock just after
), but it avoids this panic ...

--
Manuel Bouyer, LIP6, Universite Paris VI.           Manuel.Bouyer@lip6.fr
     NetBSD: 23 ans d'experience feront toujours la difference
--

--4Ckj6UjgE2iN1+kY
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename=diff

Index: vfs_subr.c
===================================================================
RCS file: /cvsroot/syssrc/sys/kern/vfs_subr.c,v
retrieving revision 1.174.2.1
diff -u -r1.174.2.1 vfs_subr.c
--- vfs_subr.c	2002/06/02 15:29:56	1.174.2.1
+++ vfs_subr.c	2002/11/19 14:44:15
@@ -467,9 +467,9 @@
 	    (TAILQ_FIRST(&vnode_free_list) == NULL &&
 	     (TAILQ_FIRST(&vnode_hold_list) == NULL || toggle));
 
+	simple_unlock(&vnode_free_list_slock);
 	if (tryalloc &&
 	    (vp = pool_get(&vnode_pool, PR_NOWAIT)) != NULL) {
-		simple_unlock(&vnode_free_list_slock);
 		memset(vp, 0, sizeof(*vp));
 		simple_lock_init(&vp->v_interlock);
 		uobj = &vp->v_uobj;
@@ -478,6 +478,7 @@
 		TAILQ_INIT(&uobj->memq);
 		numvnodes++;
 	} else {
+		simple_lock(&vnode_free_list_slock);
 		if ((vp = TAILQ_FIRST(listhd = &vnode_free_list)) == NULL)
 			vp = TAILQ_FIRST(listhd = &vnode_hold_list);
 		for (; vp != NULL; vp = TAILQ_NEXT(vp, v_freelist)) {

--4Ckj6UjgE2iN1+kY--