Subject: kern/766: union mounts can get UFS locking panics or other deadlocks
To: None <gnats-admin@NetBSD.ORG>
From: John Kohl <jtk@kolvir.blrc.ma.us>
List: netbsd-bugs
Date: 01/29/1995 19:05:06
>Number:         766
>Category:       kern
>Synopsis:       union mounts can get UFS locking panics or other deadlocks
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    kern-bug-people (Kernel Bug People)
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun Jan 29 19:05:04 1995
>Originator:     John Kohl
>Organization:
NetBSD Kernel Hackers `R` Us
>Release:        1.0A
>Environment:
	
System: NetBSD kolvir 1.0A NetBSD 1.0A (KOLVIR) #21: Sun Jan 29 20:59:56 EST 1995 jtk@kolvir:/u1/NetBSD-current/src/sys/arch/i386/compile/KOLVIR i386


>Description:
The latest union mount code in -current has a locking misordering which
can lead to panics like the one from ufs_lock():
	panic("locking against myself");

The basic problem is a misordered flag manipulation in union_lock().

>How-To-Repeat:

Compile a DIAGNOSTIC kernel.
Mount something with unions.  Make sure the top level doesn't have a
shadow directory tree yet.

cd to the union mount, then run two copies of 'find . -ls >/dev/null'.
Watch the kernel panic in short order.

>Fix:

Here's a patch which adds some comments of what's going on in a few
cases that confused me, plus a patch for the bug in ufs_lock(), plus a
patch to the ordering of some flag manipulation in
union_removed_upper():  the CACHED flags really should be manipulated
before we drop the uppervp lock.

===================================================================
RCS file: miscfs/union/RCS/union_vnops.c,v
retrieving revision 1.1
diff -ubw -r1.1 miscfs/union/union_vnops.c
--- 1.1	1995/01/29 20:26:30
+++ miscfs/union/union_vnops.c	1995/01/30 01:54:18
@@ -319,6 +319,11 @@
 	/* case 2. */
 	if (uerror != 0 /* && (lerror == 0) */ ) {
 		if (lowervp->v_type == VDIR) { /* case 2b. */
+			/*
+			 * We may be racing another process to make the
+			 * upper-level shadow directory.  Be careful with
+			 * locks/etc!
+			 */
 			dun->un_flags &= ~UN_ULOCK;
 			VOP_UNLOCK(upperdvp);
 			uerror = union_mkshadow(um, upperdvp, cnp, &uppervp);
@@ -1353,12 +1358,20 @@
 	if (un->un_uppervp != NULLVP) {
 		if (((un->un_flags & UN_ULOCK) == 0) &&
 		    (vp->v_usecount != 0)) {
-			un->un_flags |= UN_ULOCK;
+			/*
+			 * We MUST always use the order of: take upper
+			 * vp lock, manipulate union node flags, drop
+			 * upper vp lock.  This code must not be an
+			 * exception.
+			 */
 			VOP_LOCK(un->un_uppervp);
+			un->un_flags |= UN_ULOCK;
 		}
 #ifdef DIAGNOSTIC
-		if (un->un_flags & UN_KLOCK)
-			panic("union: dangling upper lock");
+		if (un->un_flags & UN_KLOCK) {
+			vprint("dangling klock", vp);
+			panic("union: dangling upper klock (%lx)", vp);
+		}
 #endif
 	}
 
@@ -1384,6 +1397,18 @@
 	return (0);
 }
 
+/*
+ * When operations want to vput() a union node yet retain a lock on
+ * the upper VP (say, to do some further operations like link(),
+ * mkdir(), ...), they set UN_KLOCK on the union node, then call
+ * vput() which calls VOP_UNLOCK() and comes here.  union_unlock()
+ * unlocks the union node (leaving the upper VP alone), clears the
+ * KLOCK flag, and then returns to vput().  The caller then does whatever
+ * is left to do with the upper VP, and insures that it gets unlocked.
+ *
+ * If UN_KLOCK isn't set, then the upper VP is unlocked here.
+ */
+
 int
 union_unlock(ap)
 	struct vop_lock_args *ap;
@@ -1453,6 +1478,10 @@
 
 	printf("\ttag VT_UNION, vp=%x, uppervp=%x, lowervp=%x\n",
 			vp, UPPERVP(vp), LOWERVP(vp));
+	if (UPPERVP(vp))
+	    vprint("union_upper", UPPERVP(vp));
+	if (LOWERVP(vp))
+	    vprint("union_lower", LOWERVP(vp));
 	return (0);
 }
 
===================================================================
--- 1.1	1995/01/29 20:32:22
+++ miscfs/union/union_subr.c	1995/01/30 02:44:36
@@ -949,15 +949,14 @@
 union_removed_upper(un)
 	struct union_node *un;
 {
+	if (un->un_flags & UN_CACHED) {
+		un->un_flags &= ~UN_CACHED;
+		LIST_REMOVE(un, un_cache);
+	}
 
 	if (un->un_flags & UN_ULOCK) {
 		un->un_flags &= ~UN_ULOCK;
 		VOP_UNLOCK(un->un_uppervp);
-	}
-
-	if (un->un_flags & UN_CACHED) {
-		un->un_flags &= ~UN_CACHED;
-		LIST_REMOVE(un, un_cache);
 	}
 }
 
>Audit-Trail:
>Unformatted: