Subject: kern/1124: more union FS bugs (kernel traps on null VP references)
To: None <gnats-admin@sun-lamp.cs.berkeley.edu>
From: John Kohl <jtk@kolvir.blrc.ma.us>
List: netbsd-bugs
Date: 06/07/1995 18:35:03
>Number:         1124
>Category:       kern
>Synopsis:       union FS can deref null vnode pointer
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people (Kernel Bug People)
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Jun  7 18:35:02 1995
>Originator:     John Kohl
>Organization:
NetBSD Kernel Hackers `R` Us
>Release:        -current, 06 June 1995
>Environment:
	
System: NetBSD kolvir 1.0A NetBSD 1.0A (KOLVIR) #465: Wed Jun 7 19:51:11 EDT 1995 jtk@kolvir:/u1/NetBSD-current/src/sys/arch/i386/compile/KOLVIR i386

>Description:
The union file system can dereference a null vnode pointer.  It happens
when there is no lower layer vnode (usually when the top file layer is
the only one with that file, e.g. an object file), and that file is
removed while another process has the file open.

>How-To-Repeat:
0) mount a union FS.  Go to a place with a file that exists only in the
upper layer.

choose 1 or 2:

1) cat the file, suspend the cat.  remove the file.  kill the cat.
watch the kernel die in union_close().

2) cat the file, put it in the background.  remove the file.  watch the
kernel die in union_read().

>Fix:
This seems to do the trick.  I'm not happy with the need to forcibly
remove the vnode out from under a process--this means that the 'cat'
command gets an error back from read rather than keeping a reference to
the real vnode.

ALSO, note there's a bug fix in here.  As currently coded,
union_removed_upper() will trap on a null deref if UN_ULOCK is set on
the union node passed to it, because it removes the upper vnode before
calling the unlock routine.  In the diffs below I just hold onto the
vnode separately and vput() it when UN_ULOCK is set.

Maybe instead of these diffs the union layer should just uncache the
union node and hold the uppervp referenced until the vnode is closed?
(Can it set a reclaim-on-inactivation flag to force a reclaim when it's
done being active?)

===================================================================
RCS file: RCS/union_subr.c,v
retrieving revision 1.1
diff -c -r1.1 union_subr.c
*** union_subr.c	1995/06/07 23:18:03	1.1
--- union_subr.c	1995/06/07 23:49:54
***************
*** 941,950 ****
--- 941,976 ----
  	return (VOP_CLOSE(vp, fmode, cred, p));
  }
  
+ /*
+  * You can't rely on the union node existing after this call, because
+  * it may be a pair of NULLVPs and thus disappear.
+  *
+  * All the callers in union_vnops.c (so far) do not reference it after their
+  * calls here.
+  */
  void
  union_removed_upper(un)
  	struct union_node *un;
  {
+ 	struct vnode *uvp = un->un_uppervp;
+ 
+ 	/*
+ 	 * forcibly revoke any users of this node if we have no hope of
+ 	 * doing anything useful with it (i.e. both vnodes would be NULLVP).
+ 	 * We must do this without setting uppervp to NULLVP, otherwise
+ 	 * we might block waiting for someone else to finish using the
+ 	 * node and they'd trip over the doubly-null union node.
+ 	 */
+ 	if (un->un_lowervp == NULLVP) {
+ 	    if (un->un_flags & UN_ULOCK) {
+ 		un->un_flags &= ~UN_ULOCK;
+ 		VOP_UNLOCK(uvp);
+ 	    }
+ 	    vgone(UNIONTOV(un));
+ 	    return;
+ 	}
+ 
+ 	VREF(uvp);			/* hold in case needed for unlock */
  
  	union_newupper(un, NULLVP);
  	union_diruncache(un);
***************
*** 956,963 ****
  
  	if (un->un_flags & UN_ULOCK) {
  		un->un_flags &= ~UN_ULOCK;
! 		VOP_UNLOCK(un->un_uppervp);
! 	}
  }
  
  #if 0
--- 982,991 ----
  
  	if (un->un_flags & UN_ULOCK) {
  		un->un_flags &= ~UN_ULOCK;
! 		vput(uvp);
! 	} else
! 	    vrele(uvp);
! 
  }
  
  #if 0
===================================================================
RCS file: RCS/union_vnops.c,v
retrieving revision 1.8
diff -c -r1.8 union_vnops.c
*** union_vnops.c	1995/06/01 23:01:52	1.8
--- union_vnops.c	1995/06/07 23:24:50
***************
*** 580,585 ****
--- 580,591 ----
  		--un->un_openl;
  		vp = un->un_lowervp;
  	}
+ 	/*
+ 	 * A stranded union node may end up here with both vnodes NULL,
+ 	 * in which case we don't do anything.
+ 	 */
+ 	if (vp == NULLVP)
+ 	    return 0;
  
  	ap->a_vp = vp;
  	return (VCALL(vp, VOFFSET(vop_close), ap));
>Audit-Trail:
>Unformatted: