Subject: kern/7071: vfs_syscalls.c,v 1.128 breaks union mounts
To: None <gnats-bugs@gnats.netbsd.org>
From: None <John.P.Darrow@wheaton.edu>
List: netbsd-bugs
Date: 03/02/1999 00:59:57
>Number:         7071
>Category:       kern
>Synopsis:       vfs_syscalls.c,v 1.128 breaks union mounts
>Confidential:   no
>Severity:       critical
>Priority:       medium
>Responsible:    kern-bug-people (Kernel Bug People)
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Mar  1 23:05:01 1999
>Last-Modified:
>Originator:     John Darrow
>Organization:
	Computing Services
	Wheaton College, Wheaton, IL
>Release:        NetBSD-current 19990301
>Environment:
System: NetBSD jdarrow.wheaton.edu 1.3J NetBSD 1.3J (JDARROW) #10: Mon Mar 1 23:39:59 CST 1999 jdarrow@jdarrow.wheaton.edu:/var/src/sys/arch/i386/compile/JDARROW i386


>Description:
>From source-changes:
->Module Name:	src
->Committed By:	fvdl
->Date:		Sun Feb 28 14:12:54 UTC 1999
->
->Modified Files:
->	src/sys/kern: vfs_syscalls.c
->Log Message:
->Use a SETRECURSE lock before calling VFS_MOUNT in the mount() system call,
->since the lock may be taken again. This was the intention of the CANRECURSE
->lock already there, but didn't work.
->
->Only fill in the vnode<->mountpoint links (mountedhere and vnodecovered)
->after VFS_MOUNT returned succesfully. It might happen that something called
->from VFS_MOUNT mistook the vnode for an already successfully mounted on
->one because of this.
->

The second part of this change breaks union mounts completely (I know they're
"experimental", but thus far they've been good enough to mount a local
directory over an ro nfs-mounted source dir and have local edits and builds
"just work").  Apparently, with the change in place,

sys_mount calls VFS_MOUNT (sys/kern/vfs_syscalls.c, line 306), which for
union mounts is union_mount, which sets lowerrootvp to mp->mnt_vnodecovered
and then VREF's it (sys/miscfs/union/union_vfsops.c, lines 122-123).
Neither VREF nor union_mount do any sanity checking for
mp->mnt_vnodecovered == NULLVP, resulting in a kernel page fault trap when
VREF dereferences the null pointer.

The following patch (which reverts part of vfs_syscalls.c,v 1.127->1.128) 
allows union mounts to work as well as they did before.  I'm not sure what
the proper solution is (e.g. whether sys_mount or union_mount is actually
"broken").  I'll also send-pr this, as someone with more understanding of
the file system code should take a look at this...

>How-To-Repeat:
Boot using kernel with filesystem UNION, mount -t union, drop into DDB with
'kernel page fault trap' in union_mount with a mov instruction which
includes %edx(some number here) with edx == 0.

>Fix:
Index: sys/kern/vfs_syscalls.c
===================================================================
RCS file: /source/cvs/netbsd/current/src/sys/kern/vfs_syscalls.c,v
retrieving revision 1.1.1.3
retrieving revision 1.2
diff -c -r1.1.1.3 -r1.2
*** vfs_syscalls.c	1999/03/01 19:06:25	1.1.1.3
--- vfs_syscalls.c	1999/03/02 05:56:00	1.2
***************
*** 285,290 ****
--- 285,291 ----
  	(void)vfs_busy(mp, LK_NOWAIT, 0);
  	mp->mnt_op = vfs;
  	vfs->vfs_refcount++;
+ 	mp->mnt_vnodecovered = vp;
  	mp->mnt_stat.f_owner = p->p_ucred->cr_uid;
  update:
  	/*
***************
*** 321,327 ****
  	cache_purge(vp);
  	if (!error) {
  		vp->v_mountedhere = mp;
- 		mp->mnt_vnodecovered = vp;
  		simple_lock(&mountlist_slock);
  		CIRCLEQ_INSERT_TAIL(&mountlist, mp, mnt_list);
  		simple_unlock(&mountlist_slock);
--- 322,327 ----

>Audit-Trail:
>Unformatted: