Subject: Re: [solution] vnode refcount panic, perhaps due to kern/vfs_lookup.c:lookup()
To: Greg Troxel <gdt@ir.bbn.com>
From: Bill Studenmund <wrstuden@netbsd.org>
List: tech-kern
Date: 03/28/2003 15:12:44
On Fri, 28 Mar 2003, Greg Troxel wrote:

> Sorry for the delay in checking your proposed alternative change (I
> had to loan out my crash/kgdb box to someone else at work).
>
> I looked at your patch, and concluded that it was incorrect to unlock
> tdvp, since VOP_LOOKUP requires that dvp is the locked vnode of the
> directory to search.  So, I started with your patch minus the

True.

> VOP_UNLOCK.  It is a good thing that UNLOCK is inappropriate, because
> now we hold a lock to the parent dir the whole time the symlink
> creation operation is in progress.  sys_symlink promptly discards the
> returned value, but the system call would return an error if
> e.g. someone snuck an unlink in between the venus symlink creation and
> looking up the symlink.
>
> On running with just VOP_LOOKUP, I got a locking against myself panic
> in namei from a stat operation following the symlink.  After running
> gdb, I noticed LOCKPARENT set in the componentname structure passed to
> VOP_LOOKUP.  After reading a lot of code/man pages, I found that
> sys_symlink sets LOCKPARENT in nameidata to get the locked directory
> where the symlink is to be placed.  This makes sense, because
> VOP_SYMLINK requires a locked directory vnode. But:
>
> The componentname struct from nd, still containing the LOCKPARENT
> flag, is passed to VOP_SYMLINK.  VOP_SYMLINK is not defined to pay
> attention to this.  But when calling VOP_LOOKUP with the same cnp, the
> parent remains locked, and we lose on the next access with locking
> against myself.

Ok, this I don't understand. If the parent is locked, and you're calling
lookup, what is the problem? You WANT the parent to be locked, no?

Or is it that when you return you return with the parent locked, which is
wrong?

> Also, it looks like if venus_symlink fails that tdvp will still be
> locked.  vnode_if.src indicates that dvp should be unlocked on error.
> So I added a vput in that case.  I think most coda problems are found
> in the namei, and once we get that far venus is likely to perform the
> symlink.
>
> So, I think sys_symlink is buggy, and should clean up the LOCKPARENT
> flag in the struct componentname before reusing it.  At a minimum,
> having extra flags that don't have the obvious semantics on some calls
> seems unclean and confusing.  (All diffs are against very recent
> netbsd-1-6.)

No. It's coda's issue. From another part of vnode_if.src:

# For operations other than VOP_LOOKUP which require a component name
# parameter, the flags required for the initial namei() call are listed.
# Additional flags may be added to the namei() call, but these are required.

which I take to mean you may get other flags. If you don't want them, be
defensive.

> --- vfs_syscalls.c.~1.1.1.3.~	Fri May 10 20:45:06 2002
> +++ vfs_syscalls.c	Fri Mar 28 14:35:16 2003
> @@ -1518,6 +1518,8 @@
>  	VATTR_NULL(&vattr);
>  	vattr.va_mode = ACCESSPERMS &~ p->p_cwdi->cwdi_cmask;
>  	VOP_LEASE(nd.ni_dvp, p, p->p_ucred, LEASE_WRITE);
> +	/* VOP_SYMLINK might call LOOKUP (e.g. coda), so LOCKPARENT isn't safe */
> +	nd.ni_cnd.cn_flags &= ~LOCKPARENT;
>  	error = VOP_SYMLINK(nd.ni_dvp, &nd.ni_vp, &nd.ni_cnd, &vattr, path);
>  	if (error == 0)
>  		vput(nd.ni_vp);
>
>
> Here is the patch to coda_vnops that works for me (thousands of
> create/delete symlink operations with emacs/cfs/coda were ok),
> followed by the snippet of my current code.  This patch works even
> without the sys_symlink patch above.

Cool! Then this is probably the way to go!

> Index: coda_vnops.c
> ===================================================================
> RCS file: /FOO-CVS/netbsd/src/sys/coda/coda_vnops.c,v
> retrieving revision 1.1.1.2
> diff -u -u -r1.1.1.2 coda_vnops.c
> --- coda_vnops.c	2001/12/06 04:27:40	1.1.1.2
> +++ coda_vnops.c	2003/03/28 19:28:46
> @@ -1642,26 +1642,17 @@
>      /* Invalidate the parent's attr cache, the modification time has changed */
>      tdcp->c_flags &= ~C_VATTR;
>
> -    if (!error)
> -    {
> -	struct nameidata nd;
> -	NDINIT(&nd, LOOKUP, FOLLOW|LOCKLEAF, UIO_SYSSPACE, nm, p);
> -	nd.ni_cnd.cn_cred = cred;
> -	nd.ni_loopcnt = 0;
> -	nd.ni_startdir = tdvp;
> -	nd.ni_cnd.cn_pnbuf = (char *)nm;
> -	nd.ni_cnd.cn_nameptr = nd.ni_cnd.cn_pnbuf;
> -	nd.ni_pathlen = len;
> -	vput(tdvp);
> -	error = lookup(&nd);
> -	*ap->a_vpp = nd.ni_vp;
> -    }
> -
> -    /*
> -     * Free the name buffer
> -     */
> -    if ((cnp->cn_flags & SAVESTART) == 0) {
> -	PNBUF_PUT(cnp->cn_pnbuf);
> +    if (!error) {
> +	/*
> +	 * We don't want to keep the parent locked, but sys_symlink left
> +	 * flag turds from an earlier namei call.
> +	 */
> +        cnp->cn_flags &= ~LOCKPARENT;
> +	error = VOP_LOOKUP(tdvp, ap->a_vpp, cnp);
> +	/* Either an error occurs, or ap->a_vpp is locked. */
> +    } else {
> +	/* error, so unlock and dereference parent */
> +        vput(tdvp);
>      }
>
>   exit:

Take care,

Bill