Subject: Re: [solution] vnode refcount panic, perhaps due to kern/vfs_lookup.c:lookup()
To: Bill Studenmund <wrstuden@netbsd.org>
From: Greg Troxel <gdt@ir.bbn.com>
List: tech-kern
Date: 03/28/2003 20:30:23
  > The componentname struct from nd, still containing the LOCKPARENT
  > flag, is passed to VOP_SYMLINK.  VOP_SYMLINK is not defined to pay
  > attention to this.  But when calling VOP_LOOKUP with the same cnp, the
  > parent remains locked, and we lose on the next access with locking
  > against myself.

  Ok, this I don't understand. If the parent is locked, and you're calling
  lookup, what is the problem? You WANT the parent to be locked, no?

  Or is it that when you return you return with the parent locked, which is
  wrong?

The latter.  Calling VOP_LOOKUP, it's locked, because it was locked
when coda_symlink was called, and all is well.  The problem is that
VOP_LOOKUP respects the LOCKPARENT flag and returns the parent locked,
and we want to leave the parent unlocked, as VOP_SYMLINK requires.

  No. It's coda's issue. From another part of vnode_if.src:

  # For operations other than VOP_LOOKUP which require a component name
  # parameter, the flags required for the initial namei() call are listed.
  # Additional flags may be added to the namei() call, but these are required.

  which I take to mean you may get other flags. If you don't want them, be
  defensive.

I read it otherwise, saying that you have to call namei with some
flags, but you can put other flags on the namei call if you have some
reason.  The issue I'm talking about is whether it is ok to have
random flags, such as LOCKPARENT, set on random vnodeops .  And if so,
whether they have the described semantics (the notion of LOCKPARENT
makes sense for VOP_SYMLINK), or are _required to be ignored_ by the
vnodeop.  Currently, flags like LOCKPARENT appear to be ignored
(leaving VOP_LOOKUP aside as a special case).

ufs_symlink, for example, does not respect the LOCKPARENT flag; the
vput is buried deep in ufs_makeinode.  ufs_mkdir is similar.

     The top half of the structure is used exclusively for the
     pathname lookups using VOP_LOOKUP() and is initialised by the
     caller.

This is the place where it says that operations other than lookup have
to ignore the flags.

In the coda_symlink case, LOCKPARENT was on the namei call before
VOP_SYMLINK, as required.  coda_symlink should ignore LOCKPARENT, and
thus can't just use the componentname to do a lookup.  The bug is
arguably in the new coda_symlink code, since VOP_LOOKUP is now called
with the same componentname structure passed to VOP_SYMLINK.  

The unintended consequences (leaving the parent locked) of reusing
componentname like this violates the Principle of Least Astonishment,
and I think it is unclean.  Flags that are defined to be ignored
shouldn't be set, and perhaps a #ifdef DIAGNOSTIC_EXTRA_PARANOID in
vnode_if.c to check for them.  This issue is quite common in
kern/vfs_syscalls.c, though.  Arguably it is just plain wrong to call
lookup reusing a componentname struct one was passed.

The LOCKPARENT issue is purely stylistic; with the &= ~LOCKPARENT in
coda_symlink I am not aware of any incorrect behavior.

  Cool! Then this is probably the way to go!

I think you are saying that you like the patch to coda_symlink, and
that sys_symlink is correct.  Committing the coda_symlink patch would
be nice; the current code is certainly incorrect, although without the
complication of cfs the bug seems not to be triggered.