Re: Vnode API change: VOP_LOCK

To: J. Hannken-Illjes <hannken%eis.cs.tu-bs.de@localhost>
Subject: Re: Vnode API change: VOP_LOCK
From: Taylor R Campbell <campbell+netbsd-tech-kern%mumble.net@localhost>
Date: Tue, 25 Feb 2014 21:28:30 +0000

   Date: Tue, 25 Feb 2014 15:11:04 +0100
   From: J. Hannken-Illjes <hannken%eis.cs.tu-bs.de@localhost>

   I'm quite sure we cannot kill ALL instances of LK_RETRY as

OK, perhaps not yet.  But:

   - at least union file system needs the unlock / relock when looking
     up DOTDOT and the relock has to succeed as the protocol wants
     the node locked on return.  I suppose there are more such cases
     inside the file systems.

I'm not familiar enough with unionfs to say, but in tmpfs and ufs, at
least, and probably all the other normal file systems, the only
unlock/lock ../relock dances are immediately followed by unlock again
because VOP_LOOKUP now returns the vnode unlocked.  So we could get
rid of the relock-and-then-unlock altogether in those cases.

There's a different dance for the genealogy analysis in rename, but
that just iterates unlock/lock .. -- there's no relocking, and it can
bail safely on failure.

As for unionfs, it might be helpful if the lock order were written
down somewhere.  It looks like the order is generally upper-to-lower,
but I'm not sure.  Does union_lookup ever need to lock the node it's
looking up, at any level?  It seems to me that it should be necessary
only to hold the directory locked.

   - this would break our revoke(2) semantics.  It must be possible to read
     and close a revoked character device.

Hmm, OK.  But is there any use for vn_lock(LK_RETRY) in a vop in a
file system, rather than in a syscall that is using VOP_READ or
VOP_CLOSE?

   - vrelel() and vclean() need it to work on layered vnodes.  When the
     lower node of a layer node gets revoked it must still be possible
     to lock the layer node to inactivate and reclaim it.

     The change of VOP_LOCK() to vn_lock(..., LK_RETRY) for vrelel() and
     vclean() was missing from the diff.

I will have to think harder about this.

   > The LK_INTERLOCK flag doesn't seem right.  It looks like the only
   > place you use it is a place where you recently removed some code to
   > drop the interlock before taking the vnode lock.  We ought to be
   > removing superfluous extra states, not adding more lock order
   > workarounds, especially ones that involve asking more of the
   > particular file systems by passing them new flags.

   OK.

   New diff at http://www.netbsd.org/~hannken/vnode-pass3-2.diff

That looks better, thanks.

   > A little more broadly, do you have a plan for what the vnode life
   > cycle should look like?  We ought to have a well-defined set of states
   > that a vnode can be in (something like embryonic, active, cached,
   > dead) and a clear set of transitions with invariants and lock order
   > and a clear protocol for file systems to implement.  Without that, I'm
   > sceptical of churn in vrelel, vclean, &c.

   - outside of vfs a vnode is one of non-existant, inactive, active or dead.

When you say `outside of vfs', do you mean `everywhere except locked
logic in kern/vfs_*', or `everywhere except locked logic in kern/vfs_*
and logic in the file systems'?

   - vget() changes from inactive to active.
   - vrele() changes from active to inactive.
   - vrecycle() changes from inactive to dead.
   - vgone() and vrevoke() changes from either active or inactive to dead.

   These transitions are mutual exclusive, protected by VI_CHANGING.

OK, that's a good start.  There are a lot of details to fill in,
though, aside from the parts you already mentioned as missing.

- How is the state of the vnode recorded so we can write kasserts
about it?

- What lock covers the state of the vnode -- or, why do we need
VI_CHANGING in addition to the interlock?

- How does it fit into file systems' vop implementations?  That is,
for each vop, when it gets a locked vnode, what state is the vnode in
and what state is it expected to be in when the vop returns?  When a
vop gets an unlocked vnode, what can it rely on about that vnode?

Follow-Ups:
- Re: Vnode API change: VOP_LOCK
  - From: J . Hannken-Illjes

Prev by Date: Re: ci_mtx_count issue on NetBSD/Milkymist
Next by Date: Re: ci_mtx_count issue on NetBSD/Milkymist
Previous by Thread: Re: Vnode API change: VOP_LOCK
Next by Thread: Re: Vnode API change: VOP_LOCK
Indexes:

Home | Main Index | Thread Index | Old Index