Re: Vnode API change: VOP_LOCK

To: Taylor R Campbell <campbell+netbsd-tech-kern%mumble.net@localhost>
Subject: Re: Vnode API change: VOP_LOCK
From: J. Hannken-Illjes <hannken%eis.cs.tu-bs.de@localhost>
Date: Wed, 26 Feb 2014 09:49:00 +0100

On Feb 25, 2014, at 10:28 PM, Taylor R Campbell 
<campbell+netbsd-tech-kern%mumble.net@localhost> wrote:

>   Date: Tue, 25 Feb 2014 15:11:04 +0100
>   From: J. Hannken-Illjes <hannken%eis.cs.tu-bs.de@localhost>
> 
>   I'm quite sure we cannot kill ALL instances of LK_RETRY as
> 
> OK, perhaps not yet.  But:
> 
>   - at least union file system needs the unlock / relock when looking
>     up DOTDOT and the relock has to succeed as the protocol wants
>     the node locked on return.  I suppose there are more such cases
>     inside the file systems.
> 
> I'm not familiar enough with unionfs to say, but in tmpfs and ufs, at
> least, and probably all the other normal file systems, the only
> unlock/lock ../relock dances are immediately followed by unlock again
> because VOP_LOOKUP now returns the vnode unlocked.  So we could get
> rid of the relock-and-then-unlock altogether in those cases.

Union file system operates on namespace and has to do these dances
independent of VOP_LOOKUP returning locked or unlocked vnodes.

It is always fun to keep layered and union file system working ...

> There's a different dance for the genealogy analysis in rename, but
> that just iterates unlock/lock .. -- there's no relocking, and it can
> bail safely on failure.
> 
> As for unionfs, it might be helpful if the lock order were written
> down somewhere.  It looks like the order is generally upper-to-lower,
> but I'm not sure.  Does union_lookup ever need to lock the node it's
> looking up, at any level?  It seems to me that it should be necessary
> only to hold the directory locked.
> 
>   - this would break our revoke(2) semantics.  It must be possible to read
>     and close a revoked character device.
> 
> Hmm, OK.  But is there any use for vn_lock(LK_RETRY) in a vop in a
> file system, rather than in a syscall that is using VOP_READ or
> VOP_CLOSE?

VOP_READ demands the vnode locked on entry so it has to be locked
with LK_RETRY before VOP_READ may be called.


>   - vrelel() and vclean() need it to work on layered vnodes.  When the
>     lower node of a layer node gets revoked it must still be possible
>     to lock the layer node to inactivate and reclaim it.
> 
>     The change of VOP_LOCK() to vn_lock(..., LK_RETRY) for vrelel() and
>     vclean() was missing from the diff.
> 
> I will have to think harder about this.
> 
>> The LK_INTERLOCK flag doesn't seem right.  It looks like the only
>> place you use it is a place where you recently removed some code to
>> drop the interlock before taking the vnode lock.  We ought to be
>> removing superfluous extra states, not adding more lock order
>> workarounds, especially ones that involve asking more of the
>> particular file systems by passing them new flags.
> 
>   OK.
> 
>   New diff at http://www.netbsd.org/~hannken/vnode-pass3-2.diff
> 
> That looks better, thanks.
> 
>> A little more broadly, do you have a plan for what the vnode life
>> cycle should look like?  We ought to have a well-defined set of states
>> that a vnode can be in (something like embryonic, active, cached,
>> dead) and a clear set of transitions with invariants and lock order
>> and a clear protocol for file systems to implement.  Without that, I'm
>> sceptical of churn in vrelel, vclean, &c.
> 
>   - outside of vfs a vnode is one of non-existant, inactive, active or dead.
> 
> When you say `outside of vfs', do you mean `everywhere except locked
> logic in kern/vfs_*', or `everywhere except locked logic in kern/vfs_*
> and logic in the file systems'?

I meant kern/vfs_*.

>   - vget() changes from inactive to active.
>   - vrele() changes from active to inactive.
>   - vrecycle() changes from inactive to dead.
>   - vgone() and vrevoke() changes from either active or inactive to dead.
> 
>   These transitions are mutual exclusive, protected by VI_CHANGING.
> 
> OK, that's a good start.  There are a lot of details to fill in,
> though, aside from the parts you already mentioned as missing.
> 
> - How is the state of the vnode recorded so we can write kasserts
> about it?

Not yet but it will come with the cleanup of VI_CLEAN/VI_XLOCK.

> - What lock covers the state of the vnode -- or, why do we need
> VI_CHANGING in addition to the interlock?

No lock but the internal flag VI_CHANGING protected by v_interlock.

> - How does it fit into file systems' vop implementations?  That is,
> for each vop, when it gets a locked vnode, what state is the vnode in
> and what state is it expected to be in when the vop returns?  When a
> vop gets an unlocked vnode, what can it rely on about that vnode?


Vnode operations get active (referenced) vnodes.  If the vnode is locked
it cannot change state to inactive or dead.  Unlocked vnodes may change
to dead as a result of revoke (or forced unmount).

Same holds for return, locked vnodes are still active, unlocked vnodes
are either active or dead and vrele'd vnodes may be inactive if we
dropped the last reference.

--
J. Hannken-Illjes - hannken%eis.cs.tu-bs.de@localhost - TU Braunschweig 
(Germany)

References:
- Re: Vnode API change: VOP_LOCK
  - From: Taylor R Campbell

Prev by Date: Re: ci_mtx_count issue on NetBSD/Milkymist
Next by Date: Re: ci_mtx_count issue on NetBSD/Milkymist
Previous by Thread: Re: Vnode API change: VOP_LOCK
Next by Thread: pmap_kenter_pa & pmap_kremove
Indexes:

Home | Main Index | Thread Index | Old Index