tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Blocking vcache_tryvget() across VOP_INACTIVE() - unneeded



On Thu, Jan 16, 2020 at 04:51:44AM +0100, Mateusz Guzik wrote:

> On 1/15/20, Andrew Doran <ad%netbsd.org@localhost> wrote:
> > I don't understand why we do this.  I don't think it's needed at all
> > because
> > if the file really is deleted and the inode is getting cleaned out, then it
> > shouldn't be getting new refs in any of the usual ways and we don't need to
> > worry about it.  And in any case the vnode lock prevents anyone from
> > messing
> > with the inode during VOP_INACTIVE().
> >
> > I want to change this because it causes nasty problems on MP.  For example
> > I
> > see threads very regulary thrown out of the namecache by vcache_vtryget(),
> > and when they finally get the vnode ref'd and locked they do it by digging
> > through the file system metadata (when everything is already in cache).
> >
> >       http://www.netbsd.org/~ad/inactive.diff
> >
> 
> I think looking at a bigger picture suggests a different solution, which
> will also keep the current blocked <-> loaded transition for consideration
> later.
> 
> I'm assuming the goal for the foreseeable future is to achieve path lookup
> with shared vnode locks.

Good guess.  There is a prototype of LK_SHARED lookup on the ad-namecache
branch, along with lookup using namecache only that takes as few vnode locks
and refcounts as possible.  In the easiest case only the root vnode and the
leaf vnode are touched (although whether the root vnode ever actually needs
a reference is another question).  There are many "non-easy" cases though.

> Since ->v_usecont transition to 0 happens all the time and VOP_INACTIVE
> processing is almost never needed, the current requirement to hold an
> exclusive lock just to find out there is nothing to do should be lifted.
> Apart from getting away with only shared lock (or no locks) this would
> also naturally eliminate vnode state changes for the common case.
> 
> Preferably vfs would provide a method for filesystems to call to signify
> they want VOP_INACTIVE to be called for given vnode, then vrelel could
> just test a bit. However, this is probably not worth the effort right now.
> 
> Said locking was also a problem in FreeBSD. It got temporarily sorted
> out with introduction of VOP_NEED_INACTIVE -- by default it returns 1,
> but filesystems which provide their own method easily avoid the problem.
> I think starting with requiring at least shared lock would do the trick
> just fine while providing race-free methods at least for ufs and tmpfs.
> 
> Then vrelel could stick to:
> diff --git a/sys/kern/vfs_vnode.c b/sys/kern/vfs_vnode.c
> index d05d5180f730..45d80dab4a2c 100644
> --- a/sys/kern/vfs_vnode.c
> +++ b/sys/kern/vfs_vnode.c
> @@ -774,7 +774,7 @@ vrelel(vnode_t *vp, int flags, int lktype)
>          * If not clean, deactivate the vnode, but preserve
>          * our reference across the call to VOP_INACTIVE().
>          */
> -       if (VSTATE_GET(vp) == VS_RECLAIMED) {
> +       if (VSTATE_GET(vp) == VS_RECLAIMED || !VOP_NEED_INACTIVE(vp))
>                 VOP_UNLOCK(vp);
>         } else {
>                 VSTATE_CHANGE(vp, VS_LOADED, VS_BLOCKED);
> 
> which even with exclusive locks already avoids the blocked state problem.
> 
> Another option is to move some of the current body into smaller halpers
> and then let filesystem piece together their own VOP_VRELE.

Interesting.  I had tried another approach, a VOP_INACTIVE() that could be
called with an LK_SHARED lock.  It returned ENOLCK if there was work to do
that needed an exclusive lock, but it was ugly.  I will take a look at the
FreeBSD solution.

Thanks,
Andrew


Home | Main Index | Thread Index | Old Index