tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Vnode API change: add global vnode cache

On Tue, Apr 15, 2014 at 04:11:58PM +0000, Taylor R Campbell wrote:
 >    New diff at
 >    Plan to commit early wednesday ...
 > I still don't think this approach is right.  It makes a long-term copy
 > of logic in getnewvnode (because this vcache(9) will not be applicable
 > everywhere), there's no reason to use kpause or any new condvars when
 > we already have one inside each vnode which we'll be using anyway in
 > vget, and it still increases the overhead of every vnode using it.
 > I don't object to the principle of VFS_LOAD_NODE, or VOP_LOAD, but at
 > the moment I think it will cause greater divergence between file
 > systems and maintenance headaches as a result.
 > As an alternative, I propose the set of patches at
 > <>, to do the
 > following:
 > [...]

Wading into this after the fact, I see the following points:

 - Duplicating the getnewvnode logic is definitely bad; we have enough
cruft without adding new redundant but not quite equivalent code

 - It seems to me that in the long run, all of this baloney should be
hidden away inside the vfs layer; filesystems that use the vnode cache
should only need to call vcache_get, and the only thing that should
ever see a partially initialized vnode is VOP_LOAD and nothing outside
the vnode cache should ever see a vnode with refcount zero. This in
particular means that all the various forms of vget() should be hidden
away, as should getnewvnode. It seems to me like both of these patch
sets are steps in this direction, but we're far enough away from the
destination that it's hard to choose between them.

 - A while back (one or two rounds ago) I floated an idea that had
separate vnode key objects for the same reason they appear here: to
make insertion atomic. Riastradh convinced me at the time that this
wasn't necessary, so I dropped it and never got around to posting it
on tech-kern. However, it had a characteristic that I don't see here:
locks in the vnode keys. The idea of that was to introduce a level of
indirection so that nothing like VI_CHANGING was needed: while loading
a vnode, you unlock the table but leave the key locked, and that takes
care, in an orderly way, of making sure nobody else sees it. If we're
going to end up with vnode key objects, I think they should contain
locks in favor of having exposed or semi-exposed state flags.

 - There are still filesystems that do not use the vnode cache. Half
the problem with the legacy scheme we have is that it tries to provide
the expire half of the cache to all filesystems, even those that don't
use or need the lookup half of the cache. This is fundamentally wrong.
I think rather than trying to continue with this folly we should do
something different that keeps these filesystems separate.

 - Question 1: How much overhead would it really cause if we didn't
bother to cycle tmpfs vnodes "in" and "out" but just materialized them
along with the tmpfs_node and kept them around until someone destroys
the tmpfs_node? vnodes have a fair amount of slop in them, but they
aren't that big and if you're using tmpfs you presumably have enough
memory to keep whole files in RAM anyway. In this case we could just
provide a constructor and destructor for non-cache vnodes, decouple
that from the lifecycle logic (and firewall it off with plenty of
assertions and such) and a lot of the silly goes away.

 - Question 2: The other obvious candidates for not using the vnode
cache are virtual constructs like kernfs and procfs. ISTM that
"loading" and "unloading" their vnodes is equally pointless and that
the same solution applies. The question is: are there any other fses
that wouldn't use the vnode cache for some other reason, such that
this reasoning wouldn't apply? I can't think of any, and it seems like
anything where there are genuine load and unload operations for vnodes
vnodes should be using the cache, but that doesn't prove anything.

Also, it should be vcache_get(), not intern; with all "intern"
operations I can think of, you pass in the object to be interned,
not just a key for it. If we had vcache_intern(), the model would be
that you consed up a vnode, then passed it (along with the key) to
vcache_intern(), which would then either insert it into the cache and
return it, or destroy it and return the existing equivalent one. This
is not really what we want.


David A. Holland

Home | Main Index | Thread Index | Old Index