tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Vnode API change: add global vnode cache

On 26 Apr 2014, at 21:22, David Holland <> 

> On Tue, Apr 15, 2014 at 04:11:58PM +0000, Taylor R Campbell wrote:
>>   New diff at
>>   Plan to commit early wednesday ...
>> I still don't think this approach is right.  It makes a long-term copy
>> of logic in getnewvnode (because this vcache(9) will not be applicable
>> everywhere), there's no reason to use kpause or any new condvars when
>> we already have one inside each vnode which we'll be using anyway in
>> vget, and it still increases the overhead of every vnode using it.
>> I don't object to the principle of VFS_LOAD_NODE, or VOP_LOAD, but at
>> the moment I think it will cause greater divergence between file
>> systems and maintenance headaches as a result.
>> As an alternative, I propose the set of patches at
>> <>, to do the
>> following:
>> [...]
> Wading into this after the fact, I see the following points:
> - Duplicating the getnewvnode logic is definitely bad; we have enough
> cruft without adding new redundant but not quite equivalent code
> paths.
> - It seems to me that in the long run, all of this baloney should be
> hidden away inside the vfs layer; filesystems that use the vnode cache
> should only need to call vcache_get, and the only thing that should
> ever see a partially initialized vnode is VOP_LOAD and nothing outside
> the vnode cache should ever see a vnode with refcount zero.


> This in
> particular means that all the various forms of vget() should be hidden
> away, as should getnewvnode. It seems to me like both of these patch
> sets are steps in this direction, but we're far enough away from the
> destination that it's hard to choose between them.
> - A while back (one or two rounds ago) I floated an idea that had
> separate vnode key objects for the same reason they appear here: to
> make insertion atomic. Riastradh convinced me at the time that this
> wasn't necessary, so I dropped it and never got around to posting it
> on tech-kern. However, it had a characteristic that I don't see here:
> locks in the vnode keys. The idea of that was to introduce a level of
> indirection so that nothing like VI_CHANGING was needed: while loading
> a vnode, you unlock the table but leave the key locked, and that takes
> care, in an orderly way, of making sure nobody else sees it. If we're
> going to end up with vnode key objects, I think they should contain
> locks in favor of having exposed or semi-exposed state flags.

While I agree on atomic insertions being a mandatory property I
don't see the need for a per key lock.  As collisions are very rare
wait and retry using hash table lock and key->vnode being NULL
or a per table cv like I did before is sufficient (at least for now).

> - There are still filesystems that do not use the vnode cache. Half
> the problem with the legacy scheme we have is that it tries to provide
> the expire half of the cache to all filesystems, even those that don't
> use or need the lookup half of the cache. This is fundamentally wrong.
> I think rather than trying to continue with this folly we should do
> something different that keeps these filesystems separate.
> - Question 1: How much overhead would it really cause if we didn't
> bother to cycle tmpfs vnodes "in" and "out" but just materialized them
> along with the tmpfs_node and kept them around until someone destroys
> the tmpfs_node? vnodes have a fair amount of slop in them, but they
> aren't that big and if you're using tmpfs you presumably have enough
> memory to keep whole files in RAM anyway. In this case we could just
> provide a constructor and destructor for non-cache vnodes, decouple
> that from the lifecycle logic (and firewall it off with plenty of
> assertions and such) and a lot of the silly goes away.

Once we reach a point where this decision has to be made it may be
better to use the vnode cache for tmpfs than to make tmpfs the only
fs being different.

> - Question 2: The other obvious candidates for not using the vnode
> cache are virtual constructs like kernfs and procfs. ISTM that
> "loading" and "unloading" their vnodes is equally pointless and that
> the same solution applies. The question is: are there any other fses
> that wouldn't use the vnode cache for some other reason, such that
> this reasoning wouldn't apply? I can't think of any, and it seems like
> anything where there are genuine load and unload operations for vnodes
> vnodes should be using the cache, but that doesn't prove anything.

Kernfs and procfs already use a hash table and are candidates for
the vnode cache.

> Also, it should be vcache_get(), not intern; with all "intern"
> operations I can think of, you pass in the object to be interned,
> not just a key for it. If we had vcache_intern(), the model would be
> that you consed up a vnode, then passed it (along with the key) to
> vcache_intern(), which would then either insert it into the cache and
> return it, or destroy it and return the existing equivalent one. This
> is not really what we want.

Agreed.  And we may need vcache_lookup to lookup but not load a
node too.

> and FWIW, I favor VOP_LOAD over VFS_LOADVNODE.

I prefer a fs operation over a vnode operation to initialise a
vnode.  Using a vnode operation we had to either pass the vnode
operations vector to vache_get() which is ugly or we had to set
it from "*mp->mnt_op->vfs_opv_descs[0]->opv_desc_vector_p" before
calling VOP_LOAD().  To me it looks better to have the fs setup the
operations vector (which may not be the default one).

Could we finally agree on this interface:

vcache_get(mp, key, key_len, vpp) to lookup and possibly load a vnode.
vcache_lookup(mp, key, key_len, vpp) to lookup a vnode.
vcache_remove(mp, key, key_len) to remove a vnode from the cache.
VFS_LOAD_NODE(mp, vp, key, key_len, new_key) to initialise a vnode.

J. Hannken-Illjes - - TU Braunschweig 

Home | Main Index | Thread Index | Old Index