tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Vnode API change: add global vnode cache

On Thu, May 01, 2014 at 06:49:41PM +0200, J. Hannken-Illjes wrote:
 > > ok, in that case I think the only issue with the patch I looked at
 > > Sunday (was it that long ago already?) is that it does duplicate the
 > > getnewvnode logic -- if you've since fixed that I don't think I have
 > > further substantive objections.
 > What do you mean with "duplicate the getnewvnode logic" here?
 > Vfs_busy()/vfs_unbusy() and vfs_insmntque()?

ultimately, what Taylor was objecting to -- he convinced me last
weekend that it was a valid objection, but I don't have time to look
at the details right now :-/

 > > Having a lock and waiting on a cv and state bit are exactly
 > > equivalent, except that if you have a lock,
 > >  - neither the state nor the mechanism is exposed.
 > What state is exposed by "wait and retry"?

The state you're waiting on. If the state in question is that the
vnode pointer inside the table is null, I suppose then it's decently
hidden away.

 > > I would consider that an advantage, but it's not a killer.
 > > 
 > > I don't think holding the table lock (or even a lock on part of the
 > > table, like one hash chain) while loading a vnode is a good idea,
 > > because a layer vnode is likely to come back to the table while
 > > loading.
 > I never presented a patch where the node got loaded with the table lock
 > held.

ok good, I wasn't sure exactly what you meant and wanted to make sure
I'd addressed all the cases.

 > > Hmm. I think one of the reasons I had a separate lock (and a refcount
 > > on the key structure) in my scheme rather than just setting the vnode
 > > pointer to null is to avoid a race on reclaim or revoke: to reclaim
 > > you want to lock the key, unlock the table, do reclaim, and only then
 > > null the vnode pointer and unlock the key. (Then you relock the table,
 > > look up the key again, and take it out if the vnode pointer is still
 > > null.) If you only null the vnode pointer before running reclaim you
 > > either have to hold the table lock during reclaim (almost certainly
 > > not a good idea -- just yesterday in another context I was dealing
 > > with a fs that needed to load a vnode when reclaiming another) or you
 > > have a race where the vnode can be reloaded while reclaim is still in
 > > progress.
 > What is the difference between a locked cache node and a cache node
 > with NULL vnode pointer (where the vnode pointer gets examined or
 > changed with the table lock held)?

My scheme was trying to address the fact that you need to hold the
vnode lock to call VOP_RECLAIM and it isn't a good idea to let the
table lock couple to the vnode lock.

[various rambling deleted]

Your scheme does this too, and on further reflection, I think your
scheme is better.

IIUC your scheme is: lock the table, look up the key, sleep while it's
present with a null vnode pointer; then either enter a key with a null
vnode pointer, unlock the table, load, relock the table, relookup, set
the vnode pointer, broadcast, unlock; or null the vnode pointer,
unlock the table, reclaim, relock the table, relookup, remove the key,
broadcast, unlock.

(This is by deduction rather than by reading the code, so maybe I've
got it all wrong. But this is the scheme I think is better than mine.)

 > > I don't think that's necessary. For vnodes belonging to virtual
 > > objects, there's very little to be gained by counting them under
 > > maxvnodes or cycling them "out of memory". The actual underlying
 > > objects can't go away and the vnode itself is just a fairly small
 > > administrative structure.
 > Sounds reasonable -- but we are not at a point to make this decision.

I think we are; we aren't at the point to *do* it, but knowing more or
less where we want to go I think we can make the design decision.

 > >> Agreed.  And we may need vcache_lookup to lookup but not load a
 > >> node too.
 > > 
 > > When would one do that? I suppose there are probably reasons, but I
 > > can't think of any offhand. It's going to inherently be a race, also,
 > > since if it returns NULL or ENOENT or whatever it does if the vnode
 > > isn't loaded, the vnode could get loaded by the time the caller sees
 > > this. Consequently I suspect that most of the uses that someone might
 > > think they want would actually be wrong. :-/
 > At least union will need something like this.  The key of a union
 > node may change and there must be a way to "test" a key.  The file
 > system will have to serialise to prevent races.

Ew. It's been a long time since I looked at onionfs and I've blocked
most of it out. How does this happen? It probably needs to be able to
look up its own vnodes by the keys or identities of either the upper
or lower vnodes under it, but I'd think it'd be better served to
manufacture its own keys and have its own table for the layer
translation. That or we could have it enter both and allow the vnode
cache to point to multiple references of the same vnode, but that
seems like a can of worms.

How does onionfs assign its inode numbers?

On the other hand, as Taylor noted, msdosfs needs to be able to change
the key of a vnode, so maybe we should just expose such a function. It
should be relatively straightforward as it doesn't need to do anything
besides lock and shuffle the table.

 > But it may be better to not expose vcache_lookup() before we have
 > a need for it.


 > > I would prefer VFS_LOADVNODE rather than VFS_LOAD_NODE, [...]
 > Ok.


 > > What calls vcache_remove? Is this the external interface for revoke?
 > > (Reclaim should be triggered from inside the cache, at least in the
 > > long run.) We need a way for the fs to mark vnodes as not worth
 > > caching any more (e.g. because they've been unlinked), so they get
 > > reclaimed as soon as the refcount hits zero, but I don't think
 > > "remove" gives the right semantics.
 > The cache in its current form does not change the vnode life cycle.
 > So vcache_remove() is called from the file systems reclaim operation.

Right, ok.

 > This will likely change once the vnode cache becomes responsible for
 > the life cycle -- but one step after the other please.

Yes indeed!

David A. Holland

Home | Main Index | Thread Index | Old Index