tech-kern archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: rototilling the vnode life cycle
Date: Fri, 28 Feb 2014 17:22:11 +0100
From: "J. Hannken-Illjes" <hannken%eis.cs.tu-bs.de@localhost>
On Feb 28, 2014, at 4:41 PM, Taylor R Campbell
<campbell+netbsd-tech-kern%mumble.net@localhost> wrote:
> Actually, I think I may want to change that, so that vget can be done
> under a pserialized read -- i.e., atomics only, no locks or waits. I
> took a look at dholland's message from September and started to write
> a generic inode->vnode cache with pserialize, and realized vget needs
> to happen under it. But at least the explicit states will help to
> clarify the model for now, and will be a good fallback if I give up
> before making a pserialized inode->vnode cache.
I don't see a many reader / rare writer case here. Pserialize is
very expensive for the writer case (at the moment you even get a minimum
of 2 HZ for a pserialize_perform -- but this is a bug).
The many-reader case is lookups of cached vnodes; the writer case is
fetching a vnode you don't have already and must read from disk. It
would be nice to make the whole namei path for lookups lock-free.
Currently we can't do that because VOP_LOOKUP requires the vnode lock,
but we can dream!
Anyway, as I said, this pserialization might be a failed experiment.
We'll see. So far I think I can simplify the draft further by forcing
it to work with pserialize, though.
> I think this is actually OK. If you're a file system, and you're
> trying to grab one of your own vnodes, it's no good to get one whose
> identity will soon be changed to deadfs -- you need to make a new
> vnode for the inode you were looking for. Nor is it any good to try
> to work with one that vfs has chosen to destroy -- it's too late to
> recover that one. So in both cases immediately returning ENOENT is
> the right thing.
It is better not to have two vnodes (on dying, one creating)
representing the same inode in parallel.
Hmm, good point. I wonder whether we can fix this by creating a
concept of inode number reservations, which I think we'll want anyway
to avoid the unlock/lookup ../lock race. tmpfs has this already: it's
the tmpfs_node tn_vlock. The idea is that you take this lock when you
want to get or create a vnode for a given `inode' (tmpfs node), and
that excludes anyone else from getting that.
Of course, for inode numbers, we'd need a somewhat hairier data
structure, because we don't have a nice struct to store a mutex into.
But if we had that, we perhaps we could make vnode destruction reserve
the inode number until VOP_RECLAIM is done and it is safe to create a
new vnode for that inode number.
> Hmm. This vn_lock code may be broken. But revocation is currently
> also broken because it doesn't exclude other vnode operations, and if
> we fixed it in the obvious way so that VOP_REVOKE or vrevoke took the
> vnode lock, then xyzfs and deadfs would have to use genfs_lock anyway
> in order for it to all work out. So even if we use the LK_RETRY hack
> I believe there's more that we'll need to fix to enable revocation to
> mix with non-genfs_lock VOP_LOCKs.
Remember, VOP_LOCK is a file system operation. Every file system is
free to use its own locking -- all file systems using genfs_lock()
is just the status quo.
Right, that was my point. Even if we use the LK_RETRY hack, there is
more we'll need to do to get non-genfs_lock VOP_LOCKs working with
revoke -- if for no other reason than that revoke must exclude other
vnode operations, we have no way to do that but VOP_LOCK, and deadfs
VOP_UNLOCK has to wake the VOP_LOCKs waiting for that vnode.
> Vrecycle() has to change -- I'm currently working on it.
>
> What is it supposed to do? I don't understand from the man page
> description or the uses in-tree.
While the name is bad it gets rid of inactive vnodes while cleaning
up a file system during unmount or reload after fsck.
Why do we do vrecycle and not just vrevoke?
Home |
Main Index |
Thread Index |
Old Index