Subject: Re: CVS commit: syssrc/sys/miscfs/nullfs
To: enami tsugutomo <enami@sm.sony.co.jp>
From: Bill Studenmund <wrstuden@netbsd.org>
List: tech-kern
Date: 03/19/2002 10:48:37
On 19 Mar 2002, enami tsugutomo wrote:

> Bill Studenmund <wrstuden@netbsd.org> writes:
>
> > There are some issues I'm not sure of about exact details, but this very
> > issue is why I/we came up with the idea. :-)
>
> For example, ufs_inactive does VOP_UPDATE().  This means the time
> recorded on disk becomes the time when upper vnode is reclaimed.  How
> do you fix this?

We don't. It's not a problem. :-) The time that gets stored is in the
ctime field. The atime and mtime were set to when the last read & write
happened, when the last read & write happened.

Think about the cases that could have happened even w/o layers. That inode
could have sat for minutes on the free list, if the system were idle. If
the system were really idle (or even a suspended laptop) it could have sat
on the freelist for hours. So exactly how long the inode sat on the free
list really isn't interesting.

> > Depends on how we do inactivation and reactivation. It either uses the
> > lower vnode's lock, or we have to swap locks around. And we will likely
> > get in a case where we have two locked locks and we have to merge them.
> > That sounds like a real mess to me.
>
> Keeping reference while it is really not used is also sounds mess.
> So, which one is bigger?

How is keeping the upper to lower reference even when the upper one is on
the free list a mess? Yes, it has problems such as the ones this thread
has pointed out (deleting a file takes a while to really happen, and
vcount gives the wrong answer). But it doesn't strike me as messy.

To get vcount right, we *must* talk to vnodes layered on top of the
current one. Doesn't matter if we do the inactivate/reactivate you suggest
or don't (as I suggest) because we will need to add up the counts of upper
active vnodes. To do that, we have to have a way for a vnode to know
what's on top of it. A linked list sounds about the right thing, but
exactly how this happens can be discussed later.

Both what you propose and what I propose doing will get the count right.
The number of non-layer references to a vnode is its active count minus
the number of references from upper vnodes. vcount returns the sum of
that, itterated over all vnodes layered above.

Doesn't matter if we leave the upper->lower references when the upper
vnode is on the freelist or not, the count will be correct. Leaving the
reference will, however, save taking the lower vnode on & off the free
list and adding & removing the upper vnode from the layer list in the
lower vnode each time we want to start & stop using the upper one.
inactive -> active transitions are one of the more common things to happen
to vnodes (especially things like directories), so keeping the reference
saves comon-path work.

Then there's handling unlinking the leaf (lower layer) file. We need to
get this right, but it is an uncommon occurence in most environments
(after all it only happens to any given file once :-) . It also happens at
very well-determined times. With what you suggest, the behavior patterns
just make cleanup happen. With what I suggest, we add a new VOP which we
fire on all of the vnodes layered above the leaf (and all the ones layered
above them) to tell them to be more agressive when they inactivate the
upper vnode. So when they inactivate, they will do exactly what you (and
I) want them to; make the reference on the lower vnode disapear quickly.

> > > Of course, in upper vnode's vget(), it should vget() the cached lower
> > > vnode.
> >
> > vget() doesn't call the vnode's file system.
>
> Please rephrase me that when upper vnode is vget()'ed.

Right. But only the file system knows that there is a lower vnode that we
need to reactivate & verify. So vget() will have to call the file system
to make sure that reactivation is ok. We must do that since only the file
system knows how many vnodes need to be verified (0 for most, 1 for most
layers, and 2 for unionfs) and what they are.

So that means adding a vop which is called every time a vnode is taken off
of the free list. That sounds really inefficient. Plus for layered file
systems, the most common thing that called that vget() was the layer
itself. So we will have the layer calling vget() which is calling the
layer. That is even grosser.

Thoughts?

Take care,

Bill