Subject: Re: Possible VOP_LOOKUP() interface change
To: Charles M. Hannum <root@ihack.net>
From: Bill Studenmund <wrstuden@nas.nasa.gov>
List: tech-kern
Date: 09/03/1999 14:00:54
On Fri, 3 Sep 1999, Charles M. Hannum wrote:

> One of the things that has been a repeated pain in the arse when
> hacking on VOP_LOOKUP() routines is the semantics of ISLASTCN,
> {LOCK,WANT}PARENT and cn_nameiop.  To DTRT, it is necessary for every
> usage of cn_nameiop, LOCKPARENT and WANTPARENT to also test ISLASTCN
> religiously.
> 
> This is pretty silly, in that it would be trivial to set cn_nameiop to
> LOOKUP, and turn off the *PARENT bits, in the flags passed to the file
> system lookup function.  This would simplify the code and generally
> make it less error-prone.

Ok, but how would we impliment the functionality these bits currently give
us? Namely they let the namei caller request the parent dir be returned
unlocked or locked. ?? For CREATE lookups, we really don't want the parent
dir unlocked between the lookup and the VOP_CREATE.

> Note that such a change will not break existing file systems; they
> will continue to do the extra tests, but they will be no-ops.
> 
> However, this change will break file systems that hack around with the
> path name behind namei().  The only one that does this is portal, and
> I suggest that breaking such a kluge is not only unimportant, but
> perhaps a service to the community, as it will force someone who cares
> to come up with a better abstraction.  B-)

I don't think portal will suffer too much with this, though I can't tell
without seeing what you have in mind. The only use in portal I found of
these bits is to cover the fact that portal_lookup() decided that this was
the last component when lookup() thought it wasn't. Depending on what you
do, portal could easily be fine.

Oh, portal does look at DELETE and RENAME & disalow them, but that's more
of an optimization.

> Any thoughts on this?

More details would be nice but the idea sounds fine.

I would ask that you keep the main functionality of PDIRUNLOCK around
(though in a different form is fine). It's there for layered file systems,
so that they can keep track of locking changes on the parent dir. The main
reason it's there is that there's a defect in the lookup locking protocol.
If we return an error, the parent dir's supposed to be locked. But if the
error we return results from re-locking the parent dir for some reason, we
loose - we're returning an error but the parrent's unlocked.

One other idea I'll throw out while we're changing lookup is one Bill
Sommerfeld's been advocating - doing all the intermediate directory
lookups using shared locks.

The lookup routine needs to know if it should share or exclusive lock the
returned vnode, and (for the .. case where it has to re-lock) what kind of
lock is on the parent. We either could define this by the state of
ISLASTCN (exclusive locks if set, shared otherwise), or pass explicit
setings (flags, etc.). Then when lookup notices that we're at the last
component, it can do the right thing (like upgrading the parent's lock
from shared to exclusive if needed).

This latter suggestion is orthogonal to yours, but if we're changing
VOP_LOOKUP, we might as well do it all at once. :-)

Take care,

Bill