Subject: Re: ufs-ism in lookup(9)
To: YAMAMOTO Takashi <yamt@mwd.biglobe.ne.jp>
From: Bill Studenmund <wrstuden@netbsd.org>
List: tech-kern
Date: 03/21/2004 19:15:23
--KFztAG8eRSV9hGtP
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Mon, Mar 22, 2004 at 09:43:30AM +0900, YAMAMOTO Takashi wrote:
> hi,

Hello.

> > > actually, cache_lookup() doesn't return the cached entry
> > > if MAKEENTRY is not set.  and ufs code depends on this behaviour.
> > > it's what i called ufs-ism.
> >=20
> > Well, given that you have patches to every other file system, it does n=
ot=20
> > seem at ALL ufs-centric. Thus "ufs-ism" is not appropriate.
>=20
> i guess that all of them are inherited from ufs.

The "guess" aspect of this thread is what's irritating me. And what makes=
=20
me nervous about your changes. If you're "guessing" about the code, you=20
really shouldn't be changing the lookup path. At all.

> > > > After your change we will always blow the
> > > > cache entry away and then do a to-disk lookup.
> > >=20
> > > it's how our lookup(9) and vnode ops works *currently*.
> > > ufs needs it, but nfs doesn't.
> >=20
> > While I agree you are correct that that's how the code works now (it is=
=20
> > quite twisty in places), it is not at all clear to me that ufs "needs" =
it.=20
>=20
> ufs needs this because ufs_remove relies upon the info stored in
> in-core inode by ufs_lookup is valid.  eg. i_offset.

Indeed. I had forgotten that part. However given your gentle reminder,
don't you see why this behavior is correct for ANY local file system that
supports hard links to files (given our name cache setup), and may be true
for other local file systems? It's a fundamental behavior, given our name=
=20
cache setup.

> > Let's start over. What exactly are you trying to do? You said something
> > about NFS doing inefficient lookups. Ok, how is the current code causin=
g=20
> > that? Let's look at all the different way's to get that.
>=20
> currently, when you do "rm somefileonnfs",
> our nfs client always issues a LOOKUP rpc before a REMOVE rpc even if
> the "somefileonnfs" is on the namecache.  it's what i'd like to change.
>=20
> as i said in the first mail, it could be achieved in nfs code.
> however, i think that it's cleaner to push what i called ufs-ism into
> ufs-like filesystems themselves.

You missed my point. You have skipped from "step A" to "step E" of your
thought process. I wanted you to work through the steps. Mainly since I
_think_ that at "step C" we can do things differently, and make both you
and me happy. However since you aren't stepping through things, it's hard
to tell.

Now that I understand the problem you're trying to address, I agree we=20
should change how things work. There is a fundamental difference between=20
local leaf file systems and remote leaf file systems, and we should push=20
this change out to them.

The part of your change I object to the most is your change to=20
vfs_lookup.c. From reading the comments, the code to disable caching in=20
the delete case is exactly correct, and the not-CREATE case seems ok=20
(though perhaps a bit questionable). docache's definition is, after all,=20
/* =3D=3D 0 do not cache last component */.

I think an issue you have implicitly pointed out is that we are being
sloppy about our use of MAKEENTRY. We not only use it as it is defined,
"/* entry is to be added to name cache */", we also use its absence to
indicate we should remove an entry from the cache while we're in
cache_lookup(). I think if we remove that behavior and retain your changes
to the local file systems then we should be fine.

Specifically, I think if we remove this part:

        if ((cnp->cn_flags & MAKEENTRY) =3D=3D 0) {
                nchstats.ncs_badhits++;
                goto remove;=20
	}

from cache_lookup(), then nfs (and other non-local leaf file systems) will=
=20
be able to use the name cache for delete & rename ops. Oh, we'll also need=
=20
to change the comment following "Name caching works as follows" in=20
vfs_cache.c.

Your other changes will still be correct, except you won't need the=20
"cnp->cn_flags &=3D ~MAKEENTRY;" parts to each one.

Thoughts?

Take care,

Bill



--KFztAG8eRSV9hGtP
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (NetBSD)

iD8DBQFAXlpKWz+3JHUci9cRAsQMAJ9XwFAFG4R4CnKHr2hisAzlE+NlSQCeJx2P
WcTPIslrpOnAkhX0v+f6FFU=
=mKBV
-----END PGP SIGNATURE-----

--KFztAG8eRSV9hGtP--