Subject: Re: ufs-ism in lookup(9)
To: YAMAMOTO Takashi <yamt@mwd.biglobe.ne.jp>
From: Bill Studenmund <wrstuden@netbsd.org>
List: tech-kern
Date: 04/05/2004 17:10:50
--AsxXAMtlQ5JHofzM
Content-Type: multipart/mixed; boundary="RhUH2Ysw6aD5utA4"
Content-Disposition: inline


--RhUH2Ysw6aD5utA4
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Mon, Apr 05, 2004 at 10:15:56PM +0900, YAMAMOTO Takashi wrote:
> > My thought is that if we make your changes and then latter add common=
=20
> > routines, we won't be that far from where we are now. It will be like w=
e=20
> > take eight steps north then seven steps south. So why not just make the=
=20
> > one step now and save a lot of grief?
>=20
> i don't understand why you're objecting so much against details
> while the result should be similar as you say...

I have a number of reasons.

1) Everything I understand we need for "common routines" can be achieved
with our current name cache with minor modifications. So if we make the
change you propose, when we later move everything back to common code, we
will be reverting most of this change.

2) Making each fs do the work individually makes our code less=20
maintainable. If it comes to pass we _need_ to do that, then we shall. But=
=20
I don't see (yet) that we do.

3) You are proposing changes to code that I think is exceptionally key,=20
and you aren't making good arguements for the changes. You made a good=20
case for needing better semantics for remote file systems, but the cases=20
for the other changes haven't been as strong.

> as i said first, what i'd like do is pushing fs-dependent things
> into their own code.  pushing the code into them, and then introduce
> common subroutines if needed, seems like a straightforward way to me.

But they aren't fs-dependent. We have 3 local and 2 remote file systems
(well lookup routines, lfs and ffs use the same code here) that can remove
files and use our name cache. We have two or three local and one remote fs
in the wings (hfs, ntfs, UDF, and afs respectively). Also, someone could
teach coda to use our name cache. In total, that's from five to ten file
systems. I do not think we have five to ten cache behaviors.

I think we have two, remote and local.

I agree that which class a file system falls into is its own choice.=20
However I think each fs will choose either remote semantics or local=20
semantics.

Since we have two classes of behavior, I think that common code can be=20
adapted now to handle both behaviors.


Let me phrase the question differently. How do you think the caching needs
of msdosfs, ext2fs, and lfs/ffs differ? How do you think the caching needs=
=20
of nfs, smbfs, coda, and afs differ in this matter (specifically name=20
caching)?

If we really do come up with four or six or more caching behaviors, then I
will agree we should farm it out. But I really haven't seen more than
remote vs local so far.

> > I mean what really is wrong with what we have now? The fact that=20
> > cache_lookup() will remove entries if MAKEENTRY is not set is wrong. We=
=20
> > want to remove it for remote file systems, and if we do we want to twea=
k=20
> > local file systems as you did in your initial patch. Since we will now =
get=20
> > into the negative-cache-entry case for DELETE and RENAME operations, we=
=20
> > need to define what we want to have happen there. We also need the chan=
ges=20
> > you proposed to nfs and smbfs.

For now I have decided to let the DELETE and RENAME cases return the=20
negative cache entry's presence. The file system can decide what it wants=
=20
to do with the entry.

> > Other than that, what do we need to do? What semantics do we need to ad=
d=20
> > or change?
>=20
> don't forget about "CREATE on negative cache" case.

Ok. Right now, the in-tree cache_lookup() removes a negative entry on the
CREATE case.  I think that is the correct behavior. Do you think we need=20
to do something different?

> > Put another way, I think we can change our current name cache routines =
to=20
> > support remote file system semantics in addition to supporting local fi=
le=20
> > system semantics. I think these changes will be smaller than the change=
s=20
> > you just proposed.
>=20
> how in particular?

I'll attach a diff. To be honest, it's your initial diff with the one
change I wanted made to it. It should fix everything I understand we have
wrong. What behaviors do you want to change about it?

Take care,

Bill

--RhUH2Ysw6aD5utA4
Content-Type: text/plain; charset=us-ascii
Content-Description: Proposed patch
Content-Disposition: attachment; filename=diffie
Content-Transfer-Encoding: quoted-printable

Index: fs/msdosfs/msdosfs_lookup.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
RCS file: /cvsroot/src/sys/fs/msdosfs/msdosfs_lookup.c,v
retrieving revision 1.3
diff -u -r1.3 msdosfs_lookup.c
--- fs/msdosfs/msdosfs_lookup.c	29 Jun 2003 22:31:09 -0000	1.3
+++ fs/msdosfs/msdosfs_lookup.c	6 Apr 2004 00:01:49 -0000
@@ -116,6 +116,7 @@
 	int wincnt =3D 1;
 	int chksum =3D -1, chksum_ok;
 	int olddos =3D 1;
+	boolean_t toremove;
=20
 	cnp->cn_flags &=3D ~PDIRUNLOCK; /* XXX why this ?? */
 	flags =3D cnp->cn_flags;
@@ -140,8 +141,9 @@
 	if ((error =3D VOP_ACCESS(vdp, VEXEC, cnp->cn_cred, cnp->cn_proc)) !=3D 0)
 		return (error);
=20
-	if ((flags & ISLASTCN) && (vdp->v_mount->mnt_flag & MNT_RDONLY) &&
-	    (cnp->cn_nameiop =3D=3D DELETE || cnp->cn_nameiop =3D=3D RENAME))
+	toremove =3D (flags & ISLASTCN) &&
+	    (cnp->cn_nameiop =3D=3D DELETE || cnp->cn_nameiop =3D=3D RENAME);
+	if (toremove && (vdp->v_mount->mnt_flag & MNT_RDONLY))
 		return (EROFS);
=20
 	/*
@@ -150,9 +152,18 @@
 	 * Before tediously performing a linear scan of the directory,
 	 * check the name cache to see if the directory/name pair
 	 * we are looking for is known already.
-	 */
-	if ((error =3D cache_lookup(vdp, vpp, cnp)) >=3D 0)
-		return (error);
+	 *
+	 * Unless of course we're going to remove the entry. Then we
+	 * need to do the linear scan anyway. Take the opportunity
+	 * to clean up the cache entries.
+	 */
+	if (toremove) {
+		cache_purge1(vdp, cnp, 0);
+		cnp->cn_flags &=3D ~MAKEENTRY;
+	} else {
+		if ((error =3D cache_lookup(vdp, vpp, cnp)) >=3D 0)
+			return (error);
+	}
=20
 	/*
 	 * If they are going after the . or .. entry in the root directory,
Index: fs/smbfs/smbfs_vnops.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
RCS file: /cvsroot/src/sys/fs/smbfs/smbfs_vnops.c,v
retrieving revision 1.42
diff -u -r1.42 smbfs_vnops.c
--- fs/smbfs/smbfs_vnops.c	22 Mar 2004 16:40:48 -0000	1.42
+++ fs/smbfs/smbfs_vnops.c	6 Apr 2004 00:01:50 -0000
@@ -622,6 +622,8 @@
=20
 	VN_KNOTE(ap->a_vp, NOTE_DELETE);
 	VN_KNOTE(ap->a_dvp, NOTE_WRITE);
+	if (!error)
+		cache_purge1(dvp, cnp, 0);
 	if (dvp =3D=3D vp)
 		vrele(vp);
 	else
@@ -650,7 +652,7 @@
 	struct vnode *fdvp =3D ap->a_fdvp;
 	struct vnode *tdvp =3D ap->a_tdvp;
 	struct componentname *tcnp =3D ap->a_tcnp;
-/*	struct componentname *fcnp =3D ap->a_fcnp;*/
+	struct componentname *fcnp =3D ap->a_fcnp;
 	struct smb_cred scred;
 #ifdef notyet
 	u_int16_t flags =3D 6;
@@ -707,11 +709,11 @@
 		VN_KNOTE(fvp, NOTE_RENAME);
 	}
=20
-	if (fvp->v_type =3D=3D VDIR) {
-		if (tvp !=3D NULL && tvp->v_type =3D=3D VDIR)
-			cache_purge(tdvp);
-		cache_purge(fdvp);
-	}
+	if (fvp->v_type =3D=3D VDIR)
+		cache_purge(fdvp); /* we need to flush DOTDOT entry */
+	else
+		cache_purge1(fdvp, fcnp, 0);
+	cache_purge1(tdvp, tcnp, 0);
=20
 	smbfs_attr_cacheremove(fdvp);
 	smbfs_attr_cacheremove(tdvp);
Index: kern/vfs_cache.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
RCS file: /cvsroot/src/sys/kern/vfs_cache.c,v
retrieving revision 1.55
diff -u -r1.55 vfs_cache.c
--- kern/vfs_cache.c	5 Apr 2004 10:20:52 -0000	1.55
+++ kern/vfs_cache.c	6 Apr 2004 00:01:51 -0000
@@ -170,6 +170,11 @@
  * If the lookup determines that the name does not exist (negative cachein=
g),
  * a status of ENOENT is returned. If the lookup fails, a status of -1
  * is returned.
+ *
+ * Note that local file systems (ffs, lfs, ext2fs, msdosfs, etc.) typically
+ * don't call us for DELETE and RENAME operations on the last path
+ * component as they use the directory lookup to also gather information
+ * used for the delete or rename operation.
  */
 int
 cache_lookup(struct vnode *dvp, struct vnode **vpp, struct componentname *=
cnp)
@@ -196,10 +201,7 @@
 		nchstats.ncs_miss++;
 		goto fail_wlock;
 	}
-	if ((cnp->cn_flags & MAKEENTRY) =3D=3D 0) {
-		nchstats.ncs_badhits++;
-		goto remove;
-	} else if (ncp->nc_vp =3D=3D NULL) {
+	if (ncp->nc_vp =3D=3D NULL) {
 		/*
 		 * Restore the ISWHITEOUT flag saved earlier.
 		 */
Index: nfs/nfs_vnops.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
RCS file: /cvsroot/src/sys/nfs/nfs_vnops.c,v
retrieving revision 1.191
diff -u -r1.191 nfs_vnops.c
--- nfs/nfs_vnops.c	5 Apr 2004 10:44:09 -0000	1.191
+++ nfs/nfs_vnops.c	6 Apr 2004 00:01:55 -0000
@@ -1776,6 +1776,7 @@
 	PNBUF_PUT(cnp->cn_pnbuf);
 	if (!error && nfs_getattrcache(vp, &vattr) =3D=3D 0 &&
 	    vattr.va_nlink =3D=3D 1) {
+		cache_purge1(dvp, cnp, 0);
 		np->n_flag |=3D NREMOVED;
 	}
 	NFS_INVALIDATE_ATTRCACHE(np);
@@ -1888,11 +1889,11 @@
=20
 	VN_KNOTE(fdvp, NOTE_WRITE);
 	VN_KNOTE(tdvp, NOTE_WRITE);
-	if (fvp->v_type =3D=3D VDIR) {
-		if (tvp !=3D NULL && tvp->v_type =3D=3D VDIR)
-			cache_purge(tdvp);
-		cache_purge(fdvp);
-	}
+	if (fvp->v_type =3D=3D VDIR)
+		cache_purge(fvp); /* we need to flush DOTDOT entry */
+	else
+		cache_purge1(fdvp, fcnp, 0);
+	cache_purge(tdvp, tcnp, 0);
 out:
 	if (tdvp =3D=3D tvp)
 		vrele(tdvp);
Index: ufs/ext2fs/ext2fs_lookup.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
RCS file: /cvsroot/src/sys/ufs/ext2fs/ext2fs_lookup.c,v
retrieving revision 1.28
diff -u -r1.28 ext2fs_lookup.c
--- ufs/ext2fs/ext2fs_lookup.c	7 Aug 2003 16:34:26 -0000	1.28
+++ ufs/ext2fs/ext2fs_lookup.c	6 Apr 2004 00:01:56 -0000
@@ -289,7 +289,7 @@
 	int flags =3D cnp->cn_flags;
 	int nameiop =3D cnp->cn_nameiop;
 	ino_t foundino;
-
+	boolean_t toremove;
 	int	dirblksize =3D VTOI(ap->a_dvp)->i_e2fs->e2fs_bsize;
=20
 	bp =3D NULL;
@@ -305,8 +305,9 @@
 	if ((error =3D VOP_ACCESS(vdp, VEXEC, cred, cnp->cn_proc)) !=3D 0)
 		return (error);
=20
-	if ((flags & ISLASTCN) && (vdp->v_mount->mnt_flag & MNT_RDONLY) &&
-	    (cnp->cn_nameiop =3D=3D DELETE || cnp->cn_nameiop =3D=3D RENAME))
+	toremove =3D (flags & ISLASTCN) &&
+	    (cnp->cn_nameiop =3D=3D DELETE || cnp->cn_nameiop =3D=3D RENAME);
+	if (toremove && (vdp->v_mount->mnt_flag & MNT_RDONLY))
 		return (EROFS);
=20
 	/*
@@ -315,9 +316,18 @@
 	 * Before tediously performing a linear scan of the directory,
 	 * check the name cache to see if the directory/name pair
 	 * we are looking for is known already.
-	 */
-	if ((error =3D cache_lookup(vdp, vpp, cnp)) >=3D 0)
-		return (error);
+	 *
+	 * Unless of course we're going to remove the entry. Then we
+	 * need to do the linear scan anyway. Take the opportunity
+	 * to clean up the cache entries.
+	 */
+	if (toremove) {
+		cache_purge1(vdp, cnp, 0);
+		cnp->cn_flags &=3D ~MAKEENTRY;
+	} else {
+		if ((error =3D cache_lookup(vdp, vpp, cnp)) >=3D 0)
+			return (error);
+	}
=20
 	/*
 	 * Suppress search for slots unless creating
Index: ufs/ufs/ufs_lookup.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
RCS file: /cvsroot/src/sys/ufs/ufs/ufs_lookup.c,v
retrieving revision 1.55
diff -u -r1.55 ufs_lookup.c
--- ufs/ufs/ufs_lookup.c	6 Mar 2004 06:54:12 -0000	1.55
+++ ufs/ufs/ufs_lookup.c	6 Apr 2004 00:01:57 -0000
@@ -133,6 +133,7 @@
 	const int needswap =3D UFS_MPNEEDSWAP(ap->a_dvp->v_mount);
 	int dirblksiz =3D DIRBLKSIZ;
 	ino_t foundino;
+	boolean_t toremove;
 	if (UFS_MPISAPPLEUFS(ap->a_dvp->v_mount)) {
 		dirblksiz =3D APPLEUFS_DIRBLKSIZ;
 	}
@@ -155,8 +156,9 @@
 	if ((error =3D VOP_ACCESS(vdp, VEXEC, cred, cnp->cn_proc)) !=3D 0)
 		return (error);
=20
-	if ((flags & ISLASTCN) && (vdp->v_mount->mnt_flag & MNT_RDONLY) &&
-	    (cnp->cn_nameiop =3D=3D DELETE || cnp->cn_nameiop =3D=3D RENAME))
+	toremove =3D (flags & ISLASTCN) &&
+	    (cnp->cn_nameiop =3D=3D DELETE || cnp->cn_nameiop =3D=3D RENAME);
+	if (toremove && (vdp->v_mount->mnt_flag & MNT_RDONLY))
 		return (EROFS);
=20
 	/*
@@ -165,9 +167,18 @@
 	 * Before tediously performing a linear scan of the directory,
 	 * check the name cache to see if the directory/name pair
 	 * we are looking for is known already.
+	 *
+	 * Unless of course we're going to remove the entry. Then we
+	 * need to do the linear scan anyway. Take the opportunity
+	 * to clean up the cache entries.
 	 */
-	if ((error =3D cache_lookup(vdp, vpp, cnp)) >=3D 0)
-		return (error);
+	if (toremove) {
+		cache_purge1(vdp, cnp, 0);
+		cnp->cn_flags &=3D ~MAKEENTRY;
+	} else {
+		if ((error =3D cache_lookup(vdp, vpp, cnp)) >=3D 0)
+			return (error);
+	}
=20
 	/*
 	 * Suppress search for slots unless creating

--RhUH2Ysw6aD5utA4--

--AsxXAMtlQ5JHofzM
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (NetBSD)

iD8DBQFAcfWKWz+3JHUci9cRAse8AJsEGHv8rL0T0OT6dlvxwWLB5OtjuACfQYVj
rIoD27Qojo+uy70tvVTE8Ks=
=cK65
-----END PGP SIGNATURE-----

--AsxXAMtlQ5JHofzM--