Subject: Re: ufs-ism in lookup(9)
To: YAMAMOTO Takashi <yamt@mwd.biglobe.ne.jp>
From: Bill Studenmund <wrstuden@netbsd.org>
List: tech-kern
Date: 03/29/2004 13:09:24
--tKW2IUtsqtDRztdT
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Mon, Mar 29, 2004 at 10:19:32AM +0900, YAMAMOTO Takashi wrote:
> >=20
> > Hmmm... It could.
> >=20
> > However that would mean that each mount point sits on two vnodes, one f=
or=20
> > the node and one for the parent.
>=20
> is it a problem?

It seems wasteful to me. Removing vnodes is not a common event, so I don't=
=20
see why we need to have an extra vnode lying around just to see if we are=
=20
removing a mount point.

> > I'd rather either leave things as they are, or just put a cached-node=
=20
> > lookup in each file system + mount point check as part of the VOP_REMOV=
E()=20
> > code.
>=20
> IMO, mount points belong to the upper layer rathar than each filesystems.
> it'd be better to put the code where it logically belongs to.

Agreed. The problem though is what steps do we have to go through to get=20
the information there.

> > Also, is there any way NFS Exporting a file system could cause a mount=
=20
> > point to get renamed, thus invalidating the above cache?
>=20
> i see no reason to keep the current behaviour for such a weird usage.
> it'd be enough to return eg. EPERM.

Well, we still have to teach the NFS server to scan the mount point cache.

Actually this discussion has reminded me why we really can't change
things. Fundamentally we mount file systems on vnodes, not directory-
parent-plus-name tuples. So while we can come up with hint caches, at some=
=20
point we positively have to look at the vnode.

Consider a system with an NFS file system mounted, and a file system=20
mounted over one of the subdirectories. root-on-nfs and either /usr and/or=
=20
/var on separate file systems would be an example. Now say someone on the=
=20
NFS server renames the directory on which we have this other fs mounted.=20
Since the local file system is mounted on the vnode, it just moved.

Now say someone on this system goes to remove the newly-named directory.
The current code will do the lookup, find the NFS client vnode on which we
have the mount point, and complain. If however we went with parent dir +=20
name in rename, we would not get a match. Thus we would get into the NFS=20
code's remove routine and attempt to remove a mounted-on vnode.

While we could accept NFS's attempt to remove this directory (we'd trigger=
=20
silly-rename issues as the mount structure has a reference to the=20
directory, so it's not a free vnode), I think that would be sub-optimal.=20
So to stay consistent, we have to move a test into nfs_remove to check and=
=20
see if we have a vnode in-core (which we'd need anyway as we have to zap=20
said node), and check to see if it's mounted-on.

Since we're going to have to keep checks in the remote file systems, I'd=20
say that if we stop always doing the VOP_LOOKUP(), we just move the mount=
=20
point test into the file systems.

Oh, another issue with keeping the path component is that we break an=20
abstraction we have now. At present, only a file system knows how to=20
compare path component names for files on it. Making the (dvp, component)=
=20
cache would also require a way to have the upper layers request component=
=20
comparison. For instance, ffs is 8-bit ascii case sensitive. HFS is case=20
insenitive, as are FAT and NTFS. Short and long FAT file names add another=
=20
twist to the mix.

Sounds like the best thing to do for now is leave VOP_RENAME() alone.

Take care,

Bill

--tKW2IUtsqtDRztdT
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (NetBSD)

iD8DBQFAaJCEWz+3JHUci9cRAhBqAJ0d/Vk5kVnNlOGlD7wIq4MZ7faGIACgjPfR
29y5zvk3Aj22q68mRrMrbtk=
=2Mtv
-----END PGP SIGNATURE-----

--tKW2IUtsqtDRztdT--