Subject: Re: ufs-ism in lookup(9)
To: None <wrstuden@netbsd.org>
From: YAMAMOTO Takashi <yamt@mwd.biglobe.ne.jp>
List: tech-kern
Date: 03/31/2004 08:53:05
hi,
> > > However that would mean that each mount point sits on two vnodes, one for
> > > the node and one for the parent.
> >
> > is it a problem?
>
> It seems wasteful to me. Removing vnodes is not a common event, so I don't
> see why we need to have an extra vnode lying around just to see if we are
> removing a mount point.
removing vnodes is a common event, IMO.
> > > Also, is there any way NFS Exporting a file system could cause a mount
> > > point to get renamed, thus invalidating the above cache?
> >
> > i see no reason to keep the current behaviour for such a weird usage.
> > it'd be enough to return eg. EPERM.
>
> Well, we still have to teach the NFS server to scan the mount point cache.
exactly.
> Actually this discussion has reminded me why we really can't change
> things. Fundamentally we mount file systems on vnodes, not directory-
> parent-plus-name tuples. So while we can come up with hint caches, at some
> point we positively have to look at the vnode.
>
> Consider a system with an NFS file system mounted, and a file system
> mounted over one of the subdirectories. root-on-nfs and either /usr and/or
> /var on separate file systems would be an example. Now say someone on the
> NFS server renames the directory on which we have this other fs mounted.
> Since the local file system is mounted on the vnode, it just moved.
>
> Now say someone on this system goes to remove the newly-named directory.
> The current code will do the lookup, find the NFS client vnode on which we
> have the mount point, and complain. If however we went with parent dir +
> name in rename, we would not get a match. Thus we would get into the NFS
> code's remove routine and attempt to remove a mounted-on vnode.
>
> While we could accept NFS's attempt to remove this directory (we'd trigger
> silly-rename issues as the mount structure has a reference to the
> directory, so it's not a free vnode), I think that would be sub-optimal.
> So to stay consistent, we have to move a test into nfs_remove to check and
> see if we have a vnode in-core (which we'd need anyway as we have to zap
> said node), and check to see if it's mounted-on.
do you really want to support such a weird situation? :-)
if so, i think you want to eliminate f_mntonname and do getcwd-like thing
to get the path of the mountpoint.
> Oh, another issue with keeping the path component is that we break an
> abstraction we have now. At present, only a file system knows how to
> compare path component names for files on it. Making the (dvp, component)
> cache would also require a way to have the upper layers request component
> comparison. For instance, ffs is 8-bit ascii case sensitive. HFS is case
> insenitive, as are FAT and NTFS. Short and long FAT file names add another
> twist to the mix.
hm, a good point.
> Sounds like the best thing to do for now is leave VOP_RENAME() alone.
the current "VOP_LOOKUP for dirop" method is wasteful even for ufs.
ie. in-core inode is bloated unnecessarily to store a result of VOP_LOOKUP.
i guess it's worse for more complicated directory implementations.
YAMAMOTO Takashi