Subject: Re: fsctl(2) [was: Re: Interface to change NFS exports]
To: Julio M. Merino Vidal <jmmv84@gmail.com>
From: Chuck Silvers <chuq@chuq.com>
List: tech-kern
Date: 09/16/2005 07:35:22
On Wed, Sep 14, 2005 at 10:12:44PM +0200, Julio M. Merino Vidal wrote:
> On 9/14/05, Jason Thorpe <thorpej@shagadelic.org> wrote:
> > 
> > On Sep 14, 2005, at 3:49 AM, Julio M. Merino Vidal wrote:
> > 
> > > But these are all different than mounting a file system, thus IMHO,
> > > mount(2) is the wrong place to handle them.  I really find the current
> > > approach of flags to change behavior weird.  (Specially having to
> > > execute completely different operations inside the vfs_mount hook,
> > > where one could use independent and smaller hooks.)
> > 
> > I'm ambivalent on the MNT_UPDATE thing, really.  MNT_UPDATE does have
> > "replace all previous mount options with these new ones" semantics,
> > so it seems sort of "natural" to leave it where it is... but I don't
> > really have a strong feeling either way.
> > 
> > > I'm interested in what you think about adding these features in
> > > fcntl(2).  (Note that the current implementation of fcntl(2) seems to
> > > have been designed leaving room for file system specific operations,
> > > which is what we want.)  Any comments?
> > 
> > I don't think it should be in fcntl(2).  fcntl(2) operates on
> > individual files / directories.  fsctl(2) operates on the file system
> > instance.
> 
> But we already have functionality in fcntl that does not operate on
> individual files/directories (F_CLOSEM, F_MAXFD or all the LCFN*
> commands in lfs).  I'm not saying this is right -- and IMVHO, it's
> not -- but it's already there.


(taking a little side-trip to talk about fcntl())

more precisely, fcntl() operates on file *descriptors*; ioctl() operates
on files.  there should not even need to be a VOP for fcntl() since
manipulating a file descriptor should not affect the file that the
descriptor points to, but I see that we started using this as mechanism
for LFS to do arbitrary stuff in the kernel at some point (replacing the
LFS-specific syscalls).  we really should have used ioctl() for LFS instead.

as I recall, VOP_FCNTL() was originally added by bill, I think as a mechanism
to control an HSM-type layered file system.  (at least, I think it was for
a control channel, I'm sure he'll correct me if I'm misremembering.)
I believe it was recommended at the time that he use ioctl() instead of
fcntl() for this purpose, but he added the fs-specific fcntl() stuff anyway,
for reasons that I don't quite remember but that I recall seemed bogus.

in short, I don't think having file-system-specific stuff in an interface
that's intended to control file descriptors makes much sense.  we certainly
shouldn't move further in that direction, and it would be good to eventually
replace our existing use of that mechanism with something else instead,
either ioctl() or possibly this fsctl() thing.

one mechanism that has been used before in commercial products to get
the effect of an fsctl() without adding a syscall is to just use ioctl()
on the root directory of a file system.  this was mostly for fs-specific
fs operations, though, and it doesn't seem very good to put fs-neutral
operations into the ioctl() morass as well.


so what was the original question again, just where to put the NFS export
control stuff?

the NFS export control info is not really controlling the file system being
exported, but rather it's controlling the behaviour of the NFS server.
the NFS server is somewhat unique, it's not a device and it's not a
file system, so none of the interfaces for talking to devices or files
or file systems really seems appropriate.  perhaps creating a /dev/nfsd
psuedo-device and using ioctls on that would be the cleanest way to wedge
it into the existing API model.  on the other hand, we already have an
"nfssvc" syscall, so we can add other NFS server control stuff there.

I'm with jason on wanting the mountargs stuff to become string-based.

was there any more to the original question?  I've lost track.

-Chuck