Subject: Re: fsctl(2) [was: Re: Interface to change NFS exports]
To: Chuck Silvers <chuq@chuq.com>
From: Bill Studenmund <wrstuden@netbsd.org>
List: tech-kern
Date: 09/19/2005 10:49:44
--ZoaI/ZTpAVc4A5k6
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Fri, Sep 16, 2005 at 07:35:22AM -0700, Chuck Silvers wrote:
>=20
> (taking a little side-trip to talk about fcntl())
>=20
> more precisely, fcntl() operates on file *descriptors*; ioctl() operates
> on files.  there should not even need to be a VOP for fcntl() since

That's not fully correct. ioctl() operates on the devices underlying a=20
file. To quote the man page:

     The ioctl() function manipulates the underlying device parameters of=
=20
special files.

And that's why we felt the need for a fcntl() VOP.

Also, we have extended fcntl() to operate on more than just the passed-in=
=20
file descriptor. Yes, the F_CLOSEM and F_MAXFD operations have to do with=
=20
_other_ file descriptors, but they are an example of not operating on just=
=20
the passed-in descriptor. So we've got (IMHO reasonable) prior-art for=20
having fcntl() do more that operate exclusively on the passed-in fd.

> manipulating a file descriptor should not affect the file that the
> descriptor points to, but I see that we started using this as mechanism
> for LFS to do arbitrary stuff in the kernel at some point (replacing the
> LFS-specific syscalls).  we really should have used ioctl() for LFS inste=
ad.

I disagree. While an fsctl() call may be a better fit, I do not think an=20
ioctl() ever will be a clean match.

> as I recall, VOP_FCNTL() was originally added by bill, I think as a mecha=
nism
> to control an HSM-type layered file system.  (at least, I think it was for
> a control channel, I'm sure he'll correct me if I'm misremembering.)
> I believe it was recommended at the time that he use ioctl() instead of
> fcntl() for this purpose, but he added the fs-specific fcntl() stuff anyw=
ay,
> for reasons that I don't quite remember but that I recall seemed bogus.

Well, as above, using an ioctl() for this would be even more bogus. :-)

The question was between overloading fcntl() and adding a new system call.=
=20
While I certainly objected to ioctl(), my feelings were not as strong=20
between a new system call and extending fcntl(), though fcntl() seemed=20
cleaner and more general-purpose. It already had the desired parameter=20
structure (file indicator, operation, data), so it seemed reasonable.

The problem for this with ioctl() is that it goes to different places for
regular files, device nodes, and pipes (see ffs_vnodeop_entries,
ffs_specop_entries, and ffs_fifoop_entries). It has to; that's its point.

However what we needed at the time was a way to send a control request to=
=20
the file system holding the file, not to the file itself. We needed=20
a call that would semanticly not branch out the way the vop_ioctl_desc=20
operators do.

Also, at the time, it was felt fcntl() used in this way could help
implement ACL operations. ACLs need to operate at exactly the same
semantic level as the call our HSM needed; for a pipe or device node, you
want to operate on the underlying inode, not the device or pipe. I admit
that our ACL implementation may be taking a different approach, so I'm not
sure how strong this motivation will turn out to be.

> in short, I don't think having file-system-specific stuff in an interface
> that's intended to control file descriptors makes much sense.  we certain=
ly
> shouldn't move further in that direction, and it would be good to eventua=
lly
> replace our existing use of that mechanism with something else instead,
> either ioctl() or possibly this fsctl() thing.

We decide what the different interfaces are intended to do, so we can=20
fully decide we are happy with fcntl() doing what it does.

If we are going to stick to existing interface definitions exclusively,=20
then ioctl() is "control device" and it is as wrong for doing these things=
=20
as is fcntl(). :-)

The problem is that we have now described operations that take place on=20
one of three different semantic levels. You can want to issue operations=20
on the internals of a file (ioctl() operating on the device backing a=20
device node), operations on the inode/vnode (what fcntl() is doing now),=20
and operations on the file system containing a node (what fsctl() would=20
do). While it may be a bit of an overload to do fsctl() work in fcntl()=20
(if we wanted to save the system call), we at least would be cleanly=20
talking to the file system we wanted to manipulate.

> one mechanism that has been used before in commercial products to get
> the effect of an fsctl() without adding a syscall is to just use ioctl()
> on the root directory of a file system.  this was mostly for fs-specific
> fs operations, though, and it doesn't seem very good to put fs-neutral
> operations into the ioctl() morass as well.

I agree that'd be gross.

Take care,

Bill

--ZoaI/ZTpAVc4A5k6
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (NetBSD)

iD8DBQFDLvo4Wz+3JHUci9cRAlFLAJ9h0x9xV6hs1wBPqCVn5BvRWSHlxACfQ1wZ
gLz1JCxrV0fAbaF5B3wX+Ko=
=iVUl
-----END PGP SIGNATURE-----

--ZoaI/ZTpAVc4A5k6--