Subject: Re: Improving the Unix API
To: Francois-Rene Rideau <>
From: der Mouse <mouse@Rodents.Montreal.QC.CA>
List: tech-kern
Date: 06/27/1999 12:58:05

> He realized that the device had an immutable attribute.
> He tried to change the attribute with open() and ioctl()

As I think someone already mentioned, BSD has chflags(), which takes a

> Robert had to hand-remove the immutable flag
> (I guess, by accessing the relevant block directly).

(clri didn't work?)

> Indeed, the "open without access rights"
> is useful not only to modify attributes and do other ioctl's,
> but also to effect all operations that should be done w/o the ability
> to open for either read or write
> (fstat, funlink, ioctl, fchown, fchmod, fsync),

funlink makes no sense, unless the fd it takes is the fd of a directory
and you pass in the name of the entry to be removed - which I imagine
is not what most people will think when they think of an fd-based
variant of unlink.  unlink() operates on names, not files, after all.

I've often wanted open-with-no-access in conjunction with fchdir().
This is because you need only execute access to set your cwd to a
directory, but there's no way to get an fd on a mode-111 directory.

> and could be used with new syscalls like
> flink (make a new directory link for file given by descriptor),
> freadlink (read link from a file descriptor opened with O_NULL),
> fexec (execute the binary that we checked), etc.

freadlink() implies that open() with O_NULL has the peculiar property
that, unlike all other open()s, it doesn't follow terminal symlinks.

While I think there are ways symlinks could be improved, I don't think
this is one of them.  I can't see any use for opening a symlink except
use of write() to atomically make the link point somewhere different,
and I'd prefer to do that by making symlink() do that when the link
already exists and some appropriate condition is met.

> Of course, you'll want to be able to fcntl(fd,F_SETFL,O_RDWR)
> or something equivalent, to upgrade your access mode
> on a file you opened with O_NULL.

The security weenie in me is _really_ unsure that the ability to
increase the access modes on an open fd is a good idea.

> About namei() and large directories, Robert suggested
> that news servers, and other large databases
> (terminfo, that web cache, and many more come to my mind),
> should use special database libraries with a well-defined API
> (possibly inspired by the filesystem interface),
> rather than abuse the filesystem API as they do;

At least one news system does this now, I think - instead of keeping
each post in a separate file, it uses one huge file and does its own
space allocation out of it.

> Another problem was the ability to change the mount status of a partition
> from read-write to read-only or to unmounted,

See NetBSD (and presumably other BSD) "mount -o update,rdonly" and/or
"umount -f".  (Last I tried, the latter didn't work as it should, but
that's a matter of fixing bugs rather than introducing new features.)

> Finally, we discussed about saving _and restoring_ the state of a process,
> another hack that he did once to preserve a long-winded calculation
> from the service shutdown of a big unix computer.

I did this once, long long ago, under (I think) 4.3.  I found that I
couldn't just dump core, though I forget why.  As for the open file
descriptor question, I punted - I made the relevant call fail unless
the process had no fds open.

> By posting on all free unix kernel mailing-list I know,
> I intend to put free unices in competition as to which
> will implement these features first.

Reasonable as this sounds, I think the last thing we need is yet
another ground on which one free-unix can be doing the "nana nana boo
boo" taunt at another.  Once upon a time I would have hoped the people
involved were sufficiently mature to avoid doing that, or responding
when on the receiving end of it - and many of them *are*, but I've been
involved in this scene too long to retain any real hope that *all* of
them are.

[And replying to another message...]

> 4.4 chflags() works fine for UFS and leaves other filesystems to map
> what they can into the UFS set.  Which is bogus - immutable is not a
> UFS attribute, it's VFS one.

Perhaps, but it's still something that the underlying filesystem has to
support.  Just because the API bit definitions happen to match what FFS
filesystems save on disk doesn't mean it's inherently an FFS thing.

> As for the opening with no permissions - well, it would make *big*
> sense if we could narrow down the API and move chown(), chmod(), etc.
> into libc leaving f-variants in the kernel.

I really don't like that.  The reasons why are (1) this means you have
to have an fd free to do them; (2) it triples the number of user/kernel
crossings involved.

> Extreme variant might include {set,get}sockopt extended to files and
> doing both *stat and *ch{mod,own,flags} via that.

If done, I think the name should be changed.  They are ?etSOCKopt,
after all.  I'm not fond of this, though; it amounts to returning to
using ioctl() for the tasks - albeit with a slightly different name.

					der Mouse

		     7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
Please read the FAQ at