current-users: Re: Symlink ownership (let's go back)

Subject: Re: Symlink ownership (let's go back)
To: None <weingart@austin.BrandonU.CA, mouse@Collatz.McRCIM.McGill.EDU,>
From: Captech) <greywolf@tomcat.VAS.viewlogic.com (James Graham>
List: current-users
Date: 08/09/1995 09:25:22
I'm going to enclose a lot of quoted material here -- please bear with me.

Steven J. Dovich writes:

#: der Mouse responding to Tobias Weingartner:
#: 
#: > > I've been watching this thread a little, and every time I see another
#: > > suggestion made, I have to cringe... ;-(
#: >
#: > Why?  You don't really think they're all going to happen, surely?  (Of
#: > the three suggestions I've made - symlink 0555 bits, symlink 0222 bits,
#: > and symlink 06000 bits - even I have implemented only one (the first).)
#: > Or do you think the current setup is so perfect that no suggestions
#: > could be worthwhile?
#: 
#: And I shiver as well...
#: 
#: Why? Because the point of the POSIX standard is portable applications,
#: and I have a hard time seeing how that goal is advanced by expansion of
#: the symlink semantics beyond the point of industry consensus. If the
#: industry segments represented in the POSIX working groups could only get
#: consensus with the current minimal specification of symlink functionality,
#: what purpose is served by insisting that NetBSD must extend the semantics
#: in a way that would encourage non-portable application development?

This "minimal specification" would only serve to hurt performance.  I do
not see that it is all that minimal anyway.  More on this in a bit...

#: 
#: With respect to the mode extensions, symlinks are only a short-cut to
#: traversal of the "real" filesystem. As such all access is already
#: mediated by the modes of each path component, and adding mode
#: interpretation to symlink evaluation contributes complexity and removes
#: performance without a corresponding benefit. I categorically reject any
#: change that would let symlink permissions override the access
#: permissions of "hard" filesystem entities. That is clearly outside the
#: bounds of existing art in this area.

This is actually the first reasonable statement on the whys and wherefores
that I have seen.  It *would* be neat to enable link modes, though, and
for the reasons previously stated, even if it does extend beyond POSIX.

#: 
#: I do not claim the present design is perfect, only that it is sufficient.

The previous design was sufficient.  The current design is broken.

#: I will however summarize
#: the specification, based on the last draft balloted (P1003.1a/D12).
#: 
#: The definition of pathname resolution is modified to follow symlinks by
#: default unless otherwise noted for an individual function. Path resolution
#: in the presence of symlinks is specified in terms of string concatenation.
#: The traditional functions for dealing with symlinks (symlink, readlink,
#: and lstat) are specified. In struct stat, POSIX only specifies that
#: "st_size" and "st_mode" have meaningful information for symlinks, and that
#: the st_mode value is only meaningful as the argument to the S_ISLNK() macro.
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

I'm going to remember this one for later...


#: There are also changes to handle a few corner cases in functions like open(),
#: These cases are largely things like opening a path naming a symlink
#: (return with error if O_CREAT|O_EXCL).
#: 
#: > > My $0.02 worth are like so:  traditionally, symlinks were owned by
#: > > UID & GID zero, and had perms 777.

Actually, traditionally (4.2), symlinks were mode 777 & ~(umask), uid of
creator, gid of host directory.  4.3 changed the mode to 777, but uid
and gid were left as "traditional".  4.4/POSIX decided to munge the uid
and gid to be the exact same as those of the symlink's containing directory.

#: 
#: Yup, that is my recollection too, though I would have to find a copy
#: of the original source tapes to verify. I have always considered symlink
#: ownership and permissions a side-effect of their implementation, and not
#: otherwise used by the system. Yes I could change their values, but that
#: never affected their application or interpretation in path resolution.

Ownerships are *not* a 'side-effect' when you're dealing with sticky 
directories.  They need to be present unless POSIX also intends to
re-do directory semantics.

#: 
#: > > It makes little sense to have any sort of protection based on [the
#: > > owner and permission bits of a symlink]
#: > 
#: > I'm not sure why not; every other filesystem entity does something with
#: > its ownership and permissions information.
#: 
#: As I commented above, the effective pathname acheived by evaluating the
#: symlink already has the owner/permissions protection. Adding it to symlinks
#: would introduce a source of conflicts in access protection.

Not really; re-read what der Mouse had come up with.

#: 
#: > > Also, kludging up the directory structure makes no sense to me.  Why
#: > > do you want to hack up something beautifull?  Why masacre (sp?) a
#: > > thing of simplicity, flexibility and function?
#: 
#: A plausible explanation is that during path resolution, if you have already
#: gone to disk for the directory, why go to disk again for the content of
#: an symbolic link. Symbolic links as directory entries can be considered
#: a disk access optimization. Like all optimizations, the assumptions can
#: be invalidated in a number of ways, but those usually get swept under
#: the rug when justifying such a change..

Here's that "later" for which I was saving the highlighted reference above.

You've already gone to disk for the directory.  Okay, fine.  Now how do
you tell it's a symbolic link?  You have only an inode.  You have to go
to disk again ANYway because you have to find out what kind of inode this
is.  Okay, it's a symlink.  You've done two disk accesses so far.
Assuming the inode is cached, and that the symlink is implemented reasonably,
(meaning that "short" symlinks are kept in the inode), if you have a short
enough symlink, you don't need to do a third access -- just steal the data
from the inode, and you're done.

If you have a long symlink, you do the third access to a direct disk block,
and get the data, and you're done.

Considering that, in order to make a symlink a special kind of directory
entry, you'd need to put some stat information in the directory itself
(instead of in the inode), this would hardly be an optimization.

#: 
#: > (Massacre, I think.)  I don't.  As far as I can tell, nobody does,
#: > except non-UNIX systems that are trying to be POSIX and implemented
#: > symlinks as funny directory entries on their own.  I'm not even sure
#: > why 4.4 made symlinks unowned; from what's come across the lists here,
#: > it seems to have been an attempt to avoid making lame systems feel
#: > lame.  This is not worth even attempting IMO; considering the problems
#: > it's causing, there's no question in my mind that it should be
#: > reversed.
#: 
#: The core functionality of symlinks is simply a string that gets pathname
#: resolved before finishing the remainder of the original path. Given that,
#: all else is embellishment, and extraneous to the problem at hand. Lameness,
#: or the lack thereof, has nothing to do with it, except as a perjorative
#: intended to factionalize the discussion and deflect away from the goal
#: of supporting interfaces for portable applications.

Indeed; nonetheless, the move to re-define symlinks is going to cause
some serious grief for existing systems -- especially those which rely
on current POSIX specifications which, unless I'm badly mistaken, either
make no specification for symlinks (thus implying the current de facto
standard) or define symbolic links as they are currently implemented prior
to 4.4BSD/draft-POSIX.

#: 
#: > This leaves us with the problem of that POSIX draft which specifies
#: > that a bunch of syscalls follow terminal symlinks, thus producing the
#: > four-way dilemma I covered in another message.
#: 
#: If you like. The real story is that the POSIX draft does what has not been
#: done previously, and clearly specifies how a symlink gets interpreted by all
#: interfaces that take pathnames. It does this by establishing a blanket rule,
#: and by declaring the cases where exceptions occur. If additional exceptions
#: are needed, by all means suggest them. I will be more than happy to forward
#: them to the POSIX working group. Convince me, and I will actively lobby for
#: useful changes to the draft language. We are dealing with a document that is
#: still in ballot, so clarifications are not impossible (radical change may
#: not be well received though).

The symlink should remain in an inode, plain and simple.  To change this will,
as noted above, break existing systems.  The manner in which symlinks are
proposed to be implemented is more of a kludge than the original implement-
ation, which was a kludge to begin with.

If it weren't for the fact that symlinks enable one to reference directories
and span filesystems without creating hard filesystem graph loops, I'm certain
that I am not the only one who would suggest doing away with them outright,
thus avoiding this entire debate.

#: 
#: /sjd
#: 

					--*greywolf;