current-users: Re: Symlink ownership

Subject: Re: Symlink ownership
To: None <current-users@NetBSD.ORG, willett@math.utah.edu>
From: Captech) <greywolf@tomcat.VAS.viewlogic.com (James Graham>
List: current-users
Date: 07/31/1995 13:00:13
Lon Willett (willett@math.utah.edu) writes:

#: I prefer conceptually ownerless symlinks.  It adds simplicity to the
#: filesystem interface.

How so?  This really only applies if you take ownership away from every
other filesystem object as well.


#:  I don't think that cluttering up the kernel, NFS
#: code, tar/cpio, etc. with an understanding of symlink ownership is a
#: good idea.

It's already been done, and it's been present since (at least) 4.2 BSD.

Not that that's a good argument for keeping it, mind you (NIH springs
into mind) :-).

#:  SYMLINK, READLINK, and LSTAT is all that that is necessary.
#: I don't want LCHOWN, LCHMOD, LUTIMES, or anything else.  If symlink
#: ownership is significant in any way, then you definitely need LCHOWN
#: (e.g. root should be able to restore/move a tree using tar or cpio and
#: not mess up any access that users have).

Points here:
	* symlink()/readlink()/lstat() are truly necessary, if you're
	  installing symbolic links.

	* chown() as far as I could tell never affected the symlink
	  itself, although if, as you say, ownership were to be significant,
	  we'd need lchown().

	* lchmod() would not be necessary, since modes on a symlink are
	  never relevant.

	* lutimes() would also not be necessary -- the only time you would
	  need in a symlink would be the mtime.  Since mtime would only
	  be affected at create time, you would have the added advantage of
	  seeing precisely WHEN the link was created.
	  (Never mind that lchown() would have to change the ctime() --
	   the only reason for using mtime at all is that ls needs something.
	   I somehow don't think that having all symlinks sit there registered
	   as either the [cm]time of the directory or "16:00 Dec 31 1969"
	   would be such a hot idea with the systems administrator faction
	   of this list (which, I suspect, is at least 30% of us)).

#: 
#: Another benefit that this buys you is implementation flexibility.
#: Symlinks can be done however is convenient for your filesystem (as a
#: special kind of "inode"; with their contents stored directly in the
#: directory; using the UID/GID inode fields to store the contents of the
#: link in the inode itself; whatever).

See my earlier comment on fragmenting inodes into linknodes.  If this
doesn't make it out within a week, I'll send it again.

#: 
#: As for the problems with sticky-bit directories, I think that the real
#: source of the trouble is that sticky-bit directories are themselves a
#: hack which is not properly thought out and implemented.

This is true.  Not to mention that the sticky bit was an ill choice for
these semantics.  I would have elected to use the set-uid bit for this,
and I would have used the sticky bit to deny creation of directories.

#: 
#: Consider the following aspects of *hard* links and sticky-bit
#: directories:
#: 
#:     -- Hard links retain no trace of their creator.

True.

#: 
#:     -- A user can make a hard link from a file he doesn't own into a
#:     sticky-bit directory, and then not be able to remove this link.

True.

#: 
#:     -- A user can fill up a sticky-bit directory with hard links, making
#:     the directory ever larger, and thus using up the disk quota of the
#:     owner of the directory.

This would take a long time to do -- at least compared to trying to allocate
a file.  And stickyness of a directory has NOTHING to do with this argument.
If the directory is writable, this is possible anyway.

#: 
#:     -- In general, because a user can make hard links to files he
#:     doesn't own, the quota system is rather compromised anyway.
#:     Especially since many programs (e.g. cc) use /tmp, /var/tmp, or
#:     other publicly readable and executable directories.

Not really -- the quota system counts blocks and inodes allocated.
A hard link consumes a minimal amount of space as a directory entry and
affects neither of these criteria, your previous paragraph notwithstanding.

Take into consideration this about symbolic links, though:

	-- A user can link to a DIRECTORY.

You can't do this with a hard link unless you're the super-user, and you're
really asking for trouble if you do this, since directories with a link
count of >2 refuse to let you rmdir them, and unlink() frequently does not 
work as advertised (i.e. does not explicitly unlink the requested entry if
it's a directory).

Also, with symbolic links, you can link objects across filesystems.
You can't do that with a hard link no matter how hard you try (unless you
muck with the kernel in a very heavy way, and even then you're likely
to break everything else in the process).


#: 
#: My vote is to fix the sticky-bit hack, so that:
#: 
#:     -- A user can't make a hard link from a file he doesn't own into a
#:     directory where sticky-bit access applies.
#: 
#:     -- A user can't make a symbolic link in a directory where sticky-bit
#:     access applies.

Both of these are absolutely silly, as they place unnecessary restrictions
on potentially necessary operations.

#: 
#:     -- A user can't make a whiteout in a directory where sticky-bit
#:     access applies (although this is really a non-issue, when you think
#:     about it).

Not knowing anything about whiteouts (yet), I can't say anything here.

#: 
#: Given the above, I don't really care what goes in the owner/group fields
#: of a symlink (that is the point).  On the one hand, it would
#: occasionally be nice to see who created them, but this isn't critical
#: (and see the first note about hard links; it would occasionally be nice
#: to see who made them too, but I don't think that this should be
#: implemented in the filesystem).

If I am repeating well-known information, please forgive me.

The difference, though, is that when you make a hard link, the semantics
are well-known, well-documented and have been this way since (time_t) 0:
If you make a "hard" link you are only making a directory entry which has
the same d_fileno as some other entry >on the same filesystem<.  If you make
a hard link in your home directory to a file somewhere else (on the same FS),
and you have read access to that file, and the owner of that "somewhere else"
decides to make the _directory_ non-readable, you still have read access to 
that file, and nothing that the owner of the "somewhere else" can do will
change that (short of changing the mode of the file itself).

You *can* find all the hard links to a particular file, if you're a user
with sufficient read permissions on all the resident directories OR
on the device itself (by using find or ncheck, respectively).

If you make a symbolic link, you are not creating a true link to the file.
You can link to a directory across filesystems.  You can have a symlink
in a UFS which points to a file in an NFS-mounted DOS filesystem, and
you can still get there from here.  However if "there" gets chmod()ed shut,
you're cut off.  No questions asked.

You will also have a more difficult time locating all the symbolic links
to a particular location -- short of running "find . -type l -ls | \
grep location", you're out of luck.

Knowing the owner of the symlink would not be necessary if not for the
semantics of the sticky directory (which Berkeley came up with long before
POSIX ratified it (ratified =? "made it more ratty"?)), in which case
this whole debate would be almost moot (almost:  There's no reason to
make the UFS/FFS dirent structure any more complex than absolutely necessary).

Knowing the *time of a symlink would not be necessary if 'stat' didn't
need something to stuff in there (for 'ls' mostly), and (time_t) 0 or the
*time of the parent don't really seem to sit well with admin types (myself
included).

#  On the other hand, using the directory
#: owner/group for symlinks is a good LCD way to fill in a "meaningless"
#: field, and could continue to work consistently regardless of the
#: implementation.

There is this, but then we enter the "LCD way" of representing other file
types as well (devices et al).  What then?

I think that hairying up the dirent structure to please POSIX is (pardon
my expression) like pissing into the wind:  It's messy, futile, and will
eventually leave you -- or your system -- wet, cold and smelly all over.

Either leave symlinks as they were or fragment the inode for the symlink
purposes, but stuffing information into the directory entry is an MS-DOS
solution to a system deserving of a more elegant one.

#: 
#: --Lon Willett
#:   willett@math.utah.edu

					--*greywolf;