Subject: Re: Pathnames with trailing /
To: Olaf Seibert <rhialto@polderland.nl>
From: Greg A. Woods <woods@weird.com>
List: tech-kern
Date: 09/11/2003 18:16:45
[ On Thursday, September 11, 2003 at 23:31:51 (+0200), Olaf Seibert wrote: ]
> Subject: Re: Pathnames with trailing /
>
> Let me follow your reasoning. "/" is a separator. A separator separates
> (by definition) the two things left and right from it. So "a/b" is
> actually 2 parts, "a" and "b", separated by this separator.
> 
> So what is "a/"? Surely it can be only the two parts "a" and "" (the
> empty string).

No, it's not two parts.  It's one part with an additional, extraneous,
separator character.

> Is the empty string a valid name?

There is no empty string!  :-)  (the pathname is simply terminated)

Separators separate tokens, but if there's nothing to separate then they 
disappear in a poof of invisble syntactic smoke. 

The idea of "separators" at the end of a statement is the same as with
the lines in the paragraph above which have a blank space at their end.
The additional blank space doesn't change its meaning, and may not even
be visible to most readers.                                            


> If yes, then this "" entry (whatever it is) must be in the directory
> "a". Therefore "a" must be an existing directory.

No, the existing directory for "a" (or "a/") is "a/." which is clearly a
very different pathname than "a" since it has two components, not one,
and the additional component has the name "." which by convention is
linked to the same file as the directory which contains it.  So, "a/."
and "a" are expected to be the same file but they do not have the same
name.  "a", "a/", "a//", and so on are all the same name.  However they
may, or may not, represent the name of a directory.

Try to think of slashes in pathnames as if they were whitespsace.
Whitespace has the exact same purpose in C, for example.  Putting
additional whitespace at the end of statements in C doesn't change
anything, just as placing additional slashes at the end of the pathname
_should_ not change the way that pathname is interpreted.

I.e. slashes in pathnames are not like terminators (semicolons in C).

Indeed we could easily and almost mechanically modify the kernel to use
space characters to separate pathname components, though that would of
course cause some serious quoting issues for languages like Shell where
pathnames are normally expected to be unquoted strings (so you couldn't
as easily mechanically modify all scripts).  However if you did make
such a change then you would immediately see that trailing whitespace on
a pathname is meaningless -- just a waste of space.

The idea here is that if you ever encounter a place where additional
separator (whitespace) characters change the meaning of a statement (or
the preceding token) then you've botched your interpreter rather badly.

The trailing NUL byte on a pathname as a C string, just as is a
semicolon on a C statement, is conceptually a bit like a token separator
of sorts too, but it also has the additional meaning of terminating the
string (like the semicolon terminates a C statement).  Note that the
trailing slash, just as with whitespace before the semicolon, does not
terminate the pathname (nor the C statement).

    And yes, Python fans, there is a direct relationship here with
    indentation!  The leading slash conceptualy represents an indented
    statement (if you wish to think of it in that way), and it is indeed
    interpreted in a special way:  it's an indication that the pathname
    which follows is fully qualified and the file it points to is found
    by starting at the root filesystem instead of at the CWD.

There's only one root filesystem though, by definition, so additional
indentation doesn't change the meaning.  :-)

Similarly  to  the  way  additional  whitespace  doesn't  change  the  
meaning  of  the  tokens  it  separates ,  additional  slash  characters  
do  not  change  which  file  a  pathname  points  to .  

-- 
						Greg A. Woods

+1 416 218-0098                  VE3TCP            RoboHack <woods@robohack.ca>
Planix, Inc. <woods@planix.com>          Secrets of the Weird <woods@weird.com>