Subject: Re: Linux emulation and mkdir with trailing /
To: None <tech-kern@netbsd.org>
From: Greg A. Woods <woods@weird.com>
List: tech-kern
Date: 09/24/2000 21:18:37
[ On Sunday, September 24, 2000 at 18:48:59 (GMT), Christos Zoulas wrote: ]
> Subject: Re: Linux emulation and mkdir with trailing /
>
> Well, there is precedence to have a function behaving differently
> in presense of a POSIX symbol: rename(2).

Hmmm...  and that's a pretty silly one too!  Why oh why wasn't "this
implementation" simply just fixed!?!?!?!?!  The old *BSD behaviour is
bogus, IMNSHO, and I see no rationale explaining why the old *BSD
behaviour might ever be necessary.

> Changing the particular namei()
> behavior of ignoring the trailing slash affects too many system calls
> and will break a lot of userland programs.

Please define "break" and "a lot" (but see below first!  :-)a.

> A glaring example is doing
> echo */ in a directory, which with this change will return all files.
> [someone will need to visit all vestiges of globing code and take care
> of this in userland].

Thats *not* an example!!!!

That is a simple shell issue, *AND* it's an example in one (or only a
very few) program(s), *AND* no such program will not actually be broken
by such a change so far as I can see, not even /bin/csh!!!!

First off let's note some facts about Unix filenames that I'm sure
everyone will say "of course!  obviously!" to, but which are very
important to remember very accurately in this context.  First off it is
impossible for any existing filename (be it a directory filename or a
regular filename or some other special filename) to end in a '/' ('/' is
a pathname component separator, not a terminator) so a simple glob-style
matching of "*/" can never ever match anything ever returned by the
filesystem.  Also note that a file is not a directory until you either
try to chdir() into it, or you stat() it and look at its mode bits.

Traditional shells (i.e. V7 /bin/sh) don't do any special parsing and if
the filenames examined by glob the don't match the pattern given then
they're ignored.  Eg. any program using basic directory scanning and
glob matching routines this means "*/" returns "*/" and never anything
more.

Other shells which try to help the user and show some discernable
difference between directories and files will artificially append a
trailing slash on directories when it displays them as command-line
completion items.  So in order to be self-consistent they must match
"*/" with directories.  I see no inconsistency here except in the fact
that modern shells invent some bogus (i.e. non-real) syntax that is
intended to help the user out somewhat (at the risk of confusing
slightly less naive users due to this very inconsistency).  Personally I
like the fact that some shells will automatically tack on a trailing
slash when expanding a filename interactively on the command-line, and
indeed I almost always use '-F' with "ls".  However I do not really like
this feature in filename matching routines, particularly not when using
the shell as a programming language.

I.e. if a shell imposes a restriction in its filename expansion that
says a filename with a trailing slash must match a file which is a
directory, then that's OK, but this is not something that the kernel
should ever do.

A less unix-like system might reject filenames which contain multiple
slashes or filenames with trailing slashes (either always, or only if
they are not refering to directories).  However such a system would, by
definition, be less unix-like than any similar system which did not
impose these restrictions.

As for whether anything would "break" or not, well there's a note in the
pdksh-5.2.14 source that indicates even SunOS-4.1 strips trailing
slashes from all filenames.  Note though that the code which manages
this is not #ifdef'ed -- it's always there and ready to do its thing.
If NetBSD took on standard Unix behaviour then nothing would have to be
done to "fix" /bin/ksh.  I haven't looked at NetBSD's /bin/sh, but I'd
be surprised if it "broke" either, and since it would seem /bin/csh has
long had such ability even on non-*BSD system, I doubt it would "break"
either.

I wouldn't mind though if /bin/sh (and maybe even /bin/ksh) avoided
imposing any special interpretations on trailing slashes when expanding
filenames though....

> Notice that Slowaris 2.7 is still broken with respect to that: try
> echo */ in /bin using /bin/sh, /bin/csh, /bin/jsh, and /bin/ksh.

Huh?  I see nothing broken in 2.6.  Everything works *exactly* as I
would expect it to work!

	20:23 [133] $ uname -srm
	SunOS 5.6 sun4m
	20:24 [134] $ <ctrl-V>Version M-11/16/88i
	20:24 [134] $ echo */                                              
	Mail/ ftp.d/ src/ tmp/
	20:24 [135] $ sh   
	$ echo */
	*/
	$ 20:24 [136] $ csh
	% <ctrl-D>echo */
	Mail/ ftp.d/ src/ tmp/
	% <ctrl-D>20:24 [137] $
	20:24 [137] $ 
	
> Fixing this in userland is a PITA, specially when one considers
> symlinks too.

This is very true, which is why I suggested simply reverting to the
tried and true *standard* behaviour in the kernel.  ("standard" in the
sense that it matches not only recognized standards, but the one true
original defining implementation)

> Having said that, it would be nice if:
> - the emulations DTRT

My guess is that except for emulations trying to exhibit "bugs" in old
*BSD implementations, simply fixing the core code would always make
everything "DTRT".

> - if posix really says that the trailing slashes are ignored:

I'm less concerned about POSIX than I am of regaining the original unix
behaviour that I've long found to be far more "predictable" and "safe"
in practice.

> 	- provide posix_ functions [but that is a pain because
> 	  there are too many of them, and it adds a lot of crud].

Grrr....

> 	- provide a sysctl to turn this on [this is *really*
> 	  bad]

If people really want to retain the buggy old *BSD behaviour then
perhaps this is indeed the best compromise.

HOWEVER, given that one of the stated goals for NetBSD is POSIX
compliance then this flag should default to whatever is proper for
POSIX.

It is, IMNSHO, very wrong for any unix-like kernel to treat "file/" as
"file/.", regardless of whether "file" is a regular file or a directory
file.  This is really basic, really simple, "everthing's a file" stuff!
So far as I've ever encountered in nearly 20 years of playing with all
kinds of unix and unix-like systems, nothing could ever be "broken" by
always stripping all trailing slashes in the kernel, whereas I've
encountered no end of annoyances and difficulties with *BSD kernels that
do not do this.

-- 
							Greg A. Woods

+1 416 218-0098      VE3TCP      <gwoods@acm.org>      <robohack!woods>
Planix, Inc. <woods@planix.com>; Secrets of the Weird <woods@weird.com>