Re: bin/45390: more sh/$PWD follies

To: kre%netbsd.org@localhost, gnats-admin%netbsd.org@localhost, netbsd-bugs%netbsd.org@localhost, dholland%eecs.harvard.edu@localhost
Subject: Re: bin/45390: more sh/$PWD follies
From: Robert Elz <kre%munnari.OZ.AU@localhost>
Date: Wed, 27 Oct 2021 21:55:01 +0000 (UTC)
The following reply was made to PR bin/45390; it has been noted by GNATS.

From: Robert Elz <kre%munnari.OZ.AU@localhost>
To: gnats-bugs%netbsd.org@localhost
Cc: 
Subject: Re: bin/45390: more sh/$PWD follies
Date: Thu, 28 Oct 2021 04:53:57 +0700

 apb (back in 2011) replied to most of this PR.   Since then it has languished.
 
 The first and second reported follies were user finger/memory trouble.
 
 The third:
 
  - cd is supposed to print the name of the directory it changes into
    if it is not the name given, and the shell is interactive, but it
    does not:
 
 does not match what POSIX specifies, which is:
 
 STDOUT
     If a non-empty directory name from CDPATH is used, or if the operand '-'
     is used, an absolute pathname of the new working directory shall be
     written to the standard output as follows:
 
             "%s\n", <new directory>
 
     Otherwise, there shall be no output.
 
 Notice nothing about interactive, nor "not the name given" in the sense
 used in the example given in the PR.
 
 Our man page currently says:
 
     In an interactive shell, the cd command will print out the name of
     the directory that it actually switched to if this is different
     from the name that the user gave, or always if the cdprint option
     is set.  The destination may be different either because the
     CDPATH mechanism was used or if the replace argument was used.
 
 [Of course, nothing in POSIX about a cdprint option, that is an ash shell
 extension, and irrelevant here.  The "replace argument" we copied from ksh,
 or so the comments say, and is also not in POSIX, but printing in that case
 is consistent with the objective.]
 
 I plan on changing out man page to be:
 
      In an interactive shell, or if the posix option is set, the cd
      command will print out the name of the directory that it actually
      switched to if this is different from the name that the user gave,
      or if the cdprint option is set.  The destination may be different
      because a non-empty element of the CDPATH mechanism was used, or
      because the replace argument was used, or because the directory
      parameter was specified as "-".
 
 This will make sh posix conformant in this area, if in posix mode -- almost
 all other shells simply ignore "interactive" for this, and print, whether or
 not ... but for compat with our existing behaviour, I am currently planning
 on only adding the "|| posix" to the test for interactive, not simply removing
 the latter.
 
    [Aside: what cd really tests is the 'i' option, which gets set automatically
     in interactive shells, but can also be turned on and off, with very little
     other effect, any time at all.]
 
 There will also be a small change to the way that the "CDPATH mechanism
 was used" test, but it is quite unlikely that that difference will be
 noticed by almost anyone.   Aside from the man page not mentioning the "-"
 case (it is implemented like that already), adding the test of posix mode
 as an alternative to interactive, and the minor CDPATH use test change, this
 is what we have now.
 
 Note: I am not saying that the implmentation is as it was in 2011 when
 the PR was filed, there have been changes in this area in the past decade,
 but there don't seem to be any specific source changes required any more
 just to fix anything in the PR.
 
 There is one more issue here - note that POSIX requires "an absolute
 pathname" to be written - which is clearly not always possible (and POSIX
 needs to be fixed, I plan on filing a bug report with them about it).
 Most other shells simply write the (usually in this case) relative path
 name used (not the user's arg, if it was that, there would be no output,
 so we're in a case where CDPATH was used - the "-" and replace cases both
 result in absolute path names, so will rarely succeed, without allowing
 the path name to be discovered).   Some shells (sometimes) write an error
 or warning message, but no path name in that case (that's what we have
 been doing), zsh pretends a relative path is absolute, and writes that
 (which is beyond dumb, and probably just a bug) and yash hopes that PWD
 (if set) was correct, and uses that to deduce the new absolute path name
 (which is clever, though only a guess, and more work than this obscure
 case is worth).   I plan on copying most of the other shells and writing
 whatever path we have available (the path actually passed to chdir(2))
 when we cannot determine the absolute path.   My guess is (because of the
 divergence of what happens) POSIX will end up saying that it is unspecifed
 what, if anything, will be output in this case.
 
 
 The fourth folly:
 
  - pwd -L is not supposed to print the cached internal value (which is
    related to but separate from $PWD) unless it names the current
    directory. However, at least in some circumstances it will, such as
    this:
 
 As apb said, that one is a bug.   I have that fixed in my private sources
 (not yet committed).   As the example in the PR might indicate, it isn't
 a bug that anyone is going to encounter very often!
 
 Also on this one, relating to the note in the PR
 	the cached internal value (which is
         related to but separate from $PWD)
 that is correct, they are separate, but they're only ever different
 if the user (or script) has modified PWD - and POSIX says that if that
 happens, what cd and pwd end up doing is unspecified, so the difference
 isn't really material.   This paragraph isn't really material to the PR
 either of course, just an observation.
 
 And the fifth (and final?) folly:
 
  - cd is supposed to both change to the named directory and update
    $PWD with it, but it does not always do this correctly:
 
 This one was fixed as part of an earlier cd cleanup (sometime before
 NetBSD 8 I believe), and the example given seems to work correctly
 to me in all currently supported shells.
 
 
 There's one more issue in the PR, (perhaps not classified as a folly,
 that isn't clear):
 
 	Also, the man page is not explicit about what "update PWD with the
 	specified directory" actually means if the specified directory is not
 	absolute. The wording needs to be improved.
 
 Our man page currently says (I didn't check what it said in 2011):
 
      The -P option instructs the shell to update PWD with the specified
      physical directory path and change to that directory.  This is the
      default.
 
 I agree that is (still) not as clear as it could be, and propose changing
 it to be:
 
      The -P option (which is the unalterable default in this sh)
      instructs the shell to change to the directory specified (or
      determined) and if successful update PWD with the new physical
      directory path.  That is the path name, not traversing any
      symbolic links, of the altered working directory of the shell.
 
 (The man page also says that we do not support the -L option, I have no
 plans to alter that, either in the code or the man page, it also mentions
 that OLDPWD gets updated, which needs no alteration.)
 
 In the forthcoming edition, POSIX is going to add a new option, -e, to cd.
 This is already supported by a few shells (including FreeBSD's, but also
 bash and mksh), its purpose is to allow cd to exit(1) if it successfully
 changes directory, but is unable to determine the new path for PWD.
 Without this option, cd is required to exit(0) if the directory is changed
 (regardless of what happens to PWD, or any output that might be printed).
 
 I have implemented this new option, which will be documented as:
 
       The -e option alters the interpretation of the exit status.  cd
       will exit with status 0 upon complete success.  If the directory
       was successfully changed, but PWD was unable to be updated, cd
       will exit with status 1 if the -e option was given, and status 0
       otherwise.  Upon any other error, including usage errors, and
       failing to successfully change directory, cd will exit with status 2.
 
 POSIX is not as specific on what the status will be, without -e, our "2"
 is just required to be >0, with -e, it is required to be >1 - as we always
 use status 2 for internal shell detected errors, and 2>0 and 2>1 this all
 just magically works!
 
 Lastly, there has been some discussion on the austin group lists about
 forcing (most) utilities that write to stdout to actually check the status
 of that output and exit(1) on write failure.   cd would not be one such
 (errors writing the new directory name will continue to be ignored),  but
 pwd is included.
 
 The effect is that very few POSIX specified utilities which write to
 stdout will be able to simply
 	do { stuff; } while (!finished); printf(whatever); exit(0);
 they will be required to (if no other errors have occurred) insert
 	fflush(stdout);
 	if (ferror(stdout))
 		 fprintf(stderr, "output error\n"), exit(1);
 before the
 	exit(0);
 
 [I believe some versions of linux libc stdio install an atexit() routine
 which does the all this, so there is no need to modify the applications, but
 that kind of hidden alteration to the exit status is evil.   It obviously
 doesn't help with shell builtins, for which there is no real exit().]
 
 IMO this is going too far (I don't much care about pwd, having it exit(1)
 in case of a write error is easy, and it isn't hard to argue that
 echo and printf should be the same (the shell's builtin echo already does
 that, has for a long time, though /bin/echo doesn't).   But for most commands
 which do something, and merely incidentally write output, it isn't really.
 
 This is relevant (just barely) here, as it is related to the shell's pwd
 which is one of the issues of the PR...
 
 I did a (private) change to the shell's pwd to make this happen for it,
 when it was being discussed on the austin list (it's just a couple of lines)
 but as it is intended to apply to almost everything, I thought I'd make it
 more general, and apply to all of the sh builtin commands (if nothing is
 written to stdout, flushing it and checking for error is harmless).  But
 then I started seeing all the exceptions that would need to be handled for
 this to be sane ... I thought of modifying our (shell source) builtins
 description file (unless you look at sh sources you never see this) to add
 a new option for "check stdout" which itself would be easy - but that change
 then needs to be passed back to where the builtin is being executed, and
 doing that would require changes all over.   So I just gave up, and removed
 everything related.   I could put back the pwd check though if it is considered
 useful -- personally I have never seen a script check the exit status of
 any command which simply writes info to stdout, nor are write errors to
 stdout almost ever experienced (other than EPIPE, which usually turns into
 SIGPIPE, and so is rarely seen as an exit status).   This issue was raised
 by someone doing torture tests on a linux system running
 	pwd >/dev/full; echo $0
 and getting 0 (from mksh I think on the system in question).   (/dev/full
 simply returns ENOSPC to any write attempt, no idea what it does on reads).
 
 I will wait a day or two to see if there are any comments on all of this
 before committing the changes referred to above, or perhaps changing some
 of them first.
 
 kre
Prev by Date: NetBSD Nightly Trouble Ticket Report
Next by Date: Re: misc/56468: Boot countdown display glitch
Previous by Thread: Re: bin/45390: more sh/$PWD follies
Next by Thread: Re: kern/45352
Indexes:
Home | Main Index | Thread Index | Old Index