Proposal for several minor changes to sh

To: tech-userlevel%netbsd.org@localhost
Subject: Proposal for several minor changes to sh
From: Robert Elz <kre%munnari.OZ.AU@localhost>
Date: Wed, 16 Mar 2016 12:12:40 +0700
I have just sent away patches for PRs 19832 35423 (those are the same thing)
some of 50958 (the ?: operator, not assignment ops yet) 50959 and 50960
and to use the shell's internal syntax tables rather than <ctype.h> for
parsing purposes (deciding what is a legal var or function name is the
most common use.)

So, with that mostly out of the way, I have a few small enhancements I'd
like to propose.   This message is that.   Some of these are already 
implemented in my sources (none have been placed in CVS yet) others not.

First, it has irritated me for ages there is no way to know that it is
the NetBSD shell that is running - many (but not all) other shells
implement one, or more, predefined sh variables that can be tested
(and which can give other info about the shell) - NetBSD's sh does not.

I am proposing adding that (or for this one, you can assume that I will
be adding that, as it costs essentially nothing and is useful.)

However, I have no particular concern what the variable is called (bash uses
BASH, BASH_VERSION and more, ksh uses KSH_VERSION, zsh uses SZH_VERSION and
more) nor what its value is to be.  I do think it should be read-only.

For now, I am using NBSD_SHELL as the variable name (NBSH_VERSION would be
another possibility) and have its value being the shell version number.
Suggestions for other names (or votes between those two) are sought.

That latter part might surprise some of you "what version number??" -- well,
I invented one (or rather I invented a scheme for making them).  Again,
there is no particular importance to this, it is just a #define in a (new)
.h file (which contains nothing else), so suggestions for different schemes
are welcome.

The scheme I picked (tentatively, for now, no-one but me has ever seen
it, so changing is no problem) is to number the version of the shell after
the NetBSD release version in which this sh version first appeared (that
is, n.m (so 7.0 or 6.1, not 6.1.5 or anything like that) and then to that
append one more number which is the shell update (the one actually in the
release would be 7.1.0 or 8.0.0 and then until the next release, significant
changes to the shell would result in the last value incrementing).
I am using 7.99.1 currently (7.99.0 is whatever version is in NetBSD current 
(7.99) just before this scheme is implemented, and then 7.99.1 is the
next one, the first that has the version number explicit.)

Even if we don't decide to put a version number in an identifying variable,
I'd like to have one (ie: invent a scheme for version numbers) because once
a few more fixes are made, I think it would be a good idea to put the NetBSD
sh into pkgsrc (shells/nbsh or something) so that others can get the benefit
of what is becoming quite a useable shell (the same as is done for nbftp
and I think a few more) and for that, we really need some kind of version
id, even if it only appears in the tarball name, and pkg name.)

The next change is one inspired by the above - as well as being read only, I'd
like to prevent the magic variable from being exported, doing so just makes
things confusing.   Doing that internally (and privately for that var alone)
in the shell is trivial, but this seems like it could be a more generally
useful change, so I am suggesting adding a mechanism (similar to readonly)
which would allow scripts to make any variable non-exportable.

Non-exportable here just means that
	export VAR
won't work, it would not prevent
	VAR=foo some-commmand
from putting VAR in the environment of some-command (nor if it is done via
the env command) - anything done that explicitly should be allowed, I'd
just like to prevent "sh -a" from exporting the variable, and similar.

If this seems like a good idea, I will explain the details of the mechanism
(this is one that is implemented, I have been using NBSD_SHELL set up this
way, with the extra "unexportable" stuff for a while now.)


Next, I'd like to fix the -q option to be a little more rational.

That option disables the effects of -x and -v while processing the
startup files (/etc/profile .profile, and $ENV - or whichever of them
the shell in question is going to read.)   However, it only does that if
set on the command line (those scripts cannot set it themselves) and it
only applies until the startup file does "set -x" or "set -v" internally.
(Then -q would apply again to the next startup script processed, if any).
As implemented the startup script is free to turn the x and v options on
at any time, and they stick if turned on regardless of -q (other than
while processing a later startup script if -q is on), but if -q is on,
the startup scripts can only disable (set +x or set +v) the options for
the remainder of itself, as soon as processing of the startup script ends,
the -x or -v (or both) options will be turned on again if they were on before
the script started.

I'd like to change it so that -q simply supresses any output from the x or v
options while a script is being processed, and has no effect at all upon the
values of the x or v options.   To forcibly enable output in a startup script
it would be necessary to do "set -x +q" (or -v) to disable the q option while
enabling -x (or -v).   Otherwise those scripts would be free to change the
valuse of -x and -v however they like, and the change would remain when the
script finishes, but if -q is set, the startup script itself would still
generate no output.   The script would be able to "set -q" itself, if it
wanted to enable -x (or -v) but only for commands after the startup scripts
are finished.

Overall this seems to be more consistent to me.

And speaking of startup scripts, because of the testing & debugging I
have been doing (either using gdb sometimes, or using the shell's debug
output (TRACE()) and since I have been looking at the parser, and quoting,
and stuff, quite a lot, having the shell process startup scripts (when it
was quite likely just going to execute "echo $x" or something similar) was
kind of tedious.   So, I put in what (at the time) I regarded as a simple
hack to skip them ... /etc/profile and .profile were never a problem, as they
only run when sh is called with a name starting with '-' (argv[0]), and
avoiding that is easy (much easier than causing it!)   But $ENV is run
all the time.   Initially I was doing "ENV=/dev/null ./sh ..." whenever I
didn't forget, but that got tedious quite quickly ("unset ENV" would have
affected the envoronment I was using to test, and I didn't want to do that.)

So, I added a new option (I picked 'Q' for "quick" or "quiet") that simply
supresses any startup file that would otherwise be executed.  At the time
I really only intended this to be a short term hack, that would never be
revealed outside my system ... but I have grown quite fond of it, and
wondered if perhaps others might like it as well.

For now, it is a one letter option that has no long name (the first such
in the NetBSD sh, and meant a few more minor changes to make that work
correctly in all cases) - nothing meaningful enough to actually discuss.

So, does anyone else think that would be useful ?


And last, for today anyway, the NetBSD shell already has an internal "posix"
mode, that it uses to behave in the way of a more standard posix shell than
it does otherwise.   Right now, it is used only to control whether $ENV
is processed in non-interactive shells (posix says it must not be, but it
seems (and I agree) that it is useful to have in all shells.)

For now, the only way the "option" gets set, is to have POSIXLY_CORRECT
(or some name like that, I forget it...) set in the environment.  If that
is done, sh (and several other commands) act more like posix demands,
rather than in the useful way.   While that is sometime a useful method,
and I do not propose removing support for POSIXLY_CORRECT (or whatever it is)
in many situations it is annoying (it is a long name to be setting all the
time) and worse - when set to control sh, you do not necessarily want it
to flow down to commands that the sh runs, but if it is in the env, it does.

So, I'd like to do what several other shells have done, and make it a proper
option (which would also allow scripts to determine if it is on or not,
which they can't really do now - POSIXLY_CORRECT only matters for the shell
at the time the shell starts, after that changing its value does not affect
whether the shell is in posix mode or not, so examining that var reveals
nothing useful (for this purpose.)  Making it a real option would also allow
scripts to enable/disable it as needed during execution (assuming it one day
affects more than whether $ENV is read or not.)

Several other shells implement "-o posix" (and set -o postx, set +o posix)
to control this, and I would suggest that NetBSD do the same.   If not set
on the command line, its initial value would (like now) come from 
POSIXLY_CORRECT (being set in the env or not.)  Typically this option has
no one letter name, and I see no reason we need depart from that practice
(we already have options like that.)

Opinions?

I have a bunch more in my TODO list, but this is enough for today...

kre
Follow-Ups:
- Re: Proposal for several minor changes to sh
  - From: Robert Elz
- Re: Proposal for several minor changes to sh
  - From: Iain Hibbert
Prev by Date: Re: Revised Web UI for NPF as a GSoC project
Next by Date: shmod
Previous by Thread: Revised Web UI for NPF as a GSoC project
Next by Thread: Re: Proposal for several minor changes to sh
Indexes:
Home | Main Index | Thread Index | Old Index