Add static local vars to sh(1) ?

To: tech-userlevel%netbsd.org@localhost
Subject: Add static local vars to sh(1) ?
From: Robert Elz <kre%munnari.OZ.AU@localhost>
Date: Mon, 29 Jan 2018 15:40:28 +0700
For an (irrelevant here) script, I had an issue that would be best
handled by having static local (to a function) sh vars (normally
handled by just using a global, but that raises namespace issues.)

So, I wondered how easy it would be to add static local vars to sh.

It turns out to be quite easy (and cheap) - what's more, it is not a
new concept, ksh93 (maybe earlier versions too) has them as well, in
at least one of its function types.

So, I have implemented them - and while the current implementation
has not yet been tested enough (nor have any ATF tests been written)
to actually commit, I thought I'd ask if this is something that others
think would be useful enough to actually include in the NetBSD sh(1).

Rather than describe how it works, I will just include here a part
of what would be (after errors and poor wording etc are fixed) the sh(1)
man page section for the local command.

This is ascii only for this e-mail, so is missing the various markup
clues that would normally appear in a man page (underlining, bolding, etc)
- they do exist, but are suppressed for this e-mail.  Apologies if
that makes some of this look weird...  be on the watch particularly for
the word "variable" - sometimes it is just a word, other uses would appear
underlined or in italics, where it refers to the command line param string.

     local [-INSx] [variable[=value] | `-'] ...
            Define local variables for a function.  Local variables have their
            attributes, and values, as they were before the first execution of
            the local command for that variable in a function, restored when
            the function terminates.  Performing local more than once on the
            same variable during a function execution is permitted, but
            generally pointless.

            The -S flag causes the local variable to be static.  When a
            function containing such variables returns, then just before the
            previous values are restored, the current value and attributes of
            each such variable are saved.  When the ``local -S variable''
            command is executed, if a value for variable had previously been
            saved, then after its outside value has been preserved, the
            previous value and attributes of variable are restored, and the
            -I, -N, and -x flags, and any initialization of the variable
            requested on the local command line, are ignored.  If there was no
            saved value for variable (from a previous execution of the same
            function which had executed ``local -S'' with the same variable
            name) then processing of those flags, and initialization, proceeds
            as if the -S flag had not been given.

            With the -N flag, variables made local, are unset initially inside
	    [... continues generally unmodified from sh(1) from -current ... ]

In the above, the 2nd paragraph is the new part - the -S option would
be added.   Neither bash nor zsh have a -S option in the places that
would be affected were they to copy this scheme, ksh93 does, but it also
uses it (amongst other uses) to declare static local vars (ksh93's -S can
also be used on globals).

In addition, the first paragraph above has been reworded (compared to what
is in -current) to make it clear that the local command is not a "declaration"
(sh has no such concept), but is a normal executable command (that change,
or some better worded version of it, will eventually make it to sh(1)
regardless of the rest of this.)

Aside from the above, the only other man page change is a note in the
paragraph a bit later than the above excerpt that describes "local -",
where I added a sentence to make it clear that none of the option flags
apply to - (which includes the -S option).  That change will also make it
to sh(1) eventually regardless - there is no specific reference to -S there.

Oh, there will also be a sentence added to the description of LINENO to
note that "local -S LINENO" 'works' but has no visible effect.

Adding "local -S" allows stuff like:

	./sh -c '
		fn() { local -S N=0 Q=""; Q=${Q}.; echo ${Q}$((++N)); }
		unset Q
		N=111
		echo "N=${N:-unset} Q=${Q:-unset}"
		fn
		fn
		echo "N=${N:-unset} Q=${Q:-unset}"
		fn 
		fn
		echo "N=${N:-unset} Q=${Q:-unset}"
	'
	N=111 Q=unset
	.1
	..2
	N=111 Q=unset
	...3
	....4
	N=111 Q=unset

In addition, the mechanism used to implement this, and the demands of
the implementation, make it trivial to fix a long standing bug in sh(1)
that I have known of for ages, but since it never seems to bother anyone,
I was just putting off dealing with until a later time...   That is
if a function that is currently being executed is removed (which includes
being redefined) we get bad mojo ... that is, the old function is freed,
but we keep on referencing it (where it used to reside) to execute whatever
remains, and whatever happens, happens...

Since no-one has even been bothered by that (it is rare for a function
to be removed, very very unusual for that to happen while it is active)
there is no PR for the problem ... if the proposal above goes ahead,
it should just get fixed as a side-effect, if not, then if someone creates
some kind of test case to demonstrate it (most simple tests appear to
work just fine, as even though the memory for the function has been
freed, it doesn't get reallocated for anything else, and so executing
what is in it still works) and files a PR, then it would get fixed
anyway (the reason for this is that I need a test case to verify that
the fix works, and right now, I don't have one - even though I know
that what we have is broken, I have yet to force sh to misbehave with
anything simple enough become an ATF test for this.)

Perhaps using one of the "detect references to freed memory" tools
that some of you have access to would show the issue...  If anyone
with one of those wants to try, a simple test case would be:

	sh -c 'fn() { unset -f fn; echo done; }; fn'

This just says "done" - but obviously when the "echo done" is performed,
fn has already been removed.  You can test this with any NetBSD sh version.

Note that stopping it saying "done" (stopping the echo command running)
is not the aim - rather the fix is to retain the old function code, and
only free it when there are no remaining references.  Being executed is
a reference.  Being named (ie: having been defined, and not removed)
is also a reference.   Because of that there is no immediate outward
sign that anything is wrong, and when the bug is fixed, nothing changes
in this simple test.   Other attempts to find a similarly simple test
that actually demonstrate a problem have so far not been successful.

Finally, I am currently living in a very e-mail challenged environment
(which will, with luck, be rectified later this week) so I will probably
not respond to any discussions this message starts (unless there is a
direct question, and even then, it depends...) - I will however read
everything (or at least try to) and it will all be back available
properly once I get a working environ again, and at that time, if
responses are needed, I will make them.

kre
Follow-Ups:
- Re: Add static local vars to sh(1) ?
  - From: Valery Ushakov
- Re: Add static local vars to sh(1) ?
  - From: Rhialto
Prev by Date: Re: pgrep -x broken
Next by Date: Re: Add static local vars to sh(1) ?
Previous by Thread: pgrep -x broken
Next by Thread: Re: Add static local vars to sh(1) ?
Indexes:
Home | Main Index | Thread Index | Old Index