NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

bin/60099: /bin/sh (unquoted) $* expansion problem(s)



>Number:         60099
>Category:       bin
>Synopsis:       /bin/sh (unquoted) $* expansion problem(s)
>Confidential:   no
>Severity:       non-critical
>Priority:       medium
>Responsible:    bin-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Mar 18 14:20:00 +0000 2026
>Originator:     Robert Elz
>Release:        NetBSD 11.99.5
>Organization:
>Environment:
System: NetBSD jacaranda.noi.kre.to 11.99.5 NetBSD 11.99.5 (JACARANDA:1.1-20260301) #257: Sun Mar 1 13:48:17 +07 2026 kre%jacaranda.noi.kre.to@localhost:/usr/obj/testing/kernels/amd64/JACARANDA amd64
Architecture: x86_64
Machine: amd64
>Description:
	When unquoted, the $* (and $@ which is supposed to be the same
	in this context) and in a context where field splitting will
	happen, POSIX says of $*

		Expands to the positional parameters, starting from one,
		initially producing one field for each positional parameter
		that is set. When the expansion occurs in a context where
		field splitting will be performed, any empty fields may be
		discarded and each of the non-empty fields shall be further
		split as described in Section 2.6.5.

		When the expansion occurs in a context where field splitting
		will not be performed [...]

	The remainder is irrelevant here.

	That is, assuming that none of the fields are empty for present
	purposes, and that there happen to be 5 positional parameters set,
	then $* should be identical to $1 $2 $3 $4 $5

	However:

sh -c 'set -- a b+ c +d e; IFS=+; args $1 $2 $3 $4 $5; args $*'
6: <a> <b> <c> <> <d> <e>
7: <a> <b> <> <c> <> <d> <e>

	where "args" is the following baby script:

#! /bin/sh

printf '%d:' "$#"
while [ "$#" -gt 0 ]
do
	printf ' <%s>' "$1"
	shift
done
printf '\n'

	I usually use a variation of that without the while loop,
	but this one is slightly safer and should work with any shell.
	Obviously the '<' and '>' in the output are from that script,
	not part of the arg values, and exist just to make it clear
	what each arg value actually is.

>How-To-Repeat:

	As above.   Note this is a very unusual set of circumstances,
	it is rare to use unquoted $* ever, let alone in a context where
	field splitting happens, and even less when the first character
	of IFS is not whitespace, which are all required for the problem
	to manifest.

	Also note that most current shells produce the same 7 args that
	our current /bin/sh produce, the exceptions are yash and ksh93,
	which do the correct thing.  zsh has a different interpretation
	of field splitting, and produces 7 args from both "args" commands,
	and pdksh (including NetBSD's /bin/ksh) also produce 7 args from
	both, which is just broken according to current standards (sometime
	about 40 years ago that might have been acceptable).

>Fix:

	I have a fix for this for /bin/sh but it (currently anyway) has
	a side effect in the area of that "empty fields may be discarded"
	in the standard, no longer performing that the same way, and due
	to that, 2 of the current ATF tests would fail (one turns out to
	be a variant the same thing as above, though involving empty params,
	so the right thing to do for that one will be to fix the test - but
	what to turn it into depends - the other is because currently we drop
	the empty fields, and after my current change is applied, we wouldn't
	- but I will see if that can be altered easily before committing any
	/bin/sh changes).

	The problem is due to the way that sh expands $* - as I believe do
	many other shells, treating it the same as "$*" making a single
	string, and then field splitting that (since the quotes were just
	pretend, and don't really exist).   When IFS[0] (the first char
	of ${IFS}) is "IFS whitespace" that all 'just works', and as
	having IFS being $' \t\n' is the common setting, so the issue
	here isn't often seen.   But when IFS[0] is not IFS whitespace,
	this technique doesn't work:

sh -c 'set -- a b+ c +d e; IFS=+; args "$*"'
1: <a+b++c++d+e>

	which is absolutely correct, further:

sh -c 'V=a+b++c++d+e ; IFS=+ ; args $V'
7: <a> <b> <> <c> <> <d> <e>

	which is also absolutely correct, and illustrates why the
	current technique for handling unquoted $* cannot remain..

	I am about to commit an extra test case to the ATF tests
	for /bin/sh t_expand.sh test program, which will currently
	fail for the subtest using the example above - and which
	also contains sub-tests which would fail if I committed my
	current /bin/sh changes (though those are in a section of
	the tests which only verify that current sh behaviour doesn't
	alter accidentally - they aren't verifying standards conformance).




Home | Main Index | Thread Index | Old Index