NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
bin/60099: /bin/sh (unquoted) $* expansion problem(s)
>Number: 60099
>Category: bin
>Synopsis: /bin/sh (unquoted) $* expansion problem(s)
>Confidential: no
>Severity: non-critical
>Priority: medium
>Responsible: bin-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Wed Mar 18 14:20:00 +0000 2026
>Originator: Robert Elz
>Release: NetBSD 11.99.5
>Organization:
>Environment:
System: NetBSD jacaranda.noi.kre.to 11.99.5 NetBSD 11.99.5 (JACARANDA:1.1-20260301) #257: Sun Mar 1 13:48:17 +07 2026 kre%jacaranda.noi.kre.to@localhost:/usr/obj/testing/kernels/amd64/JACARANDA amd64
Architecture: x86_64
Machine: amd64
>Description:
When unquoted, the $* (and $@ which is supposed to be the same
in this context) and in a context where field splitting will
happen, POSIX says of $*
Expands to the positional parameters, starting from one,
initially producing one field for each positional parameter
that is set. When the expansion occurs in a context where
field splitting will be performed, any empty fields may be
discarded and each of the non-empty fields shall be further
split as described in Section 2.6.5.
When the expansion occurs in a context where field splitting
will not be performed [...]
The remainder is irrelevant here.
That is, assuming that none of the fields are empty for present
purposes, and that there happen to be 5 positional parameters set,
then $* should be identical to $1 $2 $3 $4 $5
However:
sh -c 'set -- a b+ c +d e; IFS=+; args $1 $2 $3 $4 $5; args $*'
6: <a> <b> <c> <> <d> <e>
7: <a> <b> <> <c> <> <d> <e>
where "args" is the following baby script:
#! /bin/sh
printf '%d:' "$#"
while [ "$#" -gt 0 ]
do
printf ' <%s>' "$1"
shift
done
printf '\n'
I usually use a variation of that without the while loop,
but this one is slightly safer and should work with any shell.
Obviously the '<' and '>' in the output are from that script,
not part of the arg values, and exist just to make it clear
what each arg value actually is.
>How-To-Repeat:
As above. Note this is a very unusual set of circumstances,
it is rare to use unquoted $* ever, let alone in a context where
field splitting happens, and even less when the first character
of IFS is not whitespace, which are all required for the problem
to manifest.
Also note that most current shells produce the same 7 args that
our current /bin/sh produce, the exceptions are yash and ksh93,
which do the correct thing. zsh has a different interpretation
of field splitting, and produces 7 args from both "args" commands,
and pdksh (including NetBSD's /bin/ksh) also produce 7 args from
both, which is just broken according to current standards (sometime
about 40 years ago that might have been acceptable).
>Fix:
I have a fix for this for /bin/sh but it (currently anyway) has
a side effect in the area of that "empty fields may be discarded"
in the standard, no longer performing that the same way, and due
to that, 2 of the current ATF tests would fail (one turns out to
be a variant the same thing as above, though involving empty params,
so the right thing to do for that one will be to fix the test - but
what to turn it into depends - the other is because currently we drop
the empty fields, and after my current change is applied, we wouldn't
- but I will see if that can be altered easily before committing any
/bin/sh changes).
The problem is due to the way that sh expands $* - as I believe do
many other shells, treating it the same as "$*" making a single
string, and then field splitting that (since the quotes were just
pretend, and don't really exist). When IFS[0] (the first char
of ${IFS}) is "IFS whitespace" that all 'just works', and as
having IFS being $' \t\n' is the common setting, so the issue
here isn't often seen. But when IFS[0] is not IFS whitespace,
this technique doesn't work:
sh -c 'set -- a b+ c +d e; IFS=+; args "$*"'
1: <a+b++c++d+e>
which is absolutely correct, further:
sh -c 'V=a+b++c++d+e ; IFS=+ ; args $V'
7: <a> <b> <> <c> <> <d> <e>
which is also absolutely correct, and illustrates why the
current technique for handling unquoted $* cannot remain..
I am about to commit an extra test case to the ATF tests
for /bin/sh t_expand.sh test program, which will currently
fail for the subtest using the example above - and which
also contains sub-tests which would fail if I committed my
current /bin/sh changes (though those are in a section of
the tests which only verify that current sh behaviour doesn't
alter accidentally - they aren't verifying standards conformance).
Home |
Main Index |
Thread Index |
Old Index