Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Forthcoming (minor) change to system(3) and popen(3)

Sometimes (perhaps too many times) I would simply put all what
follows in the commit message -- but this time I thought that
would be way too far over the top, so I am sending this to explain
the motivation for a change I am about to commit.

Everything below discusses system(3) but the issue is identical
in popen(3) (the implementations are different, so the change
required is slightly different, but the issue, cause, and
symptoms are all identical).

There has been a bug report filed with the austin group (posix
maintainers) about:

	Calling `system("-some-tool")` fails
	      (although it is a valid `sh` command)

The issue is that system(3) (aside from the overhead, like forking
setting up the args, and waiting for the child process to exit) just
	execl("/bin/sh", "sh", "-c", cmd, NULL);

When passed "-some-tool" as cmd, this results in:

	execl("/bin/sh", "sh", "-c", "-some-tool", NULL);

To understand this, you need to also be aware that if
sh used getopts() (it doesn't, for several reasons) the
string passed to getoppts as the list of args would be
(something like, I am not trying to include all the
options that exist here, this is just for exposition here)


Note, it is not


That is, the -c arg does not have a parameter following, rather
it sets a flag, which says that after all the options are
processed, the next (non-option) arg is to be treated as the
command to execute, rather than as a file containing commands,

	sh -c foo

says to run the command named "foo" whereas

	sh foo

says to read the file "foo" and run the commands in it (I think
everyone knows that much).

	sh -c -e foo
	sh -e -c foo
	sh -e -c -f foo

are all the same as "sh -c foo" except for other options being
set as well.

Now back to

	execl("/bin/sh", "sh", "-c", "-some-tool", NULL);

With the above, it is immediately clear, I hope, that "-some-tool"
is going to be treated like
	execl("/bin/sh", "sh", "-c", "-s", "-o" "me-tool", NULL);
which is setting the 's' option, an attempt to sen the (non-existant
one presumes, though with some sh variants, who knows "me-tool" option,
and no further args  remaining after the options are processed, so there
is no option to be the command string that -c requires exist.

If you want to experiment with this, without needing to write C code,
our old friend ed makes it easy (that is ed(1) of course).

Just do

	! -f

(no need for any files to be edited, or for any lines in the editor
buffer or anything like that)

For the ! command in ed, ed just does


where "line" (var name invented for the purpose of this e-mail) is
the command typed by the user, line[0] is the '!', already examined
as the command for ed to execute, so that is skipped, and whatever
follows is passed to system(3).

In the first of the 2 ! commands used above, that is " -f" and when
that is done:
	-f: not found
is printed (on stderr if it matters) by the sh created to run the
command.  In that case, the " -f" arg is not an option, as it starts
with a space, not a "-" (or "+" - sh uses both, which is one reason
that getopts() cannot actually be used) but sh ignores leading white
space on command lines (that's how we get to be able to write nicely
indented scripts) and just tries to run the "-f" command, which we
presume does not exist (if you happen to have a -f command, then
please use something different than 'f' here, -a -u -m -x -v  ... any
of those work just as well ... I am not trying to con people into
running some dangerous "-f" command!)

In the second of the 2 ! commands in the example, what we get is
	sh: Bad -c option
(the exact wording might vary depending upon which exact version
of "sh" you are using - that's not important, it will also be
different if you replaced "-f" with "-A" (or anything else which
is not a valid sh option):
	sh: Illegal option -A

The "Bad -c option" is because there was no non-option arg present
to be the command to execute.

OK, that's the issue, the arg to "system()" is always supposed to
be the command, it should never be treated as sh options, whatever
it contains.

Unfortunately, system() has always worked just as posix says it
should behave:
	as if execl(PATH_TO_SH, "sh", "-c", cmd, NULL);
was executed in a child process (the "as if" is because there are other
ways that using execl() to achieve this - our implemenation of
system uses execve() for example, which is fine - we are not required
to actually use execl() just behave the same way).

Once upon a time, long long ago, sh only accepted one options arg
(and there was no -o - just a bunch of single letter options) so
this wasn't an issue, in
	sh -ec -f
	sh -ce -f

The "-ec" or "-ce" are options (the order doesn't matter, those two are
100% equivalent), and "-f" was the next arg, not an option, so with a
shell like that, system() worked properly, always.

Then, someone, somewhere, lost in the anals of history, decided to
make sh act more like other utilities, allowing multiple option args
(and introducing the '+' option introducer to turn options off - aside:
no ever wonders that n "ls -l" "-l" turns on the 'l' option for ls,
but in sh people keep asking "how come -f turns on the 'f' option and
"+f" turns it off, shouldn't it be the other way around?" - No, it
shouldn't!).  Anyway, this (probably without anyone realising, as in
pracice no-one names commands with names like "-foo" or "+bar") broke

What's more, posix standardised this form for sh, it is required (and
these days, all shells do) process multiple option args.

So, in effect, POSIX has specified that system() in a way that is
broken because of the way that sh is speficied to work.
The bug report is to get that fixed.

What is currently being proposed, is to require system() to act
as if it called
	execl(PATH_TO_SH, "sh", "-c", "--", cmd, NULL);

The "--" in sh, as in most other normal commands, ends option
processing, nothing that follows, no matter what it looks like
is an option.

[Aside again, this was originally itroduced to provide a method
that people could understand to allow them to remove files with
names beginning with "-" - "rm -foo" doesn't work to remove "-foo".
and for whatever reason, using "rm ./-foo" wasn't something that
was felt people could remember to do, so -- appeared, and "rm -- -foo"
worked after that.  But that's not relevant here.]

That's actually the wrong thing for POSIX to say (IMO) not because
it doesn't work, and not because it isn't the right thing for
implementations to do, but because it is not the standard - that is
not how current implementations behave, they all do (the equiv of)
	execl(PATH_TO_SH, "sh", "-c", cmd, NULL);
just like posix specifies should be done currently.   That is the

What POSIX should be doing is warning application writers about the
problem (as the ed example above shows, it is trivial to work around
by prepending a space to the command line passed to system()) and
encouraging implentations to correct the issue by actually implementing
it as
	execl(PATH_TO_SH, "sh", "-c", cmd, NULL);
They might even go as far as to warn people that a later version of
the standard might require the "--" form of implementation of system().
But they should not actually make that change until that has become
the real world standard - that is, until the implementations really
have added the "--" arg to the execl() call (or whatever alternative
they are using).

Ie: the correct thing for them to do, is to tell us (the implementors)
to fix system() (and popen() - that one is, for this purpose, the same)
so the problem does not occur.

And simultaneously, as application writers, to fix ed(1) (and other programs
like it) to do something like

	line[0] = ' ';

instead of
which is now done, after which they will work whether system() is fixed
or not.

What the next POSIX standard will actually say about this is not yet
certain, we'll find out a little later (probably later this year for
this particular point).

But for us, right now, this doesn't matter - adding the "--" in the exec*()
calls in system() and popen() is clearly the right thing to do, and so
that is what I propose to commit, sometime within the next 30 mins or
so (the change is already made on my system, a new release build done,
ATF tests run, ...)

I am not going to change ed(1) as suggested above - there will be no
point.   Our ed uses only our libc, so we will know that this is fixed,
and have no need to work around the problem.  (Obviously no other apps
will be updated either).

There will be no libc version bump (not even a minor one), this is all
API compatible, no new symbols added, or anything externally visible changing.
Just one case that did not work before now will work.

Also, for anyone wondering, there is absolutely no security issue here,
any attempt to deliberately pass an option to sh using this bug simply
guaranteed that the system() call would fail (the sh run would issue
an error and exit, executing nothing).

All this is just the long explanation for the coming commit, which will
just have a brief commit message!


Home | Main Index | Thread Index | Old Index