NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
bin/52640: /bin/sh can "lose" background children when waiting on foreground ones
>Number: 52640
>Category: bin
>Synopsis: /bin/sh can "lose" background children when waiting on foreground ones
>Confidential: no
>Severity: serious
>Priority: high
>Responsible: bin-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Mon Oct 23 05:40:00 +0000 2017
>Originator: Robert Elz
>Release: NetBSD 8.99.1 (lots and lots of releases...)
>Organization:
>Environment:
System: NetBSD andromeda.noi.kre.to 8.99.1 NetBSD 8.99.1 (VBOX64-1.3-20170812) #39: Sat Aug 12 15:25:04 ICT 2017 kre%magnolia.noi.kre.to@localhost:/usr/obj/current/kernels/amd64/VBOX64 amd64
Architecture: x86_64
Machine: amd64
>Description:
The following script
#! /bin/sh
(sleep 3; exit 3) & PID=$!
sleep 10
(wait $PID; echo "In child: status" $?)
wait $PID; echo "In parent: status" $?
should print:
In child: status 127
In parent: status 3
as all other shells I could find to test do (except bosh,
which is just broken, and appears to return status 0 from
the wait command in all cases, and zsh, which is just weird,
in this and so many other ways)
instead, on all currently available NetBSD sh's we see
In child: status 127
In parent: status 127
That's because the background job completes while sh is waiting
for the later foreground job, and when that happens (at least
in many cases) the background job is simply discarded (if it
exited with a signal, that will be immediately reported, but
it will still be discarded.)
Fix that problem and we instead get
In child: status 3
In parent: status 3
!!! The sub-shell has no children, it should not be
able to get status from one of its siblings.
This only happens when the child has already exited before
the sub-shell is forked, and only when the status of that
child has not already been discarded (including incorrectly
discarded as above.)
This is because when a sub-shell is forked, the job table
(which holds the results of completed tasks, and the status
of active ones) is just marked invalid, not actually cleared
(until a new job needs to be created), but the shell's "wait"
command only bothers to look at the "invalid" flag in the
case of a simple "wait" (ie: not "wait pid") which is actually
backwards - the "wait" case does not really need it, though it
avoids wasting (cpu) time, whereas the "wait pid" case does.
>How-To-Repeat:
Write any script that runs a short background job, then a
longer foreground one (which is probably why this hasn't
been noticed - most commonly the timings are inverted),
and observe what happens when the script eventually waits
for the (already completed) background job.
>Fix:
Coming soon.... Will request pullup to -8, the shells on
the older systems are so out of date that they can just
continue to suffer with this (and many other) problems that
are usually never noticed in the wild.
Home |
Main Index |
Thread Index |
Old Index