NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: bin/60275: sh(1): race condition in signal handling on background subshell fork
The following reply was made to PR bin/60275; it has been noted by GNATS.
From: Robert Elz <kre%munnari.OZ.AU@localhost>
To: gnats-bugs%netbsd.org@localhost, netbsd-bugs%netbsd.org@localhost
Cc:
Subject: Re: bin/60275: sh(1): race condition in signal handling on background subshell fork
Date: Mon, 18 May 2026 19:00:37 +0700
The previous "fix" is insufficient, if I change (the relevant part
of) the test program from:
sleep 1 && echo timeout >&2 && kill $$ & timer=$!
#sleep 0.01
kill $timer; wait $timer 2>/dev/null || :
to:
sleep 1 && echo timeout >&2 && kill $$ & timer=$!
#sleep 0.01
i=0; while [ $((++i)) -lt 5 ]; do continue; done
kill $timer; wait $timer 2>/dev/null || :
then it still fails. Anything bigger than '5' in the new loop
and it "works" (at least in my DEBUG equipped sh, with the debug options
I had enabled, a bigger value might still exhibit the problem in a non-DEBUG
sh - and might even need to be bigger than 5 for the previous "fix" to fail).
(executing an external command, even a very short sleep, takes much much
longer than a simple loop like this, unless the number of iterations is large.)
That is, the previous change only helps when the child has had no time
to execute anything (or almost anything) after the fork() before the signal
is sent. If it has plenty of time (the normal case) all is OK, and always
was. But there is still a small window when the child has passed beyond
where the previous fix helped (ie: now it knows now that it is a child process,
and cannot execute any traps that belong to the parent) but still has not yet
reset all the signals to the state they should be in for the child process.
In my DEBUG environment, using '5' the child process was just cleaning
up its signal states (after which receiving the signal would work properly)
and had already (just) fixed signal 14 (SIGALRM) (it processes the signals
in increasing numeric order) when the SIGTERM (signal 15) from the parent
arrived. By this time it was too late for the former fix to help (the child
had done too much) but "missed it by THAT much" for normal processing to help).
I have a new more comprehensive fix which I believe will handle this problem,
no matter how slow, or fast, the child and parent happen to execute after
the fork() has occurred. It should be committed a bit later today (after
more testing).
kre
Home |
Main Index |
Thread Index |
Old Index