NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: bin/60275: sh(1): race condition in signal handling on background subshell fork



The following reply was made to PR bin/60275; it has been noted by GNATS.

From: Robert Elz <kre%munnari.OZ.AU@localhost>
To: gnats-bugs%netbsd.org@localhost, netbsd-bugs%netbsd.org@localhost
Cc: 
Subject: Re: bin/60275: sh(1): race condition in signal handling on background subshell fork
Date: Sun, 17 May 2026 17:13:35 +0700

     Date:        Sat, 16 May 2026 23:25:00 +0000 (UTC)
     From:        "campbell+netbsd%mumble.net@localhost via gnats" <gnats-admin%NetBSD.org@localhost>
     Message-ID:  <20260516232500.DB6A71A923C%mollari.NetBSD.org@localhost>
 
   | 	I skimmed some of the code in /bin/sh to find where the calls
   | 	to sigaction and sigprocmask were coming from, but it wasn't
   | 	obvious.  Reproduced in 9 and in 11.
 
 And quite likely back towards the dawn of (sh) time - the way that
 signals are handled is weird, complex, and barely understood I suspect
 (inside sh) so no-one generally touches it (traps, which are the shell's
 user level manifestation of signals, on the other hand have been changed
 from time to time).
 
 The source of those sigaction/sigprocmask is easy to explain, but I won't
 bother, as those have nothing whatever to do with your problem - you may
 have found a bug in ktrace however, as the 2 signals involved should be
 SIGINT and SIGQUIT (#2 & #3) not SIGHUP and SIGINT (#1 & #2) - it looks
 as if something has an off by one bug, and I cannot see how that can
 possibly be in sh, the way that the code works.   All that is happening
 there is the normal "ignore SIGINT and SIGQUIT in a background job" stuff
 (though with "set -m" enabled, I'm not sure that should be happening, I
 will look into that).   The sigprocmask() is just defensive - making sure
 that the parent didn't have the signal (SIGINT and SIGQUIT in this case)
 blocked (so it can be properly and immediately ignored).   None of that
 has anything whatever to do with the SIGTERM issue you're seeing.
 
 I think I know what is happening however, there is a window when a
 signal which is being trapped in the parent, and is received by a
 child very early, before the child has done essentially anything,
 will simply be ignored - the child process (inside sh) isn't set up
 to handle the signal, but it can't just be left for later, or sh would
 go into an infinite loop leaving it for later, forever, so currently
 the shell just says "oh well, better luck next time" and forgets the
 signal ever happened.   My vague memory is that this is (or was) a
 fairly well understood problem by sh maintainers into the past.
 
 I will see if I can find some method to deal better with this, but it
 is not trivial (and just blocking signals before the fork, and unblocking
 them later, doesn't really work, as it is very difficult to work out just
 when "later" is safely up - and of course, sh has been around longer than
 signal blocking, so was never written to be able to do that).
 
 kre
 



Home | Main Index | Thread Index | Old Index