NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/56979: fork(2) fails to be signal safe



The following reply was made to PR lib/56979; it has been noted by GNATS.

From: Taylor R Campbell <riastradh%NetBSD.org@localhost>
To: Tom Lane <tgl%sss.pgh.pa.us@localhost>
Cc: gnats-bugs%NetBSD.org@localhost
Subject: Re: kern/56979: fork(2) fails to be signal safe
Date: Sun, 16 Oct 2022 10:37:54 +0000

 > Date: Sat, Sat, 15 Oct 2022 22:46:52 -0400
 > From: Tom Lane <tgl%sss.pgh.pa.us@localhost>
 > 
 > Well, here is the actual problem: with this implementation, the mere
 > act of invoking a C function is not guaranteed to be async-signal-safe,
 > depending on whether it crosses a not-terribly-well-defined linkage
 > boundary.
 
 This is not accurate.  Symbol binding _is_ async-signal-safe because
 all operations that _change_ rtld state -- which, if interrupted by a
 signal, might lead to rtld symbol binding logic seeing inconsistent
 states of the data structures -- block signals while they hold an
 exclusive thread lock, so symbol binding can't happen in a signal
 _while_ the rtld state is being changed.
 
 > Which nominally-primitive C operations get implemented by calls to
 > libgcc_s.so?
 
 As far as I know, none of these affect or are affected by global state
 (except possibly the floating-point exception state in softfloat,
 which is a known bug), so they should all be async-signal-safe.  If
 you find an exception, please let us know!
 
 > Date: Sat, 15 Oct 2022 23:03:04 -0400
 > From: Tom Lane <tgl%sss.pgh.pa.us@localhost>
 > 
 > Ah, sorry, I wasn't thinking very clearly there: the mainline would
 > need to be doing something that takes the RTLD lock exclusively.
 > Not that there's any shortage of cases that do that.  Then, if a
 > signal handler interrupts that and invokes one of the
 > required-to-be-safe libc functions for the first time in the
 > program, you have a deadlock in a situation that absolutely should
 > be legal per POSIX.
 
 No deadlock because the exclusive lock blocks signals and other
 threads from taking the shared lock, so there's no way to take the
 shared lock while the exclusive lock is held.
 
 The problem in postmaster is that a signal interrupted the _shared_
 lock during symbol binding, and then took the forbidden action --
 calling dlopen in a signal handler, which tries to take the exclusive
 lock, which waits for symbol binding to finish, which will never
 happen because symbol binding is waiting for the signal handler to
 return.
 


Home | Main Index | Thread Index | Old Index