Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Script command under NetBSD-current



> Date: Mon, 14 Jun 2021 20:01:49 +0000
> From: RVP <rvp%SDF.ORG@localhost>
> 
> #0  0x00007f7fbfa0ab3a in ___lwp_park60 () from /usr/libexec/ld.elf_so
> #1 0x00007f7fbfa0265d in _rtld_exclusive_enter () from /usr/libexec/ld.elf_so
> #2  0x00007f7fbfa03125 in _rtld_exit () from /usr/libexec/ld.elf_so
> #3  0x000079097fb6bb1f in __cxa_finalize () from /usr/lib/libc.so.12
> #4  0x000079097fb6b73d in exit () from /usr/lib/libc.so.12
> #5  0x0000000001401771 in done ()
> #6  0x0000000001401853 in finish ()
> #7  <signal handler called>

This indicates a bug in the script(1) program -- it calls exit() in a
signal handler, but exit() is not async-signal-safe.

script(1) should be changed to use only async-signal-safe functions in
its signal handlers -- e.g., by just setting a flag in the signal
handler and either handling EINTR after every blocking syscall or
running with signals masked except during pselect/pollts loop.

I don't know why it's different in netbsd-9 and current, but it was
broken in netbsd-9 before, and there were some changes to some of the
logic could which trigger race conditions differently in current.


> Date: Tue, 15 Jun 2021 08:15:12 +0000
> From: RVP <rvp%SDF.ORG@localhost>
> 
> The small patch below fixes it for me.
> 
> --- START PATCH ---
> --- libexec/ld.elf_so.orig/rtld.c	2020-09-22 00:41:27.000000000 +0000
> +++ libexec/ld.elf_so/rtld.c	2021-06-15 08:11:34.301709238 +0000
> @@ -1750,6 +1750,8 @@
>  	sigdelset(&blockmask, SIGTRAP);	/* Allow the debugger */
>  	sigprocmask(SIG_BLOCK, &blockmask, mask);
> 
> +	membar_enter();

This may change some timing with the effect of rejiggering a race
condition, but it doesn't meaningfully affect the semantics, and
certainly won't prevent a deadlock from calling exit in the signal
handler if it interrupts lazy symbol binding.

The corresponding membar_enter in _rtld_shared_enter at the beginning
doesn't make sense and should be removed.  In particular, the order
generally needs to be something like:

mumblefrotz_enter:
	atomic_r/m/w(lock stuff);
	membar_enter();

	body of critical section;

mumblefrotz_exit:
	membar_exit();
	atomic_r/m/w(lock stuff);

Putting another membar_enter _before_ the atomic_r/m/w(lock stuff) in
mumblefrotz_enter doesn't really do anything.


Home | Main Index | Thread Index | Old Index