NetBSD-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

script hanging with locking bug



bup (in sysutils/bup) has lots of tests, and after recent improvements
bup's tests are hanging on NetBSD 10 amd64.   I am almost certain this
is "new tests hang", not any kind of regression.

The new tests code is

    #!/usr/bin/env bash

    set -ueo pipefail

    # Callers can test for support via "with-tty true".

    usage() { echo 'Usage: with-tty command [arg ...]'; }
    misuse() { usage 1>&2; exit 2; }

    if script -qec true /dev/null; then
        # linux flavor
        script -qec "$(printf ' %q' "$@")" /dev/null
    elif script -q /dev/null true; then
        # bsd flavor
        script -q /dev/null "$@"
    else
        rc=0
        cmd="$(command -v script)" || rc=$?
        if test "$rc" -eq 0; then
            printf 'Unsupported script command: %q\n' "$cmd" 1>&2
        else
            echo 'No script command' 1>&2
        fi
        exit 2
    fi


And I find after 15 minutes of hanging:

      UID   PID  PPID    CPU PRI NI      VSZ    RSS WCHAN   STAT TTY         TIME COMMAND
    12345  1803     1  28859  29  0    17652   1072 parked  I+   pts/37   0:00.00 script -qec  true /dev/null 

Attaching, I see:

    [Switching to LWP 1803 of process 1803]
    0x00007f7f9aa0abba in ___lwp_park60 () from /usr/libexec/ld.elf_so
    (gdb) bt
    #0  0x00007f7f9aa0abba in ___lwp_park60 () from /usr/libexec/ld.elf_so
    #1  0x00007f7f9aa05d45 in _rtld_exclusive_enter () from /usr/libexec/ld.elf_so
    #2  0x00007f7f9aa0680d in _rtld_exit () from /usr/libexec/ld.elf_so
    #3  0x000079176e95a6c9 in __cxa_finalize () from /usr/lib/libc.so.12
    #4  0x000079176e95a3ed in exit () from /usr/lib/libc.so.12
    #5  0x00000001cdc0179b in done ()
    #6  0x00000001cdc0189a in finish ()
    #7  <signal handler called>
    #8  0x00007f7f9aa07ff3 in _rtld_symlook_obj () from /usr/libexec/ld.elf_so
    #9  0x00007f7f9aa083ea in _rtld_symlook_list () from /usr/libexec/ld.elf_so
    #10 0x00007f7f9aa0889f in _rtld_symlook_default () from /usr/libexec/ld.elf_so
    #11 0x00007f7f9aa08d4a in _rtld_find_plt_symdef () from /usr/libexec/ld.elf_so
    #12 0x00007f7f9aa00bc0 in _rtld_bind () from /usr/libexec/ld.elf_so
    #13 0x00007f7f9aa0082d in _rtld_bind_start () from /usr/libexec/ld.elf_so
    #14 0x0000000000000246 in ?? ()
    #15 0x0000000000002d4f in ?? ()
    #16 0x000079176e913b80 in _malloc_prefork () from /usr/lib/libc.so.12
    #17 0x00000001cdc0213b in main ()

which looks like an async-signal-safe botch leading to deadlock.

But, running script outside of bup works fine.  (Bup does set
LD_LIBRARY_PATH to pick up bup-under-test libs.)

Any insight?


Home | Main Index | Thread Index | Old Index