NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
port-arm/55704: multi-threaded applications for earmv[45]{,hf} freeze on COMPAT_NETBSD32 of aarch64
>Number: 55704
>Category: port-arm
>Synopsis: multi-threaded applications for earmv[45]{,hf} freeze on COMPAT_NETBSD32 of aarch64
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: port-arm-maintainer
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Thu Oct 08 09:55:00 +0000 2020
>Originator: Rin Okuyama
>Release: 9.99.73
>Organization:
Department of Physics, Meiji University
>Environment:
NetBSD rpi 9.99.73 NetBSD 9.99.73 (GENERIC64) #38: Wed Oct 7 17:28:52 JST 2020 rin@latipes:/sys/arch/evbarm/compile/GENERIC64 evbarm aarch64
>Description:
Multi-threaded applications on userland for earmv[45]{,hf} freeze
indefinitely on COMPAT_NETBSD32 of aarch64, if more than one CPU
cores are online. For example, ctfmerge(1) freezes almost every time
during build of pkgsrc/pkgtools/cwrappers:
----
# uname -p
aarch64
# file /emul/netbsd32/bin/sh
/bin/sh: ELF 32-bit LSB pie executable, ARM, EABI5 version 1 (SYSV), dynamically linked, interpreter /libexec/ld.elf_so, for NetBSD 9.99.73, compiled for: earmv5, not stripped
# chroot /emul/netbsd32 su -
# cd /usr/pkgsrc/pkgtools/cwrappers && make MAKE_JOBS=1
...
ctfmerge -t -g -L VERSION -o c++-wrapper alloc.o cleanup-cc.o common.o reorder-cc.o generic-transform-cc.o normalise-cc.o c++-wrapper.o transform-cc.o
(then stalls here eternally)
----
GDB shows that it is sleeping in lwp_park(2):
----
# fg
make MAKE_JOBS=1
^Z[1] + Suspended make MAKE_JOBS=1
# bg
[1] make MAKE_JOBS=1
# gdb -p `pgrep ctfmerge`
...
Thread 1 "" received signal SIGCONT, Continued.
[Switching to LWP 3419 of process 3245]
0xf3a3c4c4 in ___lwp_park60 () from /usr/libexec/ld.elf_so
(gdb) bt
#0 0xf3a3c4c4 in ___lwp_park60 () from /usr/libexec/ld.elf_so
#1 0xf3a31e6c in _rtld_exclusive_enter (mask=mask@entry=0xf73fff90)
at /usr/src/libexec/ld.elf_so/rtld.c:1766
#2 0xf3a39e60 in _rtld_tls_get_addr (tls=0xf796f000, idx=2, offset=0)
at /usr/src/libexec/ld.elf_so/tls.c:68
#3 0xf7ac9e48 in __cxa_thread_run_atexit ()
at /usr/src/lib/libc/stdlib/cxa_thread_atexit.c:55
#4 0xf7c1bc1c in pthread_exit (retval=0x0)
at /usr/src/lib/libpthread/pthread.c:629
#5 0xf7c1bd18 in pthread__create_tramp (cookie=0xf7b79000)
at /usr/src/lib/libpthread/pthread.c:562
#6 0xf7af99f4 in __mknod50 () from /usr/lib/libc.so.12
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb)
----
If only one CPU core is online by cpuctl(8), ctfmerge(1) works without
problems for COMPAT_NETBSD32. This strongly suggests that there may be
some problems for earmv[45]{,hf} userland on multi-processor machines.
>How-To-Repeat:
Described above.
>Fix:
I'm not sure whether we can fix this problem without modifying userland
binaries for earmv[45]{,hf}. While arm variants prior to v6 realize
atomic_ops(3) by swp instruction (we emulate it for COMPAT_NETBSD32),
they does not have membar_ops(3), since they are not intended for
multi-processor machines. Actually, you can see our membar_ops(3) are
no-op for arm processors prior to v6:
https://nxr.netbsd.org/xref/src/common/lib/libc/arch/arm/atomic/membar_ops.S#33
Home |
Main Index |
Thread Index |
Old Index