NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: toolchain/57241: mips64el--netbsd-install core dumps randomly
Hi rillig, I wonder whether you might be able to help solve a
make(1)-related mystery?
I'm drafting a change to fix the parallel-safety of the foo.debug
recipe in bsd.prog.mk (a little finicky because it has nontrivial
interaction with other makefiles like libexec/ld.elf_so/Makefile).
But before I commit it, I want to make sure I understand the
underlying cause of PR 57241.
The immediate symptom is that, e.g., `mips64el--netbsd-install ...
ipftest ${DESTDIR}/usr/sbin/ipftest' is crashing because its input
file has been truncated between fstat/mmap and access to file content.
And it looks like there's a concurrent objcopy from the .debug recipe
which has truncated ipftest to rewrite it in place.
But I can't figure out why the concurrent objcopy is happening only in
the mips64 builds of certain programs like ipftest(8) and crash(8),
which seem to have in common the use of compat/exec.mk. (These are
programs that run with the n64 ABI, in order to read out kernel guts
on mips64 CPUs, in a userland where _most_ programs run with the n32
ABI instead because it's more compact and they usually have <4GB RAM.)
And so I think I need a make(1) wizard to help.
Here's an example:
https://releng.netbsd.org/builds/HEAD/202504161330Z/evbmips-mips64el.build.failed
https://web.archive.org/web/20250418154748/https://releng.netbsd.org/builds/HEAD/202504161330Z/evbmips-mips64el.build.failed
[1] Bus error (core dumped) /home/builds/ab/HEAD/evbmips-mips64el/20250416...
--- /home/builds/ab/HEAD/evbmips-mips64el/202504161330Z-dest/usr/sbin/ipftest ---
...
*** Failed target: /home/builds/ab/HEAD/evbmips-mips64el/202504161330Z-dest/usr/sbin/ipftest
*** In directory: /home/source/ab/HEAD/src/external/bsd/ipf/bin/ipftest
*** Failed commands:
${_MKTARGET_INSTALL}
=> @# "install " /home/builds/ab/HEAD/evbmips-mips64el/202504161330Z-dest/usr/sbin/ipftest
${INSTALL_FILE} -o ${BINOWN} -g ${BINGRP} -m ${BINMODE} ${STRIPFLAG} ${.ALLSRC} ${.TARGET}
=> /home/builds/ab/HEAD/evbmips-mips64el/202504161330Z-tools/bin/mips64el--netbsd-install -U -M /home/builds/ab/HEAD/evbmips-mips64el/202504161330Z-dest/METALOG -D /home/builds/ab/HEAD/evbmips-mips64el/202504161330Z-dest -h sha256 -N /home/source/ab/HEAD/src/etc -c -r -o root -g wheel -m 555 ipftest /home/builds/ab/HEAD/evbmips-mips64el/202504161330Z-dest/usr/sbin/ipftest
*** [/home/builds/ab/HEAD/evbmips-mips64el/202504161330Z-dest/usr/sbin/ipftest] Error code 138
...
/home/builds/ab/HEAD/evbmips-mips64el/202504161330Z-tools/bin/mips64el--netbsd-objcopy: libcrypto.so.15.0.debug: section `.note.netbsd.pax' can't be allocated in segment 0
LOAD: .MIPS.abiflags .reginfo .dynamic .hash .dynsym .dynstr .gnu.version .gnu.version_d .gnu.version_r .rel.dyn .init .text .MIPS.stubs .fini .rodata .eh_frame_hdr .eh_frame .note.netbsd.ident .note.netbsd.pax
The last part -- a warning message about which I just filed another
bug, PR port-mips/59320: objcopy: section `.note.netbsd.pax' can't be
allocated in segment 0 -- is evidence that make(1) is still running
the buggy ipftest.debug recipe which rewrites ipftest in place:
507 ${_PROGDEBUG.${_P}}: ${_P}
508 ${_MKTARGET_CREATE}
509 ( ${OBJCOPY} --only-keep-debug --compress-debug-sections \
510 ${_P} ${_PROGDEBUG.${_P}} && \
511 ${OBJCOPY} --strip-debug -p -R .gnu_debuglink \
512 --add-gnu-debuglink=${_PROGDEBUG.${_P}} ${_P} \
513 ) || (rm -f ${_PROGDEBUG.${_P}}; false)
https://nxr.netbsd.org/xref/src/share/mk/bsd.prog.mk?r=1.355#509
My best guess was that:
1. When doing dependall, the ipftest.debug recipe above:
(a) creates ipftest.debug with objcopy at time t0,
(b) a moment later, modifies ipftest in place with objcopy, at time
t1 = t0 + eps > t1.
2. When doing install, make(1) finds that ${DESTDIR}/usr/sbin/ipftest
and ${DESTDIR}/usr/libdata/debug/usr/sbin/ipftest.debug are both
out of date, so it tries to run, _in parallel_:
(a) mips64el--netbsd-install ... ipftest ${DESTDIR}/usr/sbin/ipftest,
because ipftest exists and is up-to-date
(b) the .debug recipe above again, because ipftest exists and is
up-to-date with timestamp t1, but ipftest.debug exists and is
out-of-date with timestamp t0 < t1
Except this hypothesis doesn't make sense, for two reasons:
- The problem empirically _only_ happens in mips64 builds with a few
programs, and nothing in the hypothesis above is restricted to that.
- We pass `-p' (--preserve-dates) to objcopy(1) in step (1), so it
restores the mtime of the input file after truncating and
overwriting it -- and so by the time of make install, it should look
like ipftest.debug is up-to-date.
So I can't figure out why, under these circumstances, make install is
trying to rerun the .debug recipe. And I can't reproduce it on my
laptop.
I tried reading out `make -d g1' and `make -d m' output but it's kind
of inscrutable to me (I thought `-d g1' would show a graph, with nodes
and edges for dependency relations, but I can't figure out how to read
the edges in it).
Home |
Main Index |
Thread Index |
Old Index