NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: toolchain/57241: mips64el--netbsd-install core dumps randomly
The following reply was made to PR toolchain/57241; it has been noted by GNATS.
From: Christos Zoulas <christos%zoulas.com@localhost>
To: gnats-bugs%netbsd.org@localhost
Cc: toolchain-manager%netbsd.org@localhost,
gnats-admin%netbsd.org@localhost,
netbsd-bugs%netbsd.org@localhost,
"martin%netbsd.org@localhost" <martin%NetBSD.org@localhost>
Subject: Re: toolchain/57241: mips64el--netbsd-install core dumps randomly
Date: Fri, 18 Apr 2025 14:34:00 -0400
--Apple-Mail=_F6441B03-E773-42AB-93E9-466FAD395E7C
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
charset=utf-8
Why not build the prog.debug file as part of the prog target?
christos
--- bsd.prog.mk 28 Nov 2021 15:49:36 -0000 1.340
+++ bsd.prog.mk 18 Apr 2025 18:33:21 -0000
@@ -529,6 +529,12 @@
.if ${MKSTRIPIDENT} !=3D "no"
${OBJCOPY} -R .ident ${.TARGET}
.endif
+.if defined(_PROGDEBUG.${_P})
+ ( ${OBJCOPY} --only-keep-debug ${_P} ${_PROGDEBUG.${_P}} \
+ && ${OBJCOPY} --strip-debug -p -R .gnu_debuglink \
+ --add-gnu-debuglink=3D${_PROGDEBUG.${_P}} ${_P} \
+ ) || (rm -f ${_PROGDEBUG.${_P}}; false)
+.endif
=20
CLEANFILES+=3D ${_P}.d
.if exists(${_P}.d)
@@ -554,21 +560,18 @@
.if ${MKSTRIPIDENT} !=3D "no"
${OBJCOPY} -R .ident ${.TARGET}
.endif
-.endif # !commands(${_P})
-.endif # USE_COMBINE
-
-${_P}.ro: ${OBJS.${_P}} ${_DPADD.${_P}}
- ${_MKTARGET_LINK}
- ${CC} ${LDFLAGS:N-pie} -nostdlib -r -Wl,-dc -o ${.TARGET} =
${OBJS.${_P}}
-
.if defined(_PROGDEBUG.${_P})
-${_PROGDEBUG.${_P}}: ${_P}
- ${_MKTARGET_CREATE}
( ${OBJCOPY} --only-keep-debug ${_P} ${_PROGDEBUG.${_P}} \
&& ${OBJCOPY} --strip-debug -p -R .gnu_debuglink \
--add-gnu-debuglink=3D${_PROGDEBUG.${_P}} ${_P} \
) || (rm -f ${_PROGDEBUG.${_P}}; false)
.endif
+.endif # !commands(${_P})
+.endif # USE_COMBINE
+
+${_P}.ro: ${OBJS.${_P}} ${_DPADD.${_P}}
+ ${_MKTARGET_LINK}
+ ${CC} ${LDFLAGS:N-pie} -nostdlib -r -Wl,-dc -o ${.TARGET} =
${OBJS.${_P}}
=20
.endif # defined(OBJS.${_P}) && !empty(OBJS.${_P}) =
# }
=20
> On Apr 18, 2025, at 12:30=E2=80=AFPM, Taylor R Campbell via gnats =
<gnats-admin%netbsd.org@localhost> wrote:
>=20
> The following reply was made to PR toolchain/57241; it has been noted =
by GNATS.
>=20
> From: Taylor R Campbell <riastradh%NetBSD.org@localhost>
> To: Roland Illig <rillig%NetBSD.org@localhost>
> Cc: gnats-bugs%NetBSD.org@localhost, netbsd-bugs%NetBSD.org@localhost
> Subject: Re: toolchain/57241: mips64el--netbsd-install core dumps =
randomly
> Date: Fri, 18 Apr 2025 16:26:30 +0000
>=20
> Hi rillig, I wonder whether you might be able to help solve a
> make(1)-related mystery?
>=20
> I'm drafting a change to fix the parallel-safety of the foo.debug
> recipe in bsd.prog.mk (a little finicky because it has nontrivial
> interaction with other makefiles like libexec/ld.elf_so/Makefile).
>=20
> But before I commit it, I want to make sure I understand the
> underlying cause of PR 57241.
>=20
> The immediate symptom is that, e.g., `mips64el--netbsd-install ...
> ipftest ${DESTDIR}/usr/sbin/ipftest' is crashing because its input
> file has been truncated between fstat/mmap and access to file content.
> And it looks like there's a concurrent objcopy from the .debug recipe
> which has truncated ipftest to rewrite it in place.
>=20
> But I can't figure out why the concurrent objcopy is happening only in
> the mips64 builds of certain programs like ipftest(8) and crash(8),
> which seem to have in common the use of compat/exec.mk. (These are
> programs that run with the n64 ABI, in order to read out kernel guts
> on mips64 CPUs, in a userland where _most_ programs run with the n32
> ABI instead because it's more compact and they usually have <4GB RAM.)
>=20
> And so I think I need a make(1) wizard to help.
>=20
>=20
> Here's an example:
>=20
> =
https://releng.netbsd.org/builds/HEAD/202504161330Z/evbmips-mips64el.build=
.=3D
> failed
> =
https://web.archive.org/web/20250418154748/https://releng.netbsd.org/build=
s=3D
> /HEAD/202504161330Z/evbmips-mips64el.build.failed
>=20
> [1] Bus error (core dumped) =
/home/builds/ab/HEAD/evbmips-mips64el/2025041=3D
> 6...
> --- =
/home/builds/ab/HEAD/evbmips-mips64el/202504161330Z-dest/usr/sbin/ipfte=3D=
> st ---
> ...
> *** Failed target: =
/home/builds/ab/HEAD/evbmips-mips64el/202504161330Z-dest=3D
> /usr/sbin/ipftest
> *** In directory: =
/home/source/ab/HEAD/src/external/bsd/ipf/bin/ipftest
> *** Failed commands:
> ${_MKTARGET_INSTALL}
> =3D3D> @# "install " =
/home/builds/ab/HEAD/evbmips-mips64el/202504161330Z-des=3D
> t/usr/sbin/ipftest
> ${INSTALL_FILE} -o ${BINOWN} -g ${BINGRP} -m ${BINMODE} =
${STRIPFLAG} ${.A=3D
> LLSRC} ${.TARGET}
> =3D3D> =
/home/builds/ab/HEAD/evbmips-mips64el/202504161330Z-tools/bin/mips64e=3D
> l--netbsd-install -U -M =
/home/builds/ab/HEAD/evbmips-mips64el/202504161330Z=3D
> -dest/METALOG -D =
/home/builds/ab/HEAD/evbmips-mips64el/202504161330Z-dest -=3D
> h sha256 -N /home/source/ab/HEAD/src/etc -c -r -o root -g wheel -m =
555 i=3D
> pftest =
/home/builds/ab/HEAD/evbmips-mips64el/202504161330Z-dest/usr/sbin/ip=3D
> ftest
> *** =
[/home/builds/ab/HEAD/evbmips-mips64el/202504161330Z-dest/usr/sbin/ipft=3D=
> est] Error code 138
> ...
> =
/home/builds/ab/HEAD/evbmips-mips64el/202504161330Z-tools/bin/mips64el--ne=
t=3D
> bsd-objcopy: libcrypto.so.15.0.debug: section `.note.netbsd.pax' can't =
be a=3D
> llocated in segment 0
> LOAD: .MIPS.abiflags .reginfo .dynamic .hash .dynsym .dynstr =
.gnu.version .=3D
> gnu.version_d .gnu.version_r .rel.dyn .init .text .MIPS.stubs .fini =
.rodata=3D
> .eh_frame_hdr .eh_frame .note.netbsd.ident .note.netbsd.pax
>=20
> The last part -- a warning message about which I just filed another
> bug, PR port-mips/59320: objcopy: section `.note.netbsd.pax' can't be
> allocated in segment 0 -- is evidence that make(1) is still running
> the buggy ipftest.debug recipe which rewrites ipftest in place:
>=20
> 507 ${_PROGDEBUG.${_P}}: ${_P}
> 508 ${_MKTARGET_CREATE}
> 509 ( ${OBJCOPY} --only-keep-debug --compress-debug-sections =
\
> 510 ${_P} ${_PROGDEBUG.${_P}} && \
> 511 ${OBJCOPY} --strip-debug -p -R .gnu_debuglink \
> 512 --add-gnu-debuglink=3D3D${_PROGDEBUG.${_P}} =
${_P} \
> 513 ) || (rm -f ${_PROGDEBUG.${_P}}; false)
>=20
> https://nxr.netbsd.org/xref/src/share/mk/bsd.prog.mk?r=3D3D1.355#509
>=20
>=20
> My best guess was that:
>=20
> 1. When doing dependall, the ipftest.debug recipe above:
> (a) creates ipftest.debug with objcopy at time t0,
> (b) a moment later, modifies ipftest in place with objcopy, at time
> t1 =3D3D t0 + eps > t1.
>=20
> 2. When doing install, make(1) finds that ${DESTDIR}/usr/sbin/ipftest
> and ${DESTDIR}/usr/libdata/debug/usr/sbin/ipftest.debug are both
> out of date, so it tries to run, _in parallel_:
>=20
> (a) mips64el--netbsd-install ... ipftest =
${DESTDIR}/usr/sbin/ipftest,
> because ipftest exists and is up-to-date
>=20
> (b) the .debug recipe above again, because ipftest exists and is
> up-to-date with timestamp t1, but ipftest.debug exists and is
> out-of-date with timestamp t0 < t1
>=20
> Except this hypothesis doesn't make sense, for two reasons:
>=20
> - The problem empirically _only_ happens in mips64 builds with a few
> programs, and nothing in the hypothesis above is restricted to that.
>=20
> - We pass `-p' (--preserve-dates) to objcopy(1) in step (1), so it
> restores the mtime of the input file after truncating and
> overwriting it -- and so by the time of make install, it should look
> like ipftest.debug is up-to-date.
>=20
> So I can't figure out why, under these circumstances, make install is
> trying to rerun the .debug recipe. And I can't reproduce it on my
> laptop.
>=20
> I tried reading out `make -d g1' and `make -d m' output but it's kind
> of inscrutable to me (I thought `-d g1' would show a graph, with nodes
> and edges for dependency relations, but I can't figure out how to read
> the edges in it).
>=20
--Apple-Mail=_F6441B03-E773-42AB-93E9-466FAD395E7C
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
filename=signature.asc
Content-Type: application/pgp-signature;
name=signature.asc
Content-Description: Message signed with OpenPGP
-----BEGIN PGP SIGNATURE-----
Comment: GPGTools - http://gpgtools.org
iF0EARECAB0WIQS+BJlbqPkO0MDBdsRxESqxbLM7OgUCaAKbGAAKCRBxESqxbLM7
OpmlAJ40VE7fhoWs1JtbkPyiKlSBOwk0MQCg6KsYD4sS/xSzrMg8p+qNNMKkjew=
=Dly2
-----END PGP SIGNATURE-----
--Apple-Mail=_F6441B03-E773-42AB-93E9-466FAD395E7C--
Home |
Main Index |
Thread Index |
Old Index