NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: bin/60150: named(8) crashes at startup on NetBSD/i386 11.0_RC2
The following reply was made to PR bin/60150; it has been noted by GNATS.
From: RVP <rvp%SDF.ORG@localhost>
To: Izumi Tsutsui <tsutsui%ceres.dti.ne.jp@localhost>
Cc: gnats-bugs%netbsd.org@localhost, kre%netbsd.org@localhost, christos%netbsd.org@localhost, mrg%netbsd.org@localhost,
joerg%netbsd.org@localhost
Subject: Re: bin/60150: named(8) crashes at startup on NetBSD/i386 11.0_RC2
Date: Sat, 4 Apr 2026 00:23:28 +0000 (UTC)
This message is in MIME format. The first part should be readable text,
while the remaining parts are likely unreadable without MIME-aware tools.
--0-510152587-1775262209=:18627
Content-Type: text/plain; charset=ISO-8859-15; format=flowed
Content-Transfer-Encoding: 8BIT
On Sat, 4 Apr 2026, Izumi Tsutsui wrote:
> The likely cause seems to be misalignment of dns_dispentry_t allocated
> by isc_mem_get() in src/external/mpl/bind/dist/lib/dns/dispatch.c:
>
> https://github.com/NetBSD/src/blob/netbsd-11/external/mpl/bind/dist/lib/dns/dispatch.c#L1453
>
>>> dns_dispentry_t *resp = isc_mem_get(disp->mctx, sizeof(*resp));
>
> On NetBSD/i386, the returned address is not 8-byte aligned,
> per following printf outputs:
>
> ---
>
> Index: dist/lib/dns/dispatch.c
> ===================================================================
> RCS file: /cvsroot/src/external/mpl/bind/dist/lib/dns/dispatch.c,v
> retrieving revision 1.13
> diff -u -p -d -r1.13 dispatch.c
> --- dist/lib/dns/dispatch.c 17 Jul 2025 19:01:45 -0000 1.13
> +++ dist/lib/dns/dispatch.c 3 Apr 2026 20:13:47 -0000
> @@ -16,6 +16,7 @@
> /*! \file */
>
> #include <inttypes.h>
> +#include <stddef.h>
> #include <stdbool.h>
> #include <stdlib.h>
> #include <sys/types.h>
> @@ -1451,6 +1452,18 @@ dns_dispatch_add(dns_dispatch_t *disp, i
>
> in_port_t localport = isc_sockaddr_getport(&disp->local);
> dns_dispentry_t *resp = isc_mem_get(disp->mctx, sizeof(*resp));
> + fprintf(stderr,
> + "dispatch.c: dispentry layout "
> + "alignof(node)=%zu alignof(resp)=%zu "
> + "offsetof(ht_node)=%zu sizeof(resp)=%zu "
> + "resp_mod8=%lu ht_node_mod8=%lu\n",
> + (size_t)_Alignof(struct cds_lfht_node),
> + (size_t)_Alignof(struct dns_dispentry),
> + offsetof(struct dns_dispentry, ht_node),
> + sizeof(*resp),
> + (unsigned long)((uintptr_t)resp & 7),
> + (unsigned long)((uintptr_t)&resp->ht_node & 7));
> +
> *resp = (dns_dispentry_t){
> .timeout = timeout,
> .port = localport,
>
> ---
>
> :
> 4-Apr-2026 05:11:55.200 running
> dispatch.c: dispentry layout alignof(node)=8 alignof(resp)=8 offsetof(ht_node)=184 sizeof(resp)=200 resp_mod8=4 ht_node_mod8=4
> assertion "!is_removal_owner(node)" failed: file "/s/netbsd-11/src/external/lgpl2/userspace-rcu/lib/liburcu-cds/../../dist/src/rculfhash.c", line 1097, function "_cds_lfht_add"
> Abort (core dumped)
> #
>
> ---
>
> `dns_dispentry_t` contains `struct cds_lfht_node ht_node`, and
> this node seems to (implicitly?) need to be 8-byte alignment
> per liburcu requirements . If it is not properly aligned,
> liburcu triggers an assertion in _cds_lfht_add() as noted above.
>
> The misalignment appears to come from the non-"HAVE_JEMALLOC" path
> in external/mpl/bind/dist/lib/isc/jemalloc_shim.h:
>
> https://github.com/NetBSD/src/blob/netbsd-11/external/mpl/bind/dist/lib/isc/jemalloc_shim.h#L33-L54
>
> ---
>
> typedef union {
> size_t size;
> max_align_t __alignment;
> } size_info;
>
> static inline void *
> mallocx(size_t size, int flags) {
> void *ptr = NULL;
>
> size_t bytes = ISC_CHECKED_ADD(size, sizeof(size_info));
> size_info *si = malloc(bytes);
> INSIST(si != NULL);
>
> si->size = size;
> ptr = &si[1];
>
> if ((flags & MALLOCX_ZERO) != 0) {
> memset(ptr, 0, size);
> }
>
> return ptr;
> }
>
> ---
>
> On NetBSD/i386, sizeof(max_align_t) is 12 bytes so
> sizeof(union size_info) is also 12 byets.
>
> The returned addess of mallocx() calculated by `ptr = &si[1];`
> is not 8 byte aligned.
>
> Actually the following ugly patch fixes the assertion of _cds_lfht_add()
> in liburcu:
>
> ---
> Index: dist/lib/isc/jemalloc_shim.h
> ===================================================================
> RCS file: /cvsroot/src/external/mpl/bind/dist/lib/isc/jemalloc_shim.h,v
> retrieving revision 1.4
> diff -u -p -d -r1.4 jemalloc_shim.h
> --- dist/lib/isc/jemalloc_shim.h 26 Jan 2025 16:25:37 -0000 1.4
> +++ dist/lib/isc/jemalloc_shim.h 3 Apr 2026 20:13:47 -0000
> @@ -30,9 +30,17 @@ const char *malloc_conf = NULL;
>
> #include <stdlib.h>
>
> +#ifndef ALIGNMENT
> +#define ALIGNMENT 8U
> +#endif
> +#ifndef roundup2
> +#define roundup2(x,m) ((((x) - 1) | ((m) - 1)) + 1)
> +#endif
> +
> typedef union {
> size_t size;
> max_align_t __alignment;
> + uint8_t __roundup[roundup2(sizeof(max_align_t), ALIGNMENT)];
> } size_info;
>
> static inline void *
>
> ---
>
> :
> 04-Apr-2026 05:13:13.869 running
> dispatch.c: dispentry layout alignof(node)=8 alignof(resp)=8 offsetof(ht_node)=184 sizeof(resp)=200 resp_mod8=0 ht_node_mod8=0
> dispatch.c: dispentry layout alignof(node)=8 alignof(resp)=8 offsetof(ht_node)=184 sizeof(resp)=200 resp_mod8=0 ht_node_mod8=0
> dispatch.c: dispentry layout alignof(node)=8 alignof(resp)=8 offsetof(ht_node)=184 sizeof(resp)=200 resp_mod8=0 ht_node_mod8=0
>
> ---
> Izumi Tsutsui
>
It might actually be the max_align_t value on i386 at the core of this issue
as I found out some years back. I had a private discussion with some of the
gurus prior to filing a PR, but, there was no clear consensus (or even a palpable
fault at hand aside from my vague unease about the difference in the value of
max_align_t between GCC and NetBSD) then about what to do about this, and I
dropped it for lack of time.
I reproduce some of my emails below:
---START mail 1---
>From rvp%SDF.ORG@localhost Wed Oct 16 23:30:46 2024
Date: Wed, 16 Oct 2024 23:30:45 +0000 (UTC)
From: RVP <rvp%SDF.ORG@localhost>
To: Robert Elz <kre%munnari.OZ.AU@localhost>
Cc: matthew green <mrg%eterna23.net@localhost>, Christos Zoulas <christos%netbsd.org@localhost>
Subject: Re: Header file changes of Tue, 08 Oct 2024
On Wed, 16 Oct 2024, Robert Elz wrote:
> I was a little surprised by the symlink implementation for <stddef.h>
> after the change, rather than the more traditional dummy include file
> that just has the appropriate include guard, and then
> #include <sys/me.h>
>
> I wonder if perhaps reverting to that method, with the guard that
> gcc is looking for (and what an absurd stupid way of doing anything
> that is, really!) in <stddef.h> but all (or most) of the contents
> in <sys/stddef.h>
>
Yes, this is fine, too.
On Wed, 16 Oct 2024, matthew green wrote:
> RVP writes:
>> ie. GCC keys on the include guard for this file. On Linux, I know, this file is
>> part of GCC and not glibc, so it's important; but, I'm not sure if this matters
>> on NetBSD (doesn't look like that file'll be used--except, maybe, when building
>> a/the compiler).
>
> actually, it's used for non-in-tree users:
>
> space-bird ~> pkg_info -L gcc12|grep stddef.h
> /usr/pkg/gcc12/lib/gcc/x86_64--netbsd/12.4.0/include-fixed/stddef.h
> /usr/pkg/gcc12/lib/gcc/x86_64--netbsd/12.4.0/include/stddef.h
>
> the -fixed one is modified ours, the other is the GCC one.
>
That's the same as on Linux where GCC is built using the standard 3-stage
bootstrap, which has a `fixincludes' phase. Here, (ie. on Linux, and also
the pkgsrc GCC), those include dirs will come first (gcc -v output):
```
GNU C17 (Ubuntu 13.2.0-23ubuntu4) version 13.2.0 (x86_64-linux-gnu)
compiled by GNU C version 13.2.0, GMP version 6.3.0, MPFR version 4.2.1, MPC version 1.3.1, isl version isl-0.26-GMP
[...]
#include "..." search starts here:
#include <...> search starts here:
/usr/lib/gcc/x86_64-linux-gnu/13/include
/usr/local/include
/usr/include/x86_64-linux-gnu
/usr/include
End of search list.
```
In the system GCC, however, the search path is quite different:
```
GNU C17 (nb1 20240630) version 12.4.0 (x86_64--netbsd)
compiled by GNU C version 12.4.0, GMP version 6.2.1, MPFR version 4.2.1, MPC version 1.3.1, isl version isl-0.26-GMP
[...]
#include "..." search starts here:
#include <...> search starts here:
/usr/include/gcc-12 // <- doesn't contain stddef.h
/usr/include
End of search list.
```
In Linux, since glibc doesn't provide a stddef.h, _both_ the compiler and
the stuff it compiles end up using the same header (the gcc "fixed" one)
I expect the pkgsrc GCC is the same since the GCC header files get included
first.
But, with the system compiler, you get different values for `max_align_t' on
32-bit x86 depending on which header is picked up:
```
#include <stddef.h>
#include <stdio.h>
int main(void) {
printf("size_t = %zu\n", sizeof (size_t));
printf("ptrdiff_t = %zu\n", sizeof (ptrdiff_t));
printf("wchar_t = %zu\n", sizeof (wchar_t));
printf("max_align_t = %zu\n", _Alignof (max_align_t));
return 0;
}
$ cc -O2 -march=native -pipe -o x x.c
$ ./x
size_t = 8
ptrdiff_t = 8
wchar_t = 4
max_align_t = 16
$ cc -I/usr/src/external/gpl3/gcc/dist/gcc/ginclude -O2 -march=native -pipe -o x x.c
$ ./x
size_t = 8
ptrdiff_t = 8
wchar_t = 4
max_align_t = 16
$ cc -m32 -O2 -march=native -pipe -o x x.c
$ ./x
size_t = 4
ptrdiff_t = 4
wchar_t = 4
max_align_t = 4
$ cc -I/usr/src/external/gpl3/gcc/dist/gcc/ginclude -m32 -O2 -march=native -pipe -o x x.c
$ ./x
size_t = 4
ptrdiff_t = 4
wchar_t = 4
max_align_t = 16
```
_This_ is what prompted my initial email to you guys: GCC is compiled with one
value of `max_align_t', but, what it compiles uses a different value. Should we
worry? Or, am I just overly vexed about nothing?
Thx,
-RVP
---END mail 1---
---START mail 2---
>From rvp%SDF.ORG@localhost Thu Oct 17 23:34:52 2024
Date: Thu, 17 Oct 2024 23:34:51 +0000 (UTC)
From: RVP <rvp%SDF.ORG@localhost>
To: Christos Zoulas <christos%zoulas.com@localhost>
Cc: Robert Elz <kre%munnari.OZ.AU@localhost>, matthew green <mrg%eterna23.net@localhost>, Christos Zoulas <christos%netbsd.org@localhost>
Subject: Re: Header file changes of Tue, 08 Oct 2024
On Wed, 16 Oct 2024, Christos Zoulas wrote:
> Perhaps we should just fix max_align_t to match gcc's in our stddef.h and call it a day.
>
> typedef struct {
> long long __max_align_ll __attribute__((__aligned__(__alignof__(long long))));
> long double __max_align_ld __attribute__((__aligned__(__alignof__(long double))));
> /* _Float128 is defined as a basic type, so max_align_t must be
> sufficiently aligned for it. This code must work in C++, so we
> use __float128 here; that is only available on some
> architectures, but only on i386 is extra alignment needed for
> __float128. */
> #if defined(__i386__)
> #ifdef __clang__
> // 16 is the gcc alignment for __float128
> long long __max_align_128 __attribute__((__aligned__(16)));
> #else
> __float128 __max_align_f128 __attribute__((__aligned__(__alignof(__float128)))
> );
> #endif
> #endif
> } max_align_t;
>
That should be OK, too; but, clang defines that differently. Both FreeBSD and
OpenBSD use clang as the system compiler and they have these:
https://github.com/freebsd/freebsd-src/commit/5dd723425ee0bbe05c08d2c2272be9fc34695886
https://github.com/freebsd/freebsd-src/commit/31ad7c11b393583f7b6b1f8118b27a0339ccd71a
https://github.com/openbsd/src/commit/c9b8f0a24da6d4b249fcafbbc6a920bf0d985179
in addition to having a separate sys/_types.h header file.
Inspired by FreeBSD, I did a `tools + kernel' build just now (I need my old
and poky laptop for other work, so I can't do a full build and ATF test of
amd64/i386/xen*) with the following patch to both /usr/include/sys/stddef.h
and /usr/src/sys/sys/stddef.h.
Also, /usr/src/external/gpl3/gcc/dist/gcc/ginclude/stddef.h was moved out of
the way and a new stddef.h with just `#include_next <stddef.h>' put in place.
No problem with the build or with that kernel.
```
--- /mnt/usr/src/sys/sys/stddef.h.orig 2024-10-08 22:53:20.000000000 +0000
+++ /mnt/usr/src/sys/sys/stddef.h 2024-10-17 22:04:52.932348859 +0000
@@ -33,6 +33,7 @@
#ifndef _SYS_STDDEF_H_
#define _SYS_STDDEF_H_
+#define _STDDEF_H_ /* for GCC build */
#include <sys/cdefs.h>
#include <sys/featuretest.h>
@@ -68,11 +69,14 @@
#endif
#if (__STDC_VERSION__ - 0) >= 201112L || (__cplusplus - 0) >= 201103L
-typedef union {
- void *_v;
- long double _ld;
- long long int _ll;
+#ifndef _GCC_MAX_ALIGN_T
+#define _GCC_MAX_ALIGN_T
+#define __CLANG_MAX_ALIGN_T_DEFINED // XXX
+typedef struct {
+ long long __max_align_ll __aligned(__alignof(long long));
+ long double __max_align_ld __aligned(__alignof(long double));
} max_align_t;
#endif
+#endif
#endif /* _SYS_STDDEF_H_ */
```
-RVP
---END mail 2---
---START mail 3---
>From rvp%SDF.ORG@localhost Sun Oct 20 23:20:28 2024
Date: Sun, 20 Oct 2024 23:20:27 +0000 (UTC)
From: RVP <rvp%SDF.ORG@localhost>
To: Christos Zoulas <christos%zoulas.com@localhost>
Cc: Robert Elz <kre%munnari.OZ.AU@localhost>, matthew green <mrg%eterna23.net@localhost>, Christos Zoulas <christos%netbsd.org@localhost>
Subject: Re: Header file changes of Tue, 08 Oct 2024
On Fri, 18 Oct 2024, Christos Zoulas wrote:
> I think that we should follow suit with FreeBSD/OpenBSD which are very similar...
>
Yeah, we should follow the compiler definition--GCC in our case.
https://github.com/llvm/llvm-project/blob/main/clang/lib/Headers/__stddef_max_align_t.h
says: "Define 'max_align_t' to match the GCC definition.", but, GCC aligns to
16 bytes on i386--better not to change the ABI. (What to do about clang, then?)
Thx,
-RVP
---END mail 3---
---START mail 4---
>From rvp%SDF.ORG@localhost Mon Oct 21 22:54:13 2024
Date: Mon, 21 Oct 2024 22:54:12 +0000 (UTC)
From: RVP <rvp%SDF.ORG@localhost>
To: Christos Zoulas <christos%zoulas.com@localhost>
Cc: Robert Elz <kre%munnari.OZ.AU@localhost>, matthew green <mrg%eterna23.net@localhost>, Christos Zoulas <christos%netbsd.org@localhost>
Subject: Re: Header file changes of Tue, 08 Oct 2024
On Mon, 21 Oct 2024, Christos Zoulas wrote:
>> Yeah, we should follow the compiler definition--GCC in our case.
>>
>> https://github.com/llvm/llvm-project/blob/main/clang/lib/Headers/__stddef_max_align_t.h
>>
>> says: "Define 'max_align_t' to match the GCC definition.", but, GCC aligns to
>> 16 bytes on i386--better not to change the ABI. (What to do about clang, then?)
>
> I think that we should also special-case the i386 both in the clang and gcc cases, effectively
> providing the same max alignment for both compilers that the gcc header do.
>
Yeah, that's my preference as well. OK with mrg (GCC) and kre (standards)?
-RVP
PS. Apart from GCC/Clang, their C++ libraries, and GDB, max_align_t is hardly
used:
$ fgrep -r max_align_t /usr/src /usr/xsrc
[...]
/usr/src/external/mpl/bind/dist/lib/isc/netmgr/netmgr-int.h
/usr/src/external/mpl/bind/dist/lib/isc/jemalloc_shim.h
/usr/src/lib/libc/tls/tls.c
/usr/src/libexec/ld.elf_so/tls.c
/usr/xsrc/external/mit/MesaLib/dist/src/vulkan/util/vk_alloc.c
$
so, an ABI change looks unlikely. And, aligning the TLS bits to 16-bytes on
i386 is also probably better anyway.
Thx,
-RVP
---END mail 4---
Note Jörg's point about GCC vs. SysV ABI difference (this is where I left off
due to lack of time then):
---START mail 5---
>From joerg%bec.de@localhost Tue Oct 22 12:33:25 2024
Date: Tue, 22 Oct 2024 14:33:11 +0200
From: Jörg Sonnenberger <joerg%bec.de@localhost>
To: matthew green <mrg%eterna23.net@localhost>, RVP <rvp%SDF.ORG@localhost>
Cc: Robert Elz <kre%munnari.OZ.AU@localhost>, Christos Zoulas <christos%netbsd.org@localhost>, joerg%netbsd.org@localhost, Christos Zoulas <christos%zoulas.com@localhost>
Subject: Re: Header file changes of Tue, 08 Oct 2024
Hi all,
the core issue is that GCC at least at the time tended to seriously
overalign long double. On i386 with the SYSV ABI that we use,
max_align_t should be 32bit. Note that Linux does *not* use the SYSV
ABI, but they silently switched a while ago because they couldn't get
GCC to implement the realignment stubs correctly for a long time.
Joerg
On 10/22/24 2:22 AM, matthew green wrote:
> cc: joerg, who had a better understanding of this than i do.
>
>
> .mrg.
>
>
> RVP writes:
>> On Mon, 21 Oct 2024, Christos Zoulas wrote:
>>
>>>> Yeah, we should follow the compiler definition--GCC in our case.
>>>>
>>>> https://github.com/llvm/llvm-project/blob/main/clang/lib/Headers/__stddef_max_align_t.h
>>>>
>>>> says: "Define 'max_align_t' to match the GCC definition.", but, GCC aligns to
>>>> 16 bytes on i386--better not to change the ABI. (What to do about clang, then?)
>>>
>>> I think that we should also special-case the i386 both in the clang and gcc cases, effectively
>>> providing the same max alignment for both compilers that the gcc header do.
>>>
>>
>> Yeah, that's my preference as well. OK with mrg (GCC) and kre (standards)?
>>
>> -RVP
>>
>> PS. Apart from GCC/Clang, their C++ libraries, and GDB, max_align_t is hardly
>> used:
>>
>> $ fgrep -r max_align_t /usr/src /usr/xsrc
>> [...]
>> /usr/src/external/mpl/bind/dist/lib/isc/netmgr/netmgr-int.h
>> /usr/src/external/mpl/bind/dist/lib/isc/jemalloc_shim.h
>> /usr/src/lib/libc/tls/tls.c
>> /usr/src/libexec/ld.elf_so/tls.c
>> /usr/xsrc/external/mit/MesaLib/dist/src/vulkan/util/vk_alloc.c
>> $
>>
>> so, an ABI change looks unlikely. And, aligning the TLS bits to 16-bytes on
>> i386 is also probably better anyway.
>>
>> Thx,
>>
>> -RVP
>>
---END mail 5---
-RVP
--0-510152587-1775262209=:18627--
Home |
Main Index |
Thread Index |
Old Index