NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

bin/59256: Bogus stack frame in (some) core dumps



>Number:         59256
>Category:       bin
>Synopsis:       Bogus stack frame in (some) core dumps
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    bin-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun Apr 06 20:50:00 +0000 2025
>Originator:     Robert Elz
>Release:        NetBSD 10.99.12
>Organization:
>Environment:
System: NetBSD jacaranda.noi.kre.to 10.99.12 NetBSD 10.99.12 (JACARANDA:1.1-20250403) #193: Thu Apr 3 23:14:02 +07 2025 kre%jacaranda.noi.kre.to@localhost:/usr/obj/testing/kernels/amd64/JACARANDA amd64
Architecture: x86_64
Machine: amd64
>Description:
	Sometimes a core dump from (quite recent) systems looks like this:

Program terminated with signal SIGABRT, Aborted.
#0  0x00007f7ff7c3043a in ?? ()
(gdb) bt
#0  0x00007f7ff7c3043a in ?? ()
#1  0x00007f7ff7c3b874 in ?? ()
#2  0xffffffffffffffdf in ?? ()
#3  0xffffffffffffffff in ?? ()
#4  0x0000000000000246 in ?? ()
#5  0x00007f7fffffdd30 in ?? ()
#6  0x00007f7fffffe150 in ?? ()
#7  0x00007f7ff7c2fcfd in ?? ()
#8  0x00007f7ff7c42927 in ?? ()
#9  0x0000000000016800 in __EH_FRAME_LIST__ ()
#10 0x00007f7ff7c42932 in ?? ()
#11 0x0000006200012723 in ?? ()
#12 0x6f69747265737361 in ?? ()
#13 0x207478656e22206e in ?? ()
#14 0x224c4c554e203d3d in ?? ()
#15 0x3a64656c69616620 in ?? ()
#16 0x2f2220656c696620 in ?? ()
#17 0x796c6e6f64616572 in ?? ()
#18 0x657361656c65722f in ?? ()
#19 0x676e69747365742f in ?? ()
#20 0x6962732f6372732f in ?? ()
#21 0x65642f6966652f6e in ?? ()
#22 0x22632e6874617076 in ?? ()
#23 0x3920656e696c202c in ?? ()
#24 0x74636e7566202c34 in ?? ()
#25 0x6c6f6322206e6f69 in ?? ()
#26 0x696c5f657370616c in ?? ()
#27 0x000000000a227473 in ?? ()
#28 0x00007f7ff7a85028 in ?? ()
#29 0x00007f7f00000001 in ?? ()
#30 0x00007f7f00000000 in ?? ()
#31 0x00007f7ff7ee63a0 in ?? ()
#32 0x00007f7f00000001 in ?? ()
#33 0x00007f7ff7b9254f in ?? ()
#34 0x0000000000000000 in ?? ()

Every time I have seen it it has looked much the same (I didn't check
the addresses),   I do have the debug sets installed, so if there
were any symbols they should be found.

That particular core dump is from efi(8) - the issue with that is
just that it uses assert() to validate that data from the firmware
is exactly what it knows how to handle, and nothing else, and in
this particular case, it isn't (it just needs better handling of
unknown data).   No assistance needed with that.

I've seen the same thing with formail (from the procmail package).
That one has a single call of iscntrl() which looks like

	char * p = something;

	while ( whatever ) {
		if (iscntrl(*p))
			something;

		/* ... */
	}

and when it hits a *p which has the high bit set, the new libc
ctype trap code catches it - that's an easy fix, which I have
made (and will commit a patch to the package eventually, unless
someone else gets there before me).

I tried to make a simpler test case, but when I did that,
either calling abort() (which is all assert() does) or deliberately
sending signed char (high bit set) to one of the isxxxx() macros,
everything works as intended, so there must be something else
which is setting the environment for these broken core files.

I first saw this (from efi) with a kernel from late March, but
that might just be because that's when I started fiddling with efi(8).
The formail issue didn't arise until the recent libc ctype guard
page changes were committed, obviously.

>How-To-Repeat:
	Not sure, but for the affected programs this happens all
	the time for me.   That is, every core dump from them
	looks like this (but of course, they're only dumping core
	from the same place in their respective executing environments).

	I have a (useless) BootNNNN variable (no longer used, so I
	could delete it) which triggers the issue in efi(8) - I'll
	leave it there so efi will abort from an assert() (which
	should not be an assert()) any time we want, if there is
	some kernel, or libc, fix which might avoid the bogus
	stack frame when the abort (or SIGSEGV in the other case)
	happens.

>Fix:
	?



Home | Main Index | Thread Index | Old Index