Subject: 2.0_Beta: segfaults in vfprintf_unlocked / nbmake loses in realloc
To: None <port-amiga@NetBSD.org>
From: S.P.Zeidler <spz@serpens.de>
List: port-amiga
Date: 08/30/2004 16:58:44
Hi,

since I upgraded serpens to 2.0_Beta I have problems with programs
occasionally segfaulting in vfprintf_unlocked, eg innd:

#0  0x0820561c in vfprintf_unlocked () from /usr/lib/libc.so.12
#1  0x082052c8 in vfprintf_unlocked () from /usr/lib/libc.so.12
#2  0x081f94a8 in vsnprintf () from /usr/lib/libc.so.12
#3  0x0002011a in PrettySize (size=0, str=0xdfff8e8 "20.8kb") at status.c:86
#4  0x000209de in STATUSsummary () at status.c:303
#5  0x00020b1e in STATUSmainloophook () at status.c:344
#6  0x0001279c in CHANreadloop () at chan.c:884
#7  0x00015e6a in main (ac=4, av=0xdfffc80) at innd.c:981

I'm using innd as an example because the executable is older than the OS
upgrade and -used- to run stable, but top and vmstat, eg, also segfault in
the same routine, always in conjunction with formatting a float.
top lasts about 2 to 5 minutes, innd fails about once or twice a day.

Another cutie that started to happen with the upgrade is:
nbmake in realloc(): error: brk(2) failed [internal error]

That may have something to do with low real memory conditions, but swap
never fell under 200M free and I frankly wouldn't expect it to fail in brk
then anyway. This may, of course, be a completely separate problem
altogether.

Since I was suspecting a compiler problem from the cross-compile I also
did a native compile. That didn't solve the problem, but served to rub the
nbmake failure under my nose somewhat more (seeing that it's a Heisenbug a
second build will happily proceed over the failure position).

All these nice little failures do not happen on i386
and aren't readily explainable from the source code either, as far
as I can see.

So, either they are a compiler bug or also possibly a bug in the 68060
math emulation routines or the FPU in my box inconveniently fried itself
at or around the upgrade. 

From earlier messages on this list 2.0_BETA built itself fine a month 
ago, on 060 too, so either the problems are newer than that,
only happen on way busy machines or, well, insert another CPU. :-7

To further analyse that it would be nice to know if anybody else who 
runs this months 2.0_BETA on Amiga sees the same problem(s), and what 
CPU they use.

advanThanks,
	spz
-- 
spz@serpens.de (S.P.Zeidler)