Subject: Strange segmentation faults keep occuring in sh
To: None <netbsd-users@netbsd.org>
From: =?ISO-8859-1?Q?Lasse_Hiller=F8e_Petersen?= <lhp@toft-hp.dk>
List: netbsd-users
Date: 09/30/2006 15:29:44
dog:~ $ uname -a
NetBSD dog.toft-hp.dk 4.0_BETA NetBSD 4.0_BETA (GENERIC.MPACPI) #0: Fri 
Sep 15 03:25:05 UTC 2006  
builds@b3.netbsd.org:/home/builds/ab/netbsd-4/i386/200609140000Z-obj/home/builds/ab/netbsd-4/src/sys/arch/i386/compile/GENERIC.MPACPI 
i386

I have recently bought a new Core2 Duo E6300 machine, with an Asrock 
ConRoe945DVI motherboard, which I have posted about a couple of times 
here and there. Initially I got it with 512 MB RAM, which gave memory 
faults every time I did something slightly heavy, like compiling. 
Memtest86+ resolved this, and I have now fresh RAM (2 x 1GB), that I 
have little reason to suspect at fault, as I haven't seen any memory 
faults since, and MemTest86+ seems happy too.

I am trying to build a large bunch of packages, but get strange 
segmentation faults with varying frequency. Sometimes a make clan ; 
make, or just another make, will solve the problem. One thing the errors 
have in common is that they occur in /bin/sh. Therefore I tend not to 
suspect the RAM again. But what do I know?

Has anybody else building packages in pkgsrc with 4.0BETA seen such 
segfaults in sh?

Here is a backtrace from one occasion:
dog:~/pkgsrc/misc/koffice $ gdb /bin/sh 
/home/lhp/pkgsrc/www/htdig-devel/work/htdig-3.2.0b6/htword/sh.core
GNU gdb 5.3nb1
Copyright 2002 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain 
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386--netbsdelf"...(no debugging symbols 
found)...
Core was generated by `sh'.
Program terminated with signal 11, Segmentation fault.

warning: current_sos: Can't read pathname for load map: Input/output error

Reading symbols from /lib/libedit.so.2...(no debugging symbols 
found)...done.
Loaded symbols for /lib/libedit.so.2
Reading symbols from /lib/libtermcap.so.0...(no debugging symbols found)...
done.
Loaded symbols for /lib/libtermcap.so.0
Reading symbols from /lib/libc.so.12...(no debugging symbols found)...done.
Loaded symbols for /lib/libc.so.12
#0  0x0804bad7 in evalcommand ()
(gdb) bt
#0  0x0804bad7 in evalcommand ()
#1  0x0804b938 in evaltree ()
#2  0x0804b8e9 in evaltree ()
#3  0x0804b8e9 in evaltree ()
#4  0x0804b8d1 in evaltree ()
#5  0x0804b8d1 in evaltree ()
#6  0x0804b8d1 in evaltree ()
#7  0x0804b8d1 in evaltree ()
#8  0x0804b8d1 in evaltree ()
#9  0x0804cdfe in evalcase ()
#10 0x0804b990 in evaltree ()
#11 0x0804b8e9 in evaltree ()
#12 0x0804b8d1 in evaltree ()
#13 0x0804b8e9 in evaltree ()
#14 0x080549d8 in cmdloop ()
#15 0x08054cf0 in main ()
#16 0x08049cb0 in ___start ()

And here is the make output from another occasion, I couldn't find the 
sh.core file for this one:
===> Overriding tools for fltk-1.1.7nb4
===> Extracting for fltk-1.1.7nb4
===> Patching for fltk-1.1.7nb4
=> Applying pkgsrc patches for fltk-1.1.7nb4
===> Creating toolchain wrappers for fltk-1.1.7nb4
Node type = 1852383340
[1]   Done                    (/usr/sbin/pkg_i... |
      Done                    /usr/bin/sort -u |
      Segmentation fault      while read file;...
*** Error code 139

Stop.
make: stopped in /home/lhp/pkgsrc/x11/fltk
*** Error code 1

Stop.
make: stopped in /home/lhp/pkgsrc/x11/fltk
*** Error code 1

Stop.
make: stopped in /home/lhp/pkgsrc/graphics/openexr
*** Error code 1

Stop.
make: stopped in /home/lhp/pkgsrc/x11/kdelibs3
*** Error code 1

Stop.
make: stopped in /home/lhp/pkgsrc/x11/kdebase3
*** Error code 1

Stop.
make: stopped in /home/lhp/pkgsrc/misc/koffice

Suggestions appreciated. I seem to stumble along, so I don't feel stuck 
yet. I do feel a bit worried though. I am sure the software will 
eventually improve, so what I am most worried about is a subtle defect 
in the hardware. And I sure would prefer being able to set a build of 
all the packages cooking overnight, without fearing it will stop in the 
middle of something.

One solution I have thought of, and which I would particularly 
appreciate advice on, is whether I could just mv /bin/sh /bin/osh ; ln 
/bin/ksh /bin/sh and perhaps get rid of the shell core dumps that way. 
If I'm not entirely mistaken, ksh should be sufficiently compatible to 
do so, right?

-Lasse