Subject: weird benchmarks/bonnie crash may reveal possible GDB or compiler bug on 1.6.1 alpha
To: NetBSD/alpha Discussion List <port-alpha@NetBSD.ORG>
From: Greg A. Woods <woods@weird.com>
List: port-alpha
Date: 08/01/2003 15:28:36
I've got a problem with bonnie crashing on my new alpha.

Unfortunately I'm having trouble finding out anything about the crash
because GDB barfs:

$ cc -g -O2 -mieee -pipe -Werror -c bonnie.c
$ cc -g -static -o bonnie  bonnie.o
$ ldd bonnie
ldd: bonnie: not dynamically linked
$ file bonnie
bonnie: ELF 64-bit LSB executable, Alpha (unofficial), version 1 (SYSV), for NetBSD, statically linked, not stripped
$ size bonnie
text    data    bss     dec     hex     filename
84184   7896    12116   104196  19704   bonnie
$ gdb ./bonnie      
GNU gdb 5.0nb1
Copyright 2000 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "alpha-unknown-netbsd"...
(gdb) run -s 10
Starting program: /var/package-obj/benchmarks/bonnie/work/./bonnie -s 2
File './Bonnie.24517', size: 2097152
Writing with putc()...done
Rewriting...done
Writing intelligently...done
Reading with getc()...done
Reading intelligently...done
Seeker 1...Seeker 3...Seeker 2...start 'em...done...done...done...
              -------Sequential Output-------- ---Sequential Input-- --Random--
              -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
Machine    MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU  /sec %CPU

Program received signal SIGSEGV, Segmentation fault.
0x12000a684 in memchr ()
(gdb) where
#0  0x12000a684 in memchr ()
#1  0x12000d028 in __dtoa ()
#2  0x120009724 in vfprintf ()
warning: Hit beginning of text section without finding
warning: enclosing function for address 0x18
This warning occurs if you are debugging a function without any symbols
(for example, in a stripped executable).  In that case, you may wish to
increase the size of the search with the `set heuristic-fence-post' command.

Otherwise, you told GDB there was a function where there isn't one, or
(more likely) you have encountered a bug in GDB.
(gdb) 
(gdb) show heuristic-fence-post    
The distance searched for the start of a function is 0.
(gdb) 

Note first off that the program is statically linked, and objdump shows
a big chunk of debugging symbols.

Note secondly that heuristic-fence-post is set to zero, which according
to the GDB manual means "there is no limit" to the distance it will
search for the beginning of a function, and indeed the warning that it
hit the beginning of the text section suggests it searched all the way
through the whole program without success.

As you can see the crash happens just as bonnie is about to print the
results, and does in fact occurs in the following call (which can be
revealed by setting a breakpoint on "printf" and then running again):

	printf("%-8.8s %4d ", machine, size / (1024 * 1024));

Note that breaking on that line shows 'size' to be as expected:

	(gdb) print size
	$1 = 2097152
	(gdb) print size / (1024 * 1024)
	$2 = 2

Just to be sure the value of "machine" is OK too:

	(gdb) print machine
	$3 = 0x1200129d2 ""
	(gdb) print *machine
	$4 = 0 '\000'
	(gdb) 

How the heck does one apparently innocuous printf() of a string and
plain integer value cause such a weird crash?

The first thing I though of was that the "%-8.8s" spec was revealing an
alpha-specific bug in printf(3) itself, but given the appearance of
__dtoa() on the stack frame I assumed it has to be the "%d" that was
actually triggering the failure (even though the "%-8.8s" may have set
things up to fail).  However giving the machine a short name, a long
name, or an 8-character name, also makes no difference whatsoever, so
it's likely not anything to do with the "%-8.8s".

Does anyone have any suggestions for further investigation?  Should I
send-pr?  Should I try the pkgsrc version of gdb (5.2.1)?

My only other idea at this point is to build a libc with '-g', which I
was considering doing anyway, but with such a simple thing failing in
this way I'm wary of my chances for success....

Note this is 1.6.1 as installed from the official distribution binaries:

NetBSD 1.6.1 (GENERIC) #0: Mon Apr  7 07:59:37 UTC 2003
    autobuild@cs20.apochromatic.org:/autobuilder/build/netbsd-1-6/alpha/OBJ/autobuilder/build/netbsd-1-6/src/sys/arch/alpha/compile/GENERIC

-- 
						Greg A. Woods

+1 416 218-0098                  VE3TCP            RoboHack <woods@robohack.ca>
Planix, Inc. <woods@planix.com>          Secrets of the Weird <woods@weird.com>