Subject: Re: Optimizer bug in NetBSD/SPARC64 1.5?
To: None <port-sparc@netbsd.org>
From: Greg Earle <earle@isolar.DynDNS.ORG>
List: port-sparc
Date: 02/15/2001 07:28:20
> It's safe to say that the sparc64 compiler is, uh, poor at best.  That it
> works as well as it does surprises me every day, I think.  You'll find that
> there are several work arounds in the netbsd source tree for sparc64 compiler
> bugs.  See doc/HACKS.  This one looks not like the others, though, I don't
> know exactly what this program is doing :-)

The NMH "scan" program just reads the headers of each mail message in an NMH
folder.  So all the first column numbers should have been consecutively
numbered file names.  But instead of "1", etc. it was coming up with "?4128"
which to me shows some kind of massive overflow problem with "-O2" in that
particular instance.

How did an entire OS and kernel get built successfully if it's in this kind of
state?  'Cos other than some random programs (like "xntpd") that crash, most
everything else seems to work OK so far ... (I'm glad for that!  hehe)

> Unfortunately, avenues for alternative compilers are poor.  gcc-current won't
> build sparc64-elf for me, nor sparc64-netbsd (hasn't for _months_).  I've
> tried to get the sparclinux compiler working (they use the RedHat `gcc 2.96'
> thing.. it has dozens and sparc64 patches..) but I couldn't get that working,
> even for `sparc64-elf'.  (I have given up trying to fix the compiler for now.)

Arggh.  Well, I wouldn't mind so much, if I could find workarounds.  I got
"perl5-base" to compile, finally, by changing "-g -msoft-quad-float -O2 ... "
in "patch-ag" to simply "-O -msoft-quad-float ...", for example.  (Not sure
that it *works*, mind you, but at least it builds, which is a start.  (-: )

But I'm starting to run into thornier stuff.  For example, "sudo" needs
autoconf-2.31 and m4-1.4.  When it runs "autoconf" to create "configure",
these huge bogus line numbers get generated into "configure", and it barfs
on itself when it runs.  I traced this back to "awk" (I think), and here's
a simple demonstration:

isolar# uname -rm
1.4.2 sparc
isolar# awk 'END { printf("%d\n", 1) ; }'
1

netbsd4me# uname -rm
1.5 sparc64
netbsd4me# awk 'END { printf("%d\n", 1) ; }'
9223372045444710399

Note that this number in Hex is 0x80000001FFFFFFFF.  Significant?

If "awk" can't even print out a simple "1" without serious problems, well ...

("gawk" gets built with "-O", so I'm not sure what exactly is being tickled
 that causes this behavior at run-time.  If I build "gawk" with "-g" instead
 of "-O", I get a completely different result than the above -
 0xBFF0000000000000.

 This smacks of weirdness in /usr/src/gnu/dist/gawk/builtin.c::format_tree(),
 maybe in the "unsigned long" stuff in the handlers for "%d"/"%i"?  I tried
 to run (g)awk under GDB, but attempts to print out variables at breakpoints
 give me "Error accessing memory address 0x7NN: Invalid argument." errors.)

This kind of thing is propagating throughout the system, making it really
difficult for me to get anything built in the pkg tree.  <Sigh>

(I suppose this really should've been a simple send-pr, but I'm wondering if
 I'm the only person running into these kinds of things on the Ultra.)

	- Greg