Port-mips archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

mipsel/cobalt: GCC producing extremely slow code



Hello everyone,

I've been running NetBSD/cobalt on a Qube 2 with 64 MiB of memory
since 8.0, and from the beginning I've noticed some really bizarre
performance irregularities building and running some standard C
benchmarks on the machine with the available pre-built GCC compilers.
Sometimes I will get lucky and build a binary that runs as fast as I
expect, while most of the time it will build something that runs even
slower than an old 25 MHz VAXstation I tested with the same sources
under VMS and DEC C with no optimizations! The best example of this so
far is probably LINPACK, where building both single and
double-precision binaries in one run can sometimes end up with some
very weird disparities, like a 900 KFLOPS single-precision peak versus
a 27 MFLOPS double-precision peak (or vice versa.) I've also noticed
this in CoreMark, STREAM, Nbench and a few others I've tried. Whether
or not I'm getting good performance seems to be luck of the draw, and
even weirder, not really dependent on system load. Keeping with the
LINPACK example, binaries that run well consistently run well every
time I run them, while the slow binaries don't improve no matter how
many times I try, so it seems to be something going on at build time.

I thought it might have been memory-related at first, but upgrading
from 64 to 96 MiB using what I had didn't improve performance at all
and the system in general does not appear to be thrashing swap very
hard. In addition to that I've tried attaching a heatsink to the CPU,
running a sysupgrade from 8.0 to 9.2, then after noticing some I/O
errors in the system log I replaced the hard disk and completely
reinstalled 9.2. No avail. Most recently I have tried forcing
floating-point instructions with -mhard-float and even trying the
pre-built GCC3 package just to see if anything changes, but still the
same oddities. One gcc3 -O2 build and run gave me ~900 KFLOPS
single-precision peak and the next attempt gave me 7.2 MFLOPS and
consistently reproduced those results over three manual runs,
sustained again on the next build, then the next build after that one
went back down to a consistent 900 KFLOPS. It doesn't seem to be a GCC
issue specifically given this behavior can be replicated on two
different versions. I also dimly remember not having this issue when I
ran 5.2 on a different Qube 2 some years ago.

Any ideas of how I can dig further into this? I've never seen this
kind of behavior before and am definitely a little out of my element.
I appreciate your time and any suggestions!


Home | Main Index | Thread Index | Old Index