Subject: Re: Best optimization (cpuflags) with gcc-3.3.2 + Pentium 3?
To: Andrew Gillham <gillham@vaultron.com>
From: Frederick Bruckman <fredb@immanent.net>
List: port-i386
Date: 11/03/2003 13:03:57
On Sat, 1 Nov 2003, Andrew Gillham wrote:

> What optimizer flags are people using successfully with -current?
> Specifically I'm interested in flags for a Pentium III 1Ghz that
> are known good with pretty much everything.
>
> I guess I'm interesting in what people are doing besides the cpuflags
> recommendation of '-march=pentium3' and how your settings perform.

Here's some grist for the mill. I'm building kernels with COPTS="-O2
-march=pentium3 -finline-functions -fprefetch-loop-arrays
-maccumulate-outgoing-args -momit-leaf-frame-pointer", and I just
rebuilt X with CCOPTIONS the same except "-march=pentium4" (to run on
a P4). I'm avoiding "-frename-registers" because there was a bug in it
earlier in the gcc3.3* cycle, though it's reportedly fixed now, and
"-fomit-frame-pointer" because some projects failed to build with that
at all, also earlier in the cycle. Most operations on a P3 or P4 are
not CPU bound, so the goal is to reduce consumption of memory
bandwidth. Most of the above strive to do that by reducing cache
misses; the extra register freed by the final one in "leaf functions",
which are presumably doing the grunt work, may allow some more
efficient constructions for moving bits.

I can't honestly say that I'm noticing any difference in performance.
The only performance tests I made were on replacing mplayer's
preferred "-funroll-loops -funroll-all-loops" with
"-maccumulate-outgoing-args" on a K6-2 (both with -O3 -ffast-math
-march=k6-2), and the later produced slightly larger code, but did
slighty better or at least no worse).

Frederick