NetBSD-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: About GCC optimizations



Joel CARNAT <joel%carnat.net@localhost> wrote:

> I'm trying to get the best performance from my VI C3 machine.
> Thus I begin looking for GCC optimizations. According to the Gentoo
> Linux WiKi, a good set of flags would be "-march=c3-2 -Os
> -fomit-frame-pointer". I've compiled a few things to check for binary
> size and crash tests and it looks OK. But I still have a few questions:
> 
> 1. As far as I understood, "-Os" builds small code that fetch well in
> the C3 small cache. Does this mean that the binary will only load faster
> or does it mean that it will also run better because each function call
> will fit better in the small cache ?

It's hard to give any specific answer.  It's hard to talk about
performance "in general" (if at all meaningful).

IIRC, main alleged benefit of -Os (vs. -O2) is more compact code and
hence less paging (less pages to load on start, less pages to page out
under memory shortage, less pages to keep in page cache and so likely
less need for paging them out).  Cache effects are likely to be in the
noice compared to paging for something like ls.

ISTR, I heard a story (mid 90s) about Sun folks that were looking for
"generic" perfromance tuning for Solaris and they settled on -Os for
these reasons.  (apologies to Sun folks if i remember wrong).

OTOH, on a certain type of long running program (e.g. media
transcoding) cache effects will dominate the picture.

I'd guess that on a system with plenty of memory and not under any
heavy load the difference between -Os and -O2 is going to be very
small, and if most of you code is hot in the page cache, -O2 might
actually be faster b/c "less paging" benefit of -Os no longer plays
any role.

I use -Os -freorder-blocks for sh3 instead of -O2 b/c -falign-*
options included into -O2 tend to increase sh3 code size quite a bit
and sh3 machines often have little memory (64MB in usl-p5, 16MB in
Jornada 680) and CF size used to be a consideration until recently
(now that 1GB CFs seems to be "entry level" models compared to 64MB
few years ago).  Not that I bothered to actually measure any
performance impact, though :)

Having worked with people who do hardcoreperformance tuning as their
day job, one thing I've learned from them is that you probably
shouldn't even bother thinking about performance unless you measured
it and you understand specific conditions under which your system runs
and what kind of workloads you are optimizaing for.

So don't lose much time on this, you'll waste more time than any
improvements from highly fine-tuned -O* -f* are going to save you in
the few following years while the system you are optimizaing is still
useful. :) if -Os is what makes you feel good, just stick with it :)


> 2. My C3 is a multimedia station (running freevo and MPlayer). Should I
> had flags like "-mmmx" or "-mmsse" or are they included in the
> "-march=c3-2" ?
> 
> 3. In a more general way, how do I know which "-m" or "-f" option are
> included in the "-march=FOO" parameter ?

It's all in gcc.info.


SY, Uwe
-- 
uwe%stderr.spb.ru@localhost                       |       Zu Grunde kommen
http://snark.ptc.spbu.ru/~uwe/          |       Ist zu Grunde gehen



Home | Main Index | Thread Index | Old Index