Subject: Re: improving ssh performance on sun4m systems
To: None <port-sparc@netbsd.org>
From: David Laight <david@l8s.co.uk>
List: port-sparc
Date: 03/14/2002 08:21:47
On Thu, Mar 14, 2002 at 05:41:12AM +0300, Valeriy E. Ushakov wrote:
> On Thu, Mar 14, 2002 at 13:29:52 +1100, matthew green wrote:
> 
> > interesting that "-O2" is better than "-O3" and
> > "-O3 -fomit-frame-pointer".
> 
> gcc can sometimes do some weird decisions about inlining (that -O3
> turns on).  Perhaps that is the case and you start to pay the price of
> faulting in those extra pages?  I've seen -O3 blowing the file 5 times
> b/c a simple very frequently used helper functions were inlined all
> over the place.

Yes - both inlining and unrolling loops can cause the codes active
size to increase, this has 2 detremental effects:
1) more instructions must be read into the cache to execute a loop
2) other code is displaced from the I-Cache.

The net effect is that the benchmark runs faster, but any real
workload runs slower!

I suspect some of these optimisations were invented when systems
didn't have non-trivial caches, and when memory speeds were
similar to execution speed.

	David

-- 
David Laight: david@l8s.co.uk