Subject: GCC/EGCS issues
To: None <port-alpha@netbsd.org>
From: Bill Dorsey <dorsey@lila.lila.com>
List: port-alpha
Date: 03/14/2000 18:55:15
Hi,

I've been doing some code optimization using gcc/egcs on my PWS433a
running NetBSD 1.4.2_ALPHA and have noticed a minor anomaly in the
code which negatively impacts compiled executable performance.

When you do not specify the -mcpu= option, the code defaults to the ev4
cpu.  This causes the alpha_memory_latency var (in config/alpha/alpha.c)
to be set to 3 (cycles).  If you specify -mcpu=ev5 or -mcpu=ev56, the
alpha_memory_latency variable will be set to 2 cycles.  The comment in
alpha.c next to this value is that this is the Bcache value on a PC164 as
determined by LMbench.

I ran LMbench on my PWS433a and the results clearly show that the
L1 cache has a latency of 3 cycles, not 2 as specified in alpha.c.  I then
compiled several different benchmarks from the pkgsrc tree with
-mcpu=ev56 and -mmemory-latency={1,2,3,4} to see how performance
would be affected.  In _every_ case, best performance was achieved
with memory-latency set to 3.

The memory-latency value is used by gcc/egcs to compute scheduling
dependencies.  In the typical case, it should be set to a value equal to
the L1 cache latency.  Since the PC164 and my PWS433a both use
essentially the same CPUs (with the same L1 cache latencies, anyway),
I must conclude that the value of 2 in the alpha.c file is either a typo or
an error based on the mis-interpretation of the results from running
LMbench.

Those of you who are using machines with 21164s in them should keep
in mind that you will generally get better results if you specify
both -mcpu=
{ev5,ev56} and -mmemory-latency=3.  You will often achieve worse
results if you just specify -mcpu={ev5,ev56} than just taking the default
(ev4) cpu assumption of the compiler.  It may be worth changing the code
in the NetBSD source tree to correct this unfortunate behavior.

--
Bill Dorsey