Subject: Re: PC164 memory bus speeed (was: pciide performance on alpha)
To: None <tls@rek.tjls.com>
From: Simon Burge <simonb@netbsd.org>
List: port-alpha
Date: 11/03/1999 00:26:37
Thor Lancelot Simon wrote:

> Okay, so this is an *old* thread.  Sorry 'bout that.

Cool, nothing like an old thread to strech the memory :)

> However: you should consider the high cost of incorrectly estimating the
> cache and main memory latencies in the compiler.  I suspect that the reason
> almost *all* code seems to run much faster with gcc -mcpu=21164a isn't just
> the use of BWX instructions but also the use of cache and memory latency
> numbers that are a lot closer to reality for the pc164.
> 
> Since we know how fast the L1, L2, and L3 caches are -- L1 and L2 are the
> same for all 21164, and the speed of the L3 parts should be stamped on
> them -- and the memory's 60ns, it should be possible to feed gcc figures
> that are exactly right, and I'd be curious to see what this does for the
> various memory-sensitive benchmarks people have been disappointed by.

How does one do this?  I've tried:

	gcc -O2
	gcc -O3 -mcpu=21164a -Wa,-m21164a -mmemory-latency=L1
	gcc -O3 -mcpu=21164a -Wa,-m21164a -mmemory-latency=L2
	gcc -O3 -mcpu=21164a -Wa,-m21164a -mmemory-latency=main
	gcc -O3 -mcpu=21164a -Wa,-m21164a -mmemory-latency=L1 -fsched-interblock -fsched-spec -fsched-spec-load -fbranch-count-reg

	gcc -v
	Using builtin specs.
	gcc version egcs-2.91.66 19990314 (egcs-1.1.2 release)

and get pretty much the same time for my benchmark (between 26.0 and
26.7 seconds).  Here's the top of my table of results - I would have
expected the PC164 to be much higher up the list, perhaps less than
10 seconds.

	wave    gcc -O2            3.11  0.04   DEC EV6/667 Tru64 Unix 4.0F
	quatre  gcc -O2            5.81  0.06   DEC EV65/600 Tru64 Unix 4.0F
	numbat  gcc -O2            7.22  0.06   Ultra E3000 4/250
	x96a    gcc -O2            8.76  0.04   Compaq Deskpro EN PII/450 NetBSD 1.4
	simonpc gcc -O2           10.46  0.06   PII/400 NetBSD 1.3I
	x96c    gcc -O2           11.13  0.11   Compaq Deskpro EP/SB Celeron/400 NetBSD 1.4
	axp500  cc -O2            12.13  0.08   DEC 500/333
	wincen  gcc -O2           15.13  0.13   DEC Prioris 6166MP/2 NetBSD 1.4.1
	axposf  gcc -O2           15.79  0.16   DEC 4000/710
	yallara cc -xO3           16.85  0.11   Ultra 2/2200
	fbog    gcc -O2           19.66  0.06   Ultra 1/200
	nsw120  gcc -O2           22.80  0.10   Ultra 2/2170
	davros  gcc -O2           25.21  0.18   HP 9000/887
	scylla  gcc -O2           25.87  0.19   DEC PC164/500 NetBSD 1.4
	wraith  gcc -O2           26.83  0.50   P5/133 NetBSD 1.3.3

"axp500" and "axposf" were some _OLD_ alphas that DEC put on the 'net
years ago for people to play with.  I'm sure I don't know the real model
numbers of some of these machines.

Also, this PC164 does have 8 SIMMs, not 4.

Simon.