Subject: Re: PC164 memory bus speeed (was: pciide performance on alpha)
To: None <email@example.com>
From: Thor Lancelot Simon <firstname.lastname@example.org>
Date: 11/02/1999 01:52:24
On Sun, Aug 15, 1999 at 01:16:37PM +1000, Simon Burge wrote:
> Dave Huang wrote:
> > On Sat, 14 Aug 1999, Simon Burge wrote:
> > > Almost all of my results are with "gcc -O2", so I'm pretty sure the
> > > compiler isn't getting in the way. The code seems to be relatively
> > > simple - just multiple traversals of a linked list.
> > Okay, just checking :) I was kinda disappointed with my PC164 until I
> > realized that I wasn't testing with the same compiler... now I'm just
> > disappointed with gcc ;) My K6-2 460MHz does compile stuff faster than
> > my alpha though...
> gcc does have it's weaknesses - that's for sure. Floating point seems
> to be particularly bad, and on a lot of architectures...
Okay, so this is an *old* thread. Sorry 'bout that.
However: you should consider the high cost of incorrectly estimating the
cache and main memory latencies in the compiler. I suspect that the reason
almost *all* code seems to run much faster with gcc -mcpu=21164a isn't just
the use of BWX instructions but also the use of cache and memory latency
numbers that are a lot closer to reality for the pc164.
Since we know how fast the L1, L2, and L3 caches are -- L1 and L2 are the
same for all 21164, and the speed of the L3 parts should be stamped on
them -- and the memory's 60ns, it should be possible to feed gcc figures
that are exactly right, and I'd be curious to see what this does for the
various memory-sensitive benchmarks people have been disappointed by.
FWIW the pc164 that's now anoncvs.netbsd.org achieved the highest STREAM
benchmark result I'd ever seen at the time, several hundred megabytes per
second in 256-bit mode. So the memory bandwidth of the pc164 is probably
okay, and like people have noticed it's gcc that sucks.
Thor Lancelot Simon email@example.com
"And where do all these highways go, now that we are free?"