Subject: Re: PII vs 21164
To: None <r.phillips@jkmrc.uq.edu.au>
From: Christian von Kleist <cvk@zybx.com>
List: port-alpha
Date: 05/16/2003 03:08:09
Ray,

     We all know that the Alpha is the god of processors (doesn't it go
without saying?)  There are several things that I can think of that
might explain what you're seeing.  Compiling is a memory-intensive
process so memory (cache, main, and disk) speed is an important
factor.  Lots of traffic goes back and forth from the disk so the
file system is important.  Finally, one processor is RISC and the
other is CISC, so code size is an issue.

     One thing that comes to mind are the cache differences between the
21164 and the PII.  The 21164 has 8kb of L1 cache and 96kb of L2
cache, and the PII has 32kb of L1 and 256- to 2048kb of L2 (yours
probably has 256- or 512kb).  Size isn't all that matters in caches,
but assuming the cache hardware works about equally well in both
processors the PII should have a distinct advantage with its larger
L2 cache, even if it's only running at 66MHz (could also be 100MHz). 
Fortunately the 21164 can also operate an L3 cache of considerable
size on the motherboard.  When I put a 2mb beta cache in my PWS500a I
saw a near doubling of compile performance!  If you don't have a beta
cache module in your machine installing one might really help.

     There is also a substantial amount of disk traffic when compiling;
therefore, the amount of write caching done by the filesystem could
have a big effect.  Specifically, turning softdeps on for your /usr
partition in /etc/fstab might help a lot.  Also, if the PII has a
fast UDMA/66 drive and your Alpha is running a slow old SCSI like the
one mine came with then the PII has an advantage.

     Finally, there's a big difference in code size between the 21164 and
the PII.  The PII has a huge collection of CISC instructions that
result in small assembly code size because a smaller number of
assembly instructions are require per line of source code.  Small
assembly code size means less memory traffic during compilation and
assembly and less disk traffic when the resultant object code is
written to disk.  In contrast the 21164 has a very small RISC
instruction set that requires more machine code instructions per line
of source code.  Also, the 21164 creates a 64-bit instruction from
each of the assembly instructions.  CISC instructions can be
considerably longer than 64 bits, but the average length is probably
pretty close to something like 64 bits.  That means the PII is moving
much less information from disk to main memory to cache to processor
during both the compile and assembly stages.

     The 21164 is definitely as fast as the PII in integer performance. 
(Of course it is way faster at FP performance, but aside from using
the FP registers as a kind of cache FP is not an issue here.)  In the
end I think the only explanation for the longer compile time on your
21164 is that more memory transfer is being done.  That includes all
three levels: cache, main memory, and disk.

--
c v k @ z y b x . c o m

<quote who="Ray Phillips">
> Dear NetBSD/alpha:
>
> Could someone help me to reconcile the fact that running
>
> # cd /usr/src
> # ./build.sh -D ../DistribDir -R ../ReleaseDir release
>
> takes around eleven hours on a 500 MHz PWS with 1 GB RAM but around
> seven hours on a PII with 640 MB RAM?
>
> I don't know how the SPEC values (for example) for the two machines
> compare but I'd have thought and hoped the PWS would be considerably
> faster.  Is building an alpha release of NetBSD more "difficult"
> somehow than an i386 release?
>
> I guess there wouldn't be many (any?) floating point instructions
> executed when performing this operation?
>
>
> Ray