Subject: Re: ppc benchmarks, quick and dirty. 604ev, g3, mips
To: Riccardo Mottola <rollei@tiscalinet.it>
From: Michael <macallan18@earthlink.net>
List: port-macppc
Date: 03/12/2005 11:48:09
--Signature=_Sat__12_Mar_2005_11_48_09_-0500_Fsg0kxuqw54WjQDf
Content-Type: text/plain; charset=US-ASCII
Content-Disposition: inline
Content-Transfer-Encoding: 7bit

Hello,

> but you can see that teh g3 appears much faster even in this almost pure
> FPU bound processing! I htink the assumptions that the G3 has bad FPU
> don't generate from hard data.
Well, I ran a few benchmarks too a while ago - the results were pretty different from yours. I had a 300MHz G3 in my S900 vs. a 300MHz 604e in a Motorola PowerStack II, both with 1MB L2 cache - at 150MHz for the G3, 66MHz for the 604e. The Mac runs NetBSD, the PS AIX, for both I used gcc 3.3.something. In all FPU-bound benchmarks the 604e had a 10%-20% edge over the G3, with integer-bound stuff the G3 was slightly faster. Memory-bound tests of course were vastly faster on the G3 as long as data fit into L2 cache, after that the 604e is faster again ( it runs its bus at 66MHz in the PS, vs. 50MHz in the Mac )
And just for fun - when I use the xlc compiler on AIX the 604 is suddenly more than twice as fast as the G3 in FPU-bound tasks and has an edge almost everywhere else, but that's hardly fair ;-)
So - how do we convince IBM to port xlc to NetBSD? The one I used was quite archaic, 5.0.2 or so.

> I have about the same results when running "setiathome" and checking
> average running times, but here I have controls over the compiler
> options and the dataset is probably much smaller here, so more a kind of
> "microbench"
Hmm, I guess I'll try that too, the 'official' seti@home client for AIX is horribly slow.

Could you send me the code? I'd like to run the same test on some of my boxes.

> FFT running times:

> 4096 samples, the data set is
> 2+4 arrays of doubles of 4096 samples
> that is 6 * sizeof(double) * 4096 = 6 * 6 * 4096 = 196608KBytes assuming
> 8 bytes doubles.
Umm, sizeof(double) is 8 if I remember correctly, so 48*4KB is certainly small enough for the L2.

> I don't understand why the ppc604ev seems so slow. Everybody thought it
> would be faster than the g3! and the fft calculus should be more cpu
> bound than memory bound and the dataset should fit in the cache.
The dataset is small, fits into the L2 cache so I'd expect that the G3 has an edge because it runs the cache at a much higher speed than the 604, even the 604ev runs the cache at only 100MHz, the G3 usually runs it at about half the CPU speed. 

> Could
> the os do something wrong (the 604 is not certified for 10.15 macos)
> like not enabling or enabling "badly" the cpu L2 cache?
Hmm, does -mcpu=604e -mtune=604e change anything? The cache should be enabled by the firmware, but who knows, it's OF 1.0.5 after all... how does the benchmark behave under NetBSD? There we'd at least know if the cache is active.

have fun
Michael

--Signature=_Sat__12_Mar_2005_11_48_09_-0500_Fsg0kxuqw54WjQDf
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (NetBSD)

iQEVAwUBQjMdScpnzkX8Yg2nAQJdRwf/ce3Bpv4cPS+E5chbqy1er4ZZQD3rveYW
7rtcLOp+6MYGW9s93PlPjn8GtG/1glbr+H6KkIHkDEnTXgEkeDBTsU+ygldLKmib
zCJyW2ASDIpSrXhyz47smRLuljjLxTLYxsZw/56Uyvz/8/mNqlVywQnFLqZWS7R4
OdeCRdInjx9ojD1cDTJeWK5gmQSHi92vYi1QYm9TuYSFGrmAw8iBO4BuKSNP/cS3
02Vqwp9FReIRhH2G9o23xsOgtT2SqxCR6lNYi/7C/sdzU6/uklzrG+eMkGFg2oeB
EXA3nFIn0vGMMp9qbzCINIda5rWr/P5D430wsjyhF2INp3dAYJBZMg==
=2aV0
-----END PGP SIGNATURE-----

--Signature=_Sat__12_Mar_2005_11_48_09_-0500_Fsg0kxuqw54WjQDf--