Subject: Weird (but good) stuff. Was: Re: context switch benchmark...
To: Herb Peyerl , Jason Thorpe <thorpej@cs.orst.edu>
From: David Carrel <carrel@cisco.com>
List: port-hp300
Date: 11/06/1994 14:01:51
Kids,

Ok, I made one more try at the 50 MHz upgrade of the newly converted hp380.
But still no go.  I added the 100MHz oscillator and the machine wouldn't
even start the ROM diagnostics.  The LEDs on the motherboard all go on and
stay that way.  They don't even start ounting thrpough the ROM diagnostics
tests.  I think it just ain't gonna work on the 380.  Oh well...

But here's the interesting part.  Since I was opening the box, I tried
running at 33 MHz with the memory jumper set to 25 (instead of 33) to see
how things benchmarked.  (I've been curious to figure out exactly what that
jumper did.)  I ran dhry21 and saw no significant change, but when I ran
ctxsig, here's what I got:

100000 context switches in 39.51 secs, 2.53/millisec 395 microsec/switch
                                                     ^^^
Check it out, that's a fifteen percent improvement over the numbers that I
got yesterday.  They were:

> hp380 68040 @ 33 MHz single user mode
> 100000 context switches in 46.77 secs, 2.14/millisec 468 microsec/switch

Well, so I wen't back and reran the numbers with the jumper set to 33 and
here's what I got:

100000 context switches in 43.53 secs, 2.30/millisec 435 microsec/switch

Now that's about a 7 percent jump over yesterday.  The only thing I changed
since yesterday was to rebuild a new kernel with DEBUG and DIAGNOSTIC
removed (so that I could reboot the damn thing).  Well, that sort of makes
sense that debugging and diagnostics would slow things down.

As for the other eight percent, I'm convinced that the jumper is adding
another wait state when it's set to 33.  This is consistent with HPs very
conservative (read: solid) designs.  But I wouldn't have thought that this
would affect context switch times so much more than the dhrystones
benchmark.  I guess the dhrystones benchmark is hanging out in the 040
on-chip caches while ctxsig is requiring off-chip memory accesses.

But it definitely seems that the memory can handle the 33 MHz CPU without
the extra wait state.

Kinda interesting...

Dave