Subject: Re: Nubus card debugging
To: David A. Gatwood <dgatwood@gatwood.net>
From: Scott Reynolds <scottr@clank.org>
List: port-mac68k
Date: 09/06/2002 02:34:31
On Monday, August 5, 2002, at 03:43 PM, David A. Gatwood wrote:
> Judging by that, the code is really, really overkill, and could be
> reduced to eight accesses (four pairs) and two non-nested for loops.
It's not as much overkill as you seem to think. In its worst case the
current code makes 32 accesses. Sure, it's 4x what you're talking about
here, but how much time does this actually take? (By the way, notice
how the iterator in the outer loop is not merely decremented but rather
right-shifted.)
> char array[4];
>
> for (i=1; i<=4; i++) {
> array[4-i] = 0;
> if (mac68k_bus_space_probe(base + size - i)) {
> array[4-i] = inb(base + size - i);
> }
> }
>
> for (i=1; i<16; i++) {
> if (i<2) pos=1;
> else if (i<4) pos=2;
> else if (i<8) pos=3;
> else pos=4;
> if (array[i] == (~(i << 4) & 0xf0) | i) {
> break;
> }
> }
I'm not sure why this is any more clear than the existing code. If
you're concerned about efficiency I'd propose that you simply move the
mac68k_bus_space_probe() and the bus_space_read_1() (or in your case
the inb()) out of the inner loop and use a single temporary variable,
rather than an array. This reduces the worst case number of accesses
from 32 to 8, just as yours does.
Also, all of these conditionals will add significantly to the execution
time of your code, as they'll cause some amount of flushing to happen
in the execution stages on missed branches. I'm not sure your code is
more efficient than mine even without modification. At least it puts a
bit more load on the instruction cache. :-)
> if (i== 16) {
> // no (valid) card present
> } else {
> // the byte lanes are a bitmask stored in i,
> // i.e. byte lane 0 is 2^0, byte lane 1 is 2^1, etc.
> }
Finally, and this is just a personal preference, having to test an
index to see if it went past the end of the values used in a for loop
seems less clear than using a flag variable. I generally code flags
into loops simply because they make the intent of the loops clear. They
are not clever, nor are they particularly efficient, but I'd rather not
lose clarity for the sake of efficiency when it's simply not critical.
I seem to recall that there's a relevant Dykstra quote about the evils
of optimizing, but I'll be hanged if I can remember it...
--scott