Subject: Re: Busspace sanity ...
To: grefen@hprc.tandem.com, Jason Thorpe <thorpej@nas.nasa.gov>
From: John Nemeth <jnemeth@cue.bc.ca>
List: port-i386
Date: 06/13/1998 01:59:19
On Jun 10, 12:41am, Stefan Grefen wrote:
} In message <199806092202.PAA10864@lestat.nas.nasa.gov>  Jason Thorpe wrote:
} > On Tue, 09 Jun 1998 23:25:15 +0200 
} >  Stefan Grefen <grefen@hprc.tandem.com> wrote:
} > 
} >  > I aggree, but please with a additional switch. This is helpful for
} >  > driver development but not for daily use.
} > 
} > The DEBUG option isn't really "for daily use", either...  I guess you
} > and I disagree on the semantics of the DEBUG option.
} 
} The problem is the message goes away without DEBUG, but a 'fixed'
} driver stays. I can add a ifdef DEBUG in the driver around the 

     The message goes away when the driver is fixed too.

} I still don't see why the alignment of the data pointer (not the
} busspace address) needs to be stricter aligned than mandatory for the
} CPU (unless you check for another architecture, which I think isn't covered
} by DEBUG).

     Often on processors that don't require strict alignment, there is
a performance penalty, especially in the case of a cache miss (instead
of just waiting for one cache line to fill, or part of one, you have
to wait for two).  And of course, if it crosses a page boundary, and
one of the pages have been paged out, there is a major penalty.  Due
to the performance penalties, I consider it a bug not to use proper
alignment.  It would be interesting to enable alignment checking,
except that it only works in user mode (at least on the i486).
Personally, I always configure compilers to use strict alignment.  The
space tradeoff is minor compared to the performance difference.

} >  > > 	(2) there may be a performance gain for doing aligned access on
} >  > > 	    _any_ port, even when it is not strictly necessary.
} >  > 
} >  > Not if this is the transfer to the device. It would just introduce a
} >  > copy of the data for no reason.
} > 
} > That's only because your m_pullup() does a copy; there are other ways of
} > dealing with the problem.  See the CS8900 that _IS_ in the SUP tree at
} 
} I'm using ports, so it has to be 2 byte access and I think insw beats 
} any other solution. Doing a dance through a tmp variable may be faster than 

     Sure, just make sure the buffer you put the data in is properly
aligned.

} Since I'm using this 386 I'm looking at CPU-cycles again :-)) 

     Not using a properly aligned buffer is going to cost you in terms
of performance regardless of how you get the data from the device.

} You stop doing that if you play with bigger machines in your day-time job.

     This is very unfortunate.  I think about performance issues all
the time.  Just because you have a big machine doesn't mean you can
get sloppy.  Bigger machine generally have bigger problems to solve,
and being sloppy means you need a much bigger machine.

}-- End of excerpt from Stefan Grefen