port-arm: Re: Kernel copyin/out optimizations for ARM...

Subject: Re: Kernel copyin/out optimizations for ARM...
To: None <Richard.Earnshaw@arm.com>
From: Jason R Thorpe <thorpej@wasabisystems.com>
List: port-arm
Date: 08/08/2002 20:32:38

Jumping back into this conversation...

On Thu, Mar 14, 2002 at 03:18:08PM +0000, Richard Earnshaw wrote:

 > Hmm, I didn't say that LDMs would be slower than ldr, just that I doubted 
 > that they would make much difference to the performance here.  The one 
 > time I looked at this code suggested that it wasn't often copying large 
 > amounts of data, so the overheads were presumably elsewhere.

Allen Briggs and I have been looking at this issue recently (Allen is
in the processing of writing up some mail about it as I type this), and
our research shows that, on XScale, at least, you pay an insn-issue-latency
penalty for LDM, but that the data-fetch latency is actually *slightly*
better.

Now, if you know your data is in the cache, then the data-fetch latency
doens't matter quite as much, and so you want to worry about the issue
latency.

BTW -- we used the new performance counter API that Allen did in order
to test this stuff :-)

-- 
        -- Jason R. Thorpe <thorpej@wasabisystems.com>