Subject: Re: Kernel copyin/out optimizations for ARM...
To: None <Richard.Earnshaw@arm.com>
From: Jason R Thorpe <thorpej@wasabisystems.com>
List: port-arm
Date: 08/08/2002 20:32:38
Jumping back into this conversation...
On Thu, Mar 14, 2002 at 03:18:08PM +0000, Richard Earnshaw wrote:
> Hmm, I didn't say that LDMs would be slower than ldr, just that I doubted
> that they would make much difference to the performance here. The one
> time I looked at this code suggested that it wasn't often copying large
> amounts of data, so the overheads were presumably elsewhere.
Allen Briggs and I have been looking at this issue recently (Allen is
in the processing of writing up some mail about it as I type this), and
our research shows that, on XScale, at least, you pay an insn-issue-latency
penalty for LDM, but that the data-fetch latency is actually *slightly*
better.
Now, if you know your data is in the cache, then the data-fetch latency
doens't matter quite as much, and so you want to worry about the issue
latency.
BTW -- we used the new performance counter API that Allen did in order
to test this stuff :-)
--
-- Jason R. Thorpe <thorpej@wasabisystems.com>