Subject: Re: copyin/out
To: Jason R Thorpe <thorpej@wasabisystems.com>
From: Chris Gilbert <chris@paradox.demon.co.uk>
List: port-arm
Date: 08/09/2002 20:03:07
----- Original Message -----
From: "Jason R Thorpe" <thorpej@wasabisystems.com>
To: "Chris Gilbert" <chris@paradox.demon.co.uk>
Cc: "Allen Briggs" <briggs@wasabisystems.com>; <port-arm@netbsd.org>
Sent: Friday, August 09, 2002 5:16 PM
Subject: Re: copyin/out


> On Fri, Aug 09, 2002 at 10:16:52AM +0100, Chris Gilbert wrote:
>
>  > Quick look over it, do you need to preload the addresses you're storing
to?
>  > or does that cause it to fetch the tlb entries for speed?  IE aren't
you
>  > just filling the cache with stuff you're about to overwrite?
>
> On some processors, in certain modes, the cache does not allocate a line
> on a write-miss, and you essentially get write-through semantics.
Prefetching
> the destination into the cache means you get write-back semantics always,
> and lets the cache clean the line to put that data in before you actually
> *need* it.

Do you mean write-through? or does it bypass the cache.

>  > Hmm, I see near enough that already on cats 1.6D.
>  > 1073741824 bytes transferred in 17.343 secs (61912115 bytes/sec)
>
> Interesting.  The performance characteristics of the old code were
> VERY different on a 400MHz i80321 (XScale core).  Indeed the old code
> on my Shark can do:
>
> 1073741824 bytes transferred in 15.120 secs (71014670 bytes/sec)
>
> and the new code on the Shark yields:
>
> 1073741824 bytes transferred in 8.447 secs (127115167 bytes/sec)
>
> That is a SIGNIFICANT improvement.

That's about the same gain I get on cats:
1073741824 bytes transferred in 11.443 secs (93833944 bytes/sec)

50% improvement is pretty good 8)

Chris