Subject: Re: Accelerating memset/memcpy
To: Paul Koning <pkoning@equallogic.com>
From: Nigel Stephens <nigel@mips.com>
List: port-mips
Date: 10/01/2002 18:25:05
Paul Koning wrote:

>>>>>>"Nigel" == Nigel Stephens <nigel@mips.com> writes:
>>>>>>            
>>>>>>
>
>
>Ah... interesting.  I overlooked that one.  I've seen implementations
>of CACHE(create-dirty) in some of the MIPS processors I've used, but
>as far as I can remember they didn't implement this PREF operation.
>So you'd probably end up having to use the CACHE based approach on at
>least some platforms if you wanted to implement this optimization.
>

That's right - and possible even for some MIPS32/MIPS64 processors (see 
cgd's message).

 BTW I know from a real-world example that using pref for both 
prefetching the next source line of data and "preparing" the next output 
line is a significant win on the RM7000, at least. So definitely include 
the prefetch as well as the "prepare" in any implementation of 
bcopy/memcpy. The prefetch of course does not require the source pointer 
to be cache line aligned, but the "prepare" does require it of the dest 
pointer, given it's side effect of clearing the line to zero.

Nigel