Subject: Re: Accelerating memset/memcpy
To: None <nigel@mips.com>
From: Paul Koning <pkoning@equallogic.com>
List: port-mips
Date: 10/01/2002 12:01:43
>>>>> "Nigel" == Nigel Stephens <nigel@mips.com> writes:

 Nigel> Paul Koning wrote:
 >> instruction would let you avoid the cacheline fill when you're
 >> writing a full cacheline; PREF only lets you move that fill
 >> earlier in time.  If you're memory-bound, PREF may produce a small
 >> performance improvement, but CACHE will give a significantly
 >> larger improvement.
 >> 
 >> Unfortunately the create dirty exclusive operation is a
 >> platform-dependent operation.  Some MIPS processors have it, some
 >> (including some very recent ones) do not.
 >> 
 >> 
 Nigel> In MIPS32 and MIPS64 compliant processors the "pref"
 Nigel> instruction with code 30 is defined as "prepare for store"
 Nigel> with the following description:

 Nigel> PrepareForStore ...

Ah... interesting.  I overlooked that one.  I've seen implementations
of CACHE(create-dirty) in some of the MIPS processors I've used, but
as far as I can remember they didn't implement this PREF operation.
So you'd probably end up having to use the CACHE based approach on at
least some platforms if you wanted to implement this optimization.

 Nigel> The other advantage of the pref instruction is that it can be
 Nigel> included in user code, whereas the cache instruction is only
 Nigel> available to the kernel.

True -- although the description of the CACHE instruction in the
MIPS64 architecture spec (rev 0.95) does not say that explicitly...

   paul