Subject: Re: Accelerating memset/memcpy
To: None <cgd@broadcom.com>
From: Nigel Stephens <nigel@mips.com>
List: port-mips
Date: 10/01/2002 18:20:15
cgd@broadcom.com wrote:

>Note the MIPS32 and MIPS64 specs also include the following in their
>description of the 'pref' opcode, which are inconsistent with
>PrepareForStore's description:
>
>* "The action taken for a specific PREF instruction is both system and
>  context dependent.  Any action, including doing nothing, is
>  permitted as long as it does not change architecturally visible
>  state or alter the meaning of a program."
>
>* "A hint value cannot cause an action to modify architecturally
>  visible state."
>
>(Zeroing a line of memory is most definitely a modification of
>architecturally visible state.  8-)
>

Right, you obviously shouldn't rely on it doing anything at all, and in 
particular shouldn't rely on its side-effect of clearing the line to 
zero (i.e. to do an ultra-fast bzero!).

> (I mention this because, well,
>hey, you're a channel that might be used to get documentation fixes
>back in.  Those are from MIPS64 Volume II, rev 0.95, page 243.)
>
>  
>
Sure, I'll feed this back to the author.

>Anyway, despite the pseudo-standardization of the 'hint' fields
>("pseudo" because "any action, including doing nothing, is permitted")
>because of:
>
>* historical differences from the standardized hints,
>
>* differences in even MIPS32/MIPS64 processors about which are
>  implemented and how, and, of course,
>
>* microarchitectural differences,
>
>it really doesn't make sense to try to apply a blanket
>'mips32/mips64-optimized' memcpy (et al) to the kernel.  They really
>should be selected on a per-cpu basis.
>  
>

Good point. So there's at least three alternatives to start with: "pref 
30", "create_dirty_exclusive" and "none". Let's hope that covers most of 
them - now we just have to do a survey! But at least "pref 30" should be 
a safe initial assumption for any MIPS32/MIPS64 processor.

Nigel