Port-sparc64 archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: SPARC hardware with weaker ordering than TSO



On Mon, 25 Jul 2022 22:00:28 +0000
Taylor R Campbell <riastradh%NetBSD.org@localhost> wrote:

> > Date: Mon, 25 Jul 2022 22:22:52 +0100
> > From: Sad Clouds <cryintothebluesky%gmail.com@localhost>
> > 
> > Yes I believe so, see UltraSPARC II and IIi user's manuals
> > 
> > Copied verbatim:
> > 
> > "SPARC-V9 defines the semantics of memory operations for three memory
> > models. From strongest to weakest, they are Total Store Order (TSO),
> > Partial Store Order (PSO), and Relaxed Memory Order (RMO). The
> > differences in these models lie in the freedom an implementation is
> > allowed in order to obtain higher performance during program execution.
> > The purpose of the memory models is to specify any constraints placed
> > on the ordering of memory operations in uniprocessor and shared-memory
> > multi-processor environments. UltraSPARC-IIi supports all three memory
> > models."
> 
> I understand the CPU advertises support for these memory orders in
> that you can set the PSTATE.MM bits, and perhaps even read them back,
> but that's not what I'm asking.
> 
> What I'm asking is: Does the CPU _actually_ reorder loads and stores
> in violation of TSO, or in violation of PSO; or does the CPU
> _actually_ guarantee TSO no matter what bits you set PSTATE.MM to?
> 
> Obviously the architecture doesn't guarantee TSO when PSTATE.MM is set
> to PSO or RMO for _all_ future SPARC CPUs in principle.  But I want to
> know whether there is any real hardware where it makes an observable
> difference in practice.
>

I haven't done any practical tests to confirm any of this. The way I
read the manual - "UltraSPARC-IIi supports all three memory models"
means exactly that. For example, if you set RMO then CPU may reorder
instructions according to the set memory model. The manuals for later
CPUs like UltraSPARC-III explicitly state that the only supported
memory model is TSO. The PSTATE.MM is hardwired to 00 and other values
are reserved.

 
> > "Block load and store operations do not obey the ordering restrictions
> > of the currently selected processor memory model (TSO, PSO, or RMO);
> > block operations always execute under an RMO memory ordering model.
> > Explicit MEMBAR instructions are required to order block operations
> > among themselves or with respect to normal loads and stores. In
> > addition, block operations do not conform to dependence order on the
> > issuing processor; that is, no read-after-write or writer-after-read
> > checking occurs between block loads and stores. Explicit MEMBARs are
> > required to enforce dependence ordering between block operations that
> > reference the same address."
> 
> Are these instructions allowed for ordinary pointer dereferences in
> the SPARC ABI that compilers adhere to, or do they occur only in
> special machine-dependent code designed carefully to take advantage of
> the instructions with the necessary barriers like
> common/lib/libc/arch/sparc64/string/memcpy.S?
> 
> They resemble the x86 non-temporal store instructions (MOVNT*), which
> are not normally used except by highly machine-dependent code with the
> appropriate SFENCE instructions that normal loads and stores on normal
> memory mappings don't need, so I suspect compilers won't use them
> unless you go out of your way to ask for them.

I guess it depends on the compiler. If you do a large struct copy via a
pointer dereference, it may be more efficient to use 64-byte block
load/store instructions for some cases.


Home | Main Index | Thread Index | Old Index