NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/57515: sparc32 GCC defaults to SC memory ordering, which is not true on SPARCv8 processors



The following reply was made to PR port-sparc/57515; it has been noted by GNATS.

From: Taylor R Campbell <riastradh%NetBSD.org@localhost>
To: Martin Husemann <martin%duskware.de@localhost>
Cc: gnats-bugs%NetBSD.org@localhost
Subject: Re: kern/57515: sparc32 GCC defaults to SC memory ordering, which is not true on SPARCv8 processors
Date: Sat, 8 Jul 2023 20:19:37 +0000

 > Date: Sat, 8 Jul 2023 19:48:09 +0200
 > From: Martin Husemann <martin%duskware.de@localhost>
 > 
 > I don't think this is a bug. We default to v7 everywhere on 32bit
 > sparc.
 
 Consider the following program:
 
 int
 stldmul(_Atomic int *p, _Atomic int *q, int x)
 {
 	int y;
 
 	__atomic_store(p, &x, __ATOMIC_SEQ_CST);
 	__atomic_load(q, &y, __ATOMIC_SEQ_CST);
 
 	return x * y;
 }
 
 
 Compiling it with gcc -mcpu=v7 gives:
 
 00000000 <stldmul>:
    0:	9d e3 bf a0 	save  %sp, -96, %sp
    4:	f4 26 00 00 	st  %i2, [ %i0 ]
    8:	d0 06 40 00 	ld  [ %i1 ], %o0
    c:	40 00 00 00 	call  c <stldmul+0xc>
 			c: R_SPARC_WDISP30	.umul
   10:	92 10 00 1a 	mov  %i2, %o1
   14:	81 c7 e0 08 	ret 
   18:	91 e8 00 08 	restore  %g0, %o0, %o0
 
 
 Compiling it with gcc -mcpu=v8 gives:
 
 00000000 <stldmul>:
    0:	81 43 c0 00 	stbar 
    4:	d4 22 00 00 	st  %o2, [ %o0 ]
    8:	81 43 c0 00 	stbar 
    c:	c0 6b bf ff 	ldstub  [ %sp + -1 ], %g0
   10:	c0 6b bf ff 	ldstub  [ %sp + -1 ], %g0
   14:	d0 02 40 00 	ld  [ %o1 ], %o0
   18:	81 c3 e0 08 	retl 
   1c:	90 5a 00 0a 	smul  %o0, %o2, %o0
 
 
 If a `NetBSD/sparc' userland is supposed to work on sparcv7 and
 sparcv8 CPUs, well, that's a problem, because these are both broken:
 
 - The -mcpu=v7 output is broken on sparcv8 CPUs, which run in TSO,
   because it's missing an ldstub instruction between the store at 4
   and the load at 8 to guarantee the sequential consistency ordering
   the program asked for, e.g. for Dekker's algorithm.
 
   (The -mcpu=v8 output does more than it needs -- a single ldstub is
   enough, no need for stbar or a second ldstub, but that's a speed
   issue, not a correctness issue.)
 
 - The -mcpu=v8 output is broken on sparcv7 CPUs because it uses the
   smul instruction, which doesn't exist in sparcv7, if I recall
   correctly.
 
   (If my memory has faded and smul does exist but just isn't used by
   gcc for some reason, there's probably some other sparcv8 instruction
   that doesn't exist on sparcv7 which gcc will use with `-mcpu=v7'.)
 
 However, gcc developers don't view this as an issue -- `compile for V8
 if you want to run on V8':
 
 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110592
 
 
 I think maybe the right thing is to have `-mcpu=v7 -mmemory-model=tso'
 generate the ldstub, and use that in NetBSD/sparc; that way the
 default semantics in a non-NetBSD gcc build for a solitary `-mcpu=v7'
 option can still be as if you had passed `-mcpu=v7 -mmemory-model=sc',
 as it is today.
 
 But currently that doesn't work because the rules to generate memory
 barrier instructions or equivalent are all gated on TARGET_V8 ||
 TARGET_V9, so they just don't kick in for `-mcpu=v7' even if you also
 pass `-mmemory-model=tso'.
 


Home | Main Index | Thread Index | Old Index