Source-Changes-HG archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

[src/trunk]: src/sys/arch/mips/mips mips: Membar audit.



details:   https://anonhg.NetBSD.org/src/rev/1886f2ce8a20
branches:  trunk
changeset: 362454:1886f2ce8a20
user:      riastradh <riastradh%NetBSD.org@localhost>
date:      Sun Feb 27 19:21:53 2022 +0000

description:
mips: Membar audit.

This change should be safe because it doesn't remove or weaken any
memory barriers, but does add, clarify, or strengthen barriers.

Goals:

- Make sure mutex_enter/exit and mutex_spin_enter/exit have
  acquire/release semantics.

- New macros make maintenance easier and purpose clearer:

  . SYNC_ACQ is for load-before-load/store barrier, and BDSYNC_ACQ
    for a branch delay slot -- currently defined as plain sync for MP
    and nothing, or nop, for UP; thus it is no weaker than SYNC and
    BDSYNC as currently defined, which is syncw on Octeon, plain sync
    on non-Octeon MP, and nothing/nop on UP.

    It is not clear to me whether load-then-syncw or ll/sc-then-syncw
    or even bare load provides load-acquire semantics on Octeon -- if
    no, this will fix bugs; if yes (like it is on SPARC PSO), we can
    relax SYNC_ACQ to be syncw or nothing later.

  . SYNC_REL is for load/store-before-store barrier -- currently
    defined as plain sync for MP and nothing for UP.

    It is not clear to me whether syncw-then-store is enough for
    store-release on Octeon -- if no, we can leave this as is; if
    yes, we can relax SYNC_REL to be syncw on Octeon.

  . SYNC_PLUNGER is there to flush clogged Cavium store buffers, and
    BDSYNC_PLUNGER for a branch delay slot -- syncw on Octeon,
    nothing or nop on non-Octeon.

    => This is not necessary (or, as far as I'm aware, sufficient)
       for acquire semantics -- it serves only to flush store buffers
       where stores might otherwise linger for hundreds of thousands
       of cycles, which would, e.g., cause spin locks to be held for
       unreasonably long durations.

  Newerish revisions of the MIPS ISA also have finer-grained sync
  variants that could be plopped in here.

Mechanism:

Insert these barriers in the right places, replacing only those where
the definition is currently equivalent, so this change is safe.

- Replace #ifdef _MIPS_ARCH_OCTEONP / syncw / #endif at the end of
  atomic_cas_* by SYNC_PLUNGER, which is `sync 4' (a.k.a. syncw) if
  __OCTEON__ and empty otherwise.

  => From what I can tell, __OCTEON__ is defined in at least as many
     contexts as _MIPS_ARCH_OCTEONP -- i.e., there are some Octeons
     with no _MIPS_ARCH_OCTEONP, but I don't know if any of them are
     relevant to us or ever saw the light of day outside Cavium; we
     seem to buid with `-march=octeonp' so this is unlikely to make a
     difference.  If it turns out that we do care, well, now there's
     a central place to make the distinction for sync instructions.

- Replace post-ll/sc SYNC by SYNC_ACQ in _atomic_cas_*, which are
  internal kernel versions used in sys/arch/mips/include/lock.h where
  it assumes they have load-acquire semantics.  Should move this to
  lock.h later, since we _don't_ define __HAVE_ATOMIC_AS_MEMBAR on
  MIPS and so the extra barrier might be costly.

- Insert SYNC_REL before ll/sc, and replace post-ll/sc SYNC by
  SYNC_ACQ, in _ucas_*, which is used without any barriers in futex
  code and doesn't mention barriers in the man page so I have to
  assume it is required to be a release/acquire barrier.

- Change BDSYNC to BDSYNC_ACQ in mutex_enter and mutex_spin_enter.
  This is necessary to provide load-acquire semantics -- unclear if
  it was provided already by syncw on Octeon, but it seems more
  likely that either (a) no sync or syncw is needed at all, or (b)
  syncw is not enough and sync is needed, since syncw is only a
  store-before-store ordering barrier.

- Insert SYNC_REL before ll/sc in mutex_exit and mutex_spin_exit.
  This is currently redundant with the SYNC already there, but
  SYNC_REL more clearly identifies the necessary semantics in case we
  want to define it differently on different systems, and having a
  sync in the middle of an ll/sc is a bit weird and possibly not a
  good idea, so I intend to (carefully) remove the redundant SYNC in
  a later change.

- Change BDSYNC to BDSYNC_PLUNGER at the end of mutex_exit.  This has
  no semantic change right now -- it's syncw on Octeon, sync on
  non-Octeon MP, nop on UP -- but we can relax it later to nop on
  non-Cavium MP.

- Leave LLSCSYNC in for now -- it is apparently there for a Cavium
  erratum, but I'm not sure what the erratum is, exactly, and I have
  no reference for it.  I suspect these can be safely removed, but we
  might have to double up some other syncw instructions -- Linux uses
  it only in store-release sequences, not at the head of every ll/sc.

diffstat:

 common/lib/libc/arch/mips/atomic/atomic_cas.S    |  12 ++-----
 common/lib/libc/arch/mips/atomic/atomic_op_asm.h |   8 +-----
 common/lib/libc/arch/mips/atomic/atomic_swap.S   |   8 ++--
 sys/arch/mips/include/asm.h                      |  17 +++++++++++-
 sys/arch/mips/mips/lock_stubs_llsc.S             |  33 +++++++++++++++++------
 5 files changed, 49 insertions(+), 29 deletions(-)

diffs (280 lines):

diff -r 8c32d0cab31e -r 1886f2ce8a20 common/lib/libc/arch/mips/atomic/atomic_cas.S
--- a/common/lib/libc/arch/mips/atomic/atomic_cas.S     Sun Feb 27 19:21:44 2022 +0000
+++ b/common/lib/libc/arch/mips/atomic/atomic_cas.S     Sun Feb 27 19:21:53 2022 +0000
@@ -1,4 +1,4 @@
-/*     $NetBSD: atomic_cas.S,v 1.8 2020/08/06 10:00:21 skrll Exp $     */
+/*     $NetBSD: atomic_cas.S,v 1.9 2022/02/27 19:21:53 riastradh Exp $ */
 
 /*-
  * Copyright (c) 2008 The NetBSD Foundation, Inc.
@@ -29,7 +29,7 @@
 #include <machine/asm.h>
 #include "atomic_op_asm.h"
 
-RCSID("$NetBSD: atomic_cas.S,v 1.8 2020/08/06 10:00:21 skrll Exp $")
+RCSID("$NetBSD: atomic_cas.S,v 1.9 2022/02/27 19:21:53 riastradh Exp $")
 
        .text
        .set    noat
@@ -47,9 +47,7 @@
        beq             t0, zero, 1b
         nop
        move            v0, a1
-#ifdef _MIPS_ARCH_OCTEONP
-       syncw
-#endif
+       SYNC_PLUNGER
 2:
        j               ra
         nop
@@ -69,9 +67,7 @@
        beq             t0, zero, 1b
         nop
        move            v0, a1
-#ifdef _MIPS_ARCH_OCTEONP
-       syncw
-#endif
+       SYNC_PLUNGER
 2:
        j               ra
         nop
diff -r 8c32d0cab31e -r 1886f2ce8a20 common/lib/libc/arch/mips/atomic/atomic_op_asm.h
--- a/common/lib/libc/arch/mips/atomic/atomic_op_asm.h  Sun Feb 27 19:21:44 2022 +0000
+++ b/common/lib/libc/arch/mips/atomic/atomic_op_asm.h  Sun Feb 27 19:21:53 2022 +0000
@@ -1,4 +1,4 @@
-/*     $NetBSD: atomic_op_asm.h,v 1.4 2020/08/01 09:26:49 skrll Exp $  */
+/*     $NetBSD: atomic_op_asm.h,v 1.5 2022/02/27 19:21:53 riastradh Exp $      */
 
 /*-
  * Copyright (c) 2007 The NetBSD Foundation, Inc.
@@ -44,10 +44,4 @@
 
 #endif /* _KERNEL */
 
-#ifdef __OCTEON__
-#define        SYNCW   syncw
-#else
-#define        SYNCW   nop
-#endif
-
 #endif /* _ATOMIC_OP_ASM_H_ */
diff -r 8c32d0cab31e -r 1886f2ce8a20 common/lib/libc/arch/mips/atomic/atomic_swap.S
--- a/common/lib/libc/arch/mips/atomic/atomic_swap.S    Sun Feb 27 19:21:44 2022 +0000
+++ b/common/lib/libc/arch/mips/atomic/atomic_swap.S    Sun Feb 27 19:21:53 2022 +0000
@@ -1,4 +1,4 @@
-/*     $NetBSD: atomic_swap.S,v 1.7 2020/08/06 10:00:21 skrll Exp $    */
+/*     $NetBSD: atomic_swap.S,v 1.8 2022/02/27 19:21:53 riastradh Exp $        */
 
 /*-
  * Copyright (c) 2008 The NetBSD Foundation, Inc.
@@ -29,7 +29,7 @@
 #include <machine/asm.h>
 #include "atomic_op_asm.h"
 
-RCSID("$NetBSD: atomic_swap.S,v 1.7 2020/08/06 10:00:21 skrll Exp $")
+RCSID("$NetBSD: atomic_swap.S,v 1.8 2022/02/27 19:21:53 riastradh Exp $")
 
        .text
        .set    noreorder
@@ -54,7 +54,7 @@
         nop
 2:
        j               ra
-        SYNCW
+        BDSYNC_PLUNGER
 END(_atomic_swap_32)
 ATOMIC_OP_ALIAS(atomic_swap_32, _atomic_swap_32)
 
@@ -69,7 +69,7 @@
         nop
 2:
        j               ra
-        SYNCW
+        BDSYNC_PLUNGER
 END(_atomic_swap_64)
 ATOMIC_OP_ALIAS(atomic_swap_64, _atomic_swap_64)
 #endif
diff -r 8c32d0cab31e -r 1886f2ce8a20 sys/arch/mips/include/asm.h
--- a/sys/arch/mips/include/asm.h       Sun Feb 27 19:21:44 2022 +0000
+++ b/sys/arch/mips/include/asm.h       Sun Feb 27 19:21:53 2022 +0000
@@ -1,4 +1,4 @@
-/*     $NetBSD: asm.h,v 1.65 2021/02/18 12:28:01 simonb Exp $  */
+/*     $NetBSD: asm.h,v 1.66 2022/02/27 19:21:53 riastradh Exp $       */
 
 /*
  * Copyright (c) 1992, 1993
@@ -576,14 +576,29 @@
 #define        LLSCSYNC        sync 4; sync 4
 #define        SYNC            sync 4          /* sync 4 == syncw - sync all writes */
 #define        BDSYNC          sync 4          /* sync 4 == syncw - sync all writes */
+#define        BDSYNC_ACQ      sync
+#define        SYNC_ACQ        sync
+#define        SYNC_REL        sync
+#define        BDSYNC_PLUNGER  sync 4
+#define        SYNC_PLUNGER    sync 4
 #elif __mips >= 3 || !defined(__mips_o32)
 #define        LLSCSYNC        sync
 #define        SYNC            sync
 #define        BDSYNC          sync
+#define        BDSYNC_ACQ      sync
+#define        SYNC_ACQ        sync
+#define        SYNC_REL        sync
+#define        BDSYNC_PLUNGER  nop
+#define        SYNC_PLUNGER    /* nothing */
 #else
 #define        LLSCSYNC        /* nothing */
 #define        SYNC            /* nothing */
 #define        BDSYNC          nop
+#define        BDSYNC_ACQ      nop
+#define        SYNC_ACQ        /* nothing */
+#define        SYNC_REL        /* nothing */
+#define        BDSYNC_PLUNGER  nop
+#define        SYNC_PLUNGER    /* nothing */
 #endif
 
 /* CPU dependent hook for cp0 load delays */
diff -r 8c32d0cab31e -r 1886f2ce8a20 sys/arch/mips/mips/lock_stubs_llsc.S
--- a/sys/arch/mips/mips/lock_stubs_llsc.S      Sun Feb 27 19:21:44 2022 +0000
+++ b/sys/arch/mips/mips/lock_stubs_llsc.S      Sun Feb 27 19:21:53 2022 +0000
@@ -1,4 +1,4 @@
-/*     $NetBSD: lock_stubs_llsc.S,v 1.14 2022/02/27 19:21:44 riastradh Exp $   */
+/*     $NetBSD: lock_stubs_llsc.S,v 1.15 2022/02/27 19:21:53 riastradh Exp $   */
 
 /*-
  * Copyright (c) 2007 The NetBSD Foundation, Inc.
@@ -36,7 +36,7 @@
 
 #include <machine/asm.h>
 
-RCSID("$NetBSD: lock_stubs_llsc.S,v 1.14 2022/02/27 19:21:44 riastradh Exp $")
+RCSID("$NetBSD: lock_stubs_llsc.S,v 1.15 2022/02/27 19:21:53 riastradh Exp $")
 
 #include "assym.h"
 
@@ -63,6 +63,10 @@
 /*
  * unsigned long atomic_cas_ulong_llsc(volatile unsigned long *val,
  *     unsigned long old, unsigned long new);
+ *
+ *     For hysterical raisins in sys/arch/mips/include/lock.h, success
+ *     implies load-acquire.  The SYNC_ACQ here could be moved there
+ *     instead.
  */
 STATIC_LEAF(llsc_atomic_cas_ulong)
        LLSCSYNC
@@ -73,7 +77,7 @@
        LONG_SC t1, (a0)
        beqz    t1, 1b
         nop
-       SYNC
+       SYNC_ACQ
        j       ra
         move   v0, a1
 2:
@@ -84,6 +88,10 @@
 /*
  * unsigned int _atomic_cas_uint_llsc(volatile unsigned int *val,
  *    unsigned int old, unsigned int new);
+ *
+ *     For hysterical raisins in sys/arch/mips/include/lock.h, success
+ *     implies load-acquire.  The SYNC_ACQ here could be moved there
+ *     instead.
  */
 STATIC_LEAF(llsc_atomic_cas_uint)
        LLSCSYNC
@@ -94,7 +102,7 @@
        INT_SC  t1, (a0)
        beqz    t1, 1b
         nop
-       SYNC
+       SYNC_ACQ
        j       ra
         move   v0, a1
 2:
@@ -105,6 +113,9 @@
 /*
  * int llsc_ucas_32(volatile uint32_t *ptr, uint32_t old,
  *     uint32_t new, uint32_t *ret)
+ *
+ *     Implies release/acquire barriers until someone tells me
+ *     otherwise about _ucas_32/64.
  */
 STATIC_LEAF(llsc_ucas_32)
        .set at
@@ -115,6 +126,7 @@
        bltz    a0, _C_LABEL(llsc_ucaserr)
         nop
        move    v0, zero
+       SYNC_REL
 
        LLSCSYNC
 1:     ll      t0, 0(a0)
@@ -123,7 +135,7 @@
        sc      t1, 0(a0)
        beqz    t1, 1b
         nop
-       SYNC
+       SYNC_ACQ
 
 2:     PTR_S   zero, PCB_ONFAULT(v1)
        j       ra
@@ -144,6 +156,7 @@
        bltz    a0, _C_LABEL(llsc_ucaserr)
         nop
        move    v0, zero
+       SYNC_REL
 
        LLSCSYNC
 1:     lld     t0, 0(a0)
@@ -152,7 +165,7 @@
        scd     t1, 0(a0)
        beqz    t1, 1b
         nop
-       SYNC
+       SYNC_ACQ
 
 2:     PTR_S   zero, PCB_ONFAULT(v1)
        j       ra
@@ -181,7 +194,7 @@
        beqz    t2, 1b
         PTR_LL t0, MTX_OWNER(a0)
        j       ra
-        BDSYNC
+        BDSYNC_ACQ
 2:
        j       _C_LABEL(mutex_vector_enter)
         nop
@@ -191,6 +204,7 @@
  * void        mutex_exit(kmutex_t *mtx);
  */
 STATIC_LEAF(llsc_mutex_exit)
+       SYNC_REL
        LLSCSYNC
        PTR_LL  t0, MTX_OWNER(a0)
        SYNC
@@ -201,7 +215,7 @@
        beqz    t2, 1b
         PTR_LL t0, MTX_OWNER(a0)
        j       ra
-        BDSYNC
+        BDSYNC_PLUNGER
 2:
        j       _C_LABEL(mutex_vector_exit)
         nop
@@ -264,7 +278,7 @@
        beqz    t1, 3b
         INT_LL t3, MTX_LOCK(t0)
        j       ra
-        BDSYNC
+        BDSYNC_ACQ
 4:
        j       _C_LABEL(mutex_spin_retry)
         move   a0, t0
@@ -274,6 +288,7 @@
  * void        mutex_spin_exit(kmutex_t *mtx);
  */
 LEAF(llsc_mutex_spin_exit)
+       SYNC_REL
        PTR_L   t2, L_CPU(MIPS_CURLWP)
 #if defined(DIAGNOSTIC)
        INT_L   t0, MTX_LOCK(a0)



Home | Main Index | Thread Index | Old Index