Source-Changes-HG archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

[src/trunk]: src/lib/libc/atomic Update membar_ops(3) man page with examples ...



details:   https://anonhg.NetBSD.org/src/rev/589b62514c28
branches:  trunk
changeset: 975630:589b62514c28
user:      riastradh <riastradh%NetBSD.org@localhost>
date:      Thu Sep 03 00:00:06 2020 +0000

description:
Update membar_ops(3) man page with examples and relation to C11.

Add exhortation to always always always document how membars come in
pairs for synchronization between two CPUs when you use them.

diffstat:

 lib/libc/atomic/membar_ops.3 |  263 +++++++++++++++++++++++++++++++++++++++---
 1 files changed, 242 insertions(+), 21 deletions(-)

diffs (truncated from 323 to 300 lines):

diff -r 81797554330f -r 589b62514c28 lib/libc/atomic/membar_ops.3
--- a/lib/libc/atomic/membar_ops.3      Wed Sep 02 23:42:58 2020 +0000
+++ b/lib/libc/atomic/membar_ops.3      Thu Sep 03 00:00:06 2020 +0000
@@ -1,4 +1,4 @@
-.\"    $NetBSD: membar_ops.3,v 1.5 2017/10/24 18:19:17 abhinav Exp $
+.\"    $NetBSD: membar_ops.3,v 1.6 2020/09/03 00:00:06 riastradh Exp $
 .\"
 .\" Copyright (c) 2007, 2008 The NetBSD Foundation, Inc.
 .\" All rights reserved.
@@ -27,7 +27,7 @@
 .\" ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
 .\" POSSIBILITY OF SUCH DAMAGE.
 .\"
-.Dd November 20, 2014
+.Dd September 2, 2020
 .Dt MEMBAR_OPS 3
 .Os
 .Sh NAME
@@ -38,7 +38,7 @@
 .Nm membar_consumer ,
 .Nm membar_datadep_consumer ,
 .Nm membar_sync
-.Nd memory access barrier operations
+.Nd memory ordering barriers
 .\" .Sh LIBRARY
 .\" .Lb libc
 .Sh SYNOPSIS
@@ -58,33 +58,213 @@
 .Fn membar_sync "void"
 .Sh DESCRIPTION
 The
-.Nm membar_ops
-family of functions provide memory access barrier operations necessary
+.Nm
+family of functions prevent reordering of memory operations, as needed
 for synchronization in multiprocessor execution environments that have
 relaxed load and store order.
-.Bl -tag -width "mem"
+.Pp
+In general, memory barriers must come in pairs \(em a barrier on one
+CPU, such as
+.Fn membar_exit ,
+must pair with a barrier on another CPU, such as
+.Fn membar_enter ,
+in order to synchronize anything between the two CPUs.
+Code using
+.Nm
+should generally be annotated with comments identifying how they are
+paired.
+.Pp
+.Nm
+affect only operations on regular memory, not on device
+memory; see
+.Xr bus_space 9
+and
+.Xr bus_dma 9
+for machine-independent interfaces to handling device memory and DMA
+operations for device drivers.
+.Pp
+Unlike C11,
+.Em all
+memory operations \(em that is, all loads and stores on regular
+memory \(em are affected by
+.Nm ,
+not just C11 atomic operations on
+.Vt _Atomic Ns -qualified
+objects.
+.Bl -tag -width abcd
 .It Fn membar_enter
 Any store preceding
 .Fn membar_enter
-will reach global visibility before all loads and stores following it.
+will happen before all memory operations following it.
+.Pp
+An atomic read/modify/write operation
+.Pq Xr atomic_ops 3
+followed by a
+.Fn membar_enter
+implies a
+.Em load-acquire
+operation in the language of C11.
+.Pp
+.Sy WARNING :
+A load followed by
+.Fn membar_enter
+.Em does not
+imply a
+.Em load-acquire
+operation, even though
+.Fn membar_exit
+followed by a store implies a
+.Em store-release
+operation; the symmetry of these names and asymmetry of the semantics
+is a historical mistake.
+In the
+.Nx
+kernel, you can use
+.Xr atomic_load_acquire 9
+for a
+.Em load-acquire
+operation without any atomic read/modify/write.
 .Pp
 .Fn membar_enter
 is typically used in code that implements locking primitives to ensure
-that a lock protects its data.
+that a lock protects its data, and is typically paired with
+.Fn membar_exit ;
+see below for an example.
 .It Fn membar_exit
-All loads and stores preceding
+All memory operations preceding
+.Fn membar_exit
+will happen before any store that follows it.
+.Pp
+A
 .Fn membar_exit
-will reach global visibility before any store that follows it.
+followed by a store implies a
+.Em store-release
+operation in the language of C11.
+For a regular store, rather than an atomic read/modify/write store, you
+should use
+.Xr atomic_store_release 9
+instead of
+.Fn membar_exit
+followed by the store.
 .Pp
 .Fn membar_exit
 is typically used in code that implements locking primitives to ensure
-that a lock protects its data.
+that a lock protects its data, and is typically paired with
+.Fn membar_enter .
+For example:
+.Bd -literal -offset abcdefgh
+/* thread A */
+obj->state.mumblefrotz = 42;
+KASSERT(valid(&obj->state));
+membar_exit();
+obj->lock = 0;
+
+/* thread B */
+if (atomic_cas_uint(&obj->lock, 0, 1) != 0)
+       return;
+membar_enter();
+KASSERT(valid(&obj->state));
+obj->state.mumblefrotz--;
+.Ed
+.Pp
+In this example,
+.Em if
+the
+.Fn atomic_cas_uint
+operation in thread B witnesses the store
+.Li "obj->lock = 0"
+from thread A,
+.Em then
+everything in thread A before the
+.Fn membar_exit
+is guaranteed to happen before everything in thread B after the
+.Fn membar_enter ,
+as if the machine had sequentially executed:
+.Bd -literal -offset abcdefgh
+obj->state.mumblefrotz = 42;   /* from thread A */
+KASSERT(valid(&obj->state));
+\&...
+KASSERT(valid(&obj->state));   /* from thread B */
+obj->state.mumblefrotz--;
+.Ed
+.Pp
+.Fn membar_exit
+followed by a store, serving as a
+.Em store-release
+operation, may also be paired with a subsequent load followed by
+.Fn membar_sync ,
+serving as the corresponding
+.Em load-acquire
+operation.
+However, you should use
+.Xr atomic_store_release 9
+and
+.Xr atomic_load_acquire 9
+instead in that situation, unless the store is an atomic
+read/modify/write which requires a separate
+.Fn membar_exit .
 .It Fn membar_producer
-All stores preceding the memory barrier will reach global visibility
-before any stores after the memory barrier reach global visibility.
+All stores preceding
+.Fn membar_producer
+will happen before any stores following it.
+.Pp
+.Fn membar_producer
+has no analogue in C11.
+.Pp
+.Fn membar_producer
+is typically used in code that produces data for read-only consumers
+which use
+.Fn membar_consumer ,
+such as
+.Sq seqlocked
+snapshots of statistics; see below for an example.
 .It Fn membar_consumer
-All loads preceding the memory barrier will complete before any loads
-after the memory barrier complete.
+All loads preceding
+.Fn membar_consumer
+will complete before any loads after it.
+.Pp
+.Fn membar_consumer
+has no analogue in C11.
+.Pp
+.Fn membar_consumer
+is typically used in code that reads data from producers which use
+.Fn membar_producer ,
+such as
+.Sq seqlocked
+snapshots of statistics.
+For example:
+.Bd -literal
+struct {
+       /* version number and in-progress bit */
+       unsigned        seq;
+
+       /* read-only statistics, too large for atomic load */
+       unsigned        foo;
+       int             bar;
+       uint64_t        baz;
+} stats;
+
+       /* producer (must be serialized, e.g. with mutex(9)) */
+       stats->seq |= 1;        /* mark update in progress */
+       membar_producer();
+       stats->foo = count_foo();
+       stats->bar = measure_bar();
+       stats->baz = enumerate_baz();
+       membar_producer();
+       stats->seq++;           /* bump version number */
+
+       /* consumer (in parallel w/ producer, other consumers) */
+restart:
+       while ((seq = stats->seq) & 1)  /* wait for update */
+               SPINLOCK_BACKOFF_HOOK;
+       membar_consumer();
+       foo = stats->foo;       /* read out a candidate snapshot */
+       bar = stats->bar;
+       baz = stats->baz;
+       membar_consumer();
+       if (seq != stats->seq)  /* try again if version changed */
+               goto restart;
+.Ed
 .It Fn membar_datadep_consumer
 Same as
 .Fn membar_consumer ,
@@ -100,9 +280,21 @@
 consume(v);
 .Ed
 .Pp
-Does not guarantee ordering of loads in branches, or
+.Fn membar_datadep_consumer
+is typically paired with
+.Fn membar_exit
+by code that initializes an object before publishing it.
+However, you should use
+.Xr atomic_store_release 9
+and
+.Xr atomic_load_consume 9
+instead, to avoid obscure edge cases in case the consumer is not
+read-only.
+.Pp
+.Fn membar_datadep_consumer
+does not guarantee ordering of loads in branches, or
 .Sq control-dependent
-loads -- you must use
+loads \(em you must use
 .Fn membar_consumer
 instead:
 .Bd -literal -offset indent
@@ -120,12 +312,41 @@
 .Fn membar_datadep_consumer
 is a no-op on those CPUs.
 .It Fn membar_sync
-All loads and stores preceding the memory barrier will complete and
-reach global visibility before any loads and stores after the memory
-barrier complete and reach global visibility.
+All memory operations preceding
+.Fn membar_sync
+will happen before any memory operations following it.
+.Pp
+.Fn membar_sync
+is a sequential consistency acquire/release barrier, analogous to
+.Li "atomic_thread_fence(memory_order_seq_cst)"
+in C11.
+.Pp
+.Fn membar_sync
+is typically paired with
+.Fn membar_sync .
+.Pp
+A load followed by
+.Fn membar_sync ,
+serving as a



Home | Main Index | Thread Index | Old Index