membar_ops.3 revision 1.8
$NetBSD: membar_ops.3,v 1.8 2020/10/09 19:41:02 uwe Exp $

Copyright (c) 2007, 2008 The NetBSD Foundation, Inc.
All rights reserved.

This code is derived from software contributed to The NetBSD Foundation
by Jason R. Thorpe.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.

THIS SOFTWARE IS PROVIDED BY THE NETBSD FOUNDATION, INC. AND CONTRIBUTORS
``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE FOUNDATION OR CONTRIBUTORS
BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
POSSIBILITY OF SUCH DAMAGE.

.Dd September 2, 2020 .Dt MEMBAR_OPS 3 .Os .Sh NAME .Nm membar_ops , .Nm membar_enter , .Nm membar_exit , .Nm membar_producer , .Nm membar_consumer , .Nm membar_datadep_consumer , .Nm membar_sync .Nd memory ordering barriers .Sh LIBRARY
.Lb libc
.Sh SYNOPSIS n sys/atomic.h
.Ft void .Fn membar_enter "void" .Ft void .Fn membar_exit "void" .Ft void .Fn membar_producer "void" .Ft void .Fn membar_consumer "void" .Ft void .Fn membar_datadep_consumer "void" .Ft void .Fn membar_sync "void" .Sh DESCRIPTION The .Nm family of functions prevent reordering of memory operations, as needed for synchronization in multiprocessor execution environments that have relaxed load and store order.

p In general, memory barriers must come in pairs \(em a barrier on one CPU, such as .Fn membar_exit , must pair with a barrier on another CPU, such as .Fn membar_enter , in order to synchronize anything between the two CPUs. Code using .Nm should generally be annotated with comments identifying how they are paired.

p .Nm affect only operations on regular memory, not on device memory; see .Xr bus_space 9 and .Xr bus_dma 9 for machine-independent interfaces to handling device memory and DMA operations for device drivers.

p Unlike C11, .Em all memory operations \(em that is, all loads and stores on regular memory \(em are affected by .Nm , not just C11 atomic operations on .Vt _Atomic\^ Ns -qualified objects. l -tag -width abcd t Fn membar_enter Any store preceding .Fn membar_enter will happen before all memory operations following it.

p An atomic read/modify/write operation

q Xr atomic_ops 3 followed by a .Fn membar_enter implies a .Em load-acquire operation in the language of C11.

p .Sy WARNING : A load followed by .Fn membar_enter .Em does not imply a .Em load-acquire operation, even though .Fn membar_exit followed by a store implies a .Em store-release operation; the symmetry of these names and asymmetry of the semantics is a historical mistake. In the .Nx kernel, you can use .Xr atomic_load_acquire 9 for a .Em load-acquire operation without any atomic read/modify/write.

p .Fn membar_enter is typically used in code that implements locking primitives to ensure that a lock protects its data, and is typically paired with .Fn membar_exit ; see below for an example. t Fn membar_exit All memory operations preceding .Fn membar_exit will happen before any store that follows it.

p A .Fn membar_exit followed by a store implies a .Em store-release operation in the language of C11. .Fn membar_exit should only be used before atomic read/modify/write, such as .Xr atomic_inc_uint 3 . For regular stores, instead of .Li "membar_exit(); *p = x" , you should use .Li "atomic_store_release(p, x)" .

p .Fn membar_exit is typically paired with .Fn membar_enter , and is typically used in code that implements locking or reference counting primitives. Releasing a lock or reference count should use .Fn membar_exit , and acquiring a lock or handling an object after draining references should use .Fn membar_enter , so that whatever happened before releasing will also have happened before acquiring. For example: d -literal -offset abcdefgh /* thread A -- release a reference */ obj->state.mumblefrotz = 42; KASSERT(valid(&obj->state)); membar_exit(); atomic_dec_uint(&obj->refcnt); /* * thread B -- busy-wait until last reference is released, * then lock it by setting refcnt to UINT_MAX */ while (atomic_cas_uint(&obj->refcnt, 0, -1) != 0) continue; membar_enter(); KASSERT(valid(&obj->state)); obj->state.mumblefrotz--; .Ed

p In this example, .Em if the load in .Fn atomic_cas_uint in thread B witnesses the store in .Fn atomic_dec_uint in thread A setting the reference count to zero, .Em then everything in thread A before the .Fn membar_exit is guaranteed to happen before everything in thread B after the .Fn membar_enter , as if the machine had sequentially executed: d -literal -offset abcdefgh obj->state.mumblefrotz = 42; /* from thread A */ KASSERT(valid(&obj->state)); ... KASSERT(valid(&obj->state)); /* from thread B */ obj->state.mumblefrotz--; .Ed

p .Fn membar_exit followed by a store, serving as a .Em store-release operation, may also be paired with a subsequent load followed by .Fn membar_sync , serving as the corresponding .Em load-acquire operation. However, you should use .Xr atomic_store_release 9 and .Xr atomic_load_acquire 9 instead in that situation, unless the store is an atomic read/modify/write which requires a separate .Fn membar_exit . t Fn membar_producer All stores preceding .Fn membar_producer will happen before any stores following it.

p .Fn membar_producer has no analogue in C11.

p .Fn membar_producer is typically used in code that produces data for read-only consumers which use .Fn membar_consumer , such as .Sq seqlocked snapshots of statistics; see below for an example. t Fn membar_consumer All loads preceding .Fn membar_consumer will complete before any loads after it.

p .Fn membar_consumer has no analogue in C11.

p .Fn membar_consumer is typically used in code that reads data from producers which use .Fn membar_producer , such as .Sq seqlocked snapshots of statistics. For example: d -literal struct { /* version number and in-progress bit */ unsigned seq; /* read-only statistics, too large for atomic load */ unsigned foo; int bar; uint64_t baz; } stats; /* producer (must be serialized, e.g. with mutex(9)) */ stats->seq |= 1; /* mark update in progress */ membar_producer(); stats->foo = count_foo(); stats->bar = measure_bar(); stats->baz = enumerate_baz(); membar_producer(); stats->seq++; /* bump version number */ /* consumer (in parallel w/ producer, other consumers) */ restart: while ((seq = stats->seq) & 1) /* wait for update */ SPINLOCK_BACKOFF_HOOK; membar_consumer(); foo = stats->foo; /* read out a candidate snapshot */ bar = stats->bar; baz = stats->baz; membar_consumer(); if (seq != stats->seq) /* try again if version changed */ goto restart; .Ed t Fn membar_datadep_consumer Same as .Fn membar_consumer , but limited to loads of addresses dependent on prior loads, or .Sq data-dependent loads: d -literal -offset indent int **pp, *p, v; p = *pp; membar_datadep_consumer(); v = *p; consume(v); .Ed

p .Fn membar_datadep_consumer is typically paired with .Fn membar_exit by code that initializes an object before publishing it. However, you should use .Xr atomic_store_release 9 and .Xr atomic_load_consume 9 instead, to avoid obscure edge cases in case the consumer is not read-only.

p .Fn membar_datadep_consumer does not guarantee ordering of loads in branches, or .Sq control-dependent loads \(em you must use .Fn membar_consumer instead: d -literal -offset indent int *ok, *p, v; if (*ok) { membar_consumer(); v = *p; consume(v); } .Ed

p Most CPUs do not reorder data-dependent loads (i.e., most CPUs guarantee that cached values are not stale in that case), so .Fn membar_datadep_consumer is a no-op on those CPUs. t Fn membar_sync All memory operations preceding .Fn membar_sync will happen before any memory operations following it.

p .Fn membar_sync is a sequential consistency acquire/release barrier, analogous to .Li "atomic_thread_fence(memory_order_seq_cst)" in C11.

p .Fn membar_sync is typically paired with .Fn membar_sync .

p A load followed by .Fn membar_sync , serving as a .Em load-acquire operation, may also be paired with a prior .Fn membar_exit followed by a store, serving as the corresponding .Em store-release operation. However, you should use .Xr atomic_load_acquire 9 instead of .No load-then- Ns Fn membar_sync if it is a regular load, or .Fn membar_enter instead of .Fn membar_sync if the load is in an atomic read/modify/write operation. .El .Sh SEE ALSO .Xr atomic_ops 3 , .Xr atomic_loadstore 9 .Sh HISTORY The .Nm membar_ops functions first appeared in .Nx 5.0 . The data-dependent load barrier, .Fn membar_datadep_consumer , first appeared in .Nx 7.0 .