History log of /src/common/lib/libc/arch/mips/atomic/atomic_swap.S
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: perseant-exfatfs-base-20250801 netbsd-11-base netbsd-10-1-RELEASE perseant-exfatfs-base-20240630 perseant-exfatfs-base netbsd-10-0-RELEASE netbsd-10-0-RC6 netbsd-10-0-RC5 netbsd-10-0-RC4 netbsd-10-0-RC3 netbsd-10-0-RC2 thorpej-ifq-base thorpej-altq-separation-base netbsd-10-0-RC1 netbsd-10-base
# 1.8 27-Feb-2022 riastradh

mips: Membar audit.

This change should be safe because it doesn't remove or weaken any
memory barriers, but does add, clarify, or strengthen barriers.

Goals:

- Make sure mutex_enter/exit and mutex_spin_enter/exit have
acquire/release semantics.

- New macros make maintenance easier and purpose clearer:

. SYNC_ACQ is for load-before-load/store barrier, and BDSYNC_ACQ
for a branch delay slot -- currently defined as plain sync for MP
and nothing, or nop, for UP; thus it is no weaker than SYNC and
BDSYNC as currently defined, which is syncw on Octeon, plain sync
on non-Octeon MP, and nothing/nop on UP.

It is not clear to me whether load-then-syncw or ll/sc-then-syncw
or even bare load provides load-acquire semantics on Octeon -- if
no, this will fix bugs; if yes (like it is on SPARC PSO), we can
relax SYNC_ACQ to be syncw or nothing later.

. SYNC_REL is for load/store-before-store barrier -- currently
defined as plain sync for MP and nothing for UP.

It is not clear to me whether syncw-then-store is enough for
store-release on Octeon -- if no, we can leave this as is; if
yes, we can relax SYNC_REL to be syncw on Octeon.

. SYNC_PLUNGER is there to flush clogged Cavium store buffers, and
BDSYNC_PLUNGER for a branch delay slot -- syncw on Octeon,
nothing or nop on non-Octeon.

=> This is not necessary (or, as far as I'm aware, sufficient)
for acquire semantics -- it serves only to flush store buffers
where stores might otherwise linger for hundreds of thousands
of cycles, which would, e.g., cause spin locks to be held for
unreasonably long durations.

Newerish revisions of the MIPS ISA also have finer-grained sync
variants that could be plopped in here.

Mechanism:

Insert these barriers in the right places, replacing only those where
the definition is currently equivalent, so this change is safe.

- Replace #ifdef _MIPS_ARCH_OCTEONP / syncw / #endif at the end of
atomic_cas_* by SYNC_PLUNGER, which is `sync 4' (a.k.a. syncw) if
__OCTEON__ and empty otherwise.

=> From what I can tell, __OCTEON__ is defined in at least as many
contexts as _MIPS_ARCH_OCTEONP -- i.e., there are some Octeons
with no _MIPS_ARCH_OCTEONP, but I don't know if any of them are
relevant to us or ever saw the light of day outside Cavium; we
seem to buid with `-march=octeonp' so this is unlikely to make a
difference. If it turns out that we do care, well, now there's
a central place to make the distinction for sync instructions.

- Replace post-ll/sc SYNC by SYNC_ACQ in _atomic_cas_*, which are
internal kernel versions used in sys/arch/mips/include/lock.h where
it assumes they have load-acquire semantics. Should move this to
lock.h later, since we _don't_ define __HAVE_ATOMIC_AS_MEMBAR on
MIPS and so the extra barrier might be costly.

- Insert SYNC_REL before ll/sc, and replace post-ll/sc SYNC by
SYNC_ACQ, in _ucas_*, which is used without any barriers in futex
code and doesn't mention barriers in the man page so I have to
assume it is required to be a release/acquire barrier.

- Change BDSYNC to BDSYNC_ACQ in mutex_enter and mutex_spin_enter.
This is necessary to provide load-acquire semantics -- unclear if
it was provided already by syncw on Octeon, but it seems more
likely that either (a) no sync or syncw is needed at all, or (b)
syncw is not enough and sync is needed, since syncw is only a
store-before-store ordering barrier.

- Insert SYNC_REL before ll/sc in mutex_exit and mutex_spin_exit.
This is currently redundant with the SYNC already there, but
SYNC_REL more clearly identifies the necessary semantics in case we
want to define it differently on different systems, and having a
sync in the middle of an ll/sc is a bit weird and possibly not a
good idea, so I intend to (carefully) remove the redundant SYNC in
a later change.

- Change BDSYNC to BDSYNC_PLUNGER at the end of mutex_exit. This has
no semantic change right now -- it's syncw on Octeon, sync on
non-Octeon MP, nop on UP -- but we can relax it later to nop on
non-Cavium MP.

- Leave LLSCSYNC in for now -- it is apparently there for a Cavium
erratum, but I'm not sure what the erratum is, exactly, and I have
no reference for it. I suspect these can be safely removed, but we
might have to double up some other syncw instructions -- Linux uses
it only in store-release sequences, not at the head of every ll/sc.


Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base
# 1.7 06-Aug-2020 skrll

Centralise SYNC/BDSYNC in asm.h and introduce a new LLCSCSYNC and use it
before any ll/sc sequences.

Define LLSCSYNC as syncw; syncw for cnMIPS - issue two as early cnMIPS
has errat{um,a} that means the first can fail.


# 1.6 01-Aug-2020 skrll

Trailing whitespace


Revision tags: netbsd-8-3-RELEASE netbsd-9-4-RELEASE netbsd-9-3-RELEASE netbsd-9-2-RELEASE netbsd-9-1-RELEASE bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 bouyer-xenpvh-base phil-wifi-20200411 is-mlppp-base phil-wifi-20200406 netbsd-8-2-RELEASE ad-namecache-base3 netbsd-9-0-RELEASE netbsd-9-0-RC2 ad-namecache-base netbsd-9-0-RC1 phil-wifi-20191119 netbsd-9-base phil-wifi-20190609 netbsd-8-1-RELEASE netbsd-8-1-RC1 pgoyette-compat-merge-20190127 pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728 netbsd-8-0-RELEASE phil-wifi-base pgoyette-compat-0625 netbsd-8-0-RC2 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 netbsd-8-0-RC1 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base tls-maxphys-20171202 matt-nb8-mediatek-base perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 pgoyette-localcount-20170320 bouyer-socketcan-base pgoyette-localcount-20170107 pgoyette-localcount-20161104 localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base
# 1.5 01-Jun-2015 matt

branches: 1.5.16;
Include OCTEON support for syncw and saa/saad (Store Atomic Add).


Revision tags: netbsd-7-2-RELEASE netbsd-7-1-2-RELEASE netbsd-7-1-1-RELEASE netbsd-7-1-RELEASE netbsd-7-1-RC2 netbsd-7-nhusb-base-20170116 netbsd-7-1-RC1 netbsd-7-0-2-RELEASE netbsd-7-nhusb-base netbsd-7-0-1-RELEASE netbsd-7-0-RELEASE netbsd-7-0-RC3 netbsd-7-0-RC2 netbsd-7-0-RC1 netbsd-7-base yamt-pagecache-base9 tls-earlyentropy-base riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6 tls-maxphys-base yamt-pagecache-base5 yamt-pagecache-base4
# 1.4 14-Mar-2012 christos

don't include <sys/cdefs.h> from assembly.


Revision tags: netbsd-6-0-6-RELEASE netbsd-6-1-5-RELEASE netbsd-6-1-4-RELEASE netbsd-6-0-5-RELEASE netbsd-6-1-3-RELEASE netbsd-6-0-4-RELEASE netbsd-6-1-2-RELEASE netbsd-6-0-3-RELEASE netbsd-6-1-1-RELEASE netbsd-6-0-2-RELEASE netbsd-6-1-RELEASE netbsd-6-1-RC4 netbsd-6-1-RC3 netbsd-6-1-RC2 netbsd-6-1-RC1 netbsd-6-0-1-RELEASE matt-nb6-plus-nbase netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 netbsd-6-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.3 27-Aug-2011 bouyer

branches: 1.3.2;
loongson2f support:
- Add some loongson2 definitions to cpuregs.h, from OpenBSD
- Make sure that the at register is useable before every jump register
instruction (exept when register is k0 or k1) because -mfix-loongson2f-btb
needs the at register for its workaround
- add code to mips_fixup.c to handle the instructions added by
-mfix-loongson2f-btb
- Add a ls2-specific tlb miss handler: it doesn't have separate handler
for the xtlbmiss exeption.
- Fixes for some #ifdef MIPS3_LOONGSON2 assembly code (using the wrong
register)


Revision tags: cherry-xenmp-base bouyer-quota2-nbase bouyer-quota2-base matt-mips64-premerge-20101231 yamt-nfs-mp-base11 yamt-nfs-mp-base10 rmind-uvmplock-base yamt-nfs-mp-base9
# 1.2 14-Dec-2009 matt

Merge from matt-nb5-mips64


Revision tags: jym-xensuspend-nbase yamt-nfs-mp-base8
# 1.1 05-Sep-2009 matt

branches: 1.1.2;
file atomic_swap.S was initially added on branch matt-nb5-mips64.