History log of /src/common/lib/libc/arch/mips/atomic |
Revision | Date | Author | Comments |
1.16 | 30-Mar-2023 |
riastradh | libc: Define __atomic_is_lock_free.
Limited to architectures where it is actually needed by gcc for any calls to stdatomic.h atomic_is_lock_free for now.
We should also add it to other architectures too, along with lockful atomic r/m/w operations for sizes that can't be handled natively, but that's a lot more work. It is also necessary for -fno-inline-atomics but we're missing a lot of other symbols for that too, to be fixed. For now, this should enable the OpenSSL build to complete on these architectures again after I reverted a local change.
XXX pullup-10
|
1.15 | 25-Apr-2021 |
christos | branches: 1.15.6; use ${MACHINE_MIPS64}
|
1.14 | 28-Feb-2019 |
isaki | Add missing atomic_and_{8,16}_nv_cas.c for __sync_and_and_fetch_{1,2}. XXX why is not only atomic_and_* symmetric unlike the others? (in common/lib/libc/atomic/)
|
1.13 | 13-Oct-2014 |
martin | branches: 1.13.16; Provide <atomic> C++ 2011 support functions for mips and sh3.
|
1.12 | 24-Feb-2014 |
martin | branches: 1.12.4; Provide cas_16 and cas_8 emulation via cas_32 and use that for mips64
|
1.11 | 21-Feb-2014 |
martin | Provide all __sync_* ops in libc.
|
1.10 | 14-Dec-2009 |
matt | branches: 1.10.6; 1.10.12; Merge from matt-nb5-mips64
|
1.9 | 04-Jan-2009 |
pooka | allow inclusion of atomic ops in librump
|
1.8 | 29-Sep-2008 |
ad | branches: 1.8.8; Allow atomic ops to be built as part of libpthread.
|
1.7 | 30-Apr-2008 |
ad | Assembly _atomic_cas_up() for mips. PR lib/38482.
|
1.6 | 11-Feb-2008 |
ad | branches: 1.6.4; Only build atomic ops for libkern/libc.
|
1.5 | 10-Feb-2008 |
ad | Enable the atomic ops in userspace.
|
1.4 | 30-Nov-2007 |
ad | branches: 1.4.4; Memory barriers for MIPS.
|
1.3 | 29-Nov-2007 |
ad | Use the CAS-based inc/dec variants, since these CPUs don't have atomic add in hardware (does arm?).
|
1.2 | 29-Nov-2007 |
ad | Make the 64-bit operations available when possible.
|
1.1 | 29-Nov-2007 |
ad | Atomic ops for MIPS. Use the CAS functions already provided by the kernel, and use the generic C code to provide the rest. Unfortunatley the C code assembles up pretty badly on MIPS but at least it will work.
|
1.4.4.3 | 23-Mar-2008 |
matt | sync with HEAD
|
1.4.4.2 | 09-Jan-2008 |
matt | sync with HEAD
|
1.4.4.1 | 30-Nov-2007 |
matt | file Makefile.inc was added on branch matt-armv6 on 2008-01-09 01:20:58 +0000
|
1.6.4.1 | 18-May-2008 |
yamt | sync with head.
|
1.8.8.2 | 05-Sep-2009 |
matt | Resolve some conflicts.
|
1.8.8.1 | 05-Sep-2009 |
matt | Enable the new atomic op routines on mips64e[bl].
|
1.10.12.1 | 19-Aug-2014 |
tls | Rebase to HEAD as of a few days ago.
|
1.10.6.1 | 22-May-2014 |
yamt | sync with head.
for a reference, the tree before this commit was tagged as yamt-pagecache-tag8.
this commit was splitted into small chunks to avoid a limitation of cvs. ("Protocol error: too many arguments")
|
1.12.4.1 | 12-Nov-2014 |
snj | Pull up following revision(s) (requested by martin in ticket #218): common/lib/libc/arch/arm/atomic/Makefile.inc: revision 1.24-1.26 common/lib/libc/arch/hppa/atomic/Makefile.inc: revision 1.13 common/lib/libc/arch/mips/atomic/Makefile.inc: revision 1.13 common/lib/libc/arch/sh3/atomic/Makefile.inc: revision 1.7 common/lib/libc/arch/sparc/atomic/Makefile.inc: revision 1.18 common/lib/libc/arch/vax/atomic/Makefile.inc: revision 1.7 common/lib/libc/atomic/atomic_and_16_nv_cas.c: revision 1.2 common/lib/libc/atomic/atomic_and_8_nv_cas.c: revision 1.2 common/lib/libc/atomic/atomic_c11_compare_exchange_cas_16.c: revision 1.1-1.2 common/lib/libc/atomic/atomic_c11_compare_exchange_cas_32.c: revision 1.1-1.2 common/lib/libc/atomic/atomic_c11_compare_exchange_cas_8.c: revision 1.1-1.2 common/lib/libc/atomic/atomic_cas_by_cas32.c: revision 1.4 common/lib/libc/atomic/atomic_op_namespace.h: revision 1.7 Add __sync_val_compare_and_swap_{1,2} aliases for _atomic_cas_{8,16} -- Provide __atomic_compare_exchange_N (as needed for the C11 2011 <atomic> ops) via the corresponding CAS. -- Hook __atomic_compare_exchange_N into vax libc. -- Provide __sync_and_and_fetch_2 and __sync_and_and_fetch_1 for pre-ARMv6, they are needed for the C++ 2011 <atomic> stuff. -- Add C++ 2011 <atomic> support functions. -- Move the and_{16,8}_nv sources into the right (libc only) block. -- Provide <atomic> C++ 2011 support functions for mips and sh3. -- Provide C++ 2011 <atomic> support functions for hppa and arm. -- Provide prototypes to fix build with clang.
|
1.13.16.2 | 21-Apr-2020 |
martin | Sync with HEAD
|
1.13.16.1 | 10-Jun-2019 |
christos | Sync with HEAD
|
1.15.6.1 | 31-Jul-2023 |
martin | Pull up following revision(s) (requested by riastradh in ticket #275):
common/lib/libc/arch/sparc/atomic/Makefile.inc: revision 1.24 common/lib/libc/arch/m68k/atomic/Makefile.inc: revision 1.16 common/lib/libc/arch/mips/atomic/Makefile.inc: revision 1.16 common/lib/libc/arch/hppa/atomic/Makefile.inc: revision 1.15 common/lib/libc/arch/vax/atomic/Makefile.inc: revision 1.9 common/lib/libc/atomic/atomic_is_lock_free.c: revision 1.1 common/lib/libc/arch/sh3/atomic/Makefile.inc: revision 1.9
libc: Define __atomic_is_lock_free.
Limited to architectures where it is actually needed by gcc for any calls to stdatomic.h atomic_is_lock_free for now.
We should also add it to other architectures too, along with lockful atomic r/m/w operations for sizes that can't be handled natively, but that's a lot more work. It is also necessary for -fno-inline-atomics but we're missing a lot of other symbols for that too, to be fixed.
For now, this should enable the OpenSSL build to complete on these architectures again after I reverted a local change.
|
1.7 | 06-Aug-2020 |
skrll | Centralise SYNC/BDSYNC in asm.h and introduce a new LLCSCSYNC and use it before any ll/sc sequences.
Define LLSCSYNC as syncw; syncw for cnMIPS - issue two as early cnMIPS has errat{um,a} that means the first can fail.
|
1.6 | 01-Aug-2020 |
skrll | Trailing whitespace
|
1.5 | 01-Jun-2015 |
matt | branches: 1.5.16; Include OCTEON support for syncw and saa/saad (Store Atomic Add).
|
1.4 | 14-Mar-2012 |
christos | don't include <sys/cdefs.h> from assembly.
|
1.3 | 27-Aug-2011 |
bouyer | branches: 1.3.2; loongson2f support: - Add some loongson2 definitions to cpuregs.h, from OpenBSD - Make sure that the at register is useable before every jump register instruction (exept when register is k0 or k1) because -mfix-loongson2f-btb needs the at register for its workaround - add code to mips_fixup.c to handle the instructions added by -mfix-loongson2f-btb - Add a ls2-specific tlb miss handler: it doesn't have separate handler for the xtlbmiss exeption. - Fixes for some #ifdef MIPS3_LOONGSON2 assembly code (using the wrong register)
|
1.2 | 14-Dec-2009 |
matt | Merge from matt-nb5-mips64
|
1.1 | 05-Sep-2009 |
matt | branches: 1.1.2; file atomic_add.S was initially added on branch matt-nb5-mips64.
|
1.1.2.1 | 05-Sep-2009 |
matt | Add native ll/sc or lld/scd versions of the atomic ops.
|
1.3.2.1 | 17-Apr-2012 |
yamt | sync with head
|
1.5.16.2 | 21-Apr-2020 |
martin | Ooops, restore accidently removed files from merge mishap
|
1.5.16.1 | 21-Apr-2020 |
martin | Sync with HEAD
|
1.6 | 06-Aug-2020 |
skrll | Centralise SYNC/BDSYNC in asm.h and introduce a new LLCSCSYNC and use it before any ll/sc sequences.
Define LLSCSYNC as syncw; syncw for cnMIPS - issue two as early cnMIPS has errat{um,a} that means the first can fail.
|
1.5 | 01-Aug-2020 |
skrll | Trailing whitespace
|
1.4 | 14-Mar-2012 |
christos | branches: 1.4.34; don't include <sys/cdefs.h> from assembly.
|
1.3 | 27-Aug-2011 |
bouyer | branches: 1.3.2; loongson2f support: - Add some loongson2 definitions to cpuregs.h, from OpenBSD - Make sure that the at register is useable before every jump register instruction (exept when register is k0 or k1) because -mfix-loongson2f-btb needs the at register for its workaround - add code to mips_fixup.c to handle the instructions added by -mfix-loongson2f-btb - Add a ls2-specific tlb miss handler: it doesn't have separate handler for the xtlbmiss exeption. - Fixes for some #ifdef MIPS3_LOONGSON2 assembly code (using the wrong register)
|
1.2 | 14-Dec-2009 |
matt | Merge from matt-nb5-mips64
|
1.1 | 05-Sep-2009 |
matt | branches: 1.1.2; file atomic_and.S was initially added on branch matt-nb5-mips64.
|
1.1.2.1 | 05-Sep-2009 |
matt | Add native ll/sc or lld/scd versions of the atomic ops.
|
1.3.2.1 | 17-Apr-2012 |
yamt | sync with head
|
1.4.34.2 | 21-Apr-2020 |
martin | Ooops, restore accidently removed files from merge mishap
|
1.4.34.1 | 21-Apr-2020 |
martin | Sync with HEAD
|
1.9 | 27-Feb-2022 |
riastradh | mips: Membar audit.
This change should be safe because it doesn't remove or weaken any memory barriers, but does add, clarify, or strengthen barriers.
Goals:
- Make sure mutex_enter/exit and mutex_spin_enter/exit have acquire/release semantics.
- New macros make maintenance easier and purpose clearer:
. SYNC_ACQ is for load-before-load/store barrier, and BDSYNC_ACQ for a branch delay slot -- currently defined as plain sync for MP and nothing, or nop, for UP; thus it is no weaker than SYNC and BDSYNC as currently defined, which is syncw on Octeon, plain sync on non-Octeon MP, and nothing/nop on UP.
It is not clear to me whether load-then-syncw or ll/sc-then-syncw or even bare load provides load-acquire semantics on Octeon -- if no, this will fix bugs; if yes (like it is on SPARC PSO), we can relax SYNC_ACQ to be syncw or nothing later.
. SYNC_REL is for load/store-before-store barrier -- currently defined as plain sync for MP and nothing for UP.
It is not clear to me whether syncw-then-store is enough for store-release on Octeon -- if no, we can leave this as is; if yes, we can relax SYNC_REL to be syncw on Octeon.
. SYNC_PLUNGER is there to flush clogged Cavium store buffers, and BDSYNC_PLUNGER for a branch delay slot -- syncw on Octeon, nothing or nop on non-Octeon.
=> This is not necessary (or, as far as I'm aware, sufficient) for acquire semantics -- it serves only to flush store buffers where stores might otherwise linger for hundreds of thousands of cycles, which would, e.g., cause spin locks to be held for unreasonably long durations.
Newerish revisions of the MIPS ISA also have finer-grained sync variants that could be plopped in here.
Mechanism:
Insert these barriers in the right places, replacing only those where the definition is currently equivalent, so this change is safe.
- Replace #ifdef _MIPS_ARCH_OCTEONP / syncw / #endif at the end of atomic_cas_* by SYNC_PLUNGER, which is `sync 4' (a.k.a. syncw) if __OCTEON__ and empty otherwise.
=> From what I can tell, __OCTEON__ is defined in at least as many contexts as _MIPS_ARCH_OCTEONP -- i.e., there are some Octeons with no _MIPS_ARCH_OCTEONP, but I don't know if any of them are relevant to us or ever saw the light of day outside Cavium; we seem to buid with `-march=octeonp' so this is unlikely to make a difference. If it turns out that we do care, well, now there's a central place to make the distinction for sync instructions.
- Replace post-ll/sc SYNC by SYNC_ACQ in _atomic_cas_*, which are internal kernel versions used in sys/arch/mips/include/lock.h where it assumes they have load-acquire semantics. Should move this to lock.h later, since we _don't_ define __HAVE_ATOMIC_AS_MEMBAR on MIPS and so the extra barrier might be costly.
- Insert SYNC_REL before ll/sc, and replace post-ll/sc SYNC by SYNC_ACQ, in _ucas_*, which is used without any barriers in futex code and doesn't mention barriers in the man page so I have to assume it is required to be a release/acquire barrier.
- Change BDSYNC to BDSYNC_ACQ in mutex_enter and mutex_spin_enter. This is necessary to provide load-acquire semantics -- unclear if it was provided already by syncw on Octeon, but it seems more likely that either (a) no sync or syncw is needed at all, or (b) syncw is not enough and sync is needed, since syncw is only a store-before-store ordering barrier.
- Insert SYNC_REL before ll/sc in mutex_exit and mutex_spin_exit. This is currently redundant with the SYNC already there, but SYNC_REL more clearly identifies the necessary semantics in case we want to define it differently on different systems, and having a sync in the middle of an ll/sc is a bit weird and possibly not a good idea, so I intend to (carefully) remove the redundant SYNC in a later change.
- Change BDSYNC to BDSYNC_PLUNGER at the end of mutex_exit. This has no semantic change right now -- it's syncw on Octeon, sync on non-Octeon MP, nop on UP -- but we can relax it later to nop on non-Cavium MP.
- Leave LLSCSYNC in for now -- it is apparently there for a Cavium erratum, but I'm not sure what the erratum is, exactly, and I have no reference for it. I suspect these can be safely removed, but we might have to double up some other syncw instructions -- Linux uses it only in store-release sequences, not at the head of every ll/sc.
|
1.8 | 06-Aug-2020 |
skrll | Centralise SYNC/BDSYNC in asm.h and introduce a new LLCSCSYNC and use it before any ll/sc sequences.
Define LLSCSYNC as syncw; syncw for cnMIPS - issue two as early cnMIPS has errat{um,a} that means the first can fail.
|
1.7 | 01-Aug-2020 |
skrll | Trailing whitespace
|
1.6 | 20-Feb-2019 |
rin | Export atomic_cas_32_ni in a similar manner to its 64-bit counterpart.
Compile test only, but seems trivial enough for me.
Fix build error due to test/lib/libc/atomic/t_atomic_cas.
Note that mips32 does not use atomic_cas.S.
|
1.5 | 19-Feb-2019 |
martin | Add atomic_cas_64_ni alias
|
1.4 | 01-Jun-2015 |
matt | branches: 1.4.16; Include OCTEON support for syncw and saa/saad (Store Atomic Add).
|
1.3 | 14-Mar-2012 |
christos | don't include <sys/cdefs.h> from assembly.
|
1.2 | 14-Dec-2009 |
matt | branches: 1.2.6; Merge from matt-nb5-mips64
|
1.1 | 05-Sep-2009 |
matt | branches: 1.1.2; file atomic_cas.S was initially added on branch matt-nb5-mips64.
|
1.1.2.1 | 05-Sep-2009 |
matt | Add native ll/sc or lld/scd versions of the atomic ops.
|
1.2.6.1 | 17-Apr-2012 |
yamt | sync with head
|
1.4.16.3 | 21-Apr-2020 |
martin | Ooops, restore accidently removed files from merge mishap
|
1.4.16.2 | 21-Apr-2020 |
martin | Sync with HEAD
|
1.4.16.1 | 10-Jun-2019 |
christos | Sync with HEAD
|
1.3 | 01-Aug-2020 |
skrll | Trailing whitespace
|
1.2 | 25-May-2008 |
chs | branches: 1.2.62; enable profiling of assembly functions.
|
1.1 | 30-Apr-2008 |
ad | branches: 1.1.2; 1.1.4; Assembly _atomic_cas_up() for mips. PR lib/38482.
|
1.1.4.3 | 04-Jun-2008 |
yamt | sync with head
|
1.1.4.2 | 18-May-2008 |
yamt | sync with head.
|
1.1.4.1 | 30-Apr-2008 |
yamt | file atomic_cas_up.S was added on branch yamt-pf42 on 2008-05-18 12:28:45 +0000
|
1.1.2.1 | 23-Jun-2008 |
wrstuden | Sync w/ -current. 34 merge conflicts to follow.
|
1.2.62.2 | 21-Apr-2020 |
martin | Ooops, restore accidently removed files from merge mishap
|
1.2.62.1 | 21-Apr-2020 |
martin | Sync with HEAD
|
1.7 | 06-Aug-2020 |
skrll | Centralise SYNC/BDSYNC in asm.h and introduce a new LLCSCSYNC and use it before any ll/sc sequences.
Define LLSCSYNC as syncw; syncw for cnMIPS - issue two as early cnMIPS has errat{um,a} that means the first can fail.
|
1.6 | 01-Aug-2020 |
skrll | Trailing whitespace
|
1.5 | 01-Jun-2015 |
matt | branches: 1.5.16; Include OCTEON support for syncw and saa/saad (Store Atomic Add).
|
1.4 | 14-Mar-2012 |
christos | don't include <sys/cdefs.h> from assembly.
|
1.3 | 27-Aug-2011 |
bouyer | branches: 1.3.2; loongson2f support: - Add some loongson2 definitions to cpuregs.h, from OpenBSD - Make sure that the at register is useable before every jump register instruction (exept when register is k0 or k1) because -mfix-loongson2f-btb needs the at register for its workaround - add code to mips_fixup.c to handle the instructions added by -mfix-loongson2f-btb - Add a ls2-specific tlb miss handler: it doesn't have separate handler for the xtlbmiss exeption. - Fixes for some #ifdef MIPS3_LOONGSON2 assembly code (using the wrong register)
|
1.2 | 14-Dec-2009 |
matt | Merge from matt-nb5-mips64
|
1.1 | 05-Sep-2009 |
matt | branches: 1.1.2; file atomic_dec.S was initially added on branch matt-nb5-mips64.
|
1.1.2.1 | 05-Sep-2009 |
matt | Add native ll/sc or lld/scd versions of the atomic ops.
|
1.3.2.1 | 17-Apr-2012 |
yamt | sync with head
|
1.5.16.2 | 21-Apr-2020 |
martin | Ooops, restore accidently removed files from merge mishap
|
1.5.16.1 | 21-Apr-2020 |
martin | Sync with HEAD
|
1.7 | 06-Aug-2020 |
skrll | Centralise SYNC/BDSYNC in asm.h and introduce a new LLCSCSYNC and use it before any ll/sc sequences.
Define LLSCSYNC as syncw; syncw for cnMIPS - issue two as early cnMIPS has errat{um,a} that means the first can fail.
|
1.6 | 01-Aug-2020 |
skrll | Trailing whitespace
|
1.5 | 01-Jun-2015 |
matt | branches: 1.5.16; Include OCTEON support for syncw and saa/saad (Store Atomic Add).
|
1.4 | 14-Mar-2012 |
christos | don't include <sys/cdefs.h> from assembly.
|
1.3 | 27-Aug-2011 |
bouyer | branches: 1.3.2; loongson2f support: - Add some loongson2 definitions to cpuregs.h, from OpenBSD - Make sure that the at register is useable before every jump register instruction (exept when register is k0 or k1) because -mfix-loongson2f-btb needs the at register for its workaround - add code to mips_fixup.c to handle the instructions added by -mfix-loongson2f-btb - Add a ls2-specific tlb miss handler: it doesn't have separate handler for the xtlbmiss exeption. - Fixes for some #ifdef MIPS3_LOONGSON2 assembly code (using the wrong register)
|
1.2 | 14-Dec-2009 |
matt | Merge from matt-nb5-mips64
|
1.1 | 05-Sep-2009 |
matt | branches: 1.1.2; file atomic_inc.S was initially added on branch matt-nb5-mips64.
|
1.1.2.1 | 05-Sep-2009 |
matt | Add native ll/sc or lld/scd versions of the atomic ops.
|
1.3.2.1 | 17-Apr-2012 |
yamt | sync with head
|
1.5.16.2 | 21-Apr-2020 |
martin | Ooops, restore accidently removed files from merge mishap
|
1.5.16.1 | 21-Apr-2020 |
martin | Sync with HEAD
|
1.5 | 27-Feb-2022 |
riastradh | mips: Membar audit.
This change should be safe because it doesn't remove or weaken any memory barriers, but does add, clarify, or strengthen barriers.
Goals:
- Make sure mutex_enter/exit and mutex_spin_enter/exit have acquire/release semantics.
- New macros make maintenance easier and purpose clearer:
. SYNC_ACQ is for load-before-load/store barrier, and BDSYNC_ACQ for a branch delay slot -- currently defined as plain sync for MP and nothing, or nop, for UP; thus it is no weaker than SYNC and BDSYNC as currently defined, which is syncw on Octeon, plain sync on non-Octeon MP, and nothing/nop on UP.
It is not clear to me whether load-then-syncw or ll/sc-then-syncw or even bare load provides load-acquire semantics on Octeon -- if no, this will fix bugs; if yes (like it is on SPARC PSO), we can relax SYNC_ACQ to be syncw or nothing later.
. SYNC_REL is for load/store-before-store barrier -- currently defined as plain sync for MP and nothing for UP.
It is not clear to me whether syncw-then-store is enough for store-release on Octeon -- if no, we can leave this as is; if yes, we can relax SYNC_REL to be syncw on Octeon.
. SYNC_PLUNGER is there to flush clogged Cavium store buffers, and BDSYNC_PLUNGER for a branch delay slot -- syncw on Octeon, nothing or nop on non-Octeon.
=> This is not necessary (or, as far as I'm aware, sufficient) for acquire semantics -- it serves only to flush store buffers where stores might otherwise linger for hundreds of thousands of cycles, which would, e.g., cause spin locks to be held for unreasonably long durations.
Newerish revisions of the MIPS ISA also have finer-grained sync variants that could be plopped in here.
Mechanism:
Insert these barriers in the right places, replacing only those where the definition is currently equivalent, so this change is safe.
- Replace #ifdef _MIPS_ARCH_OCTEONP / syncw / #endif at the end of atomic_cas_* by SYNC_PLUNGER, which is `sync 4' (a.k.a. syncw) if __OCTEON__ and empty otherwise.
=> From what I can tell, __OCTEON__ is defined in at least as many contexts as _MIPS_ARCH_OCTEONP -- i.e., there are some Octeons with no _MIPS_ARCH_OCTEONP, but I don't know if any of them are relevant to us or ever saw the light of day outside Cavium; we seem to buid with `-march=octeonp' so this is unlikely to make a difference. If it turns out that we do care, well, now there's a central place to make the distinction for sync instructions.
- Replace post-ll/sc SYNC by SYNC_ACQ in _atomic_cas_*, which are internal kernel versions used in sys/arch/mips/include/lock.h where it assumes they have load-acquire semantics. Should move this to lock.h later, since we _don't_ define __HAVE_ATOMIC_AS_MEMBAR on MIPS and so the extra barrier might be costly.
- Insert SYNC_REL before ll/sc, and replace post-ll/sc SYNC by SYNC_ACQ, in _ucas_*, which is used without any barriers in futex code and doesn't mention barriers in the man page so I have to assume it is required to be a release/acquire barrier.
- Change BDSYNC to BDSYNC_ACQ in mutex_enter and mutex_spin_enter. This is necessary to provide load-acquire semantics -- unclear if it was provided already by syncw on Octeon, but it seems more likely that either (a) no sync or syncw is needed at all, or (b) syncw is not enough and sync is needed, since syncw is only a store-before-store ordering barrier.
- Insert SYNC_REL before ll/sc in mutex_exit and mutex_spin_exit. This is currently redundant with the SYNC already there, but SYNC_REL more clearly identifies the necessary semantics in case we want to define it differently on different systems, and having a sync in the middle of an ll/sc is a bit weird and possibly not a good idea, so I intend to (carefully) remove the redundant SYNC in a later change.
- Change BDSYNC to BDSYNC_PLUNGER at the end of mutex_exit. This has no semantic change right now -- it's syncw on Octeon, sync on non-Octeon MP, nop on UP -- but we can relax it later to nop on non-Cavium MP.
- Leave LLSCSYNC in for now -- it is apparently there for a Cavium erratum, but I'm not sure what the erratum is, exactly, and I have no reference for it. I suspect these can be safely removed, but we might have to double up some other syncw instructions -- Linux uses it only in store-release sequences, not at the head of every ll/sc.
|
1.4 | 01-Aug-2020 |
skrll | Trailing whitespace
|
1.3 | 01-Jun-2015 |
matt | branches: 1.3.16; Include OCTEON support for syncw and saa/saad (Store Atomic Add).
|
1.2 | 28-Apr-2008 |
martin | Remove clause 3 and 4 from TNF licenses
|
1.1 | 30-Nov-2007 |
ad | branches: 1.1.4; 1.1.8; Memory barriers for MIPS.
|
1.1.8.1 | 18-May-2008 |
yamt | sync with head.
|
1.1.4.2 | 09-Jan-2008 |
matt | sync with HEAD
|
1.1.4.1 | 30-Nov-2007 |
matt | file atomic_op_asm.h was added on branch matt-armv6 on 2008-01-09 01:20:58 +0000
|
1.3.16.2 | 21-Apr-2020 |
martin | Ooops, restore accidently removed files from merge mishap
|
1.3.16.1 | 21-Apr-2020 |
martin | Sync with HEAD
|
1.6 | 06-Aug-2020 |
skrll | Centralise SYNC/BDSYNC in asm.h and introduce a new LLCSCSYNC and use it before any ll/sc sequences.
Define LLSCSYNC as syncw; syncw for cnMIPS - issue two as early cnMIPS has errat{um,a} that means the first can fail.
|
1.5 | 01-Aug-2020 |
skrll | Trailing whitespace
|
1.4 | 14-Mar-2012 |
christos | branches: 1.4.34; don't include <sys/cdefs.h> from assembly.
|
1.3 | 27-Aug-2011 |
bouyer | branches: 1.3.2; loongson2f support: - Add some loongson2 definitions to cpuregs.h, from OpenBSD - Make sure that the at register is useable before every jump register instruction (exept when register is k0 or k1) because -mfix-loongson2f-btb needs the at register for its workaround - add code to mips_fixup.c to handle the instructions added by -mfix-loongson2f-btb - Add a ls2-specific tlb miss handler: it doesn't have separate handler for the xtlbmiss exeption. - Fixes for some #ifdef MIPS3_LOONGSON2 assembly code (using the wrong register)
|
1.2 | 14-Dec-2009 |
matt | Merge from matt-nb5-mips64
|
1.1 | 05-Sep-2009 |
matt | branches: 1.1.2; file atomic_or.S was initially added on branch matt-nb5-mips64.
|
1.1.2.1 | 05-Sep-2009 |
matt | Add native ll/sc or lld/scd versions of the atomic ops.
|
1.3.2.1 | 17-Apr-2012 |
yamt | sync with head
|
1.4.34.2 | 21-Apr-2020 |
martin | Ooops, restore accidently removed files from merge mishap
|
1.4.34.1 | 21-Apr-2020 |
martin | Sync with HEAD
|
1.8 | 27-Feb-2022 |
riastradh | mips: Membar audit.
This change should be safe because it doesn't remove or weaken any memory barriers, but does add, clarify, or strengthen barriers.
Goals:
- Make sure mutex_enter/exit and mutex_spin_enter/exit have acquire/release semantics.
- New macros make maintenance easier and purpose clearer:
. SYNC_ACQ is for load-before-load/store barrier, and BDSYNC_ACQ for a branch delay slot -- currently defined as plain sync for MP and nothing, or nop, for UP; thus it is no weaker than SYNC and BDSYNC as currently defined, which is syncw on Octeon, plain sync on non-Octeon MP, and nothing/nop on UP.
It is not clear to me whether load-then-syncw or ll/sc-then-syncw or even bare load provides load-acquire semantics on Octeon -- if no, this will fix bugs; if yes (like it is on SPARC PSO), we can relax SYNC_ACQ to be syncw or nothing later.
. SYNC_REL is for load/store-before-store barrier -- currently defined as plain sync for MP and nothing for UP.
It is not clear to me whether syncw-then-store is enough for store-release on Octeon -- if no, we can leave this as is; if yes, we can relax SYNC_REL to be syncw on Octeon.
. SYNC_PLUNGER is there to flush clogged Cavium store buffers, and BDSYNC_PLUNGER for a branch delay slot -- syncw on Octeon, nothing or nop on non-Octeon.
=> This is not necessary (or, as far as I'm aware, sufficient) for acquire semantics -- it serves only to flush store buffers where stores might otherwise linger for hundreds of thousands of cycles, which would, e.g., cause spin locks to be held for unreasonably long durations.
Newerish revisions of the MIPS ISA also have finer-grained sync variants that could be plopped in here.
Mechanism:
Insert these barriers in the right places, replacing only those where the definition is currently equivalent, so this change is safe.
- Replace #ifdef _MIPS_ARCH_OCTEONP / syncw / #endif at the end of atomic_cas_* by SYNC_PLUNGER, which is `sync 4' (a.k.a. syncw) if __OCTEON__ and empty otherwise.
=> From what I can tell, __OCTEON__ is defined in at least as many contexts as _MIPS_ARCH_OCTEONP -- i.e., there are some Octeons with no _MIPS_ARCH_OCTEONP, but I don't know if any of them are relevant to us or ever saw the light of day outside Cavium; we seem to buid with `-march=octeonp' so this is unlikely to make a difference. If it turns out that we do care, well, now there's a central place to make the distinction for sync instructions.
- Replace post-ll/sc SYNC by SYNC_ACQ in _atomic_cas_*, which are internal kernel versions used in sys/arch/mips/include/lock.h where it assumes they have load-acquire semantics. Should move this to lock.h later, since we _don't_ define __HAVE_ATOMIC_AS_MEMBAR on MIPS and so the extra barrier might be costly.
- Insert SYNC_REL before ll/sc, and replace post-ll/sc SYNC by SYNC_ACQ, in _ucas_*, which is used without any barriers in futex code and doesn't mention barriers in the man page so I have to assume it is required to be a release/acquire barrier.
- Change BDSYNC to BDSYNC_ACQ in mutex_enter and mutex_spin_enter. This is necessary to provide load-acquire semantics -- unclear if it was provided already by syncw on Octeon, but it seems more likely that either (a) no sync or syncw is needed at all, or (b) syncw is not enough and sync is needed, since syncw is only a store-before-store ordering barrier.
- Insert SYNC_REL before ll/sc in mutex_exit and mutex_spin_exit. This is currently redundant with the SYNC already there, but SYNC_REL more clearly identifies the necessary semantics in case we want to define it differently on different systems, and having a sync in the middle of an ll/sc is a bit weird and possibly not a good idea, so I intend to (carefully) remove the redundant SYNC in a later change.
- Change BDSYNC to BDSYNC_PLUNGER at the end of mutex_exit. This has no semantic change right now -- it's syncw on Octeon, sync on non-Octeon MP, nop on UP -- but we can relax it later to nop on non-Cavium MP.
- Leave LLSCSYNC in for now -- it is apparently there for a Cavium erratum, but I'm not sure what the erratum is, exactly, and I have no reference for it. I suspect these can be safely removed, but we might have to double up some other syncw instructions -- Linux uses it only in store-release sequences, not at the head of every ll/sc.
|
1.7 | 06-Aug-2020 |
skrll | Centralise SYNC/BDSYNC in asm.h and introduce a new LLCSCSYNC and use it before any ll/sc sequences.
Define LLSCSYNC as syncw; syncw for cnMIPS - issue two as early cnMIPS has errat{um,a} that means the first can fail.
|
1.6 | 01-Aug-2020 |
skrll | Trailing whitespace
|
1.5 | 01-Jun-2015 |
matt | branches: 1.5.16; Include OCTEON support for syncw and saa/saad (Store Atomic Add).
|
1.4 | 14-Mar-2012 |
christos | don't include <sys/cdefs.h> from assembly.
|
1.3 | 27-Aug-2011 |
bouyer | branches: 1.3.2; loongson2f support: - Add some loongson2 definitions to cpuregs.h, from OpenBSD - Make sure that the at register is useable before every jump register instruction (exept when register is k0 or k1) because -mfix-loongson2f-btb needs the at register for its workaround - add code to mips_fixup.c to handle the instructions added by -mfix-loongson2f-btb - Add a ls2-specific tlb miss handler: it doesn't have separate handler for the xtlbmiss exeption. - Fixes for some #ifdef MIPS3_LOONGSON2 assembly code (using the wrong register)
|
1.2 | 14-Dec-2009 |
matt | Merge from matt-nb5-mips64
|
1.1 | 05-Sep-2009 |
matt | branches: 1.1.2; file atomic_swap.S was initially added on branch matt-nb5-mips64.
|
1.1.2.1 | 05-Sep-2009 |
matt | Add native ll/sc or lld/scd versions of the atomic ops.
|
1.3.2.1 | 17-Apr-2012 |
yamt | sync with head
|
1.5.16.2 | 21-Apr-2020 |
martin | Ooops, restore accidently removed files from merge mishap
|
1.5.16.1 | 21-Apr-2020 |
martin | Sync with HEAD
|
1.13 | 21-Apr-2022 |
riastradh | mips/cavium: Take advantage of Octeon's guaranteed r/rw ordering.
|
1.12 | 09-Apr-2022 |
riastradh | Introduce membar_acquire/release. Deprecate membar_enter/exit.
The names membar_enter/exit were unclear, and the documentation of membar_enter has disagreed with the implementations on sparc, powerpc, and even x86(!) for the entire time it has been in NetBSD.
The terms `acquire' and `release' are ubiquitous in the literature today, and have been adopted in the C and C++ standards to mean load-before-load/store and load/store-before-store, respectively, which are exactly the orderings required by acquiring and releasing a mutex, as well as other useful applications like decrementing a reference count and then freeing the underlying object if it went to zero.
Originally I proposed changing one word in the documentation for membar_enter to make it load-before-load/store instead of store-before-load/store, i.e., to make it an acquire barrier. I proposed this on the grounds that
(a) all implementations guarantee load-before-load/store, (b) some implementations fail to guarantee store-before-load/store, and (c) all uses in-tree assume load-before-load/store.
I verified parts (a) and (b) (except, for (a), powerpc didn't even guarantee load-before-load/store -- isync isn't necessarily enough; need lwsync in general -- but it _almost_ did, and it certainly didn't guarantee store-before-load/store).
Part (c) might not be correct, however: under the mistaken assumption that atomic-r/m/w then membar-w/rw is equivalent to atomic-r/m/w then membar-r/rw, I only audited the cases of membar_enter that _aren't_ immediately after an atomic-r/m/w. All of those cases assume load-before-load/store. But my assumption was wrong -- there are cases of atomic-r/m/w then membar-w/rw that would be broken by changing to atomic-r/m/w then membar-r/rw:
https://mail-index.netbsd.org/tech-kern/2022/03/29/msg028044.html
Furthermore, the name membar_enter has been adopted in other places like OpenBSD where it actually does follow the documentation and guarantee store-before-load/store, even if that order is not useful. So the name membar_enter currently lives in a bad place where it means either of two things -- r/rw or w/rw.
With this change, we deprecate membar_enter/exit, introduce membar_acquire/release as better names for the useful pair (r/rw and rw/w), and make sure the implementation of membar_enter guarantees both what was documented _and_ what was implemented, making it an alias for membar_sync.
While here, rework all of the membar_* definitions and aliases. The new logic follows a rule to make it easier to audit:
membar_X is defined as an alias for membar_Y iff membar_X is guaranteed by membar_Y.
The `no stronger than' relation is (the transitive closure of):
- membar_consumer (r/r) is guaranteed by membar_acquire (r/rw) - membar_producer (w/w) is guaranteed by membar_release (rw/w) - membar_acquire (r/rw) is guaranteed by membar_sync (rw/rw) - membar_release (rw/w) is guaranteed by membar_sync (rw/rw)
And, for the deprecated membars:
- membar_enter (whether r/rw, w/rw, or rw/rw) is guaranteed by membar_sync (rw/rw) - membar_exit (rw/w) is guaranteed by membar_release (rw/w)
(membar_exit is identical to membar_release, but the name is deprecated.)
Finally, while here, annotate some of the instructions with their semantics. For powerpc, leave an essay with citations on the unfortunate but -- as far as I can tell -- necessary decision to use lwsync, not isync, for membar_acquire and membar_consumer.
Also add membar(3) and atomic(3) man page links.
|
1.11 | 12-Feb-2022 |
riastradh | mips: Brush up __cpu_simple_lock.
- Eradicate last vestiges of mb_* barriers.
- In __cpu_simple_lock_init, omit needless barrier. It is the caller's responsibility to ensure __cpu_simple_lock_init happens before other operations on it anyway, so there was never any need for a barrier here.
- In __cpu_simple_lock_try, leave comments about memory ordering guarantees of the kernel's _atomic_cas_uint, which are inexplicably different from the non-underscored atomic_cas_uint.
- In __cpu_simple_unlock, use membar_exit instead of mb_memory, and do it unconditionally.
This ensures that in __cpu_simple_lock/.../__cpu_simple_unlock, all memory operations in the ellipsis happen before the store that releases the lock.
- On Octeon, the barrier was omitted altogether, which is a bug -- it needs to be there or else there is no happens-before relation and whoever takes the lock next might see stale values stored or even stomp over the unlocking CPU's delayed loads.
- On non-Octeon, the mb_memory was sync. Using membar_exit preserves this.
XXX On Octeon, membar_exit only issues syncw -- this seems wrong, only store-before-store and not load/store-before-store, unless the CNMIPS architecture guarantees it is sufficient here like SPARCv8/v9 PSO (`Partial Store Order').
- Leave an essay with citations about why we have an apparently pointless syncw _after_ releasing a lock, to work around a design bug^W^Wquirk in cnmips which sometimes buffers stores for hundreds of thousands of cycles for fun unless you issue syncw.
|
1.10 | 10-Aug-2020 |
skrll | More SYNC centralisation
|
1.9 | 01-Aug-2020 |
skrll | Trailing whitespace
|
1.8 | 23-Jun-2015 |
matt | branches: 1.8.16; Always use sync if mips3 or later or not using O32 ABI. (A little redundant since not using O32 means you are using mips3 or later.)
|
1.7 | 22-Jun-2015 |
matt | #include "assym.h" Don't include "assym.h" with _RUMPKERNEL defined.
|
1.6 | 01-Jun-2015 |
matt | Include OCTEON support for syncw and saa/saad (Store Atomic Add).
|
1.5 | 03-Aug-2012 |
matt | Add a missing .set noreorder
|
1.4 | 14-Dec-2009 |
matt | branches: 1.4.6; Merge from matt-nb5-mips64
|
1.3 | 25-May-2008 |
chs | branches: 1.3.10; enable profiling of assembly functions.
|
1.2 | 28-Apr-2008 |
martin | branches: 1.2.2; Remove clause 3 and 4 from TNF licenses
|
1.1 | 30-Nov-2007 |
ad | branches: 1.1.4; 1.1.8; Memory barriers for MIPS.
|
1.1.8.2 | 04-Jun-2008 |
yamt | sync with head
|
1.1.8.1 | 18-May-2008 |
yamt | sync with head.
|
1.1.4.2 | 09-Jan-2008 |
matt | sync with HEAD
|
1.1.4.1 | 30-Nov-2007 |
matt | file membar_ops.S was added on branch matt-armv6 on 2008-01-09 01:20:59 +0000
|
1.2.2.1 | 23-Jun-2008 |
wrstuden | Sync w/ -current. 34 merge conflicts to follow.
|
1.3.10.3 | 03-Aug-2012 |
matt | Add missing .set noreorder
|
1.3.10.2 | 05-Sep-2009 |
matt | Resolve some conflicts.
|
1.3.10.1 | 05-Sep-2009 |
matt | Only allow to null on o32
|
1.4.6.1 | 30-Oct-2012 |
yamt | sync with head
|
1.8.16.2 | 21-Apr-2020 |
martin | Ooops, restore accidently removed files from merge mishap
|
1.8.16.1 | 21-Apr-2020 |
martin | Sync with HEAD
|