Home | History | Annotate | Download | only in kern
History log of /src/sys/kern/subr_kcpuset.c
RevisionDateAuthorComments
 1.20  23-Sep-2023  ad Repply this change with a couple of bugs fixed:

- Do away with separate pool_cache for some kernel objects that have no special
requirements and use the general purpose allocator instead. On one of my
test systems this makes for a small (~1%) but repeatable reduction in system
time during builds presumably because it decreases the kernel's cache /
memory bandwidth footprint a little.
- vfs_lockf: cache a pointer to the uidinfo and put mutex in the data segment.
 1.19  12-Sep-2023  ad Back out recent change to replace pool_cache with then general allocator.
Will return to this when I have time again.
 1.18  11-Sep-2023  martin Add missing <sys/intr.h> include (previously indirectly hidden via pool.h)
 1.17  10-Sep-2023  ad - Do away with separate pool_cache for some kernel objects that have no special
requirements and use the general purpose allocator instead. On one of my
test systems this makes for a small (~1%) but repeatable reduction in system
time during builds presumably because it decreases the kernel's cache /
memory bandwidth footprint a little.
- vfs_lockf: cache a pointer to the uidinfo and put mutex in the data segment.
 1.16  01-Sep-2023  skrll Trailing whitespace.
 1.15  09-Apr-2023  riastradh kern: KASSERT(A && B) -> KASSERT(A); KASSERT(B)
 1.14  09-Apr-2022  riastradh sys: Use membar_release/acquire around reference drop.

This just goes through my recent reference count membar audit and
changes membar_exit to membar_release and membar_enter to
membar_acquire -- this should make everything cheaper on most CPUs
without hurting correctness, because membar_acquire is generally
cheaper than membar_enter.
 1.13  12-Mar-2022  riastradh sys: Membar audit around reference count releases.

If two threads are using an object that is freed when the reference
count goes to zero, we need to ensure that all memory operations
related to the object happen before freeing the object.

Using an atomic_dec_uint_nv(&refcnt) == 0 ensures that only one
thread takes responsibility for freeing, but it's not enough to
ensure that the other thread's memory operations happen before the
freeing.

Consider:

Thread A Thread B
obj->foo = 42; obj->baz = 73;
mumble(&obj->bar); grumble(&obj->quux);
/* membar_exit(); */ /* membar_exit(); */
atomic_dec -- not last atomic_dec -- last
/* membar_enter(); */
KASSERT(invariant(obj->foo,
obj->bar));
free_stuff(obj);

The memory barriers ensure that

obj->foo = 42;
mumble(&obj->bar);

in thread A happens before

KASSERT(invariant(obj->foo, obj->bar));
free_stuff(obj);

in thread B. Without them, this ordering is not guaranteed.

So in general it is necessary to do

membar_exit();
if (atomic_dec_uint_nv(&obj->refcnt) != 0)
return;
membar_enter();

to release a reference, for the `last one out hit the lights' style
of reference counting. (This is in contrast to the style where one
thread blocks new references and then waits under a lock for existing
ones to drain with a condvar -- no membar needed thanks to mutex(9).)

I searched for atomic_dec to find all these. Obviously we ought to
have a better abstraction for this because there's so much copypasta.
This is a stop-gap measure to fix actual bugs until we have that. It
would be nice if an abstraction could gracefully handle the different
styles of reference counting in use -- some years ago I drafted an
API for this, but making it cover everything got a little out of hand
(particularly with struct vnode::v_usecount) and I ended up setting
it aside to work on psref/localcount instead for better scalability.

I got bored of adding #ifdef __HAVE_ATOMIC_AS_MEMBAR everywhere, so I
only put it on things that look performance-critical on 5sec review.
We should really adopt membar_enter_preatomic/membar_exit_postatomic
or something (except they are applicable only to atomic r/m/w, not to
atomic_load/store_*, making the naming annoying) and get rid of all
the ifdefs.
 1.12  26-Jul-2019  msaitoh Set kcpuset's bit correctly to avoid undefined behavior. Found by KUBSan.
 1.11  19-May-2014  rmind branches: 1.11.28;
Constify kcpuset_countset() and cpu_index() parameters.
 1.10  25-Oct-2013  martin branches: 1.10.2;
Turn a few __unused into __diagused
 1.9  17-Jul-2013  matt Some constification.
Add kcpuset_clone, kcpuset_insersection, kcpuset_remove,
kcpuset_ffs, kcpuset_ffs_intersecting,
kcpuset_atomicly_merge, kcpuset_atomicly_intersect, kcpuset_atomicly_remove
 1.8  16-Sep-2012  rmind branches: 1.8.2; 1.8.8;
Rename kcpuset_copybits() to kcpuset_export_u32() and thus be more specific
about the interface.
 1.7  20-Aug-2012  rmind branches: 1.7.2;
kcpuset_copybits: fix potential endianness problem. Spotted by matt@.
 1.6  06-Jun-2012  rmind Few fixes for Xen:
- cpu_load_pmap: use atomic kcpuset(9) operations; fixes rare crashes.
- Add kcpuset_copybits(9) and replace xen_kcpuset2bits(). Avoids incorrect
ncpu problem in early boot. Also, micro-optimises xen_mcast_invlpg() and
xen_mcast_tlbflush() routines.

Tested by chs@.
 1.5  20-Apr-2012  rmind - Convert x86 MD code, mainly pmap(9) e.g. TLB shootdown code, to use
kcpuset(9) and thus replace hardcoded CPU bitmasks. This removes the
limitation of maximum CPUs.

- Support up to 256 CPUs on amd64 architecture by default.

Bug fixes, improvements, completion of Xen part and testing on 64-core
AMD Opteron(tm) Processor 6282 SE (also, as Xen HVM domU with 128 CPUs)
by Manuel Bouyer.
 1.4  29-Jan-2012  rmind branches: 1.4.2;
- Add kcpuset_isotherset() and kcpuset_countset().
- Fix KC_NFIELDS_EARLY. Make kcpuset_isset() return bool.
 1.3  07-Aug-2011  rmind branches: 1.3.2; 1.3.6;
- Add an argument to kcpuset_create() for zeroing.
- Add kcpuset_atomic_set(), kcpuset_atomic_clear() and kcpuset_merge().
 1.2  07-Aug-2011  rmind Remove LW_AFFINITY flag and fix some bugs affinity mask handling.
 1.1  07-Aug-2011  rmind Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@
 1.3.6.2  29-Apr-2012  mrg sync to latest -current.
 1.3.6.1  18-Feb-2012  mrg merge to -current.
 1.3.2.4  22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.3.2.3  30-Oct-2012  yamt sync with head
 1.3.2.2  23-May-2012  yamt sync with head.
 1.3.2.1  17-Apr-2012  yamt sync with head
 1.4.2.2  12-Jun-2012  riz Pull up following revision(s) (requested by rmind in ticket #314):
sys/arch/xen/x86/cpu.c: revision 1.92
sys/kern/subr_kcpuset.c: revision 1.6
sys/sys/kcpuset.h: revision 1.6
sys/arch/xen/x86/x86_xpmap.c: revision 1.44
Few fixes for Xen:
- cpu_load_pmap: use atomic kcpuset(9) operations; fixes rare crashes.
- Add kcpuset_copybits(9) and replace xen_kcpuset2bits(). Avoids incorrect
ncpu problem in early boot. Also, micro-optimises xen_mcast_invlpg() and
xen_mcast_tlbflush() routines.
Tested by chs@.
 1.4.2.1  09-May-2012  riz Pull up following revision(s) (requested by rmind in ticket #202):
sys/arch/x86/include/cpuvar.h: revision 1.46
sys/arch/xen/include/xenpmap.h: revision 1.34
sys/arch/i386/include/param.h: revision 1.77
sys/arch/x86/x86/pmap_tlb.c: revision 1.5
sys/arch/x86/x86/pmap_tlb.c: revision 1.6
sys/arch/i386/i386/genassym.cf: revision 1.92
sys/arch/xen/x86/cpu.c: revision 1.91
sys/arch/x86/x86/pmap.c: revision 1.177
sys/arch/xen/x86/xen_pmap.c: revision 1.21
sys/arch/x86/acpi/acpi_wakeup.c: revision 1.31
sys/kern/subr_kcpuset.c: revision 1.5
sys/arch/amd64/include/param.h: revision 1.18
sys/sys/kcpuset.h: revision 1.5
sys/arch/x86/x86/mtrr_i686.c: revision 1.26
sys/arch/x86/x86/mtrr_i686.c: revision 1.27
sys/arch/xen/x86/x86_xpmap.c: revision 1.43
sys/arch/x86/x86/cpu.c: revision 1.98
sys/arch/amd64/amd64/mptramp.S: revision 1.14
sys/kern/sys_sched.c: revision 1.42
sys/arch/amd64/amd64/genassym.cf: revision 1.50
sys/arch/i386/i386/mptramp.S: revision 1.24
sys/arch/x86/include/pmap.h: revision 1.52
sys/arch/x86/include/cpu.h: revision 1.50
- Convert x86 MD code, mainly pmap(9) e.g. TLB shootdown code, to use
kcpuset(9) and thus replace hardcoded CPU bitmasks. This removes the
limitation of maximum CPUs.
- Support up to 256 CPUs on amd64 architecture by default.
Bug fixes, improvements, completion of Xen part and testing on 64-core
AMD Opteron(tm) Processor 6282 SE (also, as Xen HVM domU with 128 CPUs)
by Manuel Bouyer.
- pmap_tlb_shootdown: do not overwrite tp_cpumask with pm_cpus, but merge
like pm_kernel_cpus. Remove unecessary intersection with kcpuset_running.
Do not reset tp_userpmap if pmap_kernel().
- Remove pmap_tlb_mailbox_t wrapping, which is pointless after recent changes.
- pmap_tlb_invalidate, pmap_tlb_intr: constify for packet structure.
i686_mtrr_init_first: handle the case when there are no variable-size MTRR
registers available (i686_mtrr_vcnt == 0).
 1.7.2.2  20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.7.2.1  20-Nov-2012  tls Resync to 2012-11-19 00:00:00 UTC
 1.8.8.1  23-Jul-2013  riastradh sync with HEAD
 1.8.2.2  18-May-2014  rmind sync with head
 1.8.2.1  28-Aug-2013  rmind sync with head
 1.10.2.1  10-Aug-2014  tls Rebase.
 1.11.28.1  13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411

RSS XML Feed