Home | History | Annotate | Download | only in uvm
History log of /src/sys/uvm/uvm_pdpolicy_clock.c
RevisionDateAuthorComments
 1.42  20-May-2025  bouyer Remove the redundant kpreempt_disable/kpreempt_enable now that we're
running at splsoftbio. Pointed out by thorpej@
 1.41  19-May-2025  bouyer uvmpdpol_pagerealize(): ucpu->pdqhead is used by a single CPU; but
kpreempt_disable() isn't enough to guard against concurent access;
interrupts also need to be disabled.
If my analysis is correct, the only place using ucpu->pdqhead which
can be called from interrupt context it uvmpdpol_pagerealize(), and only
from softbio().
So:
- introduce splsoftbio() in sys/spl.h
- protect all accesses to ucpu->pdqhead with splsoftbio()

fixes pr kern/59412: uvmpdpol_pagerealize() queue index out of bound
 1.40  12-Apr-2022  andvar branches: 1.40.4; 1.40.10;
s/stablize/stabilize/
 1.39  11-Jun-2020  ad Counter tweaks:

- Don't need to count anonpages+filepages any more; clean+unknown+dirty for
each kind of page can be summed to get the totals.

- Track the number of free pages with a counter so that it's one less thing
for the allocator to do, which opens up further options there.

- Remove cpu_count_sync_one(). It has no users and doesn't save a whole lot.
For the cheap option, give cpu_count_sync() a boolean parameter indicating
that a cached value is okay, and rate limit the updates for cached values
to hz.
 1.38  11-Jun-2020  ad uvm_availmem(): give it a boolean argument to specify whether a recent
cached value will do, or if the very latest total must be fetched. It can
be called thousands of times a second and fetching the totals impacts not
only the calling LWP but other CPUs doing unrelated activity in the VM
system.
 1.37  17-May-2020  ad Start trying to reduce cache misses on vm_page during fault processing.

- Make PGO_LOCKED getpages imply PGO_NOBUSY and remove the latter. Mark
pages busy only when there's actually I/O to do.

- When doing COW on a uvm_object, don't mess with neighbouring pages. In
all likelyhood they're already entered.

- Don't mess with neighbouring VAs that have existing mappings as replacing
those mappings with same can be quite costly.

- Don't enqueue pages for neighbour faults unless not enqueued already, and
don't activate centre pages unless uvmpdpol says its useful.

Also:

- Make PGO_LOCKED getpages on UAOs work more like vnodes: do gang lookup in
the radix tree, and don't allocate new pages.

- Fix many assertion failures around faults/loans with tmpfs.
 1.36  02-Apr-2020  maxv Hide 'hardclock_ticks' behind a new getticks() function, and use relaxed
atomics internally. Only one caller is converted for now.

Discussed with riastradh@ and ad@.
 1.35  14-Mar-2020  ad uvm_pdpolicy: Require a write lock on the object only for dequeue.
No sense in requiring that for enqueue/activate/deactivate.
 1.34  08-Mar-2020  ad Don't zap the non-pdpolicy bits in pg->pqflags.
 1.33  23-Feb-2020  ad UVM locking changes, proposed on tech-kern:

- Change the lock on uvm_object, vm_amap and vm_anon to be a RW lock.
- Break v_interlock and vmobjlock apart. v_interlock remains a mutex.
- Do partial PV list locking in the x86 pmap. Others to follow later.
 1.32  30-Jan-2020  ad uvmpdpol_estimatepageable(): Don't take any locks here. This can be called
from DDB, and in any case the numbers are stale the instant the lock is
dropped, so it just doesn't matter.
 1.31  21-Jan-2020  ad uvmpdpol_pageactive(): the change to not re-activate recently activated
pages worked great with uvm_pageqlock, but it doesn't buy anything any more,
because now the busy pages are likely in a per-CPU queue somewhere waiting
to be processed, and changing the intent on those queued pages costs next
to nothing. Remove this and get back all the bits in pg->pqflags.
 1.30  01-Jan-2020  ad branches: 1.30.2;
Fix a comment.
 1.29  01-Jan-2020  mlelstv explicitely include sys/atomic.h for atomic operations.
 1.28  31-Dec-2019  ad - Add and use wrapper functions that take and acquire page interlocks, and pairs
of page interlocks. Require that the page interlock be held over calls to
uvm_pageactivate(), uvm_pagewire() and similar.

- Solve the concurrency problem with page replacement state. Rather than
updating the global state synchronously, set an intended state on
individual pages (active, inactive, enqueued, dequeued) while holding the
page interlock. After the interlock is released put the pages on a 128
entry per-CPU queue for their state changes to be made real in batch.
This results in in a ~400 fold decrease in contention on my test system.
Proposed on tech-kern but modified to use the page interlock rather than
atomics to synchronise as it's much easier to maintain that way, and
cheaper.
 1.27  31-Dec-2019  ad Rename uvm_free() -> uvm_availmem().
 1.26  31-Dec-2019  ad Rename uvm_page_locked_p() -> uvm_page_owner_locked_p()
 1.25  30-Dec-2019  ad Whitespace.
 1.24  30-Dec-2019  ad pagedaemon:

- Use marker pages to keep place in the queue when scanning, rather than
relying on assumptions.

- In uvmpdpol_balancequeue(), lock the object once instead of twice.

- When draining pools, the situation is getting desperate, but try to avoid
saturating the system with xcall, lock and interrupt activity by sleeping
for 1 clock tick if being continually awoken and all pools have been
cycled through at least once.

- Pause & resume the freelist cache during pool draining.

PR kern/54209: NetBSD 8 large memory performance extremely low
PR kern/54210: NetBSD-8 processes presumably not exiting
PR kern/54727: writing a large file causes unreasonable system behaviour
 1.23  27-Dec-2019  ad vm_page: Now that listq is gone, give the pagedaemon its own private
TAILQ_ENTRY, so that update of page replacement state can be made
asynchronous/lazy. No functional change.
 1.22  23-Dec-2019  ad uvmpdpol_selectvictim: don't assert wire_count == 0, as we can (safely)
race with object owner and wired pages can very briefly appear on the queue.
 1.21  21-Dec-2019  ad uvmexp.free -> uvm_free()
 1.20  16-Dec-2019  ad - Extend the per-CPU counters matt@ did to include all of the hot counters
in UVM, excluding uvmexp.free, which needs special treatment and will be
done with a separate commit. Cuts system time for a build by 20-25% on
a 48 CPU machine w/DIAGNOSTIC.

- Avoid 64-bit integer divide on every fault (for rnd_add_uint32).
 1.19  16-Dec-2019  ad Use the high bits of pqflags for PQ_TIME, not low.
 1.18  13-Dec-2019  ad Break the global uvm_pageqlock into a per-page identity lock and a private
lock for use of the pagedaemon policy code. Discussed on tech-kern.

PR kern/54209: NetBSD 8 large memory performance extremely low
PR kern/54210: NetBSD-8 processes presumably not exiting
PR kern/54727: writing a large file causes unreasonable system behaviour
 1.17  30-Jan-2012  para branches: 1.17.48;
removed code from uvmpdpol_needsscan_p that got there by mistake
pointed out by yamt@
 1.16  28-Jan-2012  rmind pool_page_alloc, pool_page_alloc_meta: avoid extra compare, use const.
ffs_mountfs,sys_swapctl: replace memset with kmem_zalloc.
sys_swapctl: move kmem_free outside the lock path.
uvm_init: fix comment, remove pointless numeration of steps.
uvm_map_enter: remove meflagval variable.
Fix some indentation.
 1.15  27-Jan-2012  para extending vmem(9) to be able to allocated resources for it's own needs.
simplifying uvm_map handling (no special kernel entries anymore no relocking)
make malloc(9) a thin wrapper around kmem(9)
(with private interface for interrupt safety reasons)

releng@ acknowledged
 1.14  12-Jun-2011  rmind branches: 1.14.6;
Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.
 1.13  02-Feb-2011  chuck branches: 1.13.2;
udpate license clauses on my code to match the new-style BSD licenses.
based on second diff that rmind@ sent me.

no functional change with this commit.
 1.12  04-Jun-2008  ad branches: 1.12.16; 1.12.20; 1.12.26; 1.12.28;
vm_page: put TAILQ_ENTRY into a union with LIST_ENTRY, so we can use both.
 1.11  07-Mar-2008  martin branches: 1.11.2; 1.11.4; 1.11.6;
Swap sysctl -d description of vm.filemin and vm.execmin. Noted by
Raymond Meyer on current-users.
 1.10  18-Jan-2008  yamt branches: 1.10.2; 1.10.6;
push pmap_clear_reference calls into pdpolicy code, where reference bits
actually matter.
 1.9  02-Jan-2008  ad Merge vmlocking2 to head.
 1.8  22-Feb-2007  thorpej branches: 1.8.4; 1.8.18; 1.8.24; 1.8.26; 1.8.30;
TRUE -> true, FALSE -> false
 1.7  21-Feb-2007  thorpej Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.
 1.6  19-Jan-2007  skrll branches: 1.6.2;
Remove useless double assignment.

PR 35442
 1.5  01-Nov-2006  yamt branches: 1.5.4;
remove some __unused from function parameters.
 1.4  12-Oct-2006  yamt move some knowledge about vnode into uvm_vnode.c.
 1.3  12-Oct-2006  christos - sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386
 1.2  15-Sep-2006  yamt branches: 1.2.2;
merge yamt-pdpolicy branch.
- separate page replacement policy from the rest of kernel
- implement an alternative replacement policy
 1.1  05-Mar-2006  yamt branches: 1.1.2; 1.1.6;
file uvm_pdpolicy_clock.c was initially added on branch yamt-pdpolicy.
 1.1.6.2  01-Feb-2007  ad Sync with head.
 1.1.6.1  18-Nov-2006  ad Sync with head.
 1.1.2.2  15-Sep-2006  yamt make UVM_KICK_PDAEMON() a real function and stop including
uvm_pdpolicy.h from uvm.h. this also fixes build of pmap(1).
 1.1.2.1  05-Mar-2006  yamt separate page replacement policy from the rest of kernel.
 1.2.2.2  10-Dec-2006  yamt sync with head.
 1.2.2.1  22-Oct-2006  yamt sync with head
 1.5.4.5  17-Mar-2008  yamt sync with head.
 1.5.4.4  21-Jan-2008  yamt sync with head
 1.5.4.3  26-Feb-2007  yamt sync with head.
 1.5.4.2  30-Dec-2006  yamt sync with head.
 1.5.4.1  01-Nov-2006  yamt file uvm_pdpolicy_clock.c was added on branch yamt-lazymbuf on 2006-12-30 20:51:05 +0000
 1.6.2.1  27-Feb-2007  yamt - sync with head.
- move sched_changepri back to kern_synch.c as it doesn't know PPQ anymore.
 1.8.30.2  19-Jan-2008  bouyer Sync with HEAD
 1.8.30.1  02-Jan-2008  bouyer Sync with HEAD
 1.8.26.1  04-Dec-2007  ad Pull the vmlocking changes into a new branch.
 1.8.24.1  18-Feb-2008  mjf Sync with HEAD.
 1.8.18.2  23-Mar-2008  matt sync with HEAD
 1.8.18.1  09-Jan-2008  matt sync with HEAD
 1.8.4.1  13-Mar-2007  ad Pull in the initial set of changes for the vmlocking branch.
 1.10.6.2  05-Jun-2008  mjf Sync with HEAD.

Also fix build.
 1.10.6.1  03-Apr-2008  mjf Sync with HEAD.
 1.10.2.1  24-Mar-2008  keiichi sync with head.
 1.11.6.1  23-Jun-2008  wrstuden Sync w/ -current. 34 merge conflicts to follow.
 1.11.4.1  04-May-2009  yamt sync with head.
 1.11.2.1  17-Jun-2008  yamt sync with head.
 1.12.28.1  08-Feb-2011  bouyer Sync with HEAD
 1.12.26.1  06-Jun-2011  jruoho Sync with HEAD.
 1.12.20.2  05-Mar-2011  rmind sync with head
 1.12.20.1  17-Mar-2010  rmind Reorganise UVM locking to protect P->V state and serialise pmap(9)
operations on the same page(s) by always locking their owner. Hence
lock order: "vmpage"-lock -> pmap-lock.

Patch, proposed on tech-kern@, from Andrew Doran.
 1.12.16.9  07-May-2012  matt Fix typo.
 1.12.16.8  27-Apr-2012  matt Don't decrement pgrp_active in radioactive page dequeue since we don't
increment it when activated a radioactive page.
 1.12.16.7  17-Apr-2012  matt If freemin is 0, don't say a scan is needed.
 1.12.16.6  12-Apr-2012  matt Use PQ_SWAPBACKED to determine radioactiveness of page.
Make sure to add in number of radioactive pages to actives pages.
 1.12.16.5  12-Apr-2012  matt Separate object-less anon pages out of the active list if there is no swap
device. Make uvm_reclaimable and uvm.*estimatable understand colors and
kmem allocations.
 1.12.16.4  17-Feb-2012  matt Assert the page isn't free before munging with its pageq.
 1.12.16.3  12-Feb-2012  matt Disable some of more agressive debug checks since with lots of pages, they
cause O(n^2) increases in time.
 1.12.16.2  09-Feb-2012  matt Major changes to uvm.
Support multiple collections (groups) of free pages and run the page
reclaimation algorithm on each group independently.
 1.12.16.1  21-Jan-2012  matt Use pg instead p as a pointer to struct uvm_page.
 1.13.2.1  23-Jun-2011  cherry Catchup with rmind-uvmplock merge.
 1.14.6.1  18-Feb-2012  mrg merge to -current.
 1.17.48.1  08-Apr-2020  martin Merge changes from current as of 20200406
 1.30.2.2  29-Feb-2020  ad Sync with head.
 1.30.2.1  25-Jan-2020  ad Sync with head.
 1.40.10.1  02-Aug-2025  perseant Sync with HEAD
 1.40.4.1  28-May-2025  martin Pull up following revision(s) (requested by bouyer in ticket #1121):

sys/arch/ia64/include/intr.h: revision 1.9
sys/uvm/uvm_pdpolicy_clock.c: revision 1.41
sys/sys/spl.h: revision 1.11
sys/uvm/uvm_pdpolicy_clock.c: revision 1.42
sys/arch/sparc64/include/psl.h: revision 1.66

uvmpdpol_pagerealize(): ucpu->pdqhead is used by a single CPU; but

kpreempt_disable() isn't enough to guard against concurent access;
interrupts also need to be disabled.

If my analysis is correct, the only place using ucpu->pdqhead which
can be called from interrupt context it uvmpdpol_pagerealize(), and only
from softbio().

So:
- introduce splsoftbio() in sys/spl.h
- protect all accesses to ucpu->pdqhead with splsoftbio()
fixes pr kern/59412: uvmpdpol_pagerealize() queue index out of bound

Provide splsoftbio()

Remove the redundant kpreempt_disable/kpreempt_enable now that we're
running at splsoftbio. Pointed out by thorpej@

RSS XML Feed