Home | History | Annotate | Download | only in uvm
History log of /src/sys/uvm/uvm_pdaemon.c
RevisionDateAuthorComments
 1.134  10-Sep-2023  ad uvmpd_trylockowner(): release pg->interlock before calling rw_obj_free()
since it can call back into the VM system.
 1.133  17-Apr-2021  mrg branches: 1.133.16;
fix error in previous: UVMHIST_PDHIST_SIZE needs to stay next to pdhistbuf[].
 1.132  17-Apr-2021  mrg remove KERNHIST_INIT_STATIC(). it stradles the line between usable
early in boot and broken early in boot by requiring a partly static
structure with another structure that must be present by the time
any uses are performed. theoretically platform code could allocate
a chunk while seting up memory and assign it here, giving a dynamic
sizing for the entry list, but the reality is that all users have
a statically allocated entry list as well.

the existing KERNHIST_LINK_STATIC() is used in conjunction with
KERNHIST_INITIALIZER() instead.

this stops a NULL pointer deref when the _LOG() macro is called
before the storage is linked in, which happens with GCC 10 on OCTEON
with UVMHIST enabled, crashing in very early kernel init.
 1.131  04-Nov-2020  chs branches: 1.131.2;
In uvmpd_tryownerlock(), if the initial try-lock of the owner lock fails
then rather than do more try-locks and eventually sleep for a tick,
take a hold on the current owner's lock, drop the page interlock,
and acquire the lock that we took the hold on in a blocking fashion.
After we get the lock, check if the lock that we acquired is still
the lock for the owner of the page that we're interested in.
If the owner hasn't changed then can proceed with this page,
otherwise we will skip this page and move on to a different page.
This dramatically reduces the amount of time that the pagedaemon
sleeps trying to get locks, since even 1 tick is an eternity to sleep
in this context and it was easy to trigger that case in practice,
and with this new method the pagedaemon only very rarely actually blocks
to acquire the lock that it wants since the object locks are adaptive,
and when the pagedaemon does block then the amount of time it spends
sleeping will be generally be much less than 1 tick.
 1.130  09-Jul-2020  skrll branches: 1.130.2;
Consistently use UVMHIST(__func__)

Convert UVMHIST_{CALLED,LOG} into UVMHIST_CALLARGS
 1.129  11-Jun-2020  ad Counter tweaks:

- Don't need to count anonpages+filepages any more; clean+unknown+dirty for
each kind of page can be summed to get the totals.

- Track the number of free pages with a counter so that it's one less thing
for the allocator to do, which opens up further options there.

- Remove cpu_count_sync_one(). It has no users and doesn't save a whole lot.
For the cheap option, give cpu_count_sync() a boolean parameter indicating
that a cached value is okay, and rate limit the updates for cached values
to hz.
 1.128  11-Jun-2020  ad uvm_availmem(): give it a boolean argument to specify whether a recent
cached value will do, or if the very latest total must be fetched. It can
be called thousands of times a second and fetching the totals impacts not
only the calling LWP but other CPUs doing unrelated activity in the VM
system.
 1.127  25-May-2020  ad uvm_pageout_done(): do nothing when npages is zero.
 1.126  13-Apr-2020  maxv hardclock_ticks -> getticks()
 1.125  23-Feb-2020  ad branches: 1.125.4;
UVM locking changes, proposed on tech-kern:

- Change the lock on uvm_object, vm_amap and vm_anon to be a RW lock.
- Break v_interlock and vmobjlock apart. v_interlock remains a mutex.
- Do partial PV list locking in the x86 pmap. Others to follow later.
 1.124  18-Feb-2020  chs remove the aiodoned thread. I originally added this to provide a thread context
for doing page cache iodone work, but since then biodone() has changed to
hand off all iodone work to a softint thread, so we no longer need the
special-purpose aiodoned thread.
 1.123  15-Jan-2020  ad Merge from yamt-pagecache (after much testing):

- Reduce unnecessary page scan in putpages esp. when an object has a ton of
pages cached but only a few of them are dirty.

- Reduce the number of pmap operations by tracking page dirtiness more
precisely in uvm layer.
 1.122  31-Dec-2019  ad branches: 1.122.2;
- Add and use wrapper functions that take and acquire page interlocks, and pairs
of page interlocks. Require that the page interlock be held over calls to
uvm_pageactivate(), uvm_pagewire() and similar.

- Solve the concurrency problem with page replacement state. Rather than
updating the global state synchronously, set an intended state on
individual pages (active, inactive, enqueued, dequeued) while holding the
page interlock. After the interlock is released put the pages on a 128
entry per-CPU queue for their state changes to be made real in batch.
This results in in a ~400 fold decrease in contention on my test system.
Proposed on tech-kern but modified to use the page interlock rather than
atomics to synchronise as it's much easier to maintain that way, and
cheaper.
 1.121  31-Dec-2019  ad Rename uvm_free() -> uvm_availmem().
 1.120  31-Dec-2019  ad Rename uvm_page_locked_p() -> uvm_page_owner_locked_p()
 1.119  30-Dec-2019  ad pagedaemon:

- Use marker pages to keep place in the queue when scanning, rather than
relying on assumptions.

- In uvmpdpol_balancequeue(), lock the object once instead of twice.

- When draining pools, the situation is getting desperate, but try to avoid
saturating the system with xcall, lock and interrupt activity by sleeping
for 1 clock tick if being continually awoken and all pools have been
cycled through at least once.

- Pause & resume the freelist cache during pool draining.

PR kern/54209: NetBSD 8 large memory performance extremely low
PR kern/54210: NetBSD-8 processes presumably not exiting
PR kern/54727: writing a large file causes unreasonable system behaviour
 1.118  21-Dec-2019  ad Fix merge error - don't init uvmpd_lock twice.
 1.117  21-Dec-2019  ad Detangle the pagedaemon from uvm_fpageqlock:

- Have a single lock (uvmpd_lock) to protect pagedaemon state that was
previously covered by uvmpd_pool_drain_lock plus uvm_fpageqlock.
- Don't require any locks be held when calling uvm_kick_pdaemon().
- Use uvm_free().
 1.116  21-Dec-2019  ad uvm_reclaimable(): need to sum the per-CPU values for filepages/execpages.
 1.115  14-Dec-2019  ad The uvmexp.pdpending change was incorrect - revert for now.
 1.114  14-Dec-2019  ad Adjust pdpending in uvm_pageout_start() and uvm_pageout_done() to avoid
the value going temporarily negative.
 1.113  13-Dec-2019  ad Break the global uvm_pageqlock into a per-page identity lock and a private
lock for use of the pagedaemon policy code. Discussed on tech-kern.

PR kern/54209: NetBSD 8 large memory performance extremely low
PR kern/54210: NetBSD-8 processes presumably not exiting
PR kern/54727: writing a large file causes unreasonable system behaviour
 1.112  01-Dec-2019  ad - Adjust uvmexp.swpgonly with atomics, and make uvm_swap_data_lock static.
- A bit more __cacheline_aligned on mutexes.
 1.111  01-Oct-2019  chs in uvm_wait(), panic if the pagedaemon thread does not exist.
this avoids a hang if the system runs out of memory before
the mechanisms for reclaiming memory have been set up.
 1.110  21-Apr-2019  chs Draining pools from the pagedaemon thread can deadlock, because draining
a pool can involve taking a lock which can be held by a thread which is
blocked waiting for memory. Avoid this by moving the pool-draining work
to a separate worker thread.
 1.109  28-Oct-2017  pgoyette branches: 1.109.4;
Update the kernhist(9) kernel history code to address issues identified
in PR kern/52639, as well as some general cleaning-up...

(As proposed on tech-kern@ with additional changes and enhancements.)

Details of changes:

* All history arguments are now stored as uintmax_t values[1], both in
the kernel and in the structures used for exporting the history data
to userland via sysctl(9). This avoids problems on some architectures
where passing a 64-bit (or larger) value to printf(3) can cause it to
process the value as multiple arguments. (This can be particularly
problematic when printf()'s format string is not a literal, since in
that case the compiler cannot know how large each argument should be.)

* Update the data structures used for exporting kernel history data to
include a version number as well as the length of history arguments.

* All [2] existing users of kernhist(9) have had their format strings
updated. Each format specifier now includes an explicit length
modifier 'j' to refer to numeric values of the size of uintmax_t.

* All [2] existing users of kernhist(9) have had their format strings
updated to replace uses of "%p" with "%#jx", and the pointer
arguments are now cast to (uintptr_t) before being subsequently cast
to (uintmax_t). This is needed to avoid compiler warnings about
casting "pointer to integer of a different size."

* All [2] existing users of kernhist(9) have had instances of "%s" or
"%c" format strings replaced with numeric formats; several instances
of mis-match between format string and argument list have been fixed.

* vmstat(1) has been modified to handle the new size of arguments in the
history data as exported by sysctl(9).

* vmstat(1) now provides a warning message if the history requested with
the -u option does not exist (previously, this condition was silently
ignored, with only a single blank line being printed).

* vmstat(1) now checks the version and argument length included in the
data exported via sysctl(9) and exits if they do not match the values
with which vmstat was built.

* The kernhist(9) man-page has been updated to note the additional
requirements imposed on the format strings, along with several other
minor changes and enhancements.

[1] It would have been possible to use an explicit length (for example,
uint64_t) for the history arguments. But that would require another
"rototill" of all the users in the future when we add support for an
architecture that supports a larger size. Also, the printf(3) format
specifiers for explicitly-sized values, such as "%"PRIu64, are much
more verbose (and less aesthetically appealing, IMHO) than simply
using "%ju".

[2] I've tried very hard to find "all [the] existing users of kernhist(9)"
but it is possible that I've missed some of them. I would be glad to
update any stragglers that anyone identifies.
 1.108  25-Oct-2013  martin branches: 1.108.22;
Mark a diagnostic-only variable
 1.107  30-Jul-2012  matt branches: 1.107.2; 1.107.4;
-fno-common broke kernhist since it used commons.
Add a KERNHIST_DEFINE which is define the kernel history.
Change UVM to deal with the new usage.
 1.106  05-Jun-2012  jym Now that pool_cache_invalidate() is synchronous and can handle per-CPU
caches, merge together pool_drain_start() and pool_drain_end() into

bool pool_drain(struct pool **ppp);

"bool" value indicates whether reclaiming was fully done (true) or not (false)
"ppp" will contain a pointer to the pool that was drained (optional).

See http://mail-index.netbsd.org/tech-kern/2012/06/04/msg013287.html
 1.105  01-Feb-2012  para allocate uareas and buffers from kernel_map again
add code to drain pools if kmem_arena runs out of space
 1.104  27-Jan-2012  para extending vmem(9) to be able to allocated resources for it's own needs.
simplifying uvm_map handling (no special kernel entries anymore no relocking)
make malloc(9) a thin wrapper around kmem(9)
(with private interface for interrupt safety reasons)

releng@ acknowledged
 1.103  12-Jun-2011  rmind branches: 1.103.2; 1.103.6;
Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.
 1.102  02-Feb-2011  chuck branches: 1.102.2;
udpate license clauses on my code to match the new-style BSD licenses.
based on second diff that rmind@ sent me.

no functional change with this commit.
 1.101  02-Jun-2010  pooka branches: 1.101.2; 1.101.4;
it's a wonderful static
 1.100  21-Oct-2009  rmind branches: 1.100.2; 1.100.4;
Remove uarea swap-out functionality:

- Addresses the issue described in PR/38828.
- Some simplification in threading and sleepq subsystems.
- Eliminates pmap_collect() and, as a side note, allows pmap optimisations.
- Eliminates XS_CTL_DATA_ONSTACK in scsipi code.
- Avoids few scans on LWP list and thus potentially long holds of proc_lock.
- Cuts ~1.5k lines of code. Reduces amd64 kernel size by ~4k.
- Removes __SWAP_BROKEN cases.

Tested on x86, mips, acorn32 (thanks <mpumford>) and partly tested on
acorn26 (thanks to <bjh21>).

Discussed on <tech-kern>, reviewed by <ad>.
 1.99  18-Aug-2009  yamt whitespace fixes. no functional changes.
 1.98  10-Aug-2009  haad Add uvm_reclaim_hooks support for reclaiming kernel KVA space and memory.
This is used only by zfs where uvm_reclaim hook is added from arc cache.

Oked ad@.
 1.97  13-Dec-2008  ad PR 40027/pagedaemon loops on memory shortage

uvmpd_scan_queue:

- Fix a bug that prevented the pagedaemon from making forward progress
if (a) swap was full (b) the first 16 pages on the inactive list were
unbusy anons not already backed by swap.

- Remove redundant uvm_swapisfull() check and just try to allocate a slot.
If it fails we know swap is full.
 1.96  03-Dec-2008  ad Make adjustment of uvm_extrapages atomic since it's done without a lock.
XXX This is still a hack.
 1.95  02-Dec-2008  ad uvmpd_tune: make the adjustments to individual variables atomic.
 1.94  14-Nov-2008  ad - If the system encounters a severe memory shortage, start unloading
unused kernel modules.
- Try to unload any autoloaded kernel modules 10 seconds after their
load was successful.
- Keep a counter to track module load/unload events.
 1.93  23-Sep-2008  ad branches: 1.93.2; 1.93.4;
- Make free target 0.5%, but limit to between 128k and 1024k.
- Scale free target by number of CPUs.
- Prefer pageing to swapping.

Proposed on tech-kern.
 1.92  29-Feb-2008  yamt branches: 1.92.4; 1.92.6; 1.92.10;
uvm_swap_io: if pagedaemon, don't wait for iobuf.
 1.91  07-Feb-2008  yamt branches: 1.91.2; 1.91.6;
swapcluster_flush: handle nused==0, which can happen if swapcluster_add failed.
PR/37669 from Andrew Doran.
 1.90  28-Jan-2008  yamt remove a special allocator for uareas, which is no longer necessary.
use pool_cache instead.
 1.89  02-Jan-2008  ad Merge vmlocking2 to head.
 1.88  07-Nov-2007  ad branches: 1.88.2; 1.88.6;
Merge from vmlocking:

- pool_cache changes.
- Debugger/procfs locking fixes.
- Other minor changes.
 1.87  21-Jul-2007  ad branches: 1.87.4; 1.87.6; 1.87.10; 1.87.12; 1.87.14;
Merge unobtrusive locking changes from the vmlocking branch.
 1.86  09-Jul-2007  ad branches: 1.86.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements
 1.85  15-Jun-2007  ad Add a sysctl to disable swapout of kernel stacks. Discussed on tech-kern@.
 1.84  22-Feb-2007  thorpej branches: 1.84.4; 1.84.6;
TRUE -> true, FALSE -> false
 1.83  21-Feb-2007  thorpej Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.
 1.82  27-Dec-2006  alc branches: 1.82.2;
CID-4192: ensure we have 'uobj != NULL` here

ok christos@ and yamt@
 1.81  21-Dec-2006  yamt merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.
 1.80  01-Nov-2006  yamt remove some __unused from function parameters.
 1.79  12-Oct-2006  yamt remove unnecessary #include of vnode.h.
 1.78  12-Oct-2006  christos - sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386
 1.77  15-Sep-2006  yamt branches: 1.77.2;
merge yamt-pdpolicy branch.
- separate page replacement policy from the rest of kernel
- implement an alternative replacement policy
 1.76  14-Feb-2006  yamt branches: 1.76.2; 1.76.14;
share some code between uvmpd_scan_inactive and uvmpd_scan.
 1.75  14-Feb-2006  yamt fix a compilation problem where PAGE_SHIFT is not a constant.
pointed by Chuck Silvers.
 1.74  13-Feb-2006  yamt remove an outdated comment.
 1.73  12-Feb-2006  yamt factor out swap clustering code.
 1.72  05-Jan-2006  yamt branches: 1.72.2; 1.72.4;
uvmpd_scan_inactive: when reactivating a page,
use pmap_is_referenced rather than pmap_clear_reference.
we don't need to clear the bit here as we'll do so when
moving pages back to inactive queue again. pointed by Chuck Silvers.
 1.71  21-Dec-2005  yamt branches: 1.71.2;
uvmpd_scan: when deactivating a page, clear its reference bit.
discussed on tech-kern@.
 1.70  21-Dec-2005  yamt make length of inactive queue tunable by sysctl. (vm.inactivepct)
 1.69  29-Nov-2005  yamt read-ahead statistics.
 1.68  13-Sep-2005  yamt branches: 1.68.6;
wrap swap related code by #ifdef VMSWAP. always #define VMSWAP for now.
 1.67  31-Jul-2005  yamt revert "defflag VMSWAP" changes for now.
there seems to be far more people who don't want to edit
their kernel config files than i thought.
 1.66  30-Jul-2005  yamt defflag VMSWAP.
 1.65  27-Jun-2005  thorpej branches: 1.65.2;
Use ANSI function decls.
 1.64  11-May-2005  yamt allocate anons on-demand, rather than reserving static amount of
them on boot/swapon.
 1.63  04-May-2005  yamt uvm_reclaimable: add an XXX comment.
 1.62  12-Apr-2005  yamt fix unreasonably frequent "killed: out of swap" on systems which have
little or no swap.
- even on a severe swap shortage, if we have some amount of file-backed pages,
don't bother to kill processes.
- if all pages in queue will be likely reactivated, just give up
page type balancing rather than spinning unnecessarily.
 1.61  30-Jan-2005  chs hack around a UVM problem that causes hangs when large processes fork.
see PR 26908 for details.
 1.60  03-Oct-2004  enami branches: 1.60.4; 1.60.6;
- Don't let pagedaemon sleep while draining buf.
- Estimate amount of memory to free at a time.
Address PR#27057 (and similar hangs I saw several months ago).
 1.59  24-Mar-2004  junyoung branches: 1.59.2;
Nuke __P().
 1.58  30-Jan-2004  tls Buffer cache fixes to avoid thrashing between high and low water marks
and uncontrolled growth.

The key fix is from Dan Carasone, who noticed that buf_canfree() was
counting in _bytes_ but freeing in _buffers_, which caused the instant
drop to lowater observed by some users.

We now control the rate of growth; the probability of getting a new
allocation is inversely proportional to the current size of the
cache. This idea is from a long-ago conversation with Kirk McKusick
and, if memory serves, was used for the file-system cache in some
other BSD variant at some point in history.

With growth and shrinkage more or less dealt with, we return the
default maximum cache size to 15%. The default _minimum_ cache size
is raised from 1/16 of the maximum cache size to 1/8, since 1/16 was
chosen when the maximum size was 30% of memory.

Finally, after observing the behaviour of the pagedaemon and the
buffer cache drainer under pathological workloads (e.g. a benchmark
that steps through 75% of available memory backwards) I have moved
the call to buf_drain() to the beginning of the pagedaemon from the
end; if the pagedaemon bogs down, it still won't get run as often
as it should, but at least this way it will see the state of the
free count and free target _before_ the scan step does its thing.
 1.57  04-Jan-2004  jdolecek Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread
 1.56  30-Dec-2003  pk Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.
 1.55  26-Sep-2003  chs don't dereference a vm_page pointer after we free the page.
 1.54  01-Sep-2003  yamt remove an obsolete comment.
(we now have only one inactive list.)
 1.53  28-Aug-2003  pk When retiring a swap device with marked bad blocks on it we should update
the `# swap page in use' and `# swap page only' counters. However, at the
time of swap device removal we can no longer figure out how many of the
bad swap pages are actually also `swap only' pages.

So, on swap I/O errors arrange things to not include the bad swap pages in
the `swpgonly' counter as follows: uvm_swap_markbad() decrements `swpgonly'
by the number of bad pages, and the various VM object deallocation routines
do not decrement `swpgonly' for swap slots marked as SWSLOT_BAD.
 1.52  11-Aug-2003  pk Introduce uvm_swapisfull(), which computes the available swap space by
taking into account swap devices that are in the process of being removed.
 1.51  23-Apr-2003  tls branches: 1.51.2;
Correct use of MAXBSIZE where MAXPHYS was intended. This is a necessary
first step towards per-device MAXPHYS, and has the beneficial side effect
of allowing clustering to MAXPHYS even on systems that need to run with
a reduced MAXBSIZE to get more metadata buffers.
 1.50  25-Feb-2003  simonb Cast result of pgo_put() to (void) as is the style with other calls to
pgo_put() in UVM.

Pointed out by Andrew Brown.
 1.49  23-Feb-2003  simonb Remove assigned-to but not used variable.
 1.48  24-Nov-2002  scw Quell uninitialised variable warnings.
 1.47  20-Jun-2002  chs count aobj pages (most notably kernel stack pages) as anon pages
for memory usage-balancing purposes.
 1.46  05-May-2002  chs branches: 1.46.2; 1.46.4;
look in the right flags field for PQ_INACTIVE.
make uvmpd_scan_inactive() return void since its return value is ignored.
 1.45  21-Jan-2002  wiz branches: 1.45.4;
deamon -> daemon
 1.44  31-Dec-2001  chs fix locking for loaning. in general we should be looking at the page's
uobject and uanon pointers rather than at the PQ_ANON flag to determine
which lock to hold, since PQ_ANON can be clear even when the anon's lock
is the one which we should hold (if the page was loaned from an object
and then freed by the object).
 1.43  09-Dec-2001  chs add {anon,file,exec}max as a upper bound on the amount of memory that
will be allocated for the respective usage types when there is contention
for memory.

replace "vnode" and "vtext" with "file" and "exec" in uvmexp field names
and sysctl names.
 1.42  10-Nov-2001  lukem add RCSIDs, and in some cases, slightly cleanup #include order
 1.41  06-Nov-2001  chs several changes prompted by loaning problems:
- fix the loaned case in uvm_pagefree().
- redo uvmexp.swpgonly accounting to work with page loaning.
add an assertion before each place we adjust uvmexp.swpgonly.
- fix uvm_km_pgremove() to always free any swap space associated with
the range being removed.
- get rid of UVM_LOAN_WIRED flag. instead, we just make sure that
pages loaned to the kernel are never on the page queues.
this allows us to assert that pages are not loaned and wired
at the same time.
- add yet more assertions.
 1.40  06-Nov-2001  simonb Remove some variables that are set but never used.
 1.39  30-Sep-2001  chs branches: 1.39.2;
skip the swap-out code if there's no swap space configured.
avoid some hangs in low-memory situations.
 1.38  26-Sep-2001  chs move call to pool_drain() outside the pageq lock.
 1.37  15-Sep-2001  chs a whole bunch of changes to improve performance and robustness under load:

- remove special treatment of pager_map mappings in pmaps. this is
required now, since I've removed the globals that expose the address range.
pager_map now uses pmap_kenter_pa() instead of pmap_enter(), so there's
no longer any need to special-case it.
- eliminate struct uvm_vnode by moving its fields into struct vnode.
- rewrite the pageout path. the pager is now responsible for handling the
high-level requests instead of only getting control after a bunch of work
has already been done on its behalf. this will allow us to UBCify LFS,
which needs tighter control over its pages than other filesystems do.
writing a page to disk no longer requires making it read-only, which
allows us to write wired pages without causing all kinds of havoc.
- use a new PG_PAGEOUT flag to indicate that a page should be freed
on behalf of the pagedaemon when it's unlocked. this flag is very similar
to PG_RELEASED, but unlike PG_RELEASED, PG_PAGEOUT can be cleared if the
pageout fails due to eg. an indirect-block buffer being locked.
this allows us to remove the "version" field from struct vm_page,
and together with shrinking "loan_count" from 32 bits to 16,
struct vm_page is now 4 bytes smaller.
- no longer use PG_RELEASED for swap-backed pages. if the page is busy
because it's being paged out, we can't release the swap slot to be
reallocated until that write is complete, but unlike with vnodes we
don't keep a count of in-progress writes so there's no good way to
know when the write is done. instead, when we need to free a busy
swap-backed page, just sleep until we can get it busy ourselves.
- implement a fast-path for extending writes which allows us to avoid
zeroing new pages. this substantially reduces cpu usage.
- encapsulate the data used by the genfs code in a struct genfs_node,
which must be the first element of the filesystem-specific vnode data
for filesystems which use genfs_{get,put}pages().
- eliminate many of the UVM pagerops, since they aren't needed anymore
now that the pager "put" operation is a higher-level operation.
- enhance the genfs code to allow NFS to use the genfs_{get,put}pages
instead of a modified copy.
- clean up struct vnode by removing all the fields that used to be used by
the vfs_cluster.c code (which we don't use anymore with UBC).
- remove kmem_object and mb_object since they were useless.
instead of allocating pages to these objects, we now just allocate
pages with no object. such pages are mapped in the kernel until they
are freed, so we can use the mapping to find the page to free it.
this allows us to remove splvm() protection in several places.

The sum of all these changes improves write throughput on my
decstation 5000/200 to within 1% of the rate of NetBSD 1.5
and reduces the elapsed time for "make release" of a NetBSD 1.5
source tree on my 128MB pc to 10% less than a 1.5 kernel took.
 1.36  27-Jun-2001  thorpej branches: 1.36.2; 1.36.4;
G/c a comment that no longer applies.
 1.35  23-Jun-2001  chs don't for memory in uao_set_swlot() since we're holding spinlocks,
instead return -1. adjust callers to handle this new error return.
fixes PR 13194.
 1.34  25-May-2001  chs remove trailing whitespace.
 1.33  22-May-2001  ross Merge the swap-backed and object-backed inactive lists.
 1.32  07-May-2001  thorpej Fix a silly mistake I made when reworking the uvm inactive list
some time ago. The mistake was to check that the page was not
referenced since the last active scan before moving it to inactive.
Now we just clear reference and move it to inacive (which is where
the second clock hand sweep occurs).
 1.31  10-Mar-2001  chs eliminate the VM_PAGER_* error codes in favor of the traditional E* codes.
the mapping is:

VM_PAGER_OK 0
VM_PAGER_BAD <unused>
VM_PAGER_FAIL <unused>
VM_PAGER_PEND 0 (see below)
VM_PAGER_ERROR EIO
VM_PAGER_AGAIN EAGAIN
VM_PAGER_UNLOCK EBUSY
VM_PAGER_REFAULT ERESTART

for async i/o requests, it used to be possible for the request to
be convert to sync, and the pager would return VM_PAGER_OK or VM_PAGER_PEND
to indicate whether the caller should perform post-i/o cleanup.
this is no longer allowed; pagers must now return 0 to indicate that
the async i/o was successfully started, and the caller never needs to
worry about doing the post-i/o cleanup.
 1.30  09-Mar-2001  chs add UBC memory-usage balancing. we track the number of pages in use for
each of the basic types (anonymous data, executable image, cached files)
and prevent the pagedaemon from reusing a given page if that would reduce
the count of that type of page below a sysctl-setable minimum threshold.
the thresholds are controlled via three new sysctl tunables:
vm.anonmin, vm.vnodemin, and vm.vtextmin. these tunables are the
percentages of pageable memory reserved for each usage, and we do not allow
the sum of the minimums to be more than 95% so that there's always some
memory that can be reused.
 1.29  28-Jan-2001  thorpej branches: 1.29.2;
Page scanner improvements, behavior is actually a bit more like
Mach VM's now. Specific changes:
- Pages now need not have all of their mappings removed before being
put on the inactive list. They only need to have the "referenced"
attribute cleared. This makes putting pages onto the inactive list
much more efficient. In order to eliminate redundant clearings of
"refrenced", callers of uvm_pagedeactivate() must now do this
themselves.
- When checking the "modified" attribute for a page (for clearing
PG_CLEAN), make sure to only do it if PG_CLEAN is currently set on
the page (saves a potentially expensive pmap operation).
- When scanning the inactive list, if a page is referenced, reactivate
it (this part was actually added in uvm_pdaemon.c,v 1.27). This
now works properly now that pages on the inactive list are allowed to
have mappings.
- When scanning the inactive list and considering a page for freeing,
remove all mappings, and then check the "modified" attribute if the
page is marked PG_CLEAN.
- When scanning the active list, if the page was referenced since its
last sweep by the scanner, don't deactivate it. (This part was
actually added in uvm_pdaemon.c,v 1.28.)

These changes greatly improve interactive performance during
moderate to high memory and I/O load.
 1.28  25-Jan-2001  thorpej When considering a page for deactivation, check to see if the
page has been referenced since the last time it was considered.
If it was, don't deactivate the page.
 1.27  25-Jan-2001  mycroft Put back the pmap_is_referenced() check from the original UVM code in the
inactive list scans. Without this, the referenced bit was essentially ignored.
 1.26  13-Dec-2000  chs continue processing the inactive queue past the free target when
we're enforcing the limit on the number of vnode pages.
 1.25  30-Nov-2000  simonb Move uvm_pgcnt_vnode and uvm_pgcnt_anon into uvmexp (as vnodepages and
anonpages), and add vtextpages which is currently unused but will be
used to trace the number of pages used by vtext vnodes.
 1.24  27-Nov-2000  chs Initial integration of the Unified Buffer Cache project.
 1.23  20-Aug-2000  bjh21 Ensure that uvmexp.freemin is above the kernel reserved-page count.

When it wasn't (which could happen on a 4Mb machine with 32kb pages),
uvm_pagealloc_strat could refuse to allocate user memory, while the pagedaemon
didn't think it was worth freeing any more, resulting in the system seizing up.
 1.22  12-Aug-2000  thorpej Don't bother with a trampoline to start the pagedaemon and
reaper threads.
 1.21  27-Jun-2000  mrg remove include of <vm/vm.h>
 1.20  26-Jun-2000  mrg remove/move more mach vm header files:

<vm/pglist.h> -> <uvm/uvm_pglist.h>
<vm/vm_inherit.h> -> <uvm/uvm_inherit.h>
<vm/vm_kern.h> -> into <uvm/uvm_extern.h>
<vm/vm_object.h> -> nothing
<vm/vm_pager.h> -> into <uvm/uvm_pager.h>

also includes a bunch of <vm/vm_page.h> include removals (due to redudancy
with <vm/vm.h>), and a scattering of other similar headers.
 1.19  04-Nov-1999  thorpej Const poison uvm_wait().
 1.18  12-Sep-1999  chs branches: 1.18.2; 1.18.4; 1.18.8;
eliminate the PMAP_NEW option by making it required for all ports.
ports which previously had no support for PMAP_NEW now implement
the pmap_k* interfaces as wrappers around the non-k versions.
 1.17  22-Jul-1999  thorpej Garbage collect thread_sleep()/thread_wakeup() left over from the old
Mach VM code. Also nuke iprintf(), which was no longer used anywhere.

Add proclist locking where appropriate.
 1.16  24-May-1999  thorpej - Change uvm_{lock,unlock}_fpageq() to return/take the previous interrupt
level directly, instead of making the caller wrap the calls in
splimp()/splx().
- Add a comment documenting that interrupts that cause memory allocation
must be blocked while the free page queue is locked.

Since interrupts must be blocked while this lock is asserted, tying them
together like this helps to prevent mistakes.
 1.15  30-Mar-1999  mycroft branches: 1.15.4;
Adjust a comparison so that the pagedaemon doesn't get stuck ping-ponging with
a process trying to allocate memory.
 1.14  26-Mar-1999  chs add uvmexp.swpgonly and use it to detect out-of-swap conditions.

numerous pagedaemon improvements were needed to make this useful:
- don't bother waking up procs waiting for memory if there's none to be had.
- start 4 times as many pageouts as we need free pages.
this should reduce latency in low-memory situations.
- in inactive scanning, if we find dirty swap-backed pages when swap space
is full of non-resident pages, reactivate some number of these to flush
less active pages to the inactive queue so we can consider paging them out.
this replaces the previous scheme of inactivating pages beyond the
inactive target when we failed to free anything during inactive scanning.
- during both active and inactive scanning, free any swap resources from
dirty swap-backed pages if swap space is full. this allows other pages
be paged out into that swap space.
 1.13  25-Mar-1999  mrg remove now >1 year old pre-release message.
 1.12  04-Nov-1998  chs branches: 1.12.2;
remove outdated comment.
 1.11  18-Oct-1998  chs shift by PAGE_SHIFT instead of multiplying or dividing by PAGE_SIZE.
 1.10  13-Aug-1998  eeh Merge paddr_t changes into the main branch.
 1.9  23-Jul-1998  pk branches: 1.9.2;
Include pool_drain() in page scans.
 1.8  09-Mar-1998  mrg KNF.
 1.7  10-Feb-1998  mrg - add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.
 1.6  09-Feb-1998  mrg keep statistics on pageout/pagein, total pages, and total operations.
 1.5  07-Feb-1998  mrg implement counters for pages paged in/out
 1.4  07-Feb-1998  mrg restore rcsids
 1.3  07-Feb-1998  chs keep track of how many pages are currently being paged out,
stop initiating new pageouts when "(free + paging) > freetarg".
fix pageq locking.
 1.2  06-Feb-1998  thorpej RCS ID police.
 1.1  05-Feb-1998  mrg branches: 1.1.1;
Initial revision
 1.1.1.1  05-Feb-1998  mrg initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the UVM kernel code portion.


this will be KNF'd shortly. :-)
 1.9.2.1  30-Jul-1998  eeh Split vm_offset_t and vm_size_t into paddr_t, psize_t, vaddr_t, and vsize_t.
 1.12.2.5  30-May-1999  chs when processing aiodones, only adjust uvmexp.paging if the aio
was flagged as being started by the pagedaemon.
 1.12.2.4  29-Apr-1999  chs remove a mistaken simple_unlock().
 1.12.2.3  09-Apr-1999  chs split aiodone handling out from the pagedaemon into its own thread,
the "aiodone daemon". the aiodone daemon never allocates memory,
so the pagedaemon will be able to safely block waiting for memory
as long as there are some pageouts in progress. the paging queue
scheme needs to change before this is done tho.
 1.12.2.2  25-Feb-1999  chs treat pages being paged out as "free" when determining whether to
scan the page queues.
in uvmpd_scan_inactive(), keep initating pageouts until we'll have
4 times the number of pages clean as we want free.
(this fudge factor may need adjustment).
move adjustment of uvmexp.paging to uvm_pager_put(), which is a mistake.
I think the issue was that uvm_pager_put() might fail the pageout
and retry internally with just one page, so the pagedaemon has no way
to tell how many pages are actually being cleaned. this needs more thought.
in uvmpd_scan(), put back the business where we deactivate pages
beyond uvmexp.inactarg (there's a big comment explaining this).
rename some variables for clarity.
use TAILQ_* macros instead of poking the structs directly.
 1.12.2.1  09-Nov-1998  chs initial snapshot. lots left to do.
 1.15.4.4  31-Jul-1999  chs have the aiodone daemon wakeup the pagedaemon if there are still not
enough free pages after processing everything.
 1.15.4.3  04-Jul-1999  chs update for uvm.aio_done being struct buf instead of struct uvm_aiodesc.
pull in a fix from -current.
 1.15.4.2  21-Jun-1999  thorpej Sync w/ -current.
 1.15.4.1  07-Jun-1999  chs merge everything from chs-ubc branch.
 1.18.8.1  27-Dec-1999  wrstuden Pull up to last week's -current.
 1.18.4.1  15-Nov-1999  fvdl Sync with -current
 1.18.2.5  12-Mar-2001  bouyer Sync with HEAD.
 1.18.2.4  11-Feb-2001  bouyer Sync with HEAD.
 1.18.2.3  05-Jan-2001  bouyer Sync with HEAD
 1.18.2.2  08-Dec-2000  bouyer Sync with HEAD.
 1.18.2.1  20-Nov-2000  bouyer Update thorpej_scsipi to -current as of a month ago
 1.29.2.14  11-Dec-2002  thorpej Sync with HEAD.
 1.29.2.13  01-Aug-2002  nathanw Catch up to -current.
 1.29.2.12  16-Jul-2002  nathanw pagedaemon_proc really should be a proc, not a LWP.
 1.29.2.11  24-Jun-2002  nathanw Curproc->curlwp renaming.

Change uses of "curproc->l_proc" back to "curproc", which is more like the
original use. Bare uses of "curproc" are now "curlwp".

"curproc" is now #defined in proc.h as ((curlwp) ? (curlwp)->l_proc) : NULL)
so that it is always safe to reference curproc (*de*referencing curproc
is another story, but that's always been true).
 1.29.2.10  20-Jun-2002  nathanw Catch up to -current.
 1.29.2.9  28-Feb-2002  nathanw Catch up to -current.
 1.29.2.8  08-Jan-2002  nathanw Catch up to -current.
 1.29.2.7  14-Nov-2001  nathanw Catch up to -current.
 1.29.2.6  08-Oct-2001  nathanw Catch up to -current.
 1.29.2.5  26-Sep-2001  nathanw Catch up to -current.
Again.
 1.29.2.4  21-Sep-2001  nathanw Catch up to -current.
 1.29.2.3  24-Aug-2001  nathanw Catch up with -current.
 1.29.2.2  21-Jun-2001  nathanw Catch up to -current.
 1.29.2.1  09-Apr-2001  nathanw Catch up with -current.
 1.36.4.1  01-Oct-2001  fvdl Catch up with -current.
 1.36.2.4  06-Sep-2002  jdolecek sync kqueue branch with HEAD
 1.36.2.3  23-Jun-2002  jdolecek catch up with -current on kqueue branch
 1.36.2.2  11-Feb-2002  jdolecek Sync w/ -current.
 1.36.2.1  10-Jan-2002  thorpej Sync kqueue branch with -current.
 1.39.2.1  12-Nov-2001  thorpej Sync the thorpej-mips-cache branch with -current.
 1.45.4.2  12-Mar-2002  thorpej Convert the fpageqlock to a spin mutex at IPL_VM and rename it
to fpageq_mutex.
 1.45.4.1  11-Mar-2002  thorpej Convert swap_syscall_lock and uvm.swap_data_lock to adaptive mutexes,
and rename them apporpriately.
 1.46.4.2  26-Aug-2003  tron Pull up revision 1.51 (requested by tls in ticket #1434):
Correct use of MAXBSIZE where MAXPHYS was intended. This is a necessary
first step towards per-device MAXPHYS, and has the beneficial side effect
of allowing clustering to MAXPHYS even on systems that need to run with
a reduced MAXBSIZE to get more metadata buffers.
 1.46.4.1  21-Jun-2002  lukem Pull up revision 1.47 (requested by chs in ticket #329):
count aobj pages (most notably kernel stack pages) as anon pages
for memory usage-balancing purposes.
 1.46.2.1  15-Jul-2002  gehenna catch up with -current.
 1.51.2.7  11-Dec-2005  christos Sync with head.
 1.51.2.6  10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.51.2.5  04-Feb-2005  skrll Sync with HEAD.
 1.51.2.4  19-Oct-2004  skrll Sync with HEAD
 1.51.2.3  21-Sep-2004  skrll Fix the sync with head I botched.
 1.51.2.2  18-Sep-2004  skrll Sync with HEAD.
 1.51.2.1  03-Aug-2004  skrll Sync with HEAD
 1.59.2.2  16-Mar-2005  tron Pull up revision 1.61 (requested by chs in ticket #1137):
hack around a UVM problem that causes hangs when large processes fork.
see PR 26908 for details.
 1.59.2.1  08-Oct-2004  jmc branches: 1.59.2.1.2;
Pullup rev 1.60 (requested by simonb in ticket #908)

- Dont let pagedaemon sleep while draining buf.
- Estimate amount of memory to free at a time.
- Factor out code to set watermark and ensure high > low.
- Make the step of allocation possibility a bit seamless by moving the origin
of curve from 0 to lowater mark.
Improves interactive performance when there is heavy disk activity.
PR#27057
 1.59.2.1.2.1  16-Mar-2005  tron Pull up revision 1.61 (requested by chs in ticket #1137):
hack around a UVM problem that causes hangs when large processes fork.
see PR 26908 for details.
 1.60.6.1  12-Feb-2005  yamt sync with head.
 1.60.4.1  29-Apr-2005  kent sync with -current
 1.65.2.9  17-Mar-2008  yamt sync with head.
 1.65.2.8  11-Feb-2008  yamt sync with head.
 1.65.2.7  04-Feb-2008  yamt sync with head.
 1.65.2.6  21-Jan-2008  yamt sync with head
 1.65.2.5  15-Nov-2007  yamt sync with head.
 1.65.2.4  03-Sep-2007  yamt sync with head.
 1.65.2.3  26-Feb-2007  yamt sync with head.
 1.65.2.2  30-Dec-2006  yamt sync with head.
 1.65.2.1  21-Jun-2006  yamt sync with head.
 1.68.6.1  29-Nov-2005  yamt sync with head.
 1.71.2.2  18-Feb-2006  yamt sync with head.
 1.71.2.1  15-Jan-2006  yamt sync with head.
 1.72.4.1  22-Apr-2006  simonb Sync with head.
 1.72.2.1  09-Sep-2006  rpaulo sync with head
 1.76.14.2  12-Jan-2007  ad Sync with head.
 1.76.14.1  18-Nov-2006  ad Sync with head.
 1.76.2.3  15-Sep-2006  yamt make UVM_KICK_PDAEMON() a real function and stop including
uvm_pdpolicy.h from uvm.h. this also fixes build of pmap(1).
 1.76.2.2  12-Mar-2006  yamt - change the way to account read-ahead stats.
- fix UVM_PQFLAGBITS.
 1.76.2.1  05-Mar-2006  yamt separate page replacement policy from the rest of kernel.
 1.77.2.3  10-Dec-2006  yamt sync with head.
 1.77.2.2  22-Oct-2006  yamt use workqueue for aiodoned.
 1.77.2.1  22-Oct-2006  yamt sync with head
 1.82.2.1  27-Feb-2007  yamt - sync with head.
- move sched_changepri back to kern_synch.c as it doesn't know PPQ anymore.
 1.84.6.1  11-Jul-2007  mjf Sync with head.
 1.84.4.11  01-Nov-2007  ad Yielding to avoid livelock doesn't work well, so just sleep for 1 tick.
This too is inadequate and a better solution must be found. Discussed
with yamt@.
 1.84.4.10  27-Oct-2007  yamt uvmpd_scan_queue: avoid too long busy-loops.
 1.84.4.9  26-Oct-2007  ad - Use a cross call to drain the per-CPU component of pool caches.
- When draining, skip over pools that are completly inactive.
 1.84.4.8  27-Aug-2007  yamt fix an uninitialized variable.
 1.84.4.7  24-Aug-2007  ad Sync with buffer cache locking changes. See buf.h/vfs_bio.c for details.
Some minor portions are incomplete and needs to be verified as a whole.
 1.84.4.6  22-Aug-2007  yamt update a comment.
 1.84.4.5  21-Aug-2007  yamt fix some races around pagedaemon and uvm_wait. ok'ed by Andrew Doran.
 1.84.4.4  20-Aug-2007  ad Sync with HEAD.
 1.84.4.3  15-Jul-2007  ad Sync with head.
 1.84.4.2  09-Apr-2007  ad - Add two new arguments to kthread_create1: pri_t pri, bool mpsafe.
- Fork kthreads off proc0 as new LWPs, not new processes.
 1.84.4.1  13-Mar-2007  ad Pull in the initial set of changes for the vmlocking branch.
 1.86.2.1  15-Aug-2007  skrll Sync with HEAD.
 1.87.14.2  21-Jul-2007  ad Merge unobtrusive locking changes from the vmlocking branch.
 1.87.14.1  21-Jul-2007  ad file uvm_pdaemon.c was added on branch matt-mips64 on 2007-07-21 19:21:56 +0000
 1.87.12.2  18-Feb-2008  mjf Sync with HEAD.
 1.87.12.1  19-Nov-2007  mjf Sync with HEAD.
 1.87.10.1  13-Nov-2007  bouyer Sync with HEAD
 1.87.6.3  23-Mar-2008  matt sync with HEAD
 1.87.6.2  09-Jan-2008  matt sync with HEAD
 1.87.6.1  08-Nov-2007  matt sync with -HEAD
 1.87.4.1  11-Nov-2007  joerg Sync with HEAD.
 1.88.6.1  02-Jan-2008  bouyer Sync with HEAD
 1.88.2.2  04-Dec-2007  ad Fix merge botch.
 1.88.2.1  04-Dec-2007  ad Pull the vmlocking changes into a new branch.
 1.91.6.3  17-Jan-2009  mjf Sync with HEAD.
 1.91.6.2  28-Sep-2008  mjf Sync with HEAD.
 1.91.6.1  03-Apr-2008  mjf Sync with HEAD.
 1.91.2.1  24-Mar-2008  keiichi sync with head.
 1.92.10.2  13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.92.10.1  19-Oct-2008  haad Sync with HEAD.
 1.92.6.1  10-Oct-2008  skrll Sync with HEAD.
 1.92.4.4  11-Aug-2010  yamt sync with head.
 1.92.4.3  11-Mar-2010  yamt sync with head
 1.92.4.2  19-Aug-2009  yamt sync with head.
 1.92.4.1  04-May-2009  yamt sync with head.
 1.93.4.2  02-Feb-2009  snj branches: 1.93.4.2.4;
Apply patch (requested by ad in ticket #357):
Make adjustment of some critical variables atomic.
 1.93.4.1  27-Dec-2008  snj Pull up following revision(s) (requested by bouyer in ticket #211):
sys/uvm/uvm_pdaemon.c: revision 1.97
PR 40027/pagedaemon loops on memory shortage
uvmpd_scan_queue:
- Fix a bug that prevented the pagedaemon from making forward progress
if (a) swap was full (b) the first 16 pages on the inactive list were
unbusy anons not already backed by swap.
- Remove redundant uvm_swapisfull() check and just try to allocate a slot.
If it fails we know swap is full.
 1.93.4.2.4.15  07-May-2012  matt Fix free wakeup
 1.93.4.2.4.14  27-Apr-2012  matt Don't decrement pgrp_active in radioactive page dequeue since we don't
increment it when activated a radioactive page.
 1.93.4.2.4.13  17-Apr-2012  matt Don't kick off the page daemon if it's not going to be able to do anything.
 1.93.4.2.4.12  14-Apr-2012  matt If the pagedaemon is stalling, don't wake it. Unless pages were freed for
a group, don't wake things up if paging is 0 (stop spurious wakeups).
 1.93.4.2.4.11  13-Apr-2012  matt Make sure color passed to uvm_reclaimable is valid.
 1.93.4.2.4.10  12-Apr-2012  matt If after the pagedaemon is woken and it processes the queues and make no
progress (frees no pages), instead of immediately trying again, wait 2 seconds.
 1.93.4.2.4.9  12-Apr-2012  matt Separate object-less anon pages out of the active list if there is no swap
device. Make uvm_reclaimable and uvm.*estimatable understand colors and
kmem allocations.
 1.93.4.2.4.8  29-Feb-2012  matt Improve UVM_PAGE_TRKOWN.
Add more asserts to uvm_page.
 1.93.4.2.4.7  17-Feb-2012  matt Change way waiters are handled.
 1.93.4.2.4.6  16-Feb-2012  matt Track the victims selected by the pagedaemon and what happens to then.
Keep a hint for what page group has the most free pages for a given color.
 1.93.4.2.4.5  14-Feb-2012  matt Add more KASSERTs (more! more! more!).
When returning page to the free pool, make sure to dequeue the pages before
hand or free page queue corruption will happen.
 1.93.4.2.4.4  13-Feb-2012  matt Use separate pending and paging tailq entries.
Add a queue check routine to validate the queues aren't corrupt.
 1.93.4.2.4.3  09-Feb-2012  matt Major changes to uvm.
Support multiple collections (groups) of free pages and run the page
reclaimation algorithm on each group independently.
 1.93.4.2.4.2  03-Jun-2011  matt Restore $NetBSD$
 1.93.4.2.4.1  03-Jun-2011  matt Rework page free lists to be sorted by color first rather than free_list.
Kept per color PGFL_* counter in each page free list.
Minor cleanups.
 1.93.2.1  19-Jan-2009  skrll Sync with HEAD.
 1.100.4.4  05-Mar-2011  rmind sync with head
 1.100.4.3  03-Jul-2010  rmind sync with head
 1.100.4.2  17-Mar-2010  rmind Reorganise UVM locking to protect P->V state and serialise pmap(9)
operations on the same page(s) by always locking their owner. Hence
lock order: "vmpage"-lock -> pmap-lock.

Patch, proposed on tech-kern@, from Andrew Doran.
 1.100.4.1  16-Mar-2010  rmind Change struct uvm_object::vmobjlock to be dynamically allocated with
mutex_obj_alloc(). It allows us to share the locks among UVM objects.
 1.100.2.1  17-Aug-2010  uebayasi Sync with HEAD.
 1.101.4.1  08-Feb-2011  bouyer Sync with HEAD
 1.101.2.1  06-Jun-2011  jruoho Sync with HEAD.
 1.102.2.1  23-Jun-2011  cherry Catchup with rmind-uvmplock merge.
 1.103.6.1  18-Feb-2012  mrg merge to -current.
 1.103.2.6  22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.103.2.5  30-Oct-2012  yamt sync with head
 1.103.2.4  17-Apr-2012  yamt sync with head
 1.103.2.3  26-Dec-2011  yamt - use O->A loan to serve read(2). based on a patch from Chuck Silvers
- associated O->A loan fixes.
 1.103.2.2  18-Nov-2011  yamt - use mutex obj for pageable object
- add a function to wait for a mutex obj being available
- replace some "livelock" kpauses with it
 1.103.2.1  02-Nov-2011  yamt page cache related changes

- maintain object pages in radix tree rather than rb tree.
- reduce unnecessary page scan in putpages. esp. when an object has a ton of
pages cached but only a few of them are dirty.
- reduce the number of pmap operations by tracking page dirtiness more
precisely in uvm layer.
- fix nfs commit range tracking.
- fix nfs write clustering. XXX hack
 1.107.4.1  18-May-2014  rmind sync with head
 1.107.2.2  03-Dec-2017  jdolecek update from HEAD
 1.107.2.1  20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.108.22.2  22-Apr-2019  martin Pull up following revision(s) (requested by chs in ticket #1238):

sys/uvm/uvm_pdaemon.c: revision 1.110

Draining pools from the pagedaemon thread can deadlock, because draining
a pool can involve taking a lock which can be held by a thread which is
blocked waiting for memory. Avoid this by moving the pool-draining work
to a separate worker thread.
 1.108.22.1  02-Nov-2017  snj Pull up following revision(s) (requested by pgoyette in ticket #335):
share/man/man9/kernhist.9: 1.5-1.8
sys/arch/acorn26/acorn26/pmap.c: 1.39
sys/arch/arm/arm32/fault.c: 1.105 via patch
sys/arch/arm/arm32/pmap.c: 1.350, 1.359
sys/arch/arm/broadcom/bcm2835_bsc.c: 1.7
sys/arch/arm/omap/if_cpsw.c: 1.20
sys/arch/arm/omap/tiotg.c: 1.7
sys/arch/evbarm/conf/RPI2_INSTALL: 1.3
sys/dev/ic/sl811hs.c: 1.98
sys/dev/usb/ehci.c: 1.256
sys/dev/usb/if_axe.c: 1.83
sys/dev/usb/motg.c: 1.18
sys/dev/usb/ohci.c: 1.274
sys/dev/usb/ucom.c: 1.119
sys/dev/usb/uhci.c: 1.277
sys/dev/usb/uhub.c: 1.137
sys/dev/usb/umass.c: 1.160-1.162
sys/dev/usb/umass_quirks.c: 1.100
sys/dev/usb/umass_scsipi.c: 1.55
sys/dev/usb/usb.c: 1.168
sys/dev/usb/usb_mem.c: 1.70
sys/dev/usb/usb_subr.c: 1.221
sys/dev/usb/usbdi.c: 1.175
sys/dev/usb/usbdi_util.c: 1.67-1.70
sys/dev/usb/usbroothub.c: 1.3
sys/dev/usb/xhci.c: 1.75
sys/external/bsd/drm2/dist/drm/i915/i915_gem.c: 1.34
sys/kern/kern_history.c: 1.15
sys/kern/kern_xxx.c: 1.74
sys/kern/vfs_bio.c: 1.275-1.276
sys/miscfs/genfs/genfs_io.c: 1.71
sys/sys/kernhist.h: 1.21
sys/ufs/ffs/ffs_balloc.c: 1.63
sys/ufs/lfs/lfs_vfsops.c: 1.361
sys/ufs/lfs/ulfs_inode.c: 1.21
sys/ufs/lfs/ulfs_vnops.c: 1.52
sys/ufs/ufs/ufs_inode.c: 1.102
sys/ufs/ufs/ufs_vnops.c: 1.239
sys/uvm/pmap/pmap.c: 1.37-1.39
sys/uvm/pmap/pmap_tlb.c: 1.22
sys/uvm/uvm_amap.c: 1.108
sys/uvm/uvm_anon.c: 1.64
sys/uvm/uvm_aobj.c: 1.126
sys/uvm/uvm_bio.c: 1.91
sys/uvm/uvm_device.c: 1.66
sys/uvm/uvm_fault.c: 1.201
sys/uvm/uvm_km.c: 1.144
sys/uvm/uvm_loan.c: 1.85
sys/uvm/uvm_map.c: 1.353
sys/uvm/uvm_page.c: 1.194
sys/uvm/uvm_pager.c: 1.111
sys/uvm/uvm_pdaemon.c: 1.109
sys/uvm/uvm_swap.c: 1.175
sys/uvm/uvm_vnode.c: 1.103
usr.bin/vmstat/vmstat.c: 1.219
Reorder to test for null before null deref in debug code
--
Reorder to test for null before null deref in debug code
--
KNF
--
No need for '\n' in UVMHIST_LOG
--
normalise a BIOHIST log message
--
Update the kernhist(9) kernel history code to address issues identified
in PR kern/52639, as well as some general cleaning-up...
(As proposed on tech-kern@ with additional changes and enhancements.)
Details of changes:
* All history arguments are now stored as uintmax_t values[1], both in
the kernel and in the structures used for exporting the history data
to userland via sysctl(9). This avoids problems on some architectures
where passing a 64-bit (or larger) value to printf(3) can cause it to
process the value as multiple arguments. (This can be particularly
problematic when printf()'s format string is not a literal, since in
that case the compiler cannot know how large each argument should be.)
* Update the data structures used for exporting kernel history data to
include a version number as well as the length of history arguments.
* All [2] existing users of kernhist(9) have had their format strings
updated. Each format specifier now includes an explicit length
modifier 'j' to refer to numeric values of the size of uintmax_t.
* All [2] existing users of kernhist(9) have had their format strings
updated to replace uses of "%p" with "%#jx", and the pointer
arguments are now cast to (uintptr_t) before being subsequently cast
to (uintmax_t). This is needed to avoid compiler warnings about
casting "pointer to integer of a different size."
* All [2] existing users of kernhist(9) have had instances of "%s" or
"%c" format strings replaced with numeric formats; several instances
of mis-match between format string and argument list have been fixed.
* vmstat(1) has been modified to handle the new size of arguments in the
history data as exported by sysctl(9).
* vmstat(1) now provides a warning message if the history requested with
the -u option does not exist (previously, this condition was silently
ignored, with only a single blank line being printed).
* vmstat(1) now checks the version and argument length included in the
data exported via sysctl(9) and exits if they do not match the values
with which vmstat was built.
* The kernhist(9) man-page has been updated to note the additional
requirements imposed on the format strings, along with several other
minor changes and enhancements.
[1] It would have been possible to use an explicit length (for example,
uint64_t) for the history arguments. But that would require another
"rototill" of all the users in the future when we add support for an
architecture that supports a larger size. Also, the printf(3)
format
specifiers for explicitly-sized values, such as "%"PRIu64, are much
more verbose (and less aesthetically appealing, IMHO) than simply
using "%ju".
[2] I've tried very hard to find "all [the] existing users of
kernhist(9)"
but it is possible that I've missed some of them. I would be glad
to
update any stragglers that anyone identifies.
--
For some reason this single kernel seems to have outgrown its declared
size as a result of the kernhist(9) changes. Bump the size.
XXX The amount of increase may be excessive - anyone with more detailed
XXX knowledge please feel free to further adjust the value
appropriately.
--
Misssed one cast of pointer --> uintptr_t in previous kernhist(9) commit
--
And yet another one. :(
--
Use correct mark-up for NetBSD version.
--
More improvements in grammar and readability.
--
Remove a stray '"' (obvious typo) and add a couple of casts that are
probably needed.
--
And replace an instance of "%p" conversion with "%#jx"
--
Whitespace fix. Give Bl tag table a width. Fix Xr.
 1.109.4.3  21-Apr-2020  martin Sync with HEAD
 1.109.4.2  13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.109.4.1  10-Jun-2019  christos Sync with HEAD
 1.122.2.2  29-Feb-2020  ad Sync with head.
 1.122.2.1  17-Jan-2020  ad Sync with head.
 1.125.4.1  20-Apr-2020  bouyer Sync with HEAD
 1.130.2.1  14-Dec-2020  thorpej Sync w/ HEAD.
 1.131.2.1  17-Apr-2021  thorpej Sync with HEAD.
 1.133.16.1  02-Oct-2023  martin Pull up following revision(s) (requested by ad in ticket #379):

sys/uvm/uvm_pdaemon.c: revision 1.134

uvmpd_trylockowner(): release pg->interlock before calling rw_obj_free()
since it can call back into the VM system.

RSS XML Feed