Home | History | Annotate | Download | only in uvm
History log of /src/sys/uvm/uvm_km.c
RevisionDateAuthorComments
 1.166  07-Dec-2024  chs kmem: improve behavior when using all of physical memory as kmem

On systems where kmem does not need to be limited by kernel virtual
space (essentially 64-bit platforms), we currently try to size the
"kmem" space to be big enough for all of physical memory to be
allocated as kmem, which really means that we will always run short of
physical memory before we run out of kernel virtual space. However
this does not take into account that uvm_km_va_starved_p() starts
reporting that we are low on kmem virtual space when we have used 90%
of it, in an attempt to avoid kmem space becoming too fragmented,
which means on large memory systems we will still start reacting to
being short of virtual space when there is plenty of physical memory
still available. Fix this by overallocating the kmem space by a
factor of 10/9 so that we always run low on physical memory first,
as we want.
 1.165  09-Apr-2023  riastradh uvm(9): KASSERT(A && B) -> KASSERT(A); KASSERT(B)
 1.164  26-Feb-2023  skrll nkmempages should be size_t
 1.163  12-Feb-2023  andvar s/strucure/structure/ and s/structues/structures/ in comments.
 1.162  06-Aug-2022  chs branches: 1.162.4;
allow KMSAN to work again by restoring the limiting of kva even with
NKMEMPAGES_MAX_UNLIMITED. we used to limit kva to 1/8 of physmem
but limiting to 1/4 should be enough, and 1/4 still gives the kernel
enough kva to map all of the RAM that KMSAN has not stolen.

Reported-by: syzbot+ca3710b4c40cdd61aa72@syzkaller.appspotmail.com
 1.161  03-Aug-2022  chs for platforms which define NKMEMPAGES_MAX_UNLIMITED, set nkmempages
high enough to allow the kernel to map all of RAM into kmem,
so that free physical pages rather than kernel virtual space is
the limiting factor in allocating kernel memory. this gives ZFS
more flexibility in tuning how much memory to use for its ARC cache.
 1.160  13-Mar-2021  skrll Consistently use %#jx instead of 0x%jx or just %jx in UVMHIST_LOG formats
 1.159  09-Jul-2020  skrll branches: 1.159.2;
Consistently use UVMHIST(__func__)

Convert UVMHIST_{CALLED,LOG} into UVMHIST_CALLARGS
 1.158  08-Jul-2020  skrll Trailing whitespace
 1.157  14-Mar-2020  ad Make page waits (WANTED vs BUSY) interlocked by pg->interlock. Gets RW
locks out of the equation for sleep/wakeup, and allows observing+waiting
for busy pages when holding only a read lock. Proposed on tech-kern.
 1.156  24-Feb-2020  rin 0x%#x --> %#x for non-external codes.
Also, stop mixing up 0x%x and %#x in single files as far as possible.
 1.155  23-Feb-2020  ad UVM locking changes, proposed on tech-kern:

- Change the lock on uvm_object, vm_amap and vm_anon to be a RW lock.
- Break v_interlock and vmobjlock apart. v_interlock remains a mutex.
- Do partial PV list locking in the x86 pmap. Others to follow later.
 1.154  08-Feb-2020  maxv Retire KLEAK.

KLEAK was a nice feature and served its purpose; it allowed us to detect
dozens of info leaks on the kernel->userland boundary, and thanks to it we
tackled a good part of the infoleak problem 1.5 years ago.

Nowadays however, we have kMSan, which can detect uninitialized memory in
the kernel. kMSan supersedes KLEAK: it can detect what KLEAK was able to
detect, but in addition, (1) it operates in all of the kernel and not just
the kernel->userland boundary, (2) it requires no user interaction, and (3)
it is deterministic and not statistical.

That makes kMSan the feature of choice to detect info leaks nowadays;
people interested in detecting info leaks should boot a kMSan kernel and
just wait for the magic to happen.

KLEAK was a good ride, and a fun project, but now is time for it to go.

Discussed with several people, including Thomas Barabosch.
 1.153  20-Jan-2020  skrll Another #define protection.

PMAP_ALLOC_POOLPAGE expects PMAP_{,UN}MAP_POOLPAGE to be defined
 1.152  14-Dec-2019  ad branches: 1.152.2;
Merge from yamt-pagecache: use radixtree for page lookup.

rbtree page lookup was introduced during the NetBSD 5.0 development cycle to
bypass lock contention problems with the (then) global page hash, and was a
temporary solution to allow us to make progress. radixtree is the intended
replacement.

Ok yamt@.
 1.151  13-Dec-2019  ad Break the global uvm_pageqlock into a per-page identity lock and a private
lock for use of the pagedaemon policy code. Discussed on tech-kern.

PR kern/54209: NetBSD 8 large memory performance extremely low
PR kern/54210: NetBSD-8 processes presumably not exiting
PR kern/54727: writing a large file causes unreasonable system behaviour
 1.150  01-Dec-2019  uwe Add missing #include <sys/atomic.h>
 1.149  01-Dec-2019  ad Minor correction to previous.
 1.148  01-Dec-2019  ad - Adjust uvmexp.swpgonly with atomics, and make uvm_swap_data_lock static.
- A bit more __cacheline_aligned on mutexes.
 1.147  14-Nov-2019  maxv Add support for Kernel Memory Sanitizer (kMSan). It detects uninitialized
memory used by the kernel at run time, and just like kASan and kCSan, it
is an excellent feature. It has already detected 38 uninitialized variables
in the kernel during my testing, which I have since discreetly fixed.

We use two shadows:
- "shad", to track uninitialized memory with a bit granularity (1:1).
Each bit set to 1 in the shad corresponds to one uninitialized bit of
real kernel memory.
- "orig", to track the origin of the memory with a 4-byte granularity
(1:1). Each uint32_t cell in the orig indicates the origin of the
associated uint32_t of real kernel memory.

The memory consumption of these shadows is consequent, so at least 4GB of
RAM is recommended to run kMSan.

The compiler inserts calls to specific __msan_* functions on each memory
access, to manage both the shad and the orig and detect uninitialized
memory accesses that change the execution flow (like an "if" on an
uninitialized variable).

We mark as uninit several types of memory buffers (stack, pools, kmem,
malloc, uvm_km), and check each buffer passed to copyout, copyoutstr,
bwrite, if_transmit_lock and DMA operations, to detect uninitialized memory
that leaves the system. This allows us to detect kernel info leaks in a way
that is more efficient and also more user-friendly than KLEAK.

Contrary to kASan, kMSan requires comprehensive coverage, ie we cannot
tolerate having one non-instrumented function, because this could cause
false positives. kMSan cannot instrument ASM functions, so I converted
most of them to __asm__ inlines, which kMSan is able to instrument. Those
that remain receive special treatment.

Contrary to kASan again, kMSan uses a TLS, so we must context-switch this
TLS during interrupts. We use different contexts depending on the interrupt
level.

The orig tracks precisely the origin of a buffer. We use a special encoding
for the orig values, and pack together in each uint32_t cell of the orig:
- a code designating the type of memory (Stack, Pool, etc), and
- a compressed pointer, which points either (1) to a string containing
the name of the variable associated with the cell, or (2) to an area
in the kernel .text section which we resolve to a symbol name + offset.

This encoding allows us not to consume extra memory for associating
information with each cell, and produces a precise output, that can tell
for example the name of an uninitialized variable on the stack, the
function in which it was pushed on the stack, and the function where we
accessed this uninitialized variable.

kMSan is available with LLVM, but not with GCC.

The code is organized in a way that is similar to kASan and kCSan, so it
means that other architectures than amd64 can be supported.
 1.146  02-Dec-2018  maxv Introduce KLEAK, a new feature that can detect kernel information leaks.

It works by tainting memory sources with marker values, letting the data
travel through the kernel, and scanning the kernel<->user frontier for
these marker values. Combined with compiler instrumentation and rotation
of the markers, it is able to yield relevant results with little effort.

We taint the pools and the stack, and scan copyout/copyoutstr. KLEAK is
supported on amd64 only for now, but it is not complicated to add more
architectures (just a matter of having the address of .text, and a stack
unwinder).

A userland tool is provided, that allows to execute a command in rounds
and monitor the leaks generated all the while.

KLEAK already detected directly 12 kernel info leaks, and prompted changes
that in total fixed 25+ leaks.

Based on an idea developed jointly with Thomas Barabosch (of Fraunhofer
FKIE).
 1.145  04-Nov-2018  mlelstv PMAP_MAP_POOLPAGE must not fail. Trigger assertion here instead of
panic later from failing PR_WAITOK memory allocations.
 1.144  28-Oct-2017  pgoyette branches: 1.144.2; 1.144.4;
Update the kernhist(9) kernel history code to address issues identified
in PR kern/52639, as well as some general cleaning-up...

(As proposed on tech-kern@ with additional changes and enhancements.)

Details of changes:

* All history arguments are now stored as uintmax_t values[1], both in
the kernel and in the structures used for exporting the history data
to userland via sysctl(9). This avoids problems on some architectures
where passing a 64-bit (or larger) value to printf(3) can cause it to
process the value as multiple arguments. (This can be particularly
problematic when printf()'s format string is not a literal, since in
that case the compiler cannot know how large each argument should be.)

* Update the data structures used for exporting kernel history data to
include a version number as well as the length of history arguments.

* All [2] existing users of kernhist(9) have had their format strings
updated. Each format specifier now includes an explicit length
modifier 'j' to refer to numeric values of the size of uintmax_t.

* All [2] existing users of kernhist(9) have had their format strings
updated to replace uses of "%p" with "%#jx", and the pointer
arguments are now cast to (uintptr_t) before being subsequently cast
to (uintmax_t). This is needed to avoid compiler warnings about
casting "pointer to integer of a different size."

* All [2] existing users of kernhist(9) have had instances of "%s" or
"%c" format strings replaced with numeric formats; several instances
of mis-match between format string and argument list have been fixed.

* vmstat(1) has been modified to handle the new size of arguments in the
history data as exported by sysctl(9).

* vmstat(1) now provides a warning message if the history requested with
the -u option does not exist (previously, this condition was silently
ignored, with only a single blank line being printed).

* vmstat(1) now checks the version and argument length included in the
data exported via sysctl(9) and exits if they do not match the values
with which vmstat was built.

* The kernhist(9) man-page has been updated to note the additional
requirements imposed on the format strings, along with several other
minor changes and enhancements.

[1] It would have been possible to use an explicit length (for example,
uint64_t) for the history arguments. But that would require another
"rototill" of all the users in the future when we add support for an
architecture that supports a larger size. Also, the printf(3) format
specifiers for explicitly-sized values, such as "%"PRIu64, are much
more verbose (and less aesthetically appealing, IMHO) than simply
using "%ju".

[2] I've tried very hard to find "all [the] existing users of kernhist(9)"
but it is possible that I've missed some of them. I would be glad to
update any stragglers that anyone identifies.
 1.143  01-Jun-2017  chs branches: 1.143.2;
remove checks for failure after memory allocation calls that cannot fail:

kmem_alloc() with KM_SLEEP
kmem_zalloc() with KM_SLEEP
percpu_alloc()
pserialize_create()
psref_class_create()

all of these paths include an assertion that the allocation has not failed,
so callers should not assert that again.
 1.142  19-Mar-2017  riastradh __diagused police
 1.141  27-Jul-2016  maxv branches: 1.141.2;
Use UVM_PROT_ALL only if UVM_KMF_EXEC is given as argument. Otherwise, if
UVM_KMF_PAGEABLE is also given as argument, only the VA is allocated and
UVM waits for the page to fault before kentering it. When kentering it, it
will use the UVM_PROT_ flag that was passed to uvm_map; which means that it
will kenter it as RWX.

With this change, the number of RWX pages in the amd64 kernel reaches
strictly zero.
 1.140  20-Jul-2016  maxv Introduce uvm_km_protect.
 1.139  06-Feb-2015  maxv branches: 1.139.2;
Kill kmeminit().
 1.138  29-Jan-2013  para branches: 1.138.12; 1.138.14;
bring file up to date for previous vmem changes.
 1.137  26-Jan-2013  para revert previous commit not yet fully functional, sorry
 1.136  26-Jan-2013  para make vmem(9) ready to be used early during bootstrap to replace extent(9).
pass memory for vmem structs into the initialization functions and
do away with the static pools for this.
factor out the vmem internal structures into a private header.
remove special bootstrapping of the kmem_va_arena as all necessary memory
comes from pool_allocator_meta wich is fully operational at this point.
 1.135  07-Sep-2012  para branches: 1.135.2;
call pmap_growkernel once after the kmem_arena is created
to make the pmap cover it's address space
assert on the growth in uvm_km_kmem_alloc

for the 3rd uvm_map_entry uvm_map_prepare will grow the kernel,
but we might call into uvm_km_kmem_alloc through imports to
the kmem_meta_arena earlier

while here guard uvm_km_va_starved_p from kmem_arena not yet created

thanks for tracking this down to everyone involved
 1.134  04-Sep-2012  matt Remove locking since it isn't needed. As soon as the 2nd uvm_map_entry in kernel_map
is created, uvm_map_prepare will call pmap_growkernel and the pmap_growkernel call in
uvm_km_mem_alloc will never be called again.
 1.133  03-Sep-2012  matt Switch to a spin lock (uvm_kentry_lock) which, fortunately, was sitting there
unused.
 1.132  03-Sep-2012  matt Cleanup comment. Change panic to KASSERTMSG.
Use kernel_map->misc_lock to make sure we don't call pmap_growkernel
concurrently and possibly mess up uvm_maxkaddr.
 1.131  03-Sep-2012  matt Shut up gcc printf warning.
 1.130  03-Sep-2012  matt Don't try grow the entire kmem space but just do as needed in uvm_km_kmem_alloc
 1.129  03-Sep-2012  matt Fix a bug where the kernel was never grown to accomodate the kmem VA space
since that happens before the kernel_map is set.
 1.128  09-Jul-2012  matt Convert a KASSERT to a KASSERTMSG. Expand one KASSERTSG a little bit.
 1.127  03-Jun-2012  rmind Improve the wording slightly.
 1.126  02-Jun-2012  para add some description about the vmem arenas, how they stack up and their purpose
 1.125  13-Apr-2012  yamt uvm_km_kmem_alloc: don't hardcode kmem_va_arena
 1.124  12-Mar-2012  bouyer uvm_km_pgremove_intrsafe(): properly compute the size to pmap_kremove()
(do not trucate it to the first __PGRM_BATCH pages per batch): if we were
given a sparse mapping, we could leave mappings in place.
Note that this doesn't seem to be a problem right now: I added a KASSERT
in my private tree to see if uvm_km_pgremove_intrsafe() would use a
too short size, and it didn't fire.
 1.123  25-Feb-2012  rmind uvm_km_kmem_alloc: return ENOMEM on failure in PMAP_MAP_POOLPAGE case.
 1.122  20-Feb-2012  bouyer When using uvm_km_pgremove_intrsafe() make sure mappings are removed
before returning the pages to the free pool. Otherwise, under Xen,
a page which still has a writable mapping could be allocated for
a PDP by another CPU and the hypervisor would refuse it (this is
PR port-xen/45975).
For this, move the pmap_kremove() calls inside uvm_km_pgremove_intrsafe(),
and do pmap_kremove()/uvm_pagefree() in batch of (at most) 16 entries
(as suggested by Chuck Silvers on tech-kern@, see also
http://mail-index.netbsd.org/tech-kern/2012/02/17/msg012727.html and
followups).
 1.121  19-Feb-2012  rmind Remove VM_MAP_INTRSAFE and related code. Not used since the "kmem changes".
 1.120  10-Feb-2012  para branches: 1.120.2;
proper sizing of kmem_arena on different ports

PR port-i386/45946: Kernel locks up in VMEM system
 1.119  04-Feb-2012  para improve sizing of kmem_arena now that more allocations are made from it
don't enforce limits if not required

ok: riz@
 1.118  03-Feb-2012  matt Always allocate the kmem region. Add UVMHIST support. Approved by releng.
 1.117  02-Feb-2012  para - bringing kmeminit_nkmempages back and revert pmaps that called this early
- use nkmempages to scale the kmem_arena
- reducing diff to pre kmem/vmem change
(NKMEMPAGES_MAX_DEFAULT will need adjusting on some archs)
 1.116  01-Feb-2012  para allocate uareas and buffers from kernel_map again
add code to drain pools if kmem_arena runs out of space
 1.115  01-Feb-2012  matt Use right UVM_xxx_COLORMATCH flag (even both use the same value).
 1.114  31-Jan-2012  matt Deal with case when kmembase == kmemstart.
Use KASSERTMSG for a few KASSERTs
Make sure to match the color of the VA when we are allocating a physical page.
 1.113  29-Jan-2012  para size kmem_arena more sanely for small memory machines
 1.112  27-Jan-2012  para extending vmem(9) to be able to allocated resources for it's own needs.
simplifying uvm_map handling (no special kernel entries anymore no relocking)
make malloc(9) a thin wrapper around kmem(9)
(with private interface for interrupt safety reasons)

releng@ acknowledged
 1.111  01-Sep-2011  matt branches: 1.111.2; 1.111.6;
Forward some UVM from matt-nb5-mips64. Add UVM_KMF_COLORMATCH flag.
When uvm_map gets passed UVM_FLAG_COLORMATCH, the align argument contains
the color of the starting address to be allocated (0..colormask).
When uvm_km_alloc is passed UVM_KMF_COLORMATCH (which can only be used with
UVM_KMF_VAONLY), the align argument contain the color of the starting address
to be allocated.
Change uvm_pagermapin to use this. When mapping user pages in the kernel,
if colormatch is used with the color of the starting user page then the kernel
mapping will be congruent with the existing user mappings.
 1.110  05-Jul-2011  yamt - fix a use-after-free bug in uvm_km_free.
(after uvm_km_pgremove frees pages, the following pmap_remove touches them.)
- acquire the object lock for operations on pmap_kernel as it can actually be
raced with P->V operations. eg. pagedaemon.
 1.109  12-Jun-2011  rmind Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.
 1.108  02-Feb-2011  chuck branches: 1.108.2;
udpate license clauses on my code to match the new-style BSD licenses.
based on second diff that rmind@ sent me.

no functional change with this commit.
 1.107  04-Jan-2011  matt branches: 1.107.2; 1.107.4;
Add better color matching selecting free pages. KM pages will now allocated
so that VA and PA have the same color. On a page fault, choose a physical
page that has the same color as the virtual address.

When allocating kernel memory pages, allow the MD to specify a preferred
VM_FREELIST from which to choose pages. For machines with large amounts
of memory (> 4GB), all kernel memory to come from <4GB to reduce the amount
of bounce buffering needed with 32bit DMA devices.
 1.106  14-May-2010  cegger Move PMAP_KMPAGE to be used in pmap_kenter_pa flags argument.
'Looks good to me' gimpy@
 1.105  08-Feb-2010  joerg branches: 1.105.2;
Remove separate mb_map. The nmbclusters is computed at boot time based
on the amount of physical memory and limited by NMBCLUSTERS if present.
Architectures without direct mapping also limit it based on the kmem_map
size, which is used as backing store. On i386 and ARM, the maximum KVA
used for mbuf clusters is limited to 64MB by default.

The old default limits and limits based on GATEWAY have been removed.
key_registered_sb_max is hard-wired to a value derived from 2048
clusters.
 1.104  07-Nov-2009  cegger branches: 1.104.2;
Add a flags argument to pmap_kenter_pa(9).
Patch showed on tech-kern@ http://mail-index.netbsd.org/tech-kern/2009/11/04/msg006434.html
No objections.
 1.103  13-Dec-2008  ad It's easier for kernel reserve pages to be consumed because the pagedaemon
serves as less of a barrier these days. Restrict provision of kernel reserve
pages to kmem and one of these cases:

- doing a NOWAIT allocation
- caller is a realtime thread
- caller is a kernel thread
- explicitly requested, for example by the pmap
 1.102  01-Dec-2008  ad PR port-amd64/32816 amd64 can not load lkms

Change some assertions to partially allow for VM_MAP_IS_KERNEL(map) where
map is outside the range of kernel_map.
 1.101  04-Aug-2008  pooka branches: 1.101.2; 1.101.4;
the most karmic commit of all: fix tyop in comment
 1.100  16-Jul-2008  matt Add PMAP_KMPAGE flag for pmap_kenter_pa. This allows pmaps to know that
the page being entered is being for the kernel memory allocator. Such pages
should have no references and don't need bookkeeping.
 1.99  24-Mar-2008  yamt branches: 1.99.4; 1.99.6; 1.99.8; 1.99.10;
remove a redundant pmap_update and add a comment instead.
 1.98  23-Feb-2008  chris Add some more missing pmap_update()s following pmap_kremove()s.
 1.97  02-Jan-2008  ad branches: 1.97.2; 1.97.6;
Merge vmlocking2 to head.
 1.96  21-Jul-2007  ad branches: 1.96.6; 1.96.12; 1.96.14; 1.96.16; 1.96.18; 1.96.22;
Fix DEBUG build.
 1.95  21-Jul-2007  ad Merge unobtrusive locking changes from the vmlocking branch.
 1.94  12-Mar-2007  ad branches: 1.94.8;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.
 1.93  21-Feb-2007  thorpej branches: 1.93.4;
Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.
 1.92  01-Nov-2006  yamt branches: 1.92.4;
remove some __unused from function parameters.
 1.91  12-Oct-2006  uwe More __unused (in cpp conditionals not touched by i386).
 1.90  12-Oct-2006  christos - sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386
 1.89  05-Jul-2006  drochner branches: 1.89.4; 1.89.6;
Introduce a UVM_KMF_EXEC flag for uvm_km_alloc() which enforces an
executable mapping. Up to now, only R+W was requested from pmap_kenter_pa.
On most CPUs, we get an executable mapping anyway, due to lack of
hardware support or due to lazyness in the pmap implementation. Only
alpha does obey VM_PROT_EXECUTE, afaics.
 1.88  25-May-2006  yamt branches: 1.88.2;
move wait points for kva from upper layers to vm_map. PR/33185 #1.

XXX there is a concern about interaction with kva fragmentation.
see: http://mail-index.NetBSD.org/tech-kern/2006/05/11/0000.html
 1.87  03-May-2006  yamt branches: 1.87.2;
uvm_km_suballoc: consider kva overhead of "kmapent".
fixes PR/31275 (me) and PR/32287 (Christian Biere).
 1.86  05-Apr-2006  yamt uvm_km_pgremove/uvm_km_pgremove_intrsafe: fix assertions.
 1.85  17-Mar-2006  yamt uvm_km_check_empty: fix an assertion.
 1.84  11-Dec-2005  christos branches: 1.84.4; 1.84.6; 1.84.8; 1.84.10; 1.84.12;
merge ktrace-lwp.
 1.83  27-Jun-2005  thorpej branches: 1.83.2;
Use ANSI function decls.
 1.82  29-May-2005  christos avoid shadow variables.
remove unneeded casts.
 1.81  20-Apr-2005  simonb Use a cast to (long long) and 0x%llx to print out a paddr_t instead
of casting to (void *). Fixes compile problems with 64-bit paddr_t
on 32-bit platforms.
 1.80  12-Apr-2005  yamt fix unreasonably frequent "killed: out of swap" on systems which have
little or no swap.
- even on a severe swap shortage, if we have some amount of file-backed pages,
don't bother to kill processes.
- if all pages in queue will be likely reactivated, just give up
page type balancing rather than spinning unnecessarily.
 1.79  01-Apr-2005  yamt unwrap short lines.
 1.78  01-Apr-2005  yamt merge yamt-km branch.
- don't use managed mappings/backing objects for wired memory allocations.
save some resources like pv_entry. also fix (most of) PR/27030.
- simplify kernel memory management API.
- simplify pmap bootstrap of some ports.
- some related cleanups.
 1.77  26-Feb-2005  perry branches: 1.77.2;
nuke trailing whitespace
 1.76  13-Jan-2005  yamt branches: 1.76.2; 1.76.4;
in uvm_unmap_remove, always wakeup va waiters if any.
uvm_km_free_wakeup is now a synonym of uvm_km_free.
 1.75  12-Jan-2005  yamt don't reserve (uvm_mapent_reserve) entries for malloc/pool backends
because it isn't necessary or safe.
reported and tested by Denis Lagno. PR/28897.
 1.74  05-Jan-2005  yamt km_vacache_alloc: UVM_PROT_ALL rather than UVM_PROT_NONE
so that uvm_kernacc works. PR/28861. (FUKAUMI Naoki)
 1.73  03-Jan-2005  yamt km_vacache_alloc: specify va hint correctly rather than
using stack garbage. PR/28845.
 1.72  01-Jan-2005  yamt in the case of !PMAP_MAP_POOLPAGE, gather pool backend allocations to
large chunks for kernel_map and kmem_map to ease kva fragmentation.
 1.71  01-Jan-2005  yamt introduce vm_map_kernel, a subclass of vm_map, and
move some kernel-only members of vm_map to it.
 1.70  01-Jan-2005  yamt for in-kernel maps,
- allocate kva for vm_map_entry from the map itsself and
remove the static limit, MAX_KMAPENT.
- keep merged entries for later splitting to fix allocate-to-free problem.
PR/24039.
 1.69  24-Mar-2004  junyoung Drop trailing spaces.
 1.68  10-Feb-2004  matt Back out the changes in
http://mail-index.netbsd.org/source-changes/2004/01/29/0027.html
since they don't really fix the problem.

Incorpate one fix: Mark uvm_map_entry's that were created with
UVM_FLAG_NOMERGE so that they will not be used as future merge
candidates.
 1.67  29-Jan-2004  yamt - split uvm_map() into two functions for the followings.
- for in-kernel maps, disable map entry merging so that
unmap operations won't block. (workaround for PR/24039)
- for in-kernel maps, allocate kva for vm_map_entry from
the map itsself and eliminate MAX_KMAPENT and
uvm_map_entry_kmem_pool.
 1.66  18-Dec-2003  pk * Introduce uvm_km_kmemalloc1() which allows alignment and preferred offset
to be passed to uvm_map().

* Turn all uvm_km_valloc*() macros back into (inlined) functions to retain
binary compatibility with any 3rd party modules.
 1.65  18-Dec-2003  pk Condense all existing variants of uvm_km_valloc into a single function:
uvm_km_valloc1(), and use it to express all of
uvm_km_valloc()
uvm_km_valloc_wait()
uvm_km_valloc_prefer()
uvm_km_valloc_prefer_wait()
uvm_km_valloc_align()
in terms of it by macro expansion.
 1.64  28-Aug-2003  pk When retiring a swap device with marked bad blocks on it we should update
the `# swap page in use' and `# swap page only' counters. However, at the
time of swap device removal we can no longer figure out how many of the
bad swap pages are actually also `swap only' pages.

So, on swap I/O errors arrange things to not include the bad swap pages in
the `swpgonly' counter as follows: uvm_swap_markbad() decrements `swpgonly'
by the number of bad pages, and the various VM object deallocation routines
do not decrement `swpgonly' for swap slots marked as SWSLOT_BAD.
 1.63  11-Aug-2003  pk Introduce uvm_swapisfull(), which computes the available swap space by
taking into account swap devices that are in the process of being removed.
 1.62  10-May-2003  thorpej branches: 1.62.2;
Back out the following chagne:
http://mail-index.netbsd.org/source-changes/2003/05/08/0068.html

There were some side-effects that I didn't anticipate, and fixing them
is proving to be more difficult than I thought, do just eject for now.
Maybe one day we can look at this again.

Fixes PR kern/21517.
 1.61  08-May-2003  thorpej Simplify the way the bounds of the managed kernel virtual address
space is advertised to UVM by making virtual_avail and virtual_end
first-class exported variables by UVM. Machine-dependent code is
responsible for initializing them before main() is called. Anything
that steals KVA must adjust these variables accordingly.

This reduces the number of instances of this info from 3 to 1, and
simplifies the pmap(9) interface by removing the pmap_virtual_space()
function call, and removing two arguments from pmap_steal_memory().

This also eliminates some kludges such as having to burn kernel_map
entries on space used by the kernel and stolen KVA.

This also eliminates use of VM_{MIN,MAX}_KERNEL_ADDRESS from MI code,
this giving MD code greater flexibility over the bounds of the managed
kernel virtual address space if a given port's specific platforms can
vary in this regard (this is especially true of the evb* ports).
 1.60  30-Nov-2002  bouyer Change uvm_km_kmemalloc() to accept flag UVM_KMF_NOWAIT and pass it to
uvm_map(). Change uvm_map() to honnor UVM_KMF_NOWAIT. For this, change
amap_extend() to take a flags parameter instead of just boolean for
direction, and introduce AMAP_EXTEND_FORWARDS and AMAP_EXTEND_NOWAIT flags
(AMAP_EXTEND_BACKWARDS is still defined as 0x0, to keep the code easier to
read).
Add a flag parameter to uvm_mapent_alloc().
This solves a problem a pool_get(PR_NOWAIT) could trigger a pool_get(PR_WAITOK)
in uvm_mapent_alloc().
Thanks to Chuck Silvers, enami tsugutomo, Andrew Brown and Jason R Thorpe
for feedback.
 1.59  05-Oct-2002  oster Garbage collect some leftover (and unneeded) code. OK'ed by chs.
 1.58  15-Sep-2002  chs add a new km flag UVM_KMF_CANFAIL, which causes uvm_km_kmemalloc() to
return failure if swap is full and there are no free physical pages.
have malloc() use this flag if M_CANFAIL is passed to it.
use M_CANFAIL to allow amap_extend() to fail when memory is scarce.
this should prevent most of the remaining hangs in low-memory situations.
 1.57  14-Aug-2002  thorpej Don't pass VM_PROT_EXEC to pmap_kenter_pa().
 1.56  07-Mar-2002  thorpej branches: 1.56.2; 1.56.6; 1.56.8;
If the bootstrapping process didn't actually use any KVA space, don't
reserve size of 0 in kernel_map.

From OpenBSD.
 1.55  10-Nov-2001  lukem add RCSIDs, and in some cases, slightly cleanup #include order
 1.54  07-Nov-2001  chs only acquire the lock for swpgonly if we actually need to adjust it.
 1.53  06-Nov-2001  chs several changes prompted by loaning problems:
- fix the loaned case in uvm_pagefree().
- redo uvmexp.swpgonly accounting to work with page loaning.
add an assertion before each place we adjust uvmexp.swpgonly.
- fix uvm_km_pgremove() to always free any swap space associated with
the range being removed.
- get rid of UVM_LOAN_WIRED flag. instead, we just make sure that
pages loaned to the kernel are never on the page queues.
this allows us to assert that pages are not loaned and wired
at the same time.
- add yet more assertions.
 1.52  15-Sep-2001  chs branches: 1.52.2;
a whole bunch of changes to improve performance and robustness under load:

- remove special treatment of pager_map mappings in pmaps. this is
required now, since I've removed the globals that expose the address range.
pager_map now uses pmap_kenter_pa() instead of pmap_enter(), so there's
no longer any need to special-case it.
- eliminate struct uvm_vnode by moving its fields into struct vnode.
- rewrite the pageout path. the pager is now responsible for handling the
high-level requests instead of only getting control after a bunch of work
has already been done on its behalf. this will allow us to UBCify LFS,
which needs tighter control over its pages than other filesystems do.
writing a page to disk no longer requires making it read-only, which
allows us to write wired pages without causing all kinds of havoc.
- use a new PG_PAGEOUT flag to indicate that a page should be freed
on behalf of the pagedaemon when it's unlocked. this flag is very similar
to PG_RELEASED, but unlike PG_RELEASED, PG_PAGEOUT can be cleared if the
pageout fails due to eg. an indirect-block buffer being locked.
this allows us to remove the "version" field from struct vm_page,
and together with shrinking "loan_count" from 32 bits to 16,
struct vm_page is now 4 bytes smaller.
- no longer use PG_RELEASED for swap-backed pages. if the page is busy
because it's being paged out, we can't release the swap slot to be
reallocated until that write is complete, but unlike with vnodes we
don't keep a count of in-progress writes so there's no good way to
know when the write is done. instead, when we need to free a busy
swap-backed page, just sleep until we can get it busy ourselves.
- implement a fast-path for extending writes which allows us to avoid
zeroing new pages. this substantially reduces cpu usage.
- encapsulate the data used by the genfs code in a struct genfs_node,
which must be the first element of the filesystem-specific vnode data
for filesystems which use genfs_{get,put}pages().
- eliminate many of the UVM pagerops, since they aren't needed anymore
now that the pager "put" operation is a higher-level operation.
- enhance the genfs code to allow NFS to use the genfs_{get,put}pages
instead of a modified copy.
- clean up struct vnode by removing all the fields that used to be used by
the vfs_cluster.c code (which we don't use anymore with UBC).
- remove kmem_object and mb_object since they were useless.
instead of allocating pages to these objects, we now just allocate
pages with no object. such pages are mapped in the kernel until they
are freed, so we can use the mapping to find the page to free it.
this allows us to remove splvm() protection in several places.

The sum of all these changes improves write throughput on my
decstation 5000/200 to within 1% of the rate of NetBSD 1.5
and reduces the elapsed time for "make release" of a NetBSD 1.5
source tree on my 128MB pc to 10% less than a 1.5 kernel took.
 1.51  10-Sep-2001  chris Update pmap_update to now take the updated pmap as an argument.
This will allow improvements to the pmaps so that they can more easily defer expensive operations, eg tlb/cache flush, til the last possible moment.

Currently this is a no-op on most platforms, so they should see no difference.

Reviewed by Jason.
 1.50  26-Jun-2001  thorpej branches: 1.50.2; 1.50.4;
Reduce some complexity in the fault path -- Rather than maintaining
an spl-protected "interrupt safe map" list, simply require that callers
of uvm_fault() never call us in interrupt context (MD code must make
the assertion), and check for interrupt-safe maps in uvmfault_lookup()
before we lock the map.
 1.49  02-Jun-2001  chs replace vm_map{,_entry}_t with struct vm_map{,_entry} *.
 1.48  26-May-2001  chs replace {simple_,}lock{_data,}_t with struct {simple,}lock {,*}.
 1.47  25-May-2001  chs remove trailing whitespace.
 1.46  24-Apr-2001  thorpej Sprinkle pmap_update() calls after calls to:
- pmap_enter()
- pmap_remove()
- pmap_protect()
- pmap_kenter_pa()
- pmap_kremove()
as described in pmap(9).

These calls are relatively conservative. It may be possible to
optimize these a little more.
 1.45  12-Apr-2001  thorpej Add a __predict_true() to an extremely common case.
 1.44  12-Apr-2001  thorpej In uvm_km_kmemalloc(), use the correct size for the uvm_unmap()
call if the allocation fails.

Problem pointed out by Alfred Perlstein <bright@wintelcom.net>,
who found a similar bug in FreeBSD.
 1.43  15-Mar-2001  chs eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>
 1.42  14-Jan-2001  thorpej branches: 1.42.2;
splimp() -> splvm()
 1.41  27-Nov-2000  nisimura Introduce uvm_km_valloc_align() and use it to glab process's USPACE
aligned on USPACE boundary in kernel virutal address. It's benefitial
for MIPS R4000's paired TLB entry design.
 1.40  24-Nov-2000  chs cleanup: use queue.h macros and KASSERT().
 1.39  13-Sep-2000  thorpej Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.
 1.38  24-Jul-2000  jeffs Add uvm_km_valloc_prefer_wait(). Used to valloc with the passed in
voff_t being passed to PMAP_PREFER(), which results in the propper
virtual alignment of the allocated space.
 1.37  27-Jun-2000  mrg remove include of <vm/vm.h>
 1.36  26-Jun-2000  mrg remove/move more mach vm header files:

<vm/pglist.h> -> <uvm/uvm_pglist.h>
<vm/vm_inherit.h> -> <uvm/uvm_inherit.h>
<vm/vm_kern.h> -> into <uvm/uvm_extern.h>
<vm/vm_object.h> -> nothing
<vm/vm_pager.h> -> into <uvm/uvm_pager.h>

also includes a bunch of <vm/vm_page.h> include removals (due to redudancy
with <vm/vm.h>), and a scattering of other similar headers.
 1.35  08-May-2000  thorpej branches: 1.35.4;
__predict_false() out-of-resource conditions and DIAGNOSTIC error checks.
 1.34  11-Jan-2000  chs add support for ``swapctl -d'' (removing swap space).
improve handling of i/o errors in swap space.

reviewed by: Chuck Cranor
 1.33  13-Nov-1999  thorpej Change the pmap_enter() API slightly; pmap_enter() now returns an error
value (KERN_SUCCESS or KERN_RESOURCE_SHORTAGE) indicating if it succeeded
or failed. Change the `wired' and `access_type' arguments to a single
`flags' argument, which includes the access type, and flags:

PMAP_WIRED the old `wired' boolean
PMAP_CANFAIL pmap_enter() is allowed to fail

If PMAP_CANFAIL is not specified, the pmap should behave as it always
has in the face of a drastic resource shortage: fall over dead.

Change the fault handler to deal with failure (which indicates resource
shortage) by unlocking everything, waiting for the pagedaemon to free
more memory, then retrying the fault.
 1.32  12-Sep-1999  chs branches: 1.32.2; 1.32.4; 1.32.8;
eliminate the PMAP_NEW option by making it required for all ports.
ports which previously had no support for PMAP_NEW now implement
the pmap_k* interfaces as wrappers around the non-k versions.
 1.31  22-Jul-1999  thorpej Garbage collect thread_sleep()/thread_wakeup() left over from the old
Mach VM code. Also nuke iprintf(), which was no longer used anywhere.

Add proclist locking where appropriate.
 1.30  22-Jul-1999  thorpej 0 -> FALSE in a few places.
 1.29  18-Jul-1999  chs allow uvm_km_alloc_poolpage1() to use kernel-reserve pages.
 1.28  17-Jul-1999  thorpej Garbage-collect uvm_km_get(); nothing actually uses it.
 1.27  04-Jun-1999  thorpej Keep interrupt-safe maps on an additional queue. In uvm_fault(), if we're
looking up a kernel address, check to see if the address is on this
"interrupt-safe" list. If so, return failure immediately. This prevents
a locking screw if a page fault is taken on an interrupt-safe map in or
out of interrupt context.
 1.26  26-May-1999  thorpej Wired kernel mappings are wired; pass VM_PROT_READ|VM_PROT_WRITE for
access_type to pmap_enter() to ensure that when these mappings are accessed,
possibly in interrupt context, that they won't cause mod/ref emulation
page faults.
 1.25  26-May-1999  thorpej Change the vm_map's "entries_pageable" member to a r/o flags member, which
has PAGEABLE and INTRSAFE flags. PAGEABLE now really means "pageable",
not "allocate vm_map_entry's from non-static pool", so update all map
creations to reflect that. INTRSAFE maps are maps that are used in
interrupt context (e.g. kmem_map, mb_map), and thus use the static
map entry pool (XXX as does kernel_map, for now). This will eventually
change now these maps are locked, as well.
 1.24  25-May-1999  thorpej Define a new kernel object type, "intrsafe", which are used for objects
which can be used in an interrupt context. Use pmap_kenter*() and
pmap_kremove() only for mappings owned by these objects.

Fixes some locking protocol issues related to MP support, and eliminates
all of the pmap_enter vs. pmap_kremove inconsistencies.
 1.23  11-Apr-1999  chs add a `flags' argument to uvm_pagealloc_strat().
define a flag UVM_PGA_USERESERVE to allow non-kernel object
allocations to use pages from the reserve.
use the new flag for allocations in pmap modules.
 1.22  26-Mar-1999  mycroft branches: 1.22.2;
Add a new `access type' argument to pmap_enter(). This indicates what type of
memory access a mapping was caused by. This is passed through from uvm_fault()
and udv_fault(), and in most other cases is 0.
The pmap module may use this to preset R/M information. On MMUs which require
R/M emulation, the implementation may preset the bits and avoid taking another
fault. On MMUs which keep R/M information in hardware, the implementation may
preset its cached bits to speed up the next call to pmap_is_modified() or
pmap_is_referenced().
 1.21  26-Mar-1999  chs add uvmexp.swpgonly and use it to detect out-of-swap conditions.
 1.20  25-Mar-1999  mrg remove now >1 year old pre-release message.
 1.19  24-Mar-1999  cgd after discussion with chuck, nuke pgo_attach from uvm_pagerops
 1.18  18-Oct-1998  chs branches: 1.18.2;
shift by PAGE_SHIFT instead of multiplying or dividing by PAGE_SIZE.
 1.17  11-Oct-1998  chuck remove unused share map code from UVM:
- update calls to uvm_unmap_remove/uvm_unmap (mainonly boolean arg
has been removed)
- replace UVM_ET_ISMAP checks with UVM_ET_ISSUBMAP checks
 1.16  28-Aug-1998  thorpej Add a couple of comments about how the pool page allocator functions
can be called with a map that doens't require spl protection.
 1.15  28-Aug-1998  thorpej Add a waitok boolean argument to the VM system's pool page allocator backend.
 1.14  13-Aug-1998  eeh Merge paddr_t changes into the main branch.
 1.13  09-Aug-1998  perry bzero->memset, bcopy->memcpy, bcmp->memcmp
 1.12  01-Aug-1998  thorpej We need to be able to specify a uvm_object to the pool page allocator, too.
 1.11  31-Jul-1998  thorpej Allow an alternate splimp-protected map to be specified in the pool page
allocator routines.
 1.10  24-Jul-1998  thorpej branches: 1.10.2;
Implement uvm_km_{alloc,free}_poolpage(). These functions use pmap hooks to
map/unmap pool pages if provided by the pmap layer.
 1.9  09-Jun-1998  chs correct counting for uvmexp.wired:
only pages explicitly wired by a user process should be counted.
 1.8  09-Mar-1998  mrg KNF.
 1.7  24-Feb-1998  chuck be consistent about offsets in kernel objects. vm_map_min(kernel_map)
should always be the base [fixes problem on m68k detected by jason thorpe]

add comments to uvm_km.c explaining kernel memory management in more detail
 1.6  10-Feb-1998  mrg - add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.
 1.5  08-Feb-1998  thorpej Allow callers of uvm_km_suballoc() to specify where the base of the
submap _must_ begin, by adding a "fixed" boolean argument.
 1.4  07-Feb-1998  mrg restore rcsids
 1.3  07-Feb-1998  chs convert kernel_object to an aobj.
in uvm_km_pgremove(), free swapslots if the object is an aobj.
in uvm_km_kmemalloc(), mark pages as wired and count them.
 1.2  06-Feb-1998  thorpej RCS ID police.
 1.1  05-Feb-1998  mrg branches: 1.1.1;
Initial revision
 1.1.1.1  05-Feb-1998  mrg initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the UVM kernel code portion.


this will be KNF'd shortly. :-)
 1.10.2.2  08-Aug-1998  eeh Revert cdevsw mmap routines to return int.
 1.10.2.1  30-Jul-1998  eeh Split vm_offset_t and vm_size_t into paddr_t, psize_t, vaddr_t, and vsize_t.
 1.18.2.2  25-Feb-1999  chs thread_wakeup() -> wakeup().
 1.18.2.1  09-Nov-1998  chs initial snapshot. lots left to do.
 1.22.2.1  16-Apr-1999  chs branches: 1.22.2.1.2;
pull up 1.22 -> 1.23:
add a `flags' argument to uvm_pagealloc_strat().
define a flag UVM_PGA_USERESERVE to allow non-kernel object
allocations to use pages from the reserve.
use the new flag for allocations in pmap modules.
 1.22.2.1.2.3  02-Aug-1999  thorpej Update from trunk.
 1.22.2.1.2.2  21-Jun-1999  thorpej Sync w/ -current.
 1.22.2.1.2.1  07-Jun-1999  chs merge everything from chs-ubc branch.
 1.32.8.1  27-Dec-1999  wrstuden Pull up to last week's -current.
 1.32.4.1  15-Nov-1999  fvdl Sync with -current
 1.32.2.5  21-Apr-2001  bouyer Sync with HEAD
 1.32.2.4  27-Mar-2001  bouyer Sync with HEAD.
 1.32.2.3  18-Jan-2001  bouyer Sync with head (for UBC+NFS fixes, mostly).
 1.32.2.2  08-Dec-2000  bouyer Sync with HEAD.
 1.32.2.1  20-Nov-2000  bouyer Update thorpej_scsipi to -current as of a month ago
 1.35.4.1  23-Apr-2001  he Pull up revision 1.44 (requested by thorpej):
Use correct size for uvm_unmap() in error case of uvm_km_kmemalloc().
 1.42.2.10  11-Dec-2002  thorpej Sync with HEAD.
 1.42.2.9  18-Oct-2002  nathanw Catch up to -current.
 1.42.2.8  17-Sep-2002  nathanw Catch up to -current.
 1.42.2.7  27-Aug-2002  nathanw Catch up to -current.
 1.42.2.6  01-Apr-2002  nathanw Catch up to -current.
(CVS: It's not just a program. It's an adventure!)
 1.42.2.5  14-Nov-2001  nathanw Catch up to -current.
 1.42.2.4  21-Sep-2001  nathanw Catch up to -current.
 1.42.2.3  24-Aug-2001  nathanw Catch up with -current.
 1.42.2.2  21-Jun-2001  nathanw Catch up to -current.
 1.42.2.1  09-Apr-2001  nathanw Catch up with -current.
 1.50.4.1  01-Oct-2001  fvdl Catch up with -current.
 1.50.2.5  10-Oct-2002  jdolecek sync kqueue with -current; this includes merge of gehenna-devsw branch,
merge of i386 MP branch, and part of autoconf rototil work
 1.50.2.4  06-Sep-2002  jdolecek sync kqueue branch with HEAD
 1.50.2.3  16-Mar-2002  jdolecek Catch up with -current.
 1.50.2.2  10-Jan-2002  thorpej Sync kqueue branch with -current.
 1.50.2.1  13-Sep-2001  thorpej Update the kqueue branch to HEAD.
 1.52.2.1  12-Nov-2001  thorpej Sync the thorpej-mips-cache branch with -current.
 1.56.8.2  02-Jun-2003  tron Pull up revision 1.58 (requested by skrll):
add a new km flag UVM_KMF_CANFAIL, which causes uvm_km_kmemalloc() to
return failure if swap is full and there are no free physical pages.
have malloc() use this flag if M_CANFAIL is passed to it.
use M_CANFAIL to allow amap_extend() to fail when memory is scarce.
this should prevent most of the remaining hangs in low-memory situations.
 1.56.8.1  18-Nov-2002  he Pull up revision 1.57 (requested by thorpej in ticket #675):
Don't pass VM_PROT_EXEC to pmap_kenter_pa().
 1.56.6.1  29-Aug-2002  gehenna catch up with -current.
 1.56.2.1  11-Mar-2002  thorpej Convert swap_syscall_lock and uvm.swap_data_lock to adaptive mutexes,
and rename them apporpriately.
 1.62.2.7  10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.62.2.6  01-Apr-2005  skrll Sync with HEAD.
 1.62.2.5  04-Mar-2005  skrll Sync with HEAD.

Hi Perry!
 1.62.2.4  17-Jan-2005  skrll Sync with HEAD.
 1.62.2.3  21-Sep-2004  skrll Fix the sync with head I botched.
 1.62.2.2  18-Sep-2004  skrll Sync with HEAD.
 1.62.2.1  03-Aug-2004  skrll Sync with HEAD
 1.76.4.7  19-Mar-2005  yamt sync with head. xen and whitespace. xen part is not finished.
 1.76.4.6  18-Feb-2005  chs move a UVMHIST_LOG to avoid an uninitialized variable.
add more info to a debug message.
 1.76.4.5  18-Feb-2005  yamt whitespace, comments, panic messages. no functional changes.
 1.76.4.4  16-Feb-2005  yamt remove redundant trunc_page/round_page.
 1.76.4.3  31-Jan-2005  yamt uvm_km_free: uvm_km_pgremove_intrsafe and pmap_kremove only when UVM_KMF_WIRED.
 1.76.4.2  25-Jan-2005  yamt - don't use uvm_object or managed mappings for wired allocations.
(eg. malloc(9))
- simplify uvm_km_* apis.
 1.76.4.1  25-Jan-2005  yamt remove some compatibility functions.
 1.76.2.1  29-Apr-2005  kent sync with -current
 1.77.2.1  06-Dec-2005  riz Apply patch (requested by yamt in ticket #1015):
sys/uvm/uvm_glue.c: patch
sys/uvm/uvm_km.c: patch
- correct a return value of uvm_km_valloc1 in the case of failure.
- do waitok allocation for uvm_uarea_alloc so that it won't fail on
temporary memory shortage.
 1.83.2.7  24-Mar-2008  yamt sync with head.
 1.83.2.6  27-Feb-2008  yamt sync with head.
 1.83.2.5  21-Jan-2008  yamt sync with head
 1.83.2.4  03-Sep-2007  yamt sync with head.
 1.83.2.3  26-Feb-2007  yamt sync with head.
 1.83.2.2  30-Dec-2006  yamt sync with head.
 1.83.2.1  21-Jun-2006  yamt sync with head.
 1.84.12.2  24-May-2006  tron Merge 2006-05-24 NetBSD-current into the "peter-altq" branch.
 1.84.12.1  28-Mar-2006  tron Merge 2006-03-28 NetBSD-current into the "peter-altq" branch.
 1.84.10.2  11-May-2006  elad sync with head
 1.84.10.1  19-Apr-2006  elad oops - *really* sync to head this time.
 1.84.8.5  11-Aug-2006  yamt sync with head
 1.84.8.4  26-Jun-2006  yamt sync with head.
 1.84.8.3  24-May-2006  yamt sync with head.
 1.84.8.2  11-Apr-2006  yamt sync with head
 1.84.8.1  01-Apr-2006  yamt sync with head.
 1.84.6.2  01-Jun-2006  kardel Sync with head.
 1.84.6.1  22-Apr-2006  simonb Sync with head.
 1.84.4.1  09-Sep-2006  rpaulo sync with head
 1.87.2.1  19-Jun-2006  chap Sync with head.
 1.88.2.1  13-Jul-2006  gdamore Merge from HEAD.
 1.89.6.2  10-Dec-2006  yamt sync with head.
 1.89.6.1  22-Oct-2006  yamt sync with head
 1.89.4.1  18-Nov-2006  ad Sync with head.
 1.92.4.2  24-Mar-2007  yamt sync with head.
 1.92.4.1  27-Feb-2007  yamt - sync with head.
- move sched_changepri back to kern_synch.c as it doesn't know PPQ anymore.
 1.93.4.5  18-Sep-2007  ad Undo previous. Other threads can allocate with the map locked, which could
cause PR_NOWAIT allocations to wait long term.
 1.93.4.4  18-Sep-2007  ad Don't use UVM_KMF_TRYLOCK for pool allocations when PR_NOWAIT is specified.
 1.93.4.3  21-Mar-2007  ad - Replace more simple_locks, and fix up in a few places.
- Use condition variables.
- LOCK_ASSERT -> KASSERT.
 1.93.4.2  13-Mar-2007  ad Pull in the initial set of changes for the vmlocking branch.
 1.93.4.1  13-Mar-2007  ad Sync with head.
 1.94.8.1  15-Aug-2007  skrll Sync with HEAD.
 1.96.22.2  21-Jul-2007  ad Fix DEBUG build.
 1.96.22.1  21-Jul-2007  ad file uvm_km.c was added on branch matt-mips64 on 2007-07-21 20:53:00 +0000
 1.96.18.1  02-Jan-2008  bouyer Sync with HEAD
 1.96.16.1  10-Dec-2007  yamt - separate kernel va allocation (kernel_va_arena) from
in-kernel fault handling (kernel_map).
- add vmem bootstrap code. vmem doesn't rely on malloc anymore.
- make kmem_alloc interrupt-safe.
- kill kmem_map. make malloc a wrapper of kmem_alloc.
 1.96.14.1  04-Dec-2007  ad Pull the vmlocking changes into a new branch.
 1.96.12.1  18-Feb-2008  mjf Sync with HEAD.
 1.96.6.2  23-Mar-2008  matt sync with HEAD
 1.96.6.1  09-Jan-2008  matt sync with HEAD
 1.97.6.3  17-Jan-2009  mjf Sync with HEAD.
 1.97.6.2  28-Sep-2008  mjf Sync with HEAD.
 1.97.6.1  03-Apr-2008  mjf Sync with HEAD.
 1.97.2.1  24-Mar-2008  keiichi sync with head.
 1.99.10.2  13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.99.10.1  19-Oct-2008  haad Sync with HEAD.
 1.99.8.1  18-Jul-2008  simonb Sync with head.
 1.99.6.1  18-Sep-2008  wrstuden Sync with wrstuden-revivesa-base-2.
 1.99.4.3  11-Aug-2010  yamt sync with head.
 1.99.4.2  11-Mar-2010  yamt sync with head
 1.99.4.1  04-May-2009  yamt sync with head.
 1.101.4.2  19-Apr-2009  snj branches: 1.101.4.2.4;
Pull up following revision(s) (requested by mrg in ticket #708):
sys/uvm/uvm_km.c: revision 1.102
sys/uvm/uvm_km.h: revision 1.18
sys/uvm/uvm_map.c: revision 1.264
PR port-amd64/32816 amd64 can not load lkms
Change some assertions to partially allow for VM_MAP_IS_KERNEL(map) where
map is outside the range of kernel_map.
 1.101.4.1  27-Dec-2008  snj Pull up following revision(s) (requested by bouyer in ticket #211):
sys/uvm/uvm_km.c: revision 1.103
sys/uvm/uvm_map.c: revision 1.265
sys/uvm/uvm_page.c: revision 1.141
It's easier for kernel reserve pages to be consumed because the pagedaemon
serves as less of a barrier these days. Restrict provision of kernel reserve
pages to kmem and one of these cases:
- doing a NOWAIT allocation
- caller is a realtime thread
- caller is a kernel thread
- explicitly requested, for example by the pmap
 1.101.4.2.4.11  12-Apr-2012  matt Apply colormask to get a valid color.
 1.101.4.2.4.10  12-Apr-2012  matt Separate object-less anon pages out of the active list if there is no swap
device. Make uvm_reclaimable and uvm.*estimatable understand colors and
kmem allocations.
 1.101.4.2.4.9  29-Feb-2012  matt Improve UVM_PAGE_TRKOWN.
Add more asserts to uvm_page.
 1.101.4.2.4.8  14-Feb-2012  matt Add more KASSERTs (more! more! more!).
When returning page to the free pool, make sure to dequeue the pages before
hand or free page queue corruption will happen.
 1.101.4.2.4.7  10-Feb-2012  matt Place allocated kmem pages on a kmem_pageq. This makes it easy for crash
dump code to find them.
 1.101.4.2.4.6  09-Feb-2012  matt Major changes to uvm.
Support multiple collections (groups) of free pages and run the page
reclaimation algorithm on each group independently.
 1.101.4.2.4.5  03-Jun-2011  matt Restore $NetBSD$
 1.101.4.2.4.4  25-May-2011  matt Make uvm_map recognize UVM_FLAG_COLORMATCH which tells uvm_map that the
'align' argument specifies the starting color of the KVA range to be returned.

When calling uvm_km_alloc with UVM_KMF_VAONLY, also specify the starting
color of the kva range returned (UMV_KMF_COLORMATCH) and pass those to
uvm_map.

In uvm_pglistalloc, make sure the pages being returned have sequentially
advancing colors (so they can be mapped in a contiguous address range).
Add a few missing UVM_FLAG_COLORMATCH flags to uvm_pagealloc calls.

Make the socket and pipe loan color-safe.

Make the mips pmap enforce strict page color (color(VA) == color(PA)).
 1.101.4.2.4.3  06-Feb-2010  matt Allow uvm_km_alloc to allocate from a specific vm freelist if the port wants
it to.
 1.101.4.2.4.2  26-Jan-2010  matt Pass hints to uvm_pagealloc* to get it to use the right page color rather
than guess the right page color.
 1.101.4.2.4.1  09-Jan-2010  matt If PMAP_ALLOC_POOLPAGE is defined use it instead of uvm_pagealloc
 1.101.2.1  19-Jan-2009  skrll Sync with HEAD.
 1.104.2.8  17-Aug-2010  uebayasi Sync with HEAD.
 1.104.2.7  08-Jul-2010  uebayasi Clean up.
 1.104.2.6  07-Jul-2010  uebayasi Clean up; merge options DIRECT_PAGE into options XIP.
 1.104.2.5  06-Jul-2010  uebayasi Directly allocate zero'ed vm_page for XIP unallocated blocks, instead
of abusing pool page. Move the code to XIP vnode pager in genfs_io.c.
 1.104.2.4  31-May-2010  uebayasi Re-define the definition of "device page"; device pages are pages of
device memory. Pages which don't have vm_page (== can't be used for
generic use), but whose PV are tracked, are called "direct pages" from
now.
 1.104.2.3  30-Apr-2010  uebayasi Sync with HEAD.
 1.104.2.2  23-Feb-2010  uebayasi Don't forget opt_device_page.h.
 1.104.2.1  10-Feb-2010  uebayasi Initial attempt to implement uvm_pageofzero_xip(), which returns a pointer
to a single read-only zeroed page. This is meant to be used for XIP now.
Only compile tested.
 1.105.2.5  05-Mar-2011  rmind sync with head
 1.105.2.4  02-Jul-2010  rmind Undo 1.105.2.2 revision, note that uvm_km_pgremove_intrsafe() extracts the
mapping, improve comments.
 1.105.2.3  30-May-2010  rmind sync with head
 1.105.2.2  17-Mar-2010  rmind Reorganise UVM locking to protect P->V state and serialise pmap(9)
operations on the same page(s) by always locking their owner. Hence
lock order: "vmpage"-lock -> pmap-lock.

Patch, proposed on tech-kern@, from Andrew Doran.
 1.105.2.1  16-Mar-2010  rmind Change struct uvm_object::vmobjlock to be dynamically allocated with
mutex_obj_alloc(). It allows us to share the locks among UVM objects.
 1.107.4.1  08-Feb-2011  bouyer Sync with HEAD
 1.107.2.1  06-Jun-2011  jruoho Sync with HEAD.
 1.108.2.1  23-Jun-2011  cherry Catchup with rmind-uvmplock merge.
 1.111.6.6  02-Jun-2012  mrg sync to latest -current.
 1.111.6.5  29-Apr-2012  mrg sync to latest -current.
 1.111.6.4  05-Apr-2012  mrg sync to latest -current.
 1.111.6.3  04-Mar-2012  mrg sync to latest -current.
 1.111.6.2  24-Feb-2012  mrg sync to -current.
 1.111.6.1  18-Feb-2012  mrg merge to -current.
 1.111.2.5  22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.111.2.4  30-Oct-2012  yamt sync with head
 1.111.2.3  18-Apr-2012  yamt byebye VM_MAP_INTRSAFE
 1.111.2.2  17-Apr-2012  yamt sync with head
 1.111.2.1  02-Nov-2011  yamt page cache related changes

- maintain object pages in radix tree rather than rb tree.
- reduce unnecessary page scan in putpages. esp. when an object has a ton of
pages cached but only a few of them are dirty.
- reduce the number of pmap operations by tracking page dirtiness more
precisely in uvm layer.
- fix nfs commit range tracking.
- fix nfs write clustering. XXX hack
 1.120.2.4  25-Nov-2013  bouyer Pull up following revision(s) (requested by para in ticket #989):
sys/uvm/uvm_km.c: revision 1.125
uvm_km_kmem_alloc: don't hardcode kmem_va_arena
 1.120.2.3  07-Sep-2012  riz branches: 1.120.2.3.2; 1.120.2.3.4;
Pull up following revision(s) (requested by para in ticket #547):
sys/uvm/uvm_map.c: revision 1.320
sys/uvm/uvm_map.c: revision 1.321
sys/uvm/uvm_map.c: revision 1.322
sys/uvm/uvm_km.c: revision 1.130
sys/uvm/uvm_km.c: revision 1.131
sys/uvm/uvm_km.c: revision 1.132
sys/uvm/uvm_km.c: revision 1.133
sys/uvm/uvm_km.c: revision 1.134
sys/uvm/uvm_km.c: revision 1.135
sys/uvm/uvm_km.c: revision 1.129
Fix a bug where the kernel was never grown to accomodate the kmem VA space
since that happens before the kernel_map is set.
Don't try grow the entire kmem space but just do as needed in
uvm_km_kmem_alloc
Shut up gcc printf warning.
Cleanup comment. Change panic to KASSERTMSG.
Use kernel_map->misc_lock to make sure we don't call pmap_growkernel
concurrently and possibly mess up uvm_maxkaddr.
Switch to a spin lock (uvm_kentry_lock) which, fortunately, was
sitting there
unused.
Remove locking since it isn't needed. As soon as the 2nd
uvm_map_entry in kernel_map
is created, uvm_map_prepare will call pmap_growkernel and the
pmap_growkernel call in
uvm_km_mem_alloc will never be called again.
call pmap_growkernel once after the kmem_arena is created
to make the pmap cover it's address space
assert on the growth in uvm_km_kmem_alloc
for the 3rd uvm_map_entry uvm_map_prepare will grow the kernel,
but we might call into uvm_km_kmem_alloc through imports to
the kmem_meta_arena earlier
while here guard uvm_km_va_starved_p from kmem_arena not yet created
thanks for tracking this down to everyone involved
 1.120.2.2  17-Mar-2012  bouyer branches: 1.120.2.2.2;
Pull up following revision(s) (requested by rmind in ticket #113):
sys/uvm/uvm_km.c: revision 1.123
uvm_km_kmem_alloc: return ENOMEM on failure in PMAP_MAP_POOLPAGE case.
 1.120.2.1  22-Feb-2012  riz Pull up following revision(s) (requested by bouyer in ticket #29):
sys/arch/xen/x86/x86_xpmap.c: revision 1.39
sys/arch/xen/include/hypervisor.h: revision 1.37
sys/arch/xen/include/intr.h: revision 1.34
sys/arch/xen/x86/xen_ipi.c: revision 1.10
sys/arch/x86/x86/cpu.c: revision 1.97
sys/arch/x86/include/cpu.h: revision 1.48
sys/uvm/uvm_map.c: revision 1.315
sys/arch/x86/x86/pmap.c: revision 1.165
sys/arch/xen/x86/cpu.c: revision 1.81
sys/arch/x86/x86/pmap.c: revision 1.167
sys/arch/xen/x86/cpu.c: revision 1.82
sys/arch/x86/x86/pmap.c: revision 1.168
sys/arch/xen/x86/xen_pmap.c: revision 1.17
sys/uvm/uvm_km.c: revision 1.122
sys/uvm/uvm_kmguard.c: revision 1.10
sys/arch/x86/include/pmap.h: revision 1.50
Apply patch proposed in PR port-xen/45975 (this does not solve the exact
problem reported here but is part of the solution):
xen_kpm_sync() is not working as expected,
leading to races between CPUs.
1 the check (xpq_cpu != &x86_curcpu) is always false because we
have different x86_curcpu symbols with different addresses in the kernel.
Fortunably, all addresses dissaemble to the same code.
Because of this we always use the code intended for bootstrap, which doesn't
use cross-calls or lock.
2 once 1 above is fixed, xen_kpm_sync() will use xcalls to sync other CPUs,
which cause it to sleep and pmap.c doesn't like that. It triggers this
KASSERT() in pmap_unmap_ptes():
KASSERT(pmap->pm_ncsw == curlwp->l_ncsw);
3 pmap->pm_cpus is not safe for the purpose of xen_kpm_sync(), which
needs to know on which CPU a pmap is loaded *now*:
pmap->pm_cpus is cleared before cpu_load_pmap() is called to switch
to a new pmap, leaving a window where a pmap is still in a CPU's
ci_kpm_pdir but not in pm_cpus. As a virtual CPU may be preempted
by the hypervisor at any time, it can be large enough to let another
CPU free the PTP and reuse it as a normal page.
To fix 2), avoid cross-calls and IPIs completely, and instead
use a mutex to update all CPU's ci_kpm_pdir from the local CPU.
It's safe because we just need to update the table page, a tlbflush IPI will
happen later. As a side effect, we don't need a different code for bootstrap,
fixing 1). The mutex added to struct cpu needs a small headers reorganisation.
to fix 3), introduce a pm_xen_ptp_cpus which is updated from
cpu_pmap_load(), whith the ci_kpm_mtx mutex held. Checking it with
ci_kpm_mtx held will avoid overwriting the wrong pmap's ci_kpm_pdir.
While there I removed the unused pmap_is_active() function;
and added some more details to DIAGNOSTIC panics.
When using uvm_km_pgremove_intrsafe() make sure mappings are removed
before returning the pages to the free pool. Otherwise, under Xen,
a page which still has a writable mapping could be allocated for
a PDP by another CPU and the hypervisor would refuse it (this is
PR port-xen/45975).
For this, move the pmap_kremove() calls inside uvm_km_pgremove_intrsafe(),
and do pmap_kremove()/uvm_pagefree() in batch of (at most) 16 entries
(as suggested by Chuck Silvers on tech-kern@, see also
http://mail-index.netbsd.org/tech-kern/2012/02/17/msg012727.html and
followups).
Avoid early use of xen_kpm_sync(); locks are not available at this time.
Don't call cpu_init() twice.
Makes LOCKDEBUG kernels boot again
Revert pmap_pte_flush() -> xpq_flush_queue() in previous.
 1.120.2.3.4.1  25-Nov-2013  bouyer Pull up following revision(s) (requested by para in ticket #989):
sys/uvm/uvm_km.c: revision 1.125
uvm_km_kmem_alloc: don't hardcode kmem_va_arena
 1.120.2.3.2.1  25-Nov-2013  bouyer Pull up following revision(s) (requested by para in ticket #989):
sys/uvm/uvm_km.c: revision 1.125
uvm_km_kmem_alloc: don't hardcode kmem_va_arena
 1.120.2.2.2.1  01-Nov-2012  matt sync with netbsd-6-0-RELEASE.
 1.135.2.2  03-Dec-2017  jdolecek update from HEAD
 1.135.2.1  25-Feb-2013  tls resync with head
 1.138.14.3  28-Aug-2017  skrll Sync with HEAD
 1.138.14.2  05-Oct-2016  skrll Sync with HEAD
 1.138.14.1  06-Apr-2015  skrll Sync with HEAD
 1.138.12.1  25-Mar-2015  snj Pull up following revision(s) (requested by maxv in ticket #617):
sys/kern/kern_malloc.c: revision 1.144, 1.145
sys/kern/kern_pmf.c: revision 1.37
sys/rump/librump/rumpkern/rump.c: revision 1.316
sys/uvm/uvm_extern.h: revision 1.193
sys/uvm/uvm_km.c: revision 1.139
Don't include <uvm/uvm_extern.h>
--
Kill kmeminit().
--
Remove this MALLOC_DEFINE (M_PMF unused).
 1.139.2.3  20-Mar-2017  pgoyette Sync with HEAD
 1.139.2.2  06-Aug-2016  pgoyette Sync with HEAD
 1.139.2.1  26-Jul-2016  pgoyette Sync with HEAD
 1.141.2.1  21-Apr-2017  bouyer Sync with HEAD
 1.143.2.1  02-Nov-2017  snj Pull up following revision(s) (requested by pgoyette in ticket #335):
share/man/man9/kernhist.9: 1.5-1.8
sys/arch/acorn26/acorn26/pmap.c: 1.39
sys/arch/arm/arm32/fault.c: 1.105 via patch
sys/arch/arm/arm32/pmap.c: 1.350, 1.359
sys/arch/arm/broadcom/bcm2835_bsc.c: 1.7
sys/arch/arm/omap/if_cpsw.c: 1.20
sys/arch/arm/omap/tiotg.c: 1.7
sys/arch/evbarm/conf/RPI2_INSTALL: 1.3
sys/dev/ic/sl811hs.c: 1.98
sys/dev/usb/ehci.c: 1.256
sys/dev/usb/if_axe.c: 1.83
sys/dev/usb/motg.c: 1.18
sys/dev/usb/ohci.c: 1.274
sys/dev/usb/ucom.c: 1.119
sys/dev/usb/uhci.c: 1.277
sys/dev/usb/uhub.c: 1.137
sys/dev/usb/umass.c: 1.160-1.162
sys/dev/usb/umass_quirks.c: 1.100
sys/dev/usb/umass_scsipi.c: 1.55
sys/dev/usb/usb.c: 1.168
sys/dev/usb/usb_mem.c: 1.70
sys/dev/usb/usb_subr.c: 1.221
sys/dev/usb/usbdi.c: 1.175
sys/dev/usb/usbdi_util.c: 1.67-1.70
sys/dev/usb/usbroothub.c: 1.3
sys/dev/usb/xhci.c: 1.75
sys/external/bsd/drm2/dist/drm/i915/i915_gem.c: 1.34
sys/kern/kern_history.c: 1.15
sys/kern/kern_xxx.c: 1.74
sys/kern/vfs_bio.c: 1.275-1.276
sys/miscfs/genfs/genfs_io.c: 1.71
sys/sys/kernhist.h: 1.21
sys/ufs/ffs/ffs_balloc.c: 1.63
sys/ufs/lfs/lfs_vfsops.c: 1.361
sys/ufs/lfs/ulfs_inode.c: 1.21
sys/ufs/lfs/ulfs_vnops.c: 1.52
sys/ufs/ufs/ufs_inode.c: 1.102
sys/ufs/ufs/ufs_vnops.c: 1.239
sys/uvm/pmap/pmap.c: 1.37-1.39
sys/uvm/pmap/pmap_tlb.c: 1.22
sys/uvm/uvm_amap.c: 1.108
sys/uvm/uvm_anon.c: 1.64
sys/uvm/uvm_aobj.c: 1.126
sys/uvm/uvm_bio.c: 1.91
sys/uvm/uvm_device.c: 1.66
sys/uvm/uvm_fault.c: 1.201
sys/uvm/uvm_km.c: 1.144
sys/uvm/uvm_loan.c: 1.85
sys/uvm/uvm_map.c: 1.353
sys/uvm/uvm_page.c: 1.194
sys/uvm/uvm_pager.c: 1.111
sys/uvm/uvm_pdaemon.c: 1.109
sys/uvm/uvm_swap.c: 1.175
sys/uvm/uvm_vnode.c: 1.103
usr.bin/vmstat/vmstat.c: 1.219
Reorder to test for null before null deref in debug code
--
Reorder to test for null before null deref in debug code
--
KNF
--
No need for '\n' in UVMHIST_LOG
--
normalise a BIOHIST log message
--
Update the kernhist(9) kernel history code to address issues identified
in PR kern/52639, as well as some general cleaning-up...
(As proposed on tech-kern@ with additional changes and enhancements.)
Details of changes:
* All history arguments are now stored as uintmax_t values[1], both in
the kernel and in the structures used for exporting the history data
to userland via sysctl(9). This avoids problems on some architectures
where passing a 64-bit (or larger) value to printf(3) can cause it to
process the value as multiple arguments. (This can be particularly
problematic when printf()'s format string is not a literal, since in
that case the compiler cannot know how large each argument should be.)
* Update the data structures used for exporting kernel history data to
include a version number as well as the length of history arguments.
* All [2] existing users of kernhist(9) have had their format strings
updated. Each format specifier now includes an explicit length
modifier 'j' to refer to numeric values of the size of uintmax_t.
* All [2] existing users of kernhist(9) have had their format strings
updated to replace uses of "%p" with "%#jx", and the pointer
arguments are now cast to (uintptr_t) before being subsequently cast
to (uintmax_t). This is needed to avoid compiler warnings about
casting "pointer to integer of a different size."
* All [2] existing users of kernhist(9) have had instances of "%s" or
"%c" format strings replaced with numeric formats; several instances
of mis-match between format string and argument list have been fixed.
* vmstat(1) has been modified to handle the new size of arguments in the
history data as exported by sysctl(9).
* vmstat(1) now provides a warning message if the history requested with
the -u option does not exist (previously, this condition was silently
ignored, with only a single blank line being printed).
* vmstat(1) now checks the version and argument length included in the
data exported via sysctl(9) and exits if they do not match the values
with which vmstat was built.
* The kernhist(9) man-page has been updated to note the additional
requirements imposed on the format strings, along with several other
minor changes and enhancements.
[1] It would have been possible to use an explicit length (for example,
uint64_t) for the history arguments. But that would require another
"rototill" of all the users in the future when we add support for an
architecture that supports a larger size. Also, the printf(3)
format
specifiers for explicitly-sized values, such as "%"PRIu64, are much
more verbose (and less aesthetically appealing, IMHO) than simply
using "%ju".
[2] I've tried very hard to find "all [the] existing users of
kernhist(9)"
but it is possible that I've missed some of them. I would be glad
to
update any stragglers that anyone identifies.
--
For some reason this single kernel seems to have outgrown its declared
size as a result of the kernhist(9) changes. Bump the size.
XXX The amount of increase may be excessive - anyone with more detailed
XXX knowledge please feel free to further adjust the value
appropriately.
--
Misssed one cast of pointer --> uintptr_t in previous kernhist(9) commit
--
And yet another one. :(
--
Use correct mark-up for NetBSD version.
--
More improvements in grammar and readability.
--
Remove a stray '"' (obvious typo) and add a couple of casts that are
probably needed.
--
And replace an instance of "%p" conversion with "%#jx"
--
Whitespace fix. Give Bl tag table a width. Fix Xr.
 1.144.4.2  13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.144.4.1  10-Jun-2019  christos Sync with HEAD
 1.144.2.2  26-Dec-2018  pgoyette Sync with HEAD, resolve a few conflicts
 1.144.2.1  26-Nov-2018  pgoyette Sync with HEAD, resolve a couple of conflicts
 1.152.2.2  29-Feb-2020  ad Sync with head.
 1.152.2.1  25-Jan-2020  ad Sync with head.
 1.159.2.1  03-Apr-2021  thorpej Sync with HEAD.
 1.162.4.1  15-Dec-2024  martin Pull up following revision(s) (requested by chs in ticket #1027):

sys/uvm/uvm_km.c: revision 1.166

kmem: improve behavior when using all of physical memory as kmem

On systems where kmem does not need to be limited by kernel virtual
space (essentially 64-bit platforms), we currently try to size the
"kmem" space to be big enough for all of physical memory to be
allocated as kmem, which really means that we will always run short of
physical memory before we run out of kernel virtual space. However
this does not take into account that uvm_km_va_starved_p() starts
reporting that we are low on kmem virtual space when we have used 90%
of it, in an attempt to avoid kmem space becoming too fragmented,
which means on large memory systems we will still start reacting to
being short of virtual space when there is plenty of physical memory
still available. Fix this by overallocating the kmem space by a
factor of 10/9 so that we always run low on physical memory first,
as we want.

RSS XML Feed