Home | History | Annotate | Download | only in uvm
History log of /src/sys/uvm/uvm_page.h
RevisionDateAuthorComments
 1.109  20-Dec-2020  skrll Support __HAVE_PMAP_PV_TRACK in sys/uvm/pmap based pmaps (aka common pmap)
 1.108  20-Dec-2020  skrll Remove VM_MD_TO_PAGE that was accidentally committed in 1.106. It's going
to be readded with the code that uses it
 1.107  07-Oct-2020  chs branches: 1.107.2;
Add a new, more aggressive allocator for uvm_pglistalloc() to allocate
contiguous physical pages, and try this new allocator if the existing
one fails. The existing contig allocator only tries to allocate pages
that are already free, which works fine shortly after boot but rarely
works after the system has been up for a while. The new allocator uses
the pagedaemon to evict pages from memory in the hope that this will
free up a range of pages that satisfies the constraits of the request.
This should help with things like plugging in a USB device, which often
fails for some USB controllers because they can't get contigous memory.
 1.106  20-Sep-2020  skrll G/C uvm_pagezerocheck
 1.105  14-Jun-2020  ad Remove PG_ZERO. It worked brilliantly on x86 machines from the mid-90s but
having spent an age experimenting with it over the last 6 months on various
machines and with different use cases it's always either break-even or a
slight net loss for me.
 1.104  24-May-2020  ad Add uvm_pagewanted_p(): return true if someone is waiting on the page and
assert caller has correct lock to observe that.
 1.103  17-May-2020  ad Start trying to reduce cache misses on vm_page during fault processing.

- Make PGO_LOCKED getpages imply PGO_NOBUSY and remove the latter. Mark
pages busy only when there's actually I/O to do.

- When doing COW on a uvm_object, don't mess with neighbouring pages. In
all likelyhood they're already entered.

- Don't mess with neighbouring VAs that have existing mappings as replacing
those mappings with same can be quite costly.

- Don't enqueue pages for neighbour faults unless not enqueued already, and
don't activate centre pages unless uvmpdpol says its useful.

Also:

- Make PGO_LOCKED getpages on UAOs work more like vnodes: do gang lookup in
the radix tree, and don't allocate new pages.

- Fix many assertion failures around faults/loans with tmpfs.
 1.102  17-Mar-2020  ad Tweak the March 14th change to make page waits interlocked by pg->interlock.
Remove unneeded changes and only deal with the PQ_WANTED flag, to exclude
possible bugs.
 1.101  16-Mar-2020  rin Include <sys/rwlock.h> for krwlock_t required by uvm_pagewait().
 1.100  14-Mar-2020  ad Make page waits (WANTED vs BUSY) interlocked by pg->interlock. Gets RW
locks out of the equation for sleep/wakeup, and allows observing+waiting
for busy pages when holding only a read lock. Proposed on tech-kern.
 1.99  06-Mar-2020  riastradh Include "opt_uvm_page_trkown.h" for UVM_PAGE_TRKOWN.
 1.98  23-Feb-2020  ad UVM locking changes, proposed on tech-kern:

- Change the lock on uvm_object, vm_amap and vm_anon to be a RW lock.
- Break v_interlock and vmobjlock apart. v_interlock remains a mutex.
- Do partial PV list locking in the x86 pmap. Others to follow later.
 1.97  21-Jan-2020  ad uvmpdpol_pageactive(): the change to not re-activate recently activated
pages worked great with uvm_pageqlock, but it doesn't buy anything any more,
because now the busy pages are likely in a per-CPU queue somewhere waiting
to be processed, and changing the intent on those queued pages costs next
to nothing. Remove this and get back all the bits in pg->pqflags.
 1.96  15-Jan-2020  ad Merge from yamt-pagecache (after much testing):

- Reduce unnecessary page scan in putpages esp. when an object has a ton of
pages cached but only a few of them are dirty.

- Reduce the number of pmap operations by tracking page dirtiness more
precisely in uvm layer.
 1.95  10-Jan-2020  ad UVM_PAGE_TREE_PENALTY isn't used any more.
 1.94  09-Jan-2020  ad Use __SHIFTIN()/__SHIFTOUT(). Suggested by riastradh@.
 1.93  31-Dec-2019  ad branches: 1.93.2;
- Add and use wrapper functions that take and acquire page interlocks, and pairs
of page interlocks. Require that the page interlock be held over calls to
uvm_pageactivate(), uvm_pagewire() and similar.

- Solve the concurrency problem with page replacement state. Rather than
updating the global state synchronously, set an intended state on
individual pages (active, inactive, enqueued, dequeued) while holding the
page interlock. After the interlock is released put the pages on a 128
entry per-CPU queue for their state changes to be made real in batch.
This results in in a ~400 fold decrease in contention on my test system.
Proposed on tech-kern but modified to use the page interlock rather than
atomics to synchronise as it's much easier to maintain that way, and
cheaper.
 1.92  31-Dec-2019  ad struct vm_page: cluster fields most heavily used by the page allocator and
uvmpdpol at the start of the structure, so that while under global lock we
need only touch one cache line for each vm_page. There is still the problem
of vm_page not being aligned, but this seems to drop lock wait time for
(a modified) uvmpdpol and the allocator by 20-30% in a quick test.
 1.91  31-Dec-2019  ad Rename uvm_page_locked_p() -> uvm_page_owner_locked_p()
 1.90  27-Dec-2019  ad vm_page: Now that listq is gone, give the pagedaemon its own private
TAILQ_ENTRY, so that update of page replacement state can be made
asynchronous/lazy. No functional change.
 1.89  27-Dec-2019  ad Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.
 1.88  21-Dec-2019  ad - Rename VM_PGCOLOR_BUCKET() to VM_PGCOLOR(). I want to reuse "bucket" for
something else soon and TBH it matches what this macro does better.

- Add inlines to set/get locator values in the unused lower bits of
pg->phys_addr. Begin by using it to cache the freelist index, because
computing it is expensive and that shows up during profiling. Discussed
on tech-kern.
 1.87  15-Dec-2019  ad Merge from yamt-pagecache:

- do gang lookup of pages using radixtree.
- remove now unused uvm_object::uo_memq and vm_page::listq.queue.
 1.86  14-Dec-2019  ad Merge from yamt-pagecache: use radixtree for page lookup.

rbtree page lookup was introduced during the NetBSD 5.0 development cycle to
bypass lock contention problems with the (then) global page hash, and was a
temporary solution to allow us to make progress.radixtree is the intended
replacement.

Ok yamt@.
 1.85  13-Dec-2019  ad Break the global uvm_pageqlock into a per-page identity lock and a private
lock for use of the pagedaemon policy code. Discussed on tech-kern.

PR kern/54209: NetBSD 8 large memory performance extremely low
PR kern/54210: NetBSD-8 processes presumably not exiting
PR kern/54727: writing a large file causes unreasonable system behaviour
 1.84  07-Jan-2019  jdolecek branches: 1.84.4;
add sysctl to easily set ubc_direct

PR kern/53124
 1.83  19-May-2018  jdolecek branches: 1.83.2;
add experimental new function uvm_direct_process(), to allow of read/writes
of contents of uvm pages without mapping them into kernel, using
direct map or moral equivalent; pmaps supporting the interface need
to provide pmap_direct_process() and define PMAP_DIRECT

implement the new interface for amd64; I hear alpha and mips might be relatively
easy to add too, but I lack the knowledge

part of resolution for PR kern/53124
 1.82  14-Nov-2017  mrg branches: 1.82.2;
remove duplicate prototype.
 1.81  23-Dec-2016  cherry "Make NetBSD great again!"

Introduce uvm_hotplug(9) to the kernel.

Many thanks, in no particular order to:

TNF, for funding the project.

Chuck Silvers - for multiple API reviews and feedback.
Nick Hudson - for testing on multiple architectures and bugfix patches.
Everyone who helped with boot testing.

KeK (http://www.kek.org.in) for hosting the primary developers.
 1.80  23-Mar-2015  riastradh branches: 1.80.2;
Call these `identities', not `life states'.
 1.79  21-Mar-2015  riastradh No, PQ_ANON is set only if owned by anon, not if loaned to anon.
 1.78  21-Mar-2015  riastradh Address O->A loan case in comments, pointed out by chs@.
 1.77  21-Mar-2015  riastradh Elaborate on locking scheme and vm_page states.
 1.76  25-Oct-2013  martin branches: 1.76.6;
Optimize out VM_PHYSMEM_PTR_SWAP on architectures that have VM_PHYSSEG_MAX = 1
(hard to address two different array entries there w/o invoking undefined
behaviour, and newer compilers complain about it).
 1.75  05-May-2012  rmind branches: 1.75.2; 1.75.4;
Describe PG_ flags (for struct vm_page). Reviewed by yamt@.
 1.74  28-Jan-2012  rmind Improve description on struct vm_page and explain locking a little bit more.
 1.73  12-Jun-2011  rmind branches: 1.73.2; 1.73.6;
Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.
 1.72  19-May-2011  yamt branches: 1.72.2;
g/c unused function prototypes
 1.71  02-Feb-2011  chuck udpate license clauses on my code to match the new-style BSD licenses.
based on second diff that rmind@ sent me.

no functional change with this commit.
 1.70  18-Jan-2011  matt branches: 1.70.2;
Improve the efficiency of searching for a contiguous set of free pages.
 1.69  26-Nov-2010  uebayasi branches: 1.69.2;
Put back VM_PAGE_TO_MD(); pointed out by skrll@, thanks.
 1.68  25-Nov-2010  uebayasi Revert vm_physseg allocation changes. A report says that it causes
panics when used with mplayer in heavy load.
 1.67  14-Nov-2010  uebayasi Be a little more friendly to dynamic physical segment registration.

Maintain an array of pointer to struct vm_physseg, instead of struct
array. So that VM subsystem can take its pointer safely. Pointer
to this struct will replace raw paddr_t usage in the future.

Dynamic removal is not supported yet.

Only MD data structure changes, no kernel bump needed.

Tested on i386, amd64, powerpc/ibm40x, arm11.
 1.66  12-Nov-2010  uebayasi Put VM_PAGE_TO_MD() definition in one place. No functional changes.
 1.65  12-Nov-2010  uebayasi Abstraction fix; move physical address -> per-page metadata (struct
vm_page *) "reverse" lookup code from uvm_page.h to uvm_page.c, to
help migration to not do that.

Likewise move per-page metadata (struct vm_page *) -> physical
address "forward" conversion code into *.c too. This is called
only low-layer VM and MD code.
 1.64  12-Nov-2010  uebayasi Abstraction fix; move physical address -> physical segment "reverse"
lookup code from uvm_page.h to uvm_page.c.

This code is used by some pmaps to lookup per-page state (PV) from
per-segment metadata (struct vm_physseg). This is not needed if
UVM looks up physical segment once in fault handler, then directly
passes it to pmap. This change helps transition to that model.

The only users of vm_physseg_find() are pmap_motorola.c and
powerpc/ibm4xx/pmap.c.

Tested By: Compiling and running powerpc/ibm4xx/pmap.c
(evbppc/conf/OPENBLOCKS266)
 1.63  10-Nov-2010  uebayasi Use more VM_PHYSMEM_*() accessors. No functional changes.
 1.62  10-Nov-2010  uebayasi Prepare vm_physmem[] -> (*vm_physmem)[] migration, so that physical
segments can be changed at run-time. Pointers are easier to update.
 1.61  25-Sep-2010  matt Rename rb.h to rbtree.h, as it is more appropriate (c.f. ptree.h). Also
helps find code that hasn't been updated to use the new rbtree API.
 1.60  29-Jul-2010  hannken Add vm page flag PG_MARKER and use it to tag dummy marker pages
in genfs_do_putpages() and uao_put().
Use 'v_uobj.uo_npages' to check for an empty memq.
Put some assertions where these marker pages may not appear.

Ok: YAMAMOTO Takashi <yamt@netbsd.org>
 1.59  06-Feb-2010  uebayasi branches: 1.59.2; 1.59.4;
__inline -> inline
 1.58  06-Feb-2010  uebayasi Make vm_physseg lookup routines take the target vm_physseg. This is for the
coming "managed" device segments.
 1.57  18-Aug-2009  thorpej Add a real API for testing if a page is a managed page, and adjust callers
to stop relying on vm_physseg_find() for this purpose.
 1.56  16-Jan-2009  yamt - g/c stale function prototypes.
- rename UVM_PAGE_HASH_PENALTY to UVM_PAGE_TREE_PENALTY.
 1.55  04-Jun-2008  ad branches: 1.55.6; 1.55.14; 1.55.18;
Replace the global vm_page hash with a per vm_object rbtree.
Proposed on tech-kern@.
 1.54  04-Jun-2008  ad - vm_page: put listq, pageq into a union alongside a LIST_ENTRY, so we can
use both types of list.

- Make page coloring and idle zero state per-CPU.

- Maintain per-CPU page freelists. When freeing, put pages onto the local
CPU's lists and the global lists. When allocating, prefer to take pages
from the local CPU. If none are available take from the global list as
done now. Proposed on tech-kern@.
 1.53  02-Jun-2008  ad uvm_pageidlezero:

- Use high and low water marks to try and reduce power consumption.
- Do trylock on uvm_fpageqlock, and bail if we can't get it.
- Only run on one CPU at a time.
 1.52  27-Feb-2008  matt branches: 1.52.2; 1.52.4; 1.52.6;
Convert two inlines from old-style-definitions to ansi.
 1.51  27-Feb-2008  ad Minor corrections to comments.
 1.50  02-Jan-2008  ad branches: 1.50.2; 1.50.6;
Merge vmlocking2 to head.
 1.49  21-Jul-2007  ad branches: 1.49.6; 1.49.12; 1.49.14; 1.49.18; 1.49.22;
Merge unobtrusive locking changes from the vmlocking branch.
 1.48  14-Apr-2007  perseant branches: 1.48.2;
Track lwp as well as proc owner with UVM_PAGE_TRKOWN
 1.47  21-Feb-2007  thorpej branches: 1.47.4; 1.47.6;
Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.
 1.46  15-Sep-2006  yamt branches: 1.46.6;
merge yamt-pdpolicy branch.
- separate page replacement policy from the rest of kernel
- implement an alternative replacement policy
 1.45  06-Apr-2006  uebayasi branches: 1.45.8;
Update comment to match reality (vm_physmemseg -> vm_physseg).
 1.44  16-Feb-2006  perry branches: 1.44.2; 1.44.4; 1.44.6;
Change "inline" back to "__inline" in .h files -- C99 is still too
new, and some apps compile things in C89 mode. C89 keywords stay.

As per core@.
 1.43  11-Feb-2006  yamt remove the following options. no objections on tech-kern@.

UVM_PAGER_INLINE
UVM_AMAP_INLINE
UVM_PAGE_INLINE
UVM_MAP_INLINE
 1.42  24-Dec-2005  perry branches: 1.42.2; 1.42.4; 1.42.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.
 1.41  29-Nov-2005  yamt read-ahead statistics.
 1.40  04-Jun-2005  chs branches: 1.40.2; 1.40.8;
adapt to const changes.
 1.39  07-Oct-2004  yamt g/c stale declarations of page queues.
 1.38  12-May-2004  yamt add assertions.
 1.37  24-Mar-2004  junyoung Nuke __P().
 1.36  10-Nov-2003  rearnsha In vm_phsyseg_find, use u_int for start, len and try when doing a
binary search. Avoids the need for signed division by 2. Approved
by thorpej.
 1.35  03-Nov-2003  yamt add a DEBUG check if freed PG_ZERO pages are really zero-filled.
 1.34  10-May-2003  thorpej branches: 1.34.2;
Back out the following chagne:
http://mail-index.netbsd.org/source-changes/2003/05/08/0068.html

There were some side-effects that I didn't anticipate, and fixing them
is proving to be more difficult than I thought, do just eject for now.
Maybe one day we can look at this again.

Fixes PR kern/21517.
 1.33  08-May-2003  thorpej Simplify the way the bounds of the managed kernel virtual address
space is advertised to UVM by making virtual_avail and virtual_end
first-class exported variables by UVM. Machine-dependent code is
responsible for initializing them before main() is called. Anything
that steals KVA must adjust these variables accordingly.

This reduces the number of instances of this info from 3 to 1, and
simplifies the pmap(9) interface by removing the pmap_virtual_space()
function call, and removing two arguments from pmap_steal_memory().

This also eliminates some kludges such as having to burn kernel_map
entries on space used by the kernel and stolen KVA.

This also eliminates use of VM_{MIN,MAX}_KERNEL_ADDRESS from MI code,
this giving MD code greater flexibility over the bounds of the managed
kernel virtual address space if a given port's specific platforms can
vary in this regard (this is especially true of the evb* ports).
 1.32  08-Nov-2002  enami s/than than/than/.
 1.31  15-Sep-2001  chs branches: 1.31.6;
a whole bunch of changes to improve performance and robustness under load:

- remove special treatment of pager_map mappings in pmaps. this is
required now, since I've removed the globals that expose the address range.
pager_map now uses pmap_kenter_pa() instead of pmap_enter(), so there's
no longer any need to special-case it.
- eliminate struct uvm_vnode by moving its fields into struct vnode.
- rewrite the pageout path. the pager is now responsible for handling the
high-level requests instead of only getting control after a bunch of work
has already been done on its behalf. this will allow us to UBCify LFS,
which needs tighter control over its pages than other filesystems do.
writing a page to disk no longer requires making it read-only, which
allows us to write wired pages without causing all kinds of havoc.
- use a new PG_PAGEOUT flag to indicate that a page should be freed
on behalf of the pagedaemon when it's unlocked. this flag is very similar
to PG_RELEASED, but unlike PG_RELEASED, PG_PAGEOUT can be cleared if the
pageout fails due to eg. an indirect-block buffer being locked.
this allows us to remove the "version" field from struct vm_page,
and together with shrinking "loan_count" from 32 bits to 16,
struct vm_page is now 4 bytes smaller.
- no longer use PG_RELEASED for swap-backed pages. if the page is busy
because it's being paged out, we can't release the swap slot to be
reallocated until that write is complete, but unlike with vnodes we
don't keep a count of in-progress writes so there's no good way to
know when the write is done. instead, when we need to free a busy
swap-backed page, just sleep until we can get it busy ourselves.
- implement a fast-path for extending writes which allows us to avoid
zeroing new pages. this substantially reduces cpu usage.
- encapsulate the data used by the genfs code in a struct genfs_node,
which must be the first element of the filesystem-specific vnode data
for filesystems which use genfs_{get,put}pages().
- eliminate many of the UVM pagerops, since they aren't needed anymore
now that the pager "put" operation is a higher-level operation.
- enhance the genfs code to allow NFS to use the genfs_{get,put}pages
instead of a modified copy.
- clean up struct vnode by removing all the fields that used to be used by
the vfs_cluster.c code (which we don't use anymore with UBC).
- remove kmem_object and mb_object since they were useless.
instead of allocating pages to these objects, we now just allocate
pages with no object. such pages are mapped in the kernel until they
are freed, so we can use the mapping to find the page to free it.
this allows us to remove splvm() protection in several places.

The sum of all these changes improves write throughput on my
decstation 5000/200 to within 1% of the rate of NetBSD 1.5
and reduces the elapsed time for "make release" of a NetBSD 1.5
source tree on my 128MB pc to 10% less than a 1.5 kernel took.
 1.30  25-Jul-2001  thorpej branches: 1.30.2;
Back out previous -- christos needs to update his lint(1).
 1.29  25-Jul-2001  christos fix non-portable bitmap warning.
 1.28  22-Jul-2001  wiz seperate -> separate
 1.27  28-Jun-2001  thorpej branches: 1.27.2;
Rather than using u_shorts, use u_ints and bitfields in the vm_page. This
provides us more flexibility with pageq-locked fields, and clarifies the
locking semantics for platforms which cannot address shorts.

From Ross Harvey.
 1.26  25-May-2001  chs remove trailing whitespace.
 1.25  16-May-2001  ross Expand on the locking notes comment with a XXX warning about u_short fields.
 1.24  02-May-2001  thorpej Support dynamic sizing of the page color bins. We also support
dynamically re-coloring pages; as machine-dependent code discovers
the size of the system's caches, it may call uvm_page_recolor() with
the new number of colors to use. If the new mumber of colors is
smaller (or equal to) the current number of colors, then uvm_page_recolor()
is a no-op.

The system defaults to one bucket if machine-dependent code does not
initialize uvmexp.ncolors before uvm_page_init() is called.

Note that the number of color bins should be initialized to something
reasonable as early as possible -- for many early memory allocations,
we live with the consequences of the page choice for the lifetime of
the boot.
 1.23  01-May-2001  thorpej Garbage-collect a comment that has not been applicable since Mach.
 1.22  01-May-2001  thorpej Per discussion w/ chuck and chuck, restructure the md page stuff
to use a structure called "vm_page_md", and use __HAVE_VM_PAGE_MD
and __HAVE_PMAP_PHYSSEG.
 1.21  29-Apr-2001  thorpej Add a VM_MDPAGE_MEMBERS macro that defines pmap-specific data for
each vm_page structure. Add a VM_MDPAGE_INIT() macro to init this
data when pages are initialized by UVM. These macros are mandatory,
but ports may #define them to nothing if they are not needed/used.

This deprecates struct pmap_physseg. As a transitional measure,
allow a port to #define PMAP_PHYSSEG so that it can continue to
use it until its pmap is converted to use VM_MDPAGE_MEMBERS.

Use all this stuff to eliminate a lot of extra work in the Alpha
pmap module (it's smaller and faster now). Changes to other pmap
modules will follow.
 1.20  29-Apr-2001  thorpej Implement page coloring, using a round-robin bucket selection
algorithm (Solaris calls this "Bin Hopping").

This implementation currently relies on MD code to define a
constant defining the number of buckets. This will change
reasonably soon (MD code will be able to dynamically size
the bucket array).
 1.19  28-Dec-2000  chs branches: 1.19.2;
remove some more leftovers from Mach.
 1.18  27-Nov-2000  chs Initial integration of the Unified Buffer Cache project.
 1.17  03-Oct-2000  mrg clean up a comment.
 1.16  27-Jun-2000  mrg more vm header file changes:

<vm/vm_extern.h> merged into <uvm/uvm_extern.h>
<vm/vm_page.h> merged into <uvm/uvm_page.h>
<vm/pmap.h> has become <uvm/uvm_pmap.h>

this leaves just <vm/vm.h> in NetBSD.
 1.15  24-Apr-2000  thorpej Changes necessary to implement pre-zero'ing of pages in the idle loop:
- Make page free lists have two actual queues: known-zero pages and
pages with unknown contents.
- Implement uvm_pageidlezero(). This function attempts to zero up to
the target number of pages until the target has been reached (currently
target is `all free pages') or until whichqs becomes non-zero (indicating
that a process is ready to run).
- Define a new hook for the pmap module for pre-zero'ing pages. This is
used to zero the pages using uncached access. This allows us to zero
as many pages as we want without polluting the cache.

In order to use this feature, each platform must add the appropropriate
glue in their idle loop.
 1.14  26-Mar-2000  kleink Merge parts of chs-ubc2 into the trunk:
Add a new type voff_t (defined as a synonym for off_t) to describe offsets
into uvm objects, and update the appropriate interfaces to use it, the
most visible effect being the ability to mmap() file offsets beyond
the range of a vaddr_t.

Originally by Chuck Silvers; blame me for problems caused by merging this
into non-UBC.
 1.13  21-Jun-1999  thorpej branches: 1.13.2;
Protect prototypes, certain macros, and inlines from userland.
 1.12  24-May-1999  thorpej - Change uvm_{lock,unlock}_fpageq() to return/take the previous interrupt
level directly, instead of making the caller wrap the calls in
splimp()/splx().
- Add a comment documenting that interrupts that cause memory allocation
must be blocked while the free page queue is locked.

Since interrupts must be blocked while this lock is asserted, tying them
together like this helps to prevent mistakes.
 1.11  25-Mar-1999  mrg branches: 1.11.4;
remove now >1 year old pre-release message.
 1.10  13-Aug-1998  eeh Merge paddr_t changes into the main branch.
 1.9  08-Jul-1998  thorpej branches: 1.9.2;
Add support for multiple memory free lists. There is at least one
default free list, and 0 - N additional free list, in order of descending
priority.

A new page allocation function, uvm_pagealloc_strat(), has been added,
providing three page allocation strategies:

- normal: high -> low priority free list walk, taking the
page off the first free list that has one.

- only: attempt to allocate a page only from the specified free
list, failing if that free list has none available.

- fallback: if `only' fails, fall back on `normal'.

uvm_pagealloc(...) is provided for normal use (and is a synonym for
uvm_pagealloc_strat(..., UVM_PGA_STRAT_NORMAL, 0); the free list argument
is ignored for the `normal' case).

uvm_page_physload() now specified which free list the pages will be
loaded onto. This means that some platforms which have multiple physical
memory segments may define additional vm_physsegs if they wish to break
individual physical segments into differing priorities.

Machine-dependent code must define _at least_ the following constants
in <machine/vmparam.h>:

VM_NFREELIST: the number of free lists the system will have

VM_FREELIST_DEFAULT: the default freelist (should always be 0,
but is defined in machdep code so that it's with all of the
other free list-related constants).

Additional free list names may be defined by machine-dependent code, but
they will only be used by machine-dependent code (e.g. for loading the
vm_physsegs).
 1.8  28-May-1998  chuck unstatic uvm_page_physload so pmap modules can use it too.
as requested by Eduardo E. Horvath
 1.7  22-Mar-1998  chuck remove tmpwire arg from uvm_pagewire() -- it isn't needed anymore.
noted by chuck s.
 1.6  09-Mar-1998  mrg KNF.
 1.5  10-Feb-1998  mrg - add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.
 1.4  10-Feb-1998  perry add/cleanup multiple inclusion protection.
 1.3  07-Feb-1998  mrg restore rcsids
 1.2  06-Feb-1998  thorpej RCS ID police.
 1.1  05-Feb-1998  mrg branches: 1.1.1;
Initial revision
 1.1.1.1  05-Feb-1998  mrg initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the UVM kernel code portion.


this will be KNF'd shortly. :-)
 1.9.2.1  30-Jul-1998  eeh Split vm_offset_t and vm_size_t into paddr_t, psize_t, vaddr_t, and vsize_t.
 1.11.4.4  09-Aug-1999  chs create a new type "voff_t" for uvm_object offsets
and define it to be "off_t". also, remove pgo_asyncget().
 1.11.4.3  31-Jul-1999  chs add uvm_page_unbusy() to simplify dropping PG_BUSY.
 1.11.4.2  01-Jul-1999  thorpej Sync w/ -current.
 1.11.4.1  21-Jun-1999  thorpej Sync w/ -current.
 1.13.2.3  05-Jan-2001  bouyer Sync with HEAD
 1.13.2.2  08-Dec-2000  bouyer Sync with HEAD.
 1.13.2.1  20-Nov-2000  bouyer Update thorpej_scsipi to -current as of a month ago
 1.19.2.4  11-Nov-2002  nathanw Catch up to -current
 1.19.2.3  21-Sep-2001  nathanw Catch up to -current.
 1.19.2.2  24-Aug-2001  nathanw Catch up with -current.
 1.19.2.1  21-Jun-2001  nathanw Catch up to -current.
 1.27.2.2  10-Jan-2002  thorpej Sync kqueue branch with -current.
 1.27.2.1  03-Aug-2001  lukem update to -current
 1.30.2.1  01-Oct-2001  fvdl Catch up with -current.
 1.31.6.2  12-Mar-2002  thorpej Make pageqlock an adaptive mutex, and rename it to pageq_mutex.
 1.31.6.1  12-Mar-2002  thorpej Convert the fpageqlock to a spin mutex at IPL_VM and rename it
to fpageq_mutex.
 1.34.2.6  11-Dec-2005  christos Sync with head.
 1.34.2.5  10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.34.2.4  19-Oct-2004  skrll Sync with HEAD
 1.34.2.3  21-Sep-2004  skrll Fix the sync with head I botched.
 1.34.2.2  18-Sep-2004  skrll Sync with HEAD.
 1.34.2.1  03-Aug-2004  skrll Sync with HEAD
 1.40.8.1  29-Nov-2005  yamt sync with head.
 1.40.2.6  17-Mar-2008  yamt sync with head.
 1.40.2.5  21-Jan-2008  yamt sync with head
 1.40.2.4  03-Sep-2007  yamt sync with head.
 1.40.2.3  26-Feb-2007  yamt sync with head.
 1.40.2.2  30-Dec-2006  yamt sync with head.
 1.40.2.1  21-Jun-2006  yamt sync with head.
 1.42.6.1  22-Apr-2006  simonb Sync with head.
 1.42.4.1  09-Sep-2006  rpaulo sync with head
 1.42.2.1  18-Feb-2006  yamt sync with head.
 1.44.6.1  24-May-2006  tron Merge 2006-05-24 NetBSD-current into the "peter-altq" branch.
 1.44.4.1  19-Apr-2006  elad oops - *really* sync to head this time.
 1.44.2.3  11-Apr-2006  yamt sync with head
 1.44.2.2  12-Mar-2006  yamt - change the way to account read-ahead stats.
- fix UVM_PQFLAGBITS.
 1.44.2.1  05-Mar-2006  yamt separate page replacement policy from the rest of kernel.
 1.45.8.1  18-Nov-2006  ad Sync with head.
 1.46.6.2  15-Apr-2007  yamt sync with head.
 1.46.6.1  27-Feb-2007  yamt - sync with head.
- move sched_changepri back to kern_synch.c as it doesn't know PPQ anymore.
 1.47.6.1  11-Jul-2007  mjf Sync with head.
 1.47.4.2  08-Jun-2007  ad Sync with head.
 1.47.4.1  13-Mar-2007  ad Pull in the initial set of changes for the vmlocking branch.
 1.48.2.1  15-Aug-2007  skrll Sync with HEAD.
 1.49.22.2  21-Jul-2007  ad Merge unobtrusive locking changes from the vmlocking branch.
 1.49.22.1  21-Jul-2007  ad file uvm_page.h was added on branch matt-mips64 on 2007-07-21 19:21:56 +0000
 1.49.18.1  02-Jan-2008  bouyer Sync with HEAD
 1.49.14.1  04-Dec-2007  ad Pull the vmlocking changes into a new branch.
 1.49.12.1  18-Feb-2008  mjf Sync with HEAD.
 1.49.6.2  23-Mar-2008  matt sync with HEAD
 1.49.6.1  09-Jan-2008  matt sync with HEAD
 1.50.6.4  17-Jan-2009  mjf Sync with HEAD.
 1.50.6.3  05-Jun-2008  mjf Sync with HEAD.

Also fix build.
 1.50.6.2  02-Jun-2008  mjf Sync with HEAD.
 1.50.6.1  03-Apr-2008  mjf Sync with HEAD.
 1.50.2.1  24-Mar-2008  keiichi sync with head.
 1.52.6.1  23-Jun-2008  wrstuden Sync w/ -current. 34 merge conflicts to follow.
 1.52.4.5  09-Oct-2010  yamt sync with head
 1.52.4.4  11-Aug-2010  yamt sync with head.
 1.52.4.3  11-Mar-2010  yamt sync with head
 1.52.4.2  19-Aug-2009  yamt sync with head.
 1.52.4.1  04-May-2009  yamt sync with head.
 1.52.2.2  17-Jun-2008  yamt sync with head.
 1.52.2.1  04-Jun-2008  yamt sync with head
 1.55.18.1  14-Oct-2011  matt Add VM_PHYSMEM_PTR and VM_PAGE_TO_MD macros from -current.
 1.55.14.9  29-Feb-2012  matt Improve UVM_PAGE_TRKOWN.
Add more asserts to uvm_page.
 1.55.14.8  16-Feb-2012  matt Track the victims selected by the pagedaemon and what happens to then.
Keep a hint for what page group has the most free pages for a given color.
 1.55.14.7  13-Feb-2012  matt Use separate pending and paging tailq entries.
Add a queue check routine to validate the queues aren't corrupt.
 1.55.14.6  09-Feb-2012  matt Major changes to uvm.
Support multiple collections (groups) of free pages and run the page
reclaimation algorithm on each group independently.
 1.55.14.5  03-Jun-2011  matt Restore $NetBSD$
 1.55.14.4  03-Jun-2011  matt Rework page free lists to be sorted by color first rather than free_list.
Kept per color PGFL_* counter in each page free list.
Minor cleanups.
 1.55.14.3  25-May-2011  matt Make uvm_map recognize UVM_FLAG_COLORMATCH which tells uvm_map that the
'align' argument specifies the starting color of the KVA range to be returned.

When calling uvm_km_alloc with UVM_KMF_VAONLY, also specify the starting
color of the kva range returned (UMV_KMF_COLORMATCH) and pass those to
uvm_map.

In uvm_pglistalloc, make sure the pages being returned have sequentially
advancing colors (so they can be mapped in a contiguous address range).
Add a few missing UVM_FLAG_COLORMATCH flags to uvm_pagealloc calls.

Make the socket and pipe loan color-safe.

Make the mips pmap enforce strict page color (color(VA) == color(PA)).
 1.55.14.2  29-Apr-2011  matt Add macros from current (VM_PAGE_TO_MD, VM_PHYSMEM_PTR, VM_PHYSMEM_PTR_SWAP)
 1.55.14.1  23-Jan-2010  matt Add a start_hint to vm_physseg so when allocating pages, we can skip
forward over pages that are probably still allocated.
 1.55.6.1  19-Jan-2009  skrll Sync with HEAD.
 1.59.4.3  31-May-2011  rmind sync with head
 1.59.4.2  05-Mar-2011  rmind sync with head
 1.59.4.1  17-Mar-2010  rmind Reorganise UVM locking to protect P->V state and serialise pmap(9)
operations on the same page(s) by always locking their owner. Hence
lock order: "vmpage"-lock -> pmap-lock.

Patch, proposed on tech-kern@, from Andrew Doran.
 1.59.2.37  21-Nov-2010  uebayasi Rename PGO_ZERO as PGO_HOLE, and s/uvm_page_zeropage/uvm_page_holepage/.
 1.59.2.36  15-Nov-2010  uebayasi Move zero-page into a common place, in the hope that it's shared
for other purposes.

According to Chuck Silvers, zero-page mappings don't need to be
explicitly unmapped in putpages(). Follow that advice.
 1.59.2.35  12-Nov-2010  uebayasi Move MD member in struct vm_physseg to the tail, in case this struct
can be shared among architectures with only difference of the MD
part.
 1.59.2.34  10-Nov-2010  uebayasi Fix thinko; make vm_physseg ptr swap really work.
 1.59.2.33  04-Nov-2010  uebayasi Split physical device segment pages from "managed" to "managed
device". Cache that information as a flag PG_DEVICE so that callers
don't need to walk physsegs everytime.

Remove PQ_FIXED, which means that page daemon doesn't need to know
device segment pages at all. But still fault handlers need to know
them.

I think this is what I can do best now.
 1.59.2.32  27-Oct-2010  uebayasi Unconditionally provide device page segment data structures and
functions as suggested by Chuck Silvers.

(Memory and device segments are being merged soon.)
 1.59.2.31  22-Oct-2010  uebayasi Sync with HEAD (-D20101022).
 1.59.2.30  17-Aug-2010  uebayasi Sync with HEAD.
 1.59.2.29  17-Aug-2010  uebayasi Collect a garbage.
 1.59.2.28  11-Aug-2010  uebayasi s/vm_physseg_find_direct/vm_physseg_find_device/
 1.59.2.27  22-Jul-2010  uebayasi s/PG_XIP/PQ_FIXED/, meaning that the fault handler sees XIP pages as
"fixed", and doesn't pass them to paging activity.

("XIP" is a vnode specific knowledge. It was wrong that the fault
handler had to know such a special thing.)
 1.59.2.26  15-Jul-2010  uebayasi Rename PG_DIRECT to PG_XIP. PG_XIP is marked to XIP vnode pages.
 1.59.2.25  08-Jul-2010  uebayasi One more missing s/DIRECT_PAGE/XIP/.
 1.59.2.24  08-Jul-2010  uebayasi Whitespace.
 1.59.2.23  07-Jul-2010  uebayasi To simplify things, revert global vm_page_md hash and allocate struct
vm_page [] for XIP physical segments.
 1.59.2.22  31-May-2010  uebayasi Re-define the definition of "device page"; device pages are pages of
device memory. Pages which don't have vm_page (== can't be used for
generic use), but whose PV are tracked, are called "direct pages" from
now.
 1.59.2.21  31-May-2010  uebayasi Revert partial "phys_addr" removal code. This change is independent of
XIP, and will be done later.
 1.59.2.20  29-Apr-2010  uebayasi "int free_list" (VM_FREELIST_*) is specific to struct vm_page (memory
page). Handle it only in memory physseg parts.

Record device page's properties in struct vm_physseg for future uses.
For example, framebuffers that is capable of some accelarated bus access
(e.g. write-combining) should register its capability through "int
flags".
 1.59.2.19  28-Apr-2010  uebayasi Manage struct vm_physseg as a list, which means that struct vm_physseg
objects don't move when a segment is added / removed.
 1.59.2.18  28-Apr-2010  uebayasi Always use struct vm_physseg *vm_physmem_ptrs[] in MD code.
 1.59.2.17  27-Apr-2010  uebayasi Maintain not only arrays of struct vm_physseg, but also arrays of pointers
to struct vm_physseg. This is need:

- to make the array change dynamically (unload), and

- to make the struct vm_physseg * object to be passed to device drivers as
a cookie of a managed physical segment.
 1.59.2.16  27-Apr-2010  uebayasi Sort.
 1.59.2.15  23-Feb-2010  uebayasi Put back vm_page::phys_addr for now, because removing it involves some random
parts in the tree. I'll revisit this after merging the branch.
 1.59.2.14  23-Feb-2010  uebayasi Make struct vm_page_md * -> struct vm_page_md * lookup a real function and
hide its internal. Won't cause much performance loss because results are
usually cached by callers.
 1.59.2.13  23-Feb-2010  uebayasi Introduce uvm_page_physload_device(). This registers a physical address
range of a device, similar to uvm_page_physload() for memories. For now,
this is supposed to be called by MD code. We have to consider the design
when we'll manage mmap'able character devices.

Expose paddr_t -> struct vm_page * conversion function for device pages,
uvm_phys_to_vm_page_device(). This will be called by XIP vnode pager.
Because it knows if a given vnode is a device page (and its physical
address base) or not. Don't look up device segments, but directly make a
cookie.
 1.59.2.12  12-Feb-2010  uebayasi Typo.
 1.59.2.11  12-Feb-2010  uebayasi Enable the newly added VM_PAGE_TO_MD() only #ifdef __HAVE_VM_PAGE_MD.
Pointed out by mrg@.
 1.59.2.10  10-Feb-2010  uebayasi Fix previous again & use VM_PAGE_TO_MD() where appropriate.
 1.59.2.9  10-Feb-2010  uebayasi Oops fix a typo. (My lapdog's k/b is dying.)
 1.59.2.8  10-Feb-2010  uebayasi Introduce VM_PAGE_TO_MD(); lookup vm_page_md from a given vm_page.
 1.59.2.7  10-Feb-2010  uebayasi Initial MD per-page data (struct vm_page_md) lookup code for XIP'able device
pages. Compile tested only.

Always define uvm_pageisdevice_p(). Always false if kernel is !DEVICE_PAGE.
 1.59.2.6  09-Feb-2010  uebayasi Implement device page struct vm_page * handling.
 1.59.2.5  09-Feb-2010  uebayasi Define vm_physdev / vm_nphysdev, physical address segment data for managed
device pages.
 1.59.2.4  09-Feb-2010  uebayasi vm_nphysseg -> vm_nphysmem
 1.59.2.3  09-Feb-2010  uebayasi Kill vm_page::phys_addr.
 1.59.2.2  08-Feb-2010  uebayasi Make vm_physseg lookup into a real function.
 1.59.2.1  08-Feb-2010  uebayasi Make vm_physseg::lastpg exclusive end.
 1.69.2.1  06-Jun-2011  jruoho Sync with HEAD.
 1.70.2.1  08-Feb-2011  bouyer Sync with HEAD
 1.72.2.1  23-Jun-2011  cherry Catchup with rmind-uvmplock merge.
 1.73.6.2  02-Jun-2012  mrg sync to latest -current.
 1.73.6.1  18-Feb-2012  mrg merge to -current.
 1.73.2.12  22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.73.2.11  23-May-2012  yamt sync with head.
 1.73.2.10  17-Apr-2012  yamt sync with head
 1.73.2.9  17-Feb-2012  yamt byebye PG_HOLE as it turned out to be unnecessary.
 1.73.2.8  30-Nov-2011  yamt make lfs another pager specific flag so that it won't be affected by
an nfs hack in genfs.
 1.73.2.7  20-Nov-2011  yamt - fix page loaning XXX make O->A loaning further
- add some statistics
 1.73.2.6  18-Nov-2011  yamt - use mutex obj for pageable object
- add a function to wait for a mutex obj being available
- replace some "livelock" kpauses with it
 1.73.2.5  14-Nov-2011  yamt remove now unused UVM_PAGE_TREE_PENALTY
 1.73.2.4  13-Nov-2011  yamt cache UVM_OBJ_IS_VNODE in pqflags
 1.73.2.3  11-Nov-2011  yamt - track the number of clean/dirty/unknown pages in the system.
- g/c PG_MARKER
 1.73.2.2  06-Nov-2011  yamt remove pg->listq and uobj->memq
 1.73.2.1  02-Nov-2011  yamt page cache related changes

- maintain object pages in radix tree rather than rb tree.
- reduce unnecessary page scan in putpages. esp. when an object has a ton of
pages cached but only a few of them are dirty.
- reduce the number of pmap operations by tracking page dirtiness more
precisely in uvm layer.
- fix nfs commit range tracking.
- fix nfs write clustering. XXX hack
 1.75.4.1  18-May-2014  rmind sync with head
 1.75.2.2  03-Dec-2017  jdolecek update from HEAD
 1.75.2.1  20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.76.6.2  05-Feb-2017  skrll Sync with HEAD
 1.76.6.1  06-Apr-2015  skrll Sync with HEAD
 1.80.2.1  07-Jan-2017  pgoyette Sync with HEAD. (Note that most of these changes are simply $NetBSD$
tag issues.)
 1.82.2.2  18-Jan-2019  pgoyette Synch with HEAD
 1.82.2.1  21-May-2018  pgoyette Sync with HEAD
 1.83.2.2  08-Apr-2020  martin Merge changes from current as of 20200406
 1.83.2.1  10-Jun-2019  christos Sync with HEAD
 1.84.4.1  13-May-2020  martin Pull up following revision(s) (requested by chs in ticket #906):

sys/uvm/uvm_page.h: revision 1.99

Include "opt_uvm_page_trkown.h" for UVM_PAGE_TRKOWN.
 1.93.2.3  29-Feb-2020  ad Sync with head.
 1.93.2.2  25-Jan-2020  ad Sync with head.
 1.93.2.1  17-Jan-2020  ad Sync with head.
 1.107.2.1  03-Jan-2021  thorpej Sync w/ HEAD.

RSS XML Feed