Home | History | Annotate | only in /src/sys/arch/amd64/include
History log of /src/sys/arch/amd64/include
RevisionDateAuthorComments
 1.25 30-Nov-2024  christos Create a new header lwp_private.h to contain _lwp_getprivate_fast,
_lwp_gettcb_fast, _lwp_settcb and remove them from mcontext.h, so that:
1. we don't need special hacks to hide them
2. we can include <lwp.h> where needed to get the necessary prototypes
without redefining them locally.
 1.24 04-Nov-2024  christos Undo previous lwp.h change.
 1.23 03-Nov-2024  christos Split __lwp_getprivate_fast and __lwp_*tcb from mcontext.h into a separate
lwp.h file.
 1.22 31-Oct-2018  maxv branches: 1.22.36;
Revert my kasan addition in this makefile, it looks like it causes
asan.h to be installed, while we don't want it to be.
 1.21 31-Oct-2018  maxv Move the MI parts of KASAN into kern/subr_asan.c. This file includes
machine/asan.h, which contains the MD functions. We use an include rather
than a plain C file, because we want GCC to optimize/inline some functions
into one single block.

The amd64 MD parts of KASAN are moved accordingly.

The naming convention we use is:

kasan_*
a generic kasan object, declared in subr_asan.c
kasan_md_*
an MD kasan object, declared in machine/asan.h, and used
in subr_asan.c
__md_*
an MD object, declared in machine/asan.h, and not used
outside

Overall this makes it easier to add KASAN support on more architectures.

Discussed with several people.
 1.20 12-Jul-2018  maxv Remove the kernel PMC code. Sent yesterday on tech-kern@.

This change:

* Removes "options PERFCTRS", the associated includes, and the associated
ifdefs. In doing so, it removes several XXXSMPs in the MI code, which is
good.

* Removes the PMC code of ARM XSCALE.

* Removes all the pmc.h files. They were all empty, except for ARM XSCALE.

* Reorders the x86 PMC code not to rely on the legacy pmc.h file. The
definitions are put in sysarch.h.

* Removes the kern/sys_pmc.c file, and along with it, the sys_pmc_control
and sys_pmc_get_info syscalls. They are marked as OBSOL in kern,
netbsd32 and rump.

* Removes the pmc_evid_t and pmc_ctr_t types.

* Removes all the associated man pages. The sets are marked as obsolete.
 1.19 27-Feb-2016  tls branches: 1.19.16; 1.19.18;
Add cpu_rng, a framework for simple on-CPU random number generators.
 1.18 23-Jul-2014  alnsn branches: 1.18.4;
Rename sljitarch.h to sljit_machdep.h.
 1.17 18-Feb-2014  dsl branches: 1.17.2;
Copy fpu.h to release
 1.16 11-Feb-2014  dsl Move sys/arch/amd64/amd64/fpu.c and sys/arch/amd64/include/fpu.h
into sys/arch/x86 in preparation for using the same code for i386.
 1.15 05-Nov-2012  alnsn Build sljit test when MKSLJIT != no and set MKSLJIT to yes on amd64 and i386.
 1.14 08-Aug-2012  drochner branches: 1.14.2;
on x86, <machine/cpufunc.h> only pulls in <x86/cpufunc.h>. The latter
is not installed to userland and noone missed it, so the former ones
can not be useful either. Don't install them.
 1.13 17-Jul-2011  joerg branches: 1.13.2;
Retire varargs.h support. Move machine/stdarg.h logic into MI
sys/stdarg.h and expect compiler to provide proper builtins, defaulting
to the GCC interface. lint still has a special fallback.
Reduce abuse of _BSD_VA_LIST_ by defining __va_list by default and
derive va_list as required by standards.
 1.12 17-Jul-2011  dyoung On amd64, good-bye <machine/bus.h>.

Up next: update set lists.
 1.11 31-Jul-2010  joerg Add support for fenv.h interface for i386 and amd64.

Submitted by Stathis Kamperis as part of GSoC 2010 and ported from
FreeBSD.
 1.10 04-Jan-2008  dsl branches: 1.10.10; 1.10.24; 1.10.30; 1.10.32;
Change the way that the trap/intr/syscall frames and the __gregset_t[]
indexes are defined so that only a single list of the registers is used.
The code no longer relies on the two structures matching.
There should be no binary change.
 1.9 20-Dec-2007  ad - Make __cpu_simple_lock and similar real functions and patch at runtime.
- Remove old x86 atomic ops.
- Drop text alignment back to 16 on i386 (really, this time).
- Minor cleanup.
 1.8 09-Feb-2007  ad branches: 1.8.12; 1.8.24; 1.8.30; 1.8.32; 1.8.36;
Merge newlock2 to head.
 1.7 09-Feb-2006  manu branches: 1.7.14;
Add initial (but unfinished) COMPAT_LINUX32 for amd64. This is good enough so
that the i386 license manager part of amd64 version of Fluent works.

While I'm here, add SysV IPC to COMPAT_LINUX/amd64
 1.6 04-Feb-2006  jmmv Revert yesterday's change that attempted to fix the detection of the
boot device when using a Multiboot boot loader. It couldn't work because
these boot loaders do not pass a checksum of the disk so matchbiosdisk()
cannot really find any matches. I should have gone to sleep before
commiting...

Found by xtraeme@.
 1.5 03-Feb-2006  jmmv branches: 1.5.2;
When booting an i386 kernel with Multiboot, properly detect the boot device
by looking it up in the x86_alldisks table (instead of trying to match it
to 'wd*' manually).

In order to do this, move the cpu_rootconf function from x86 common code
to amd64 and i386 specific one. This way, i386 can do an extra step (call
the appropriate Multiboot code) in the appropriate place (after
x86_matchbiosdisks and before findroot()).
 1.4 11-Dec-2005  christos branches: 1.4.2; 1.4.4;
merge ktrace-lwp.
 1.3 02-Jul-2004  drochner branches: 1.3.12;
add a <machine/joystick.h> which just includes the new common one
 1.2 08-May-2004  kleink Factor out W{CHAR,INT}_{MAX,MIN} into their own header file.
 1.1 26-Apr-2003  fvdl branches: 1.1.2; 1.1.4;
Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1.4.1 05-Jul-2004  he Pull up revision 1.3 (requested by drochner in ticket #605):
Add a <machine/joystick.h> which here is just a copy
of the i386 one.
 1.1.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.1.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.1.2.1 03-Aug-2004  skrll Sync with HEAD
 1.3.12.3 21-Jan-2008  yamt sync with head
 1.3.12.2 26-Feb-2007  yamt sync with head.
 1.3.12.1 21-Jun-2006  yamt sync with head.
 1.4.4.1 09-Sep-2006  rpaulo sync with head
 1.4.2.1 18-Feb-2006  yamt sync with head.
 1.5.2.1 22-Apr-2006  simonb Sync with head.
 1.7.14.1 24-Oct-2006  ad Compile fixes
 1.8.36.2 08-Jan-2008  bouyer Sync with HEAD
 1.8.36.1 02-Jan-2008  bouyer Sync with HEAD
 1.8.32.1 26-Dec-2007  ad Sync with head.
 1.8.30.1 18-Feb-2008  mjf Sync with HEAD.
 1.8.24.1 09-Jan-2008  matt sync with HEAD
 1.8.12.1 18-Apr-2007  thorpej Convert i386 and amd64 to the new atomic ops API.
 1.10.32.1 05-Mar-2011  rmind sync with head
 1.10.30.1 17-Aug-2010  uebayasi Sync with HEAD.
 1.10.24.2 27-Aug-2011  jym Sync with HEAD. Most notably: uvm/pmap work done by rmind@, and MP Xen
work of cherry@.

No regression observed on suspend/restore.
 1.10.24.1 24-Oct-2010  jym Sync with HEAD
 1.10.10.1 11-Aug-2010  yamt sync with head.
 1.13.2.2 16-Jan-2013  yamt sync with (a bit old) head
 1.13.2.1 30-Oct-2012  yamt sync with head
 1.14.2.3 03-Dec-2017  jdolecek update from HEAD
 1.14.2.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.14.2.1 20-Nov-2012  tls Resync to 2012-11-19 00:00:00 UTC
 1.17.2.1 10-Aug-2014  tls Rebase.
 1.18.4.1 19-Mar-2016  skrll Sync with HEAD
 1.19.18.1 10-Jun-2019  christos Sync with HEAD
 1.19.16.2 26-Nov-2018  pgoyette Sync with HEAD, resolve a couple of conflicts
 1.19.16.1 28-Jul-2018  pgoyette Sync with HEAD
 1.22.36.1 02-Aug-2025  perseant Sync with HEAD
 1.5 19-Oct-2014  joerg Disable SSE and AVX for kernel modules too.
 1.4 11-Jun-2012  chs branches: 1.4.2; 1.4.14;
make dtrace work on amd64.
 1.3 27-Nov-2009  pooka branches: 1.3.4; 1.3.8; 1.3.10; 1.3.18; 1.3.24;
Move -mcmodel=kernel CFLAGS from bsd.klinks.mk to amd64/include/Makefile.inc
to avoid having the kernel toolchain flags split over a billion different
files.
 1.2 25-Nov-2009  tron Enable SSP (Stack Smash Protection) in x86 kernels by default (except
in i386 *TINY kernels). The NetBSD/i386 "ALL" kernel is unconditionally
compiled with SSP enabled.

Change approved by the core team.
 1.1 11-Nov-2009  haad branches: 1.1.2; 1.1.4;
Build kernel modules with -mno-red-zone like kernel is build. This fixes
frequent panics in amd64 zfs module. This should also fix problem reported
by Nicolas Joly in:

http://mail-index.netbsd.org/port-amd64/2008/12/09/msg000646.html

Thanks to cube@ for his help with this.
 1.1.4.2 13-Nov-2009  sborrill Pull up the following revisions(s) (requested by cube in ticket #1140):
sys/arch/amd64/include/Makefile.inc: revision 1.1

Build kernel modules with -mno-red-zone like kernel is build. This fixes
frequent panics in amd64 zfs module, plus other reported problems.
 1.1.4.1 11-Nov-2009  sborrill file Makefile.inc was added on branch netbsd-5-0 on 2009-11-13 20:42:49 +0000
 1.1.2.2 13-Nov-2009  sborrill Pull up the following revisions(s) (requested by cube in ticket #1140):
sys/arch/amd64/include/Makefile.inc: revision 1.1

Build kernel modules with -mno-red-zone like kernel is build. This fixes
frequent panics in amd64 zfs module, plus other reported problems.
 1.1.2.1 11-Nov-2009  sborrill file Makefile.inc was added on branch netbsd-5 on 2009-11-13 20:38:48 +0000
 1.3.24.2 10-Nov-2014  msaitoh Pull up following revision(s) (requested by joerg in ticket #1172):
sys/arch/amd64/include/Makefile.inc: revision 1.5
sys/arch/i386/include/Makefile.inc: revision 1.3 via patch
Disable SSE and AVX for kernel modules too.
 1.3.24.1 22-Nov-2012  riz Pull up following revision(s) (requested by chs in ticket #690):
external/cddl/osnet/dev/dtrace/amd64/dtrace_isa.c: revision 1.4
sys/arch/amd64/include/Makefile.inc: revision 1.4
sys/arch/amd64/include/pmap.h: revision 1.33
external/cddl/osnet/dev/dtrace/amd64/dtrace_subr.c: revision 1.6
sys/arch/amd64/include/asm.h: revision 1.15
sys/arch/amd64/amd64/genassym.cf: revision 1.51
external/cddl/osnet/dev/dtrace/amd64/dtrace_asm.S: revision 1.4
make dtrace work on amd64.
allow more space for modules.
 1.3.18.1 30-Oct-2012  yamt sync with head
 1.3.10.2 24-Oct-2010  jym Sync with HEAD
 1.3.10.1 27-Nov-2009  jym file Makefile.inc was added on branch jym-xensuspend on 2010-10-24 22:47:52 +0000
 1.3.8.2 21-Apr-2010  matt sync to netbsd-5
 1.3.8.1 27-Nov-2009  matt file Makefile.inc was added on branch matt-nb5-mips64 on 2010-04-21 00:33:53 +0000
 1.3.4.2 11-Mar-2010  yamt sync with head
 1.3.4.1 27-Nov-2009  yamt file Makefile.inc was added on branch yamt-nfs-mp on 2010-03-11 15:01:59 +0000
 1.4.14.1 19-Oct-2014  martin Pull up following revision(s) (requested by joerg in ticket #152):
sys/arch/amd64/include/Makefile.inc: revision 1.5
sys/arch/i386/include/Makefile.inc: revision 1.3
Disable SSE and AVX for kernel modules too.
 1.4.2.1 03-Dec-2017  jdolecek update from HEAD
 1.6 24-May-2008  jmcneill MI implementation of AcpiAcquireGlobalLock and AcpiReleaseGlobalLock.
 1.5 11-Apr-2008  jmcneill branches: 1.5.2; 1.5.4; 1.5.6;
Revert previous.
 1.4 11-Apr-2008  jmcneill Remove code erroneously copied from i386 ACPI_ACQUIRE_GLOBAL_LOCK.
 1.3 18-Feb-2008  scw branches: 1.3.6;
Re-sync the (broken) amd64 ACPI Global Lock macros with their i386
counterparts.

As discussed on port-amd64, these will be replaced by portable C code
in the near future.
 1.2 24-Dec-2005  perry branches: 1.2.24; 1.2.40; 1.2.50; 1.2.56;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.
 1.1 11-May-2003  fvdl branches: 1.1.18;
ACPI support. Wakeup code still to be done.
 1.1.18.2 27-Feb-2008  yamt sync with head.
 1.1.18.1 21-Jun-2006  yamt sync with head.
 1.2.56.1 18-Feb-2008  mjf Sync with HEAD.
 1.2.50.1 23-Mar-2008  matt sync with HEAD
 1.2.40.1 03-Jun-2008  skrll Sync with netbsd-4.
 1.2.24.1 22-Feb-2008  bouyer Pull up following revision(s) (requested by scw in ticket #1077):
sys/arch/amd64/include/acpi_func.h: revision 1.3
Re-sync the (broken) amd64 ACPI Global Lock macros with their i386
counterparts.
 1.3.6.1 02-Jun-2008  mjf Sync with HEAD.
 1.5.6.2 23-Jun-2008  wrstuden Remove files removed on branch. Updating using patch has its
drawbacks. :-)
 1.5.6.1 23-Jun-2008  wrstuden Sync w/ -current. 34 merge conflicts to follow.
 1.5.4.1 04-May-2009  yamt sync with head.
 1.5.2.1 04-Jun-2008  yamt sync with head
 1.1 11-May-2003  fvdl ACPI support. Wakeup code still to be done.
 1.11 07-May-2019  kamil Switch all users (except ia64) of custom machine/ansi.h to common_ansi.h

Deduplicate the code among ports and poll definitions of types
directly from a compiler.

This fixes miscompilation of certain programs that instruct compilers
to generate code for different types. This bug has been detected with
-fshort-wchar in EFI firmware.

Proposed and discussed on a mailing list (twice).

Itanium uses custom !ELF fallback switch, temporarily leave it as it is.
 1.10 17-Jul-2011  joerg branches: 1.10.54;
Retire varargs.h support. Move machine/stdarg.h logic into MI
sys/stdarg.h and expect compiler to provide proper builtins, defaulting
to the GCC interface. lint still has a special fallback.
Reduce abuse of _BSD_VA_LIST_ by defining __va_list by default and
derive va_list as required by standards.
 1.9 27-Mar-2010  tnozaki 1. {wctype,wctrans,mbstate}_t: switch MD to MI like other
libc implementation (such as *BSD and glibc2).

2. don't typedef void * wc{type,trans}_t, suggested by soda@-san.
it may pass through compiler type check, it's harmful.
so i introduce dummy struct __tag_wc{type,trans}_t(iconv_t already does).

no ABI change was made.
 1.8 11-Jan-2009  christos branches: 1.8.2; 1.8.4; 1.8.6;
merge christos-time_t
 1.7 26-Oct-2008  mrg branches: 1.7.2;
put the contents of these header files around #ifdef __x86_64__, and
#include the <i386/foo.h> in the #else clause, making these files
largely bit-size independant.
 1.6 17-Oct-2007  garbled branches: 1.6.16; 1.6.18; 1.6.22; 1.6.28;
Merge the ppcoea-renovation branch to HEAD.

This branch was a major cleanup and rototill of many of the various OEA
cpu based PPC ports that focused on sharing as much code as possible
between the various ports to eliminate near-identical copies of files in
every tree. Additionally there is a new PIC system that unifies the
interface to interrupt code for all different OEA ppc arches. The work
for this branch was done by a variety of people, too long to list here.

TODO:
bebox still needs work to complete the transition to -renovation.
ofppc still needs a bunch of work, which I will be looking at.
ev64260 still needs to be renovated
amigappc was not attempted.

NOTES:
pmppc was removed as an arch, and moved to a evbppc target.
 1.5 03-Sep-2007  drochner clean up some definitions around rune_t which are not needed anymore
 1.4 04-Oct-2006  tnozaki branches: 1.4.8; 1.4.16; 1.4.22; 1.4.26; 1.4.28;
fix gcc -Werror -Wmissing-braces problem
mbstate_t(this is opaque object)'s initializer should be ``{ 0 }'',
so changed 1st field of union from character array to integer.
 1.3 11-Dec-2005  christos branches: 1.3.20; 1.3.22;
merge ktrace-lwp.
 1.2 07-Aug-2003  agc branches: 1.2.16;
Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.1 26-Apr-2003  fvdl branches: 1.1.2;
Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.1.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.1.2.1 03-Aug-2004  skrll Sync with HEAD
 1.2.16.2 27-Oct-2007  yamt sync with head.
 1.2.16.1 30-Dec-2006  yamt sync with head.
 1.3.22.1 22-Oct-2006  yamt sync with head
 1.3.20.1 18-Nov-2006  ad Sync with head.
 1.4.28.1 06-Nov-2007  matt sync with HEAD
 1.4.26.1 02-Oct-2007  joerg Sync with HEAD.
 1.4.22.1 10-Sep-2007  skrll Sync with HEAD.
 1.4.16.1 03-Oct-2007  garbled Sync with HEAD
 1.4.8.1 09-Oct-2007  ad Sync with head.
 1.6.28.1 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.6.22.2 11-Aug-2010  yamt sync with head.
 1.6.22.1 04-May-2009  yamt sync with head.
 1.6.18.5 04-Jan-2009  christos oops, missed ptrdiff_t.
 1.6.18.4 04-Jan-2009  christos revert clock_t and size_t changes.
 1.6.18.3 01-Nov-2008  christos Sync with head.
 1.6.18.2 30-Mar-2008  christos time_t is now __int64_t
 1.6.18.1 29-Mar-2008  christos Welcome to the time_t=long long dev_t=uint64_t branch.
 1.6.16.1 17-Jan-2009  mjf Sync with HEAD.
 1.7.2.1 19-Jan-2009  skrll Sync with HEAD.
 1.8.6.1 30-May-2010  rmind sync with head
 1.8.4.1 30-Apr-2010  uebayasi Sync with HEAD.
 1.8.2.2 27-Aug-2011  jym Sync with HEAD. Most notably: uvm/pmap work done by rmind@, and MP Xen
work of cherry@.

No regression observed on suspend/restore.
 1.8.2.1 24-Oct-2010  jym Sync with HEAD
 1.10.54.1 10-Jun-2019  christos Sync with HEAD
 1.1 26-Apr-2003  fvdl Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1 26-Apr-2003  fvdl Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.12 13-Sep-2022  riastradh amd64/asan.h, amd64/msan.h: Add include guards.
 1.11 22-Aug-2022  hannken Sprinkle "#include <machine/pmap_private.h>", kernel ALL/amd64
compiles again.
 1.10 20-Aug-2022  riastradh x86: Split bootspace out of x86/pmap.h into new x86/bootspace.h.
 1.9 10-Sep-2020  maxv kasan: fix the copyright notices
 1.8 05-Sep-2020  riastradh Round of uvm.h cleanup.

The poorly named uvm.h is generally supposed to be for uvm-internal
users only.

- Narrow it to files that actually need it -- mostly files that need
to query whether curlwp is the pagedaemon, which should maybe be
exposed by an external header.

- Use uvm_extern.h where feasible and uvm_*.h for things not exposed
by it. We should split up uvm_extern.h but this will serve for now
to reduce the uvm.h dependencies.

- Use uvm_stat.h and #ifdef UVMHIST uvm.h for files that use
UVMHIST(ubchist), since ubchist is declared in uvm.h but the
reference evaporates if UVMHIST is not defined, so we reduce header
file dependencies.

- Make uvm_device.h and uvm_swap.h independently includable while
here.

ok chs@
 1.7 23-Jun-2020  maxv Rename __MD_CANONICAL_BASE -> __MD_KERNMEM_BASE for clarity.
 1.6 02-May-2020  maxv Call kasan_early_init earlier, to unbreak KASAN after the recent RNG
changes. Will also prevent further trouble.
 1.5 25-Apr-2020  bouyer Merge the bouyer-xenpvh branch, bringing in Xen PV drivers support under HVM
guests in GENERIC.
Xen support can be disabled at runtime with
boot -c
disable hypervisor
 1.4 15-Apr-2020  maxv Use large pages for the kASan shadow, same as kMSan.
 1.3 09-Mar-2019  maxv branches: 1.3.4; 1.3.12;
Start replacing the x86 PTE bits.
 1.2 04-Feb-2019  maxv Add more symbols to the unwinder, in case we get a KASAN message inside
an exception handler.
 1.1 31-Oct-2018  maxv branches: 1.1.2;
Move the MI parts of KASAN into kern/subr_asan.c. This file includes
machine/asan.h, which contains the MD functions. We use an include rather
than a plain C file, because we want GCC to optimize/inline some functions
into one single block.

The amd64 MD parts of KASAN are moved accordingly.

The naming convention we use is:

kasan_*
a generic kasan object, declared in subr_asan.c
kasan_md_*
an MD kasan object, declared in machine/asan.h, and used
in subr_asan.c
__md_*
an MD object, declared in machine/asan.h, and not used
outside

Overall this makes it easier to add KASAN support on more architectures.

Discussed with several people.
 1.1.2.2 26-Nov-2018  pgoyette Sync with HEAD, resolve a couple of conflicts
 1.1.2.1 31-Oct-2018  pgoyette file asan.h was added on branch pgoyette-compat on 2018-11-26 01:52:17 +0000
 1.3.12.1 20-Apr-2020  bouyer Sync with HEAD
 1.3.4.3 21-Apr-2020  martin Sync with HEAD
 1.3.4.2 10-Jun-2019  christos Sync with HEAD
 1.3.4.1 09-Mar-2019  christos file asan.h was added on branch phil-wifi on 2019-06-10 22:05:47 +0000
 1.24 05-Jan-2025  riastradh x86 machine/asm.h: Avoid juxtaposition for concatenation.

clang asm doesn't seem to like it. Instead of `.asciz "foo" "bar"',
do `.ascii "foo"; .asciz "bar"'.

PR toolchain/58960: Missing support for _NETBSD_REVISIONID on various
ports
 1.23 09-Jun-2024  riastradh branches: 1.23.2;
amd64/asm.h: Respect _NETBSD_REVISIONID.
 1.22 17-Apr-2021  rillig sys/arch/amd64: remove trailing whitespace
 1.21 25-Apr-2020  bouyer Merge the bouyer-xenpvh branch, bringing in Xen PV drivers support under HVM
guests in GENERIC.
Xen support can be disabled at runtime with
boot -c
disable hypervisor
 1.20 17-Apr-2020  joerg Mark the .ident section as mergable string section to avoid redundant
entries.
 1.19 22-May-2014  uebayasi branches: 1.19.28; 1.19.38;
Indent.
 1.18 12-Sep-2013  joerg branches: 1.18.2;
Pass PICFLAGS down to cc-as-as and use __PIC__ to decide if it is small
vs big PIC mode. Retire -DPIC and -DBIGPIC.
 1.17 22-Jun-2013  uebayasi branches: 1.17.2;
Define IDTVEC_END(), from i386/asm.h.
 1.16 21-Jun-2013  uebayasi Add END(y) as i386/asm.h does.
 1.15 11-Jun-2012  chs branches: 1.15.2;
make dtrace work on amd64.
 1.14 20-Dec-2010  joerg branches: 1.14.8; 1.14.14;
Consistently use .gnu.warning with .pushsectio and .popsection on all
architectures instead of obsolete STABS frames for linker warnings.
 1.13 26-Oct-2008  mrg branches: 1.13.8; 1.13.16;
put the contents of these header files around #ifdef __x86_64__, and
#include the <i386/foo.h> in the #else clause, making these files
largely bit-size independant.
 1.12 01-Oct-2008  joerg Make RCSID() use section .ident like the C version and i386's asm does.
OK ad@
 1.11 20-Dec-2007  ad branches: 1.11.6; 1.11.10; 1.11.12; 1.11.16;
- Make __cpu_simple_lock and similar real functions and patch at runtime.
- Remove old x86 atomic ops.
- Drop text alignment back to 16 on i386 (really, this time).
- Minor cleanup.
 1.10 29-Nov-2007  ad branches: 1.10.2; 1.10.6;
Drop text alignment back to 16 - the usual size of blocks the instruction
decoder works with.
 1.9 17-Oct-2007  garbled branches: 1.9.2;
Merge the ppcoea-renovation branch to HEAD.

This branch was a major cleanup and rototill of many of the various OEA
cpu based PPC ports that focused on sharing as much code as possible
between the various ports to eliminate near-identical copies of files in
every tree. Additionally there is a new PIC system that unifies the
interface to interrupt code for all different OEA ppc arches. The work
for this branch was done by a variety of people, too long to list here.

TODO:
bebox still needs work to complete the transition to -renovation.
ofppc still needs a bunch of work, which I will be looking at.
ev64260 still needs to be renovated
amigappc was not attempted.

NOTES:
pmppc was removed as an arch, and moved to a evbppc target.
 1.8 29-Aug-2007  ad Merge most x86 changes from the vmlocking branch, except the threaded soft
interrupt stuff. This is mostly comprised of changes to the pmap modules to
work on multiprocessor systems without kernel_lock, and changes to speed up
tlb shootdowns.
 1.7 09-Feb-2007  ad branches: 1.7.6; 1.7.14; 1.7.18; 1.7.22; 1.7.24;
Merge newlock2 to head.
 1.6 05-Sep-2006  ad branches: 1.6.2;
Add an SPLLOWER() macro.
 1.5 20-Jan-2006  christos branches: 1.5.2; 1.5.6;
Add a STRONG_ALIAS macro
 1.4 11-Dec-2005  christos branches: 1.4.2;
merge ktrace-lwp.
 1.3 07-Aug-2003  agc branches: 1.3.16;
Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.2 02-May-2003  yamt branches: 1.2.2;
set symbol to be a function using .type directive in IDTVEC macro
so that ddb backtrace can pick them up after recent ksyms changes.

suggested by Matt Thomas on tech-kern.
ok'ed by Frank van der Linden.
 1.1 26-Apr-2003  fvdl Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.2.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.2.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.2.2.1 03-Aug-2004  skrll Sync with HEAD
 1.3.16.6 21-Jan-2008  yamt sync with head
 1.3.16.5 07-Dec-2007  yamt sync with head
 1.3.16.4 03-Sep-2007  yamt sync with head.
 1.3.16.3 26-Feb-2007  yamt sync with head.
 1.3.16.2 30-Dec-2006  yamt sync with head.
 1.3.16.1 21-Jun-2006  yamt sync with head.
 1.4.2.1 01-Feb-2006  yamt sync with head.
 1.5.6.1 14-Sep-2006  yamt sync with head.
 1.5.2.1 09-Sep-2006  rpaulo sync with head
 1.6.2.3 06-Feb-2007  ad Remove now unused SPLLOWER() macro.
 1.6.2.2 27-Jan-2007  ad If running on a PPro or later, at boot patch in versions of spllower() and
similar that use cmpxchg8b instead of cli/sti. Cuts the clock cycles for
splx() by a factor of ~6 on the P4, and ~3 on the PIII when bracketed by
serializing instructions (and hopefully more when not).
 1.6.2.1 29-Dec-2006  ad Checkpoint work in progress.
 1.7.24.2 09-Jan-2008  matt sync with HEAD
 1.7.24.1 06-Nov-2007  matt sync with HEAD
 1.7.22.2 03-Dec-2007  joerg Sync with HEAD.
 1.7.22.1 03-Sep-2007  jmcneill Sync with HEAD.
 1.7.18.1 03-Sep-2007  skrll Sync with HEAD.
 1.7.14.1 03-Oct-2007  garbled Sync with HEAD
 1.7.6.2 03-Dec-2007  ad Sync with HEAD.
 1.7.6.1 09-Oct-2007  ad Sync with head.
 1.9.2.2 27-Dec-2007  mjf Sync with HEAD.
 1.9.2.1 08-Dec-2007  mjf Sync with HEAD.
 1.10.6.1 02-Jan-2008  bouyer Sync with HEAD
 1.10.2.1 26-Dec-2007  ad Sync with head.
 1.11.16.2 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.11.16.1 19-Oct-2008  haad Sync with HEAD.
 1.11.12.1 10-Oct-2008  skrll Sync with HEAD.
 1.11.10.1 04-May-2009  yamt sync with head.
 1.11.6.2 17-Jan-2009  mjf Sync with HEAD.
 1.11.6.1 05-Oct-2008  mjf Sync with HEAD.
 1.13.16.1 05-Mar-2011  rmind sync with head
 1.13.8.1 10-Jan-2011  jym Sync with HEAD
 1.14.14.1 22-Nov-2012  riz Pull up following revision(s) (requested by chs in ticket #690):
external/cddl/osnet/dev/dtrace/amd64/dtrace_isa.c: revision 1.4
sys/arch/amd64/include/Makefile.inc: revision 1.4
sys/arch/amd64/include/pmap.h: revision 1.33
external/cddl/osnet/dev/dtrace/amd64/dtrace_subr.c: revision 1.6
sys/arch/amd64/include/asm.h: revision 1.15
sys/arch/amd64/amd64/genassym.cf: revision 1.51
external/cddl/osnet/dev/dtrace/amd64/dtrace_asm.S: revision 1.4
make dtrace work on amd64.
allow more space for modules.
 1.14.8.2 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.14.8.1 30-Oct-2012  yamt sync with head
 1.15.2.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.15.2.1 23-Jun-2013  tls resync from head
 1.17.2.1 18-May-2014  rmind sync with head
 1.18.2.1 10-Aug-2014  tls Rebase.
 1.19.38.1 20-Apr-2020  bouyer Sync with HEAD
 1.19.28.1 21-Apr-2020  martin Sync with HEAD
 1.23.2.1 02-Aug-2025  perseant Sync with HEAD
 1.10 20-Dec-2007  ad - Make __cpu_simple_lock and similar real functions and patch at runtime.
- Remove old x86 atomic ops.
- Drop text alignment back to 16 on i386 (really, this time).
- Minor cleanup.
 1.9 17-Oct-2007  garbled branches: 1.9.2; 1.9.4; 1.9.8;
Merge the ppcoea-renovation branch to HEAD.

This branch was a major cleanup and rototill of many of the various OEA
cpu based PPC ports that focused on sharing as much code as possible
between the various ports to eliminate near-identical copies of files in
every tree. Additionally there is a new PIC system that unifies the
interface to interrupt code for all different OEA ppc arches. The work
for this branch was done by a variety of people, too long to list here.

TODO:
bebox still needs work to complete the transition to -renovation.
ofppc still needs a bunch of work, which I will be looking at.
ev64260 still needs to be renovated
amigappc was not attempted.

NOTES:
pmppc was removed as an arch, and moved to a evbppc target.
 1.8 03-Oct-2007  ad Fix the ifdefs to match i386.
 1.7 03-Oct-2007  ad Make the atomics inline unless !__GNUC__.
 1.6 29-Aug-2007  ad branches: 1.6.2;
Merge most x86 changes from the vmlocking branch, except the threaded soft
interrupt stuff. This is mostly comprised of changes to the pmap modules to
work on multiprocessor systems without kernel_lock, and changes to speed up
tlb shootdowns.
 1.5 09-Feb-2007  ad branches: 1.5.6; 1.5.12; 1.5.14; 1.5.18; 1.5.22; 1.5.24;
Merge newlock2 to head.
 1.4 28-Dec-2005  perry branches: 1.4.20;
inline -> __inline
 1.3 24-Dec-2005  perry __asm__ -> __asm
__const__ -> const
__inline__ -> inline
__volatile__ -> volatile
 1.2 24-Dec-2005  perry Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.
 1.1 26-Apr-2003  fvdl branches: 1.1.18;
Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1.18.5 21-Jan-2008  yamt sync with head
 1.1.18.4 27-Oct-2007  yamt sync with head.
 1.1.18.3 03-Sep-2007  yamt sync with head.
 1.1.18.2 26-Feb-2007  yamt sync with head.
 1.1.18.1 21-Jun-2006  yamt sync with head.
 1.4.20.1 29-Dec-2006  ad Checkpoint work in progress.
 1.5.24.2 23-Mar-2008  matt sync with HEAD
 1.5.24.1 06-Nov-2007  matt sync with HEAD
 1.5.22.2 04-Oct-2007  joerg Sync with HEAD.
 1.5.22.1 03-Sep-2007  jmcneill Sync with HEAD.
 1.5.18.1 03-Sep-2007  skrll Sync with HEAD.
 1.5.14.2 16-Oct-2007  garbled Sync with HEAD
 1.5.14.1 03-Oct-2007  garbled Sync with HEAD
 1.5.12.1 18-Apr-2007  thorpej Convert i386 and amd64 to the new atomic ops API.
 1.5.6.2 09-Oct-2007  ad Sync with head.
 1.5.6.1 23-Aug-2007  ad Fix x86_atomic_setbits_u64, which was cheerfully setting every bit except
the one specified!
 1.6.2.1 06-Oct-2007  yamt sync with head.
 1.9.8.1 02-Jan-2008  bouyer Sync with HEAD
 1.9.4.1 26-Dec-2007  ad Sync with head.
 1.9.2.1 27-Dec-2007  mjf Sync with HEAD.
 1.3 18-Oct-2011  dyoung Define some optional routines that will help device_register() to
register ISA & PCI devices. Add stub implementations of the routines.
 1.2 04-Feb-2006  jmmv Revert yesterday's change that attempted to fix the detection of the
boot device when using a Multiboot boot loader. It couldn't work because
these boot loaders do not pass a checksum of the disk so matchbiosdisk()
cannot really find any matches. I should have gone to sleep before
commiting...

Found by xtraeme@.
 1.1 03-Feb-2006  jmmv branches: 1.1.2;
When booting an i386 kernel with Multiboot, properly detect the boot device
by looking it up in the x86_alldisks table (instead of trying to match it
to 'wd*' manually).

In order to do this, move the cpu_rootconf function from x86 common code
to amd64 and i386 specific one. This way, i386 can do an extra step (call
the appropriate Multiboot code) in the appropriate place (after
x86_matchbiosdisks and before findroot()).
 1.1.2.1 22-Apr-2006  simonb Sync with head.
 1.1 26-Apr-2003  fvdl Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1 26-Apr-2003  fvdl Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.3 26-Oct-2008  mrg put the contents of these header files around #ifdef __x86_64__, and
#include the <i386/foo.h> in the #else clause, making these files
largely bit-size independant.
 1.2 31-Jan-2006  dsl branches: 1.2.72; 1.2.76; 1.2.82;
Change sys/arch/xxx/include/bswap.h to #include machine/byte_swap.h then
sys/bswap.h in order to pick up the MD inline routines and the constant
folding definitions in the right order.
Code can include either sys/bswap.h or machine/bswap.h with the same effect.
 1.1 26-Apr-2003  fvdl branches: 1.1.18; 1.1.30;
Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1.30.1 01-Feb-2006  yamt sync with head.
 1.1.18.1 21-Jun-2006  yamt sync with head.
 1.2.82.1 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.2.76.1 04-May-2009  yamt sync with head.
 1.2.72.1 17-Jan-2009  mjf Sync with HEAD.
 1.2 17-Jul-2011  dyoung On amd64, good-bye <machine/bus.h>.

Up next: update set lists.
 1.1 26-Apr-2003  fvdl branches: 1.1.122;
Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1.122.1 27-Aug-2011  jym Add/remove files, like in HEAD.
 1.1 01-Jul-2011  dyoung branches: 1.1.2;
Per discussion at
<http://mail-index.netbsd.org/tech-kern/2010/04/02/msg007941.html>,
divide each machine's bus.h into bus_defs.h (constants & data types)
and bus_funcs.h (macro implementations of bus_space(9) routines and MD
prototypes).

Note that some bus_space(9) routines' implementation will move to .c
files from inline subroutines or macros in .h files.

I've only made the split for machine architectures where there is PCI.
All of the non-PCI-having architectures will require a similar split.

These #include files are not referenced by any (committed) Makefiles or
header files, yet. Changes to Makefiles, to <sys/bus.h>, and to some
more machine-dependent files will dribble in before I throw the switch.
 1.1.2.2 27-Aug-2011  jym Add/remove files, like in HEAD.
 1.1.2.1 01-Jul-2011  jym file bus_defs.h was added on branch jym-xensuspend on 2011-08-27 15:59:48 +0000
 1.1 01-Jul-2011  dyoung branches: 1.1.2;
Per discussion at
<http://mail-index.netbsd.org/tech-kern/2010/04/02/msg007941.html>,
divide each machine's bus.h into bus_defs.h (constants & data types)
and bus_funcs.h (macro implementations of bus_space(9) routines and MD
prototypes).

Note that some bus_space(9) routines' implementation will move to .c
files from inline subroutines or macros in .h files.

I've only made the split for machine architectures where there is PCI.
All of the non-PCI-having architectures will require a similar split.

These #include files are not referenced by any (committed) Makefiles or
header files, yet. Changes to Makefiles, to <sys/bus.h>, and to some
more machine-dependent files will dribble in before I throw the switch.
 1.1.2.2 27-Aug-2011  jym Add/remove files, like in HEAD.
 1.1.2.1 01-Jul-2011  jym file bus_funcs.h was added on branch jym-xensuspend on 2011-08-27 15:59:48 +0000
 1.2 11-Dec-2005  christos merge ktrace-lwp.
 1.1 16-Apr-2005  yamt branches: 1.1.2; 1.1.4; 1.1.12;
add files which i forgot to add with arch/x86/x86/bus_dma.c rev.1.21.
 1.1.12.2 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.1.12.1 16-Apr-2005  skrll file bus_private.h was added on branch ktrace-lwp on 2005-11-10 13:51:35 +0000
 1.1.4.2 29-Apr-2005  kent sync with -current
 1.1.4.1 16-Apr-2005  kent file bus_private.h was added on branch kent-audio2 on 2005-04-29 11:28:00 +0000
 1.1.2.2 21-Apr-2005  tron Pull up revision 1.1 (requested by yamt in ticket #175):
add files which i forgot to add with arch/x86/x86/bus_dma.c rev.1.21.
 1.1.2.1 16-Apr-2005  tron file bus_private.h was added on branch netbsd-3 on 2005-04-21 18:43:01 +0000
 1.8 17-Apr-2021  rillig sys/arch/amd64: remove trailing whitespace
 1.7 14-Jan-2010  joerg Provide inline assembly version of bswap64.
 1.6 26-Oct-2008  mrg branches: 1.6.4; 1.6.8; 1.6.12;
put the contents of these header files around #ifdef __x86_64__, and
#include the <i386/foo.h> in the #else clause, making these files
largely bit-size independant.
 1.5 28-Apr-2008  martin branches: 1.5.6;
Remove clause 3 and 4 from TNF licenses
 1.4 30-Jan-2006  dsl branches: 1.4.72; 1.4.74; 1.4.76;
Move all the stuff that detects bswapxx(constant) into the MI sys/bswap.h
Put the minimum to define the required inline assembler or C into the MD files.
NB: there may be some fallout from this!
 1.3 28-Dec-2005  perry branches: 1.3.2;
inline -> __inline
 1.2 24-Dec-2005  perry Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.
 1.1 26-Apr-2003  fvdl branches: 1.1.18;
Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1.18.1 21-Jun-2006  yamt sync with head.
 1.3.2.1 01-Feb-2006  yamt sync with head.
 1.4.76.3 11-Mar-2010  yamt sync with head
 1.4.76.2 04-May-2009  yamt sync with head.
 1.4.76.1 16-May-2008  yamt sync with head.
 1.4.74.1 18-May-2008  yamt sync with head.
 1.4.72.2 17-Jan-2009  mjf Sync with HEAD.
 1.4.72.1 02-Jun-2008  mjf Sync with HEAD.
 1.5.6.1 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.6.12.1 21-Apr-2010  matt sync to netbsd-5
 1.6.8.1 24-Oct-2010  jym Sync with HEAD
 1.6.4.1 16-Jan-2010  bouyer Pull up following revision(s) (requested by joerg in ticket #1245):
sys/arch/amd64/include/byte_swap.h: revision 1.7
Provide inline assembly version of bswap64.
 1.3 20-Jan-2012  joerg Change CMSG_SPACE and CMSG_LEN to provide Integer Constant Expressions
again. This was changed in sys/socket.h r1.51 to work around fallout
from the IPv6 aux data migration. It broke the historic ABI on some
platforms. This commit restores compatibility for netbsd32 code on such
platforms and provides a template for future changes to the CMSG_*
alignment. Revert PCC/Clang workarounds in postfix and tmux.
 1.2 26-Oct-2008  mrg branches: 1.2.28; 1.2.32;
put the contents of these header files around #ifdef __x86_64__, and
#include the <i386/foo.h> in the #else clause, making these files
largely bit-size independant.
 1.1 26-Apr-2003  fvdl branches: 1.1.104; 1.1.108; 1.1.114;
Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1.114.1 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.1.108.1 04-May-2009  yamt sync with head.
 1.1.104.1 17-Jan-2009  mjf Sync with HEAD.
 1.2.32.1 18-Feb-2012  mrg merge to -current.
 1.2.28.1 17-Apr-2012  yamt sync with head
 1.72 04-Sep-2023  mrg x86: avoid annoying GCC 12 bounds check in curcpu() and curlwp().

these functions read %gs and return an pointer at an offset from this
value (the current cpu, or lwp pointers), and GCC is complaining that
they're accessing a array cpu_info[0] (ie, zero length, no storage.)

several attempts to workaround it have failed, and because of the
asm volatile nature of this, it seems very unlikely a compiler would
take this and do something wrong with it.
 1.71 09-Apr-2023  riastradh amd64: Make curlwp and curcpu() flushable.

The only effect of the `volatile' qualifier on an asm block with
outputs is to force the instructions to appear in the generated code,
even if the outputs end up being unused. Since these instructions
have no (architectural) side effects -- provided %gs is set
correctly, which must be the case here -- there's no need for the
volatile qualifier, so nix it.
 1.70 02-Nov-2021  ryo In order to prevent _mcount() from being recursively called when built with COPTS=-O0,
sprinkle `__always_inline' to make _mcount() be generated as a single function.
 1.69 17-Apr-2021  rillig sys/arch/amd64: remove trailing whitespace
 1.68 17-Mar-2020  maxv Add a redzone between the pcb and the stack. Sent to port-amd64@.
 1.67 08-Dec-2019  maxv Use the inlines; it is actually fine, since the compiler drops the inlines
if the caller is kmsan-instrumented, forcing a white-listing of the memory
access.
 1.66 21-Nov-2019  ad mi_userret(): take care of calling preempt(), set spc_curpriority directly,
and remove MD code that does the same.
 1.65 14-Nov-2019  maxv Add support for Kernel Memory Sanitizer (kMSan). It detects uninitialized
memory used by the kernel at run time, and just like kASan and kCSan, it
is an excellent feature. It has already detected 38 uninitialized variables
in the kernel during my testing, which I have since discreetly fixed.

We use two shadows:
- "shad", to track uninitialized memory with a bit granularity (1:1).
Each bit set to 1 in the shad corresponds to one uninitialized bit of
real kernel memory.
- "orig", to track the origin of the memory with a 4-byte granularity
(1:1). Each uint32_t cell in the orig indicates the origin of the
associated uint32_t of real kernel memory.

The memory consumption of these shadows is consequent, so at least 4GB of
RAM is recommended to run kMSan.

The compiler inserts calls to specific __msan_* functions on each memory
access, to manage both the shad and the orig and detect uninitialized
memory accesses that change the execution flow (like an "if" on an
uninitialized variable).

We mark as uninit several types of memory buffers (stack, pools, kmem,
malloc, uvm_km), and check each buffer passed to copyout, copyoutstr,
bwrite, if_transmit_lock and DMA operations, to detect uninitialized memory
that leaves the system. This allows us to detect kernel info leaks in a way
that is more efficient and also more user-friendly than KLEAK.

Contrary to kASan, kMSan requires comprehensive coverage, ie we cannot
tolerate having one non-instrumented function, because this could cause
false positives. kMSan cannot instrument ASM functions, so I converted
most of them to __asm__ inlines, which kMSan is able to instrument. Those
that remain receive special treatment.

Contrary to kASan again, kMSan uses a TLS, so we must context-switch this
TLS during interrupts. We use different contexts depending on the interrupt
level.

The orig tracks precisely the origin of a buffer. We use a special encoding
for the orig values, and pack together in each uint32_t cell of the orig:
- a code designating the type of memory (Stack, Pool, etc), and
- a compressed pointer, which points either (1) to a string containing
the name of the variable associated with the cell, or (2) to an area
in the kernel .text section which we resolve to a symbol name + offset.

This encoding allows us not to consume extra memory for associating
information with each cell, and produces a precise output, that can tell
for example the name of an uninitialized variable on the stack, the
function in which it was pushed on the stack, and the function where we
accessed this uninitialized variable.

kMSan is available with LLVM, but not with GCC.

The code is organized in a way that is similar to kASan and kCSan, so it
means that other architectures than amd64 can be supported.
 1.64 11-Feb-2019  cherry We reorganise definitions for XEN source support as follows:

XEN - common sources required for baseline XEN support.
XENPV - sources required for support of XEN in PV mode.
XENPVHVM - sources required for support for XEN in HVM mode.
XENPVH - sources required for support for XEN in PVH mode.
 1.63 18-Nov-2018  cherry On Xen, copy just the bits we need from the trapframe for hardclock(9)
and statclock(9).

Current, the macros that use the trapframe are:
CLKF_USERMODE()
CLKF_PC()
CLKF_INTR()

Of these, CLKF_INTR() already ignores the frame and uses the ci_idepth
variable to do its job.

Convert the two remaining ones to do this, but only for XEN.
 1.62 16-Mar-2018  maxv branches: 1.62.2;
Remove the prototypes for cpu_uarea_*, I removed these functions two
minutes ago.
 1.61 17-Sep-2017  maxv branches: 1.61.2;
Remove the second argument from USERMODE and KERNELMODE, it is unused
now that we don't have vm86 anymore.
 1.60 21-Jan-2012  chs branches: 1.60.6; 1.60.40;
allocate uareas contiguously and access them via the direct map.
 1.59 30-Dec-2008  pooka branches: 1.59.14; 1.59.18;
_LKM -> _MODULE
 1.58 26-Oct-2008  mrg branches: 1.58.2;
put the contents of these header files around #ifdef __x86_64__, and
#include the <i386/foo.h> in the #else clause, making these files
largely bit-size independant.
 1.57 22-May-2008  ad branches: 1.57.4;
Mark x86_curlwp() with __attribute__ ((const)), so gcc can CSE it and know
that it does not clobber global data.
 1.56 11-May-2008  ad Wrap stuff in #ifdef _KERNEL
 1.55 11-May-2008  ad Share cpu.h between the x86 ports.
 1.54 11-May-2008  ad Simplify x86 identcpu code, and share between i386/amd64.
 1.53 10-May-2008  ad Improve x86 tsc handling:

- Ditch the cross-CPU calibration stuff. It didn't work properly, and it's
near impossible to synchronize the CPUs in a running system, because bus
traffic will interfere with any calibration attempt, messing up the
timings.

- Only enable the TSC on CPUs where we are sure it does not drift. If we are
On a known good CPU, give the TSC high timecounter quality, making it the
default.

- When booting CPUs, detect TSC skew and account for it. Most Intel MP
systems have synchronized counters, but that need not be true if the
system has a complicated bus structure. As far as I know, AMD systems
do not have synchronized TSCs and so we need to handle skew.

- While an AP is waiting to be set running, try and make the TSC drift by
entering a reduced power state. If we detect drift, ensure that the TSC
does not get a high timecounter quality. This should not happen and is
only for safety.

- Make cpu_counter() stuff LKM safe.
 1.52 09-May-2008  joerg Make cpu_idle a macro calling a function pointer on x86.
Select the Xen idle routine for Xen, mwait if supported by the CPU and
it is not AMD and halt otherwise. As reported by Christoph Egger,
AMD Barcelona keeps the CPU in C0 state with MWAIT, contrary to HLT,
which uses C1 and therefore much less power.
 1.51 30-Apr-2008  ad branches: 1.51.2;
Avoid unneeded AST faults.
 1.50 28-Apr-2008  ad Add support for kernel preeemption to the i386 and amd64 ports. Notes:

- I have seen one isolated panic in the x86 pmap, but otherwise i386
seems stable with preemption enabled.

- amd64 is missing the FPU handling changes and it's not yet safe to
enable it there.

- The usual level for kern.sched.kpreempt_pri will be 128 once enabled
by default. For testing, setting it to 0 helps to shake out bugs.
 1.49 24-Apr-2008  ad branches: 1.49.2;
- Give ci_want_resched a single cache line, and align. This is for monitor/
mwait. At least one errata sheet from Intel notes that a single line
should be used.
- Align cc_microtime.
 1.48 23-Apr-2008  he Ensure that offsetof() is in scope by including <sys/systm.h>.
Fixes build problem found while building swapnetbsd.o for XEN3_DOM0.
 1.47 21-Apr-2008  cegger Access Xen's vcpu info structure per-CPU.
Tested on i386 and amd64 (both dom0 and domU) by me.
Xen2 tested (both dom0 and domU) by bouyer.
OK bouyer
 1.46 16-Apr-2008  cegger branches: 1.46.2;
use POSIX integer types
 1.45 27-Feb-2008  xtraeme Remove CTL_MACHDEP_NAMES, it's not used anywhere.

Ok by martin@.
 1.44 22-Jan-2008  joerg branches: 1.44.2; 1.44.6;
GC i8254_microtime.
 1.43 05-Jan-2008  yamt remove no longer necessary cpu_maxproc.
 1.42 05-Jan-2008  yamt - make amd64 use per-cpu tss.
- fix iopl syscall for amd64+xen.
 1.41 05-Jan-2008  yamt g/c unused members
 1.40 05-Jan-2008  yamt g/c ci_idle_pcb_paddr
 1.39 01-Jan-2008  yamt try to detect processor resource sharing topologies. ie. package/core/smt IDs.
 1.38 25-Dec-2007  perry Convert many of the uses of __attribute__ to equivalent
__packed, __unused and __dead macros from cdefs.h
 1.37 22-Dec-2007  dsl Define 'struct intrframe' in terms of 'struct trapframe' since the two are
assumed to match by a lot of code (including that which saves the regs).
This only slightly reduces the number of places the trapframe register
layout is defined.
 1.36 18-Dec-2007  joerg Add new IPI for saving CPU state explicitly, share high-level part of
ACPI wakeup code and teach it how to start the APs again. As a side
effect the CPU_START interface allows choosing between different
bootstrap codes more easily now.
 1.35 09-Dec-2007  jmcneill branches: 1.35.2;
Merge jmcneill-pm branch.
 1.34 03-Dec-2007  joerg branches: 1.34.2;
Add a CPU local timer based on the LAPIC. This is consistently faster
than TSC, but doesn't suffer from SpeedStep as TSC does.

The default quality is higher than HPET for UP, but -100 for
MULTIPROCESSOR as it needs CPU local state which doesn't exist yet.
 1.33 22-Nov-2007  bouyer branches: 1.33.2;
Pull up the bouyer-xenamd64 branch to HEAD. This brings in amd64 support
to NetBSD/Xen, both Dom0 and DomU.
 1.32 12-Nov-2007  ad - cpu_vendor was both an int and char[] on amd64 - fix it.
- Run the errata check/patch on all CPUs, not just the boot processor.
 1.31 29-Oct-2007  ad branches: 1.31.2;
Mark cpu_info::ci_tlbstate volatile to ensure that the compiler doesn't
reorder accesses to it. It's updated from the TLB IPI handlers and we don't
block those, so the order in which things are read/updated is important.
 1.30 26-Oct-2007  joerg Match delay/DELAY on x86 with delay(9). It takes an unsigned int as
argument. Use this and replace the inline assembly (mul + div using the
64bit intermediate result) with normal 32bit multiplication and
division. The compiler can turn the division into a multiplication and
shift, making it even cheaper then the original assembly. For extreme
long delays, just use 64bit arithmetic.
 1.29 18-Oct-2007  yamt merge yamt-x86pmap branch.

- reduce differences between amd64 and i386. notably, share pmap.c
between them. it makes several i386 pmap improvements available to
amd64, including tlb shootdown reduction and bug fixes from Stephan Uphoff.
- implement deferred pmap switching for amd64.
- remove LARGEPAGES option. always use large pages if available.
also, make it work on amd64.
 1.28 17-Oct-2007  garbled Merge the ppcoea-renovation branch to HEAD.

This branch was a major cleanup and rototill of many of the various OEA
cpu based PPC ports that focused on sharing as much code as possible
between the various ports to eliminate near-identical copies of files in
every tree. Additionally there is a new PIC system that unifies the
interface to interrupt code for all different OEA ppc arches. The work
for this branch was done by a variety of people, too long to list here.

TODO:
bebox still needs work to complete the transition to -renovation.
ofppc still needs a bunch of work, which I will be looking at.
ev64260 still needs to be renovated
amigappc was not attempted.

NOTES:
pmppc was removed as an arch, and moved to a evbppc target.
 1.27 26-Sep-2007  ad branches: 1.27.2;
x86 changes for pcc and LKMs.

- Replace most inline assembly with proper functions. As a side effect
this reduces the size of amd64 GENERIC by about 120kB, and i386 by a
smaller amount. Nearly all of the inlines did something slow, or something
that does not need to be fast.
- Make curcpu() and curlwp functions proper, unless __GNUC__ && _KERNEL.
In that case make them inlines. Makes curlwp LKM and preemption safe.
- Make bus_space and bus_dma more LKM friendly.
- Share a few more files between the ports.
- Other minor changes.
 1.26 25-Sep-2007  ad ci_astpending is no more.
 1.25 29-Aug-2007  ad branches: 1.25.2;
Merge most x86 changes from the vmlocking branch, except the threaded soft
interrupt stuff. This is mostly comprised of changes to the pmap modules to
work on multiprocessor systems without kernel_lock, and changes to speed up
tlb shootdowns.
 1.24 21-May-2007  fvdl branches: 1.24.4; 1.24.8; 1.24.10;
Revert fs/gs changes until I figure out issues with them.
 1.23 17-May-2007  yamt merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.
 1.22 11-May-2007  fvdl Don't save/restore %fs and %gs in trapframe. The kernel won't touch them.
Instead, save/restore them on context switch. For 32bit processes, save/restore
the selector values only, for 64bit processes, save/restore the appropriate
MSRs. Iff the defaults have been changed.
 1.21 21-Mar-2007  xtraeme branches: 1.21.4;
- Remove ci_msr_rvalue, it's not useful anymore as yamt@ pointed out.
- Remove completely debug from msr_ipifuncs, now it's known to work.
 1.20 20-Mar-2007  xtraeme MSR read and write IPI handlers for x86. A MSR will be read or written
in all CPUs available in the system. This adds another member
to struct cpu_info, ci_msr_rvalue; it will contain the value of the MSR
in a previous operation.

Tested with clockmod in UP and SMP by me, tested with est in SMP
by Daniel Carosone and Michael Van Elst.

Ok'ed by Andrew Doran and Matthew R. Green.
 1.19 16-Mar-2007  xtraeme struct cpu_info: add a ci_feature2_flags member.
identcpu: print extended cpuid features with ci_feature2_flags.

"Looks good" by christos and njoly.
 1.18 16-Mar-2007  xtraeme Remove __P(), remove k8_powernow_init proto... it was moved to
x86/include/powernow.h long time ago.
 1.17 12-Mar-2007  ad branches: 1.17.2; 1.17.4;
Include sys/simplelock.h, not sys/lock.h.
 1.16 05-Mar-2007  drochner branches: 1.16.2;
clean up how cpus and ioapics are attached at the mainbus:
Seperate "cpubus" and "ioapicbus" -- while they share a common "address
space" (the apic id), the kernel doesn't use this fact. There are different
data passed to cpus and apics, which caused some ugly polymorphism. This
also saves the special "submatch" functions needed to distingush cpus
and ioapics for autoconf. (And it makes that "apid" locators wired
in the kernel configuration are honored now; this allows one to dumb down
an mp box to singleprocessor by userconfig.)
Print "apid" locators in the buses "print" function "as everyone does",
so the per-port cpu drivers don't need to do it.
Being here, constify "struct cpu_functions" and g/c the unused MP_PICMODE
flag.
 1.15 04-Mar-2007  christos Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.
 1.14 16-Feb-2007  ad branches: 1.14.2;
Remove spllowersoftclock() and CLKF_BASEPRI(), and always dispatch callouts
via a soft interrupt. In the near future, softclock will be run from process
context.
 1.13 09-Feb-2007  ad Merge newlock2 to head.
 1.12 06-Aug-2006  xtraeme branches: 1.12.4; 1.12.8; 1.12.10;
AMD PowerNow!/Cool`n'Quiet driver for NetBSD/amd64,
adapted from OpenBSD.

Tested on a few machines:

http://bigbird.dohd.org:3021/NetBSD/dmesg
http://www.bsd.org.il/netbsd/acpi/dmesg

Thanks to cube, elad and others for testing and fixes.

Enabled by default on GENERIC.
 1.11 07-Jun-2006  kardel convert to timecounters (from branch simonb-timecounters)
 1.10 06-Mar-2006  cube branches: 1.10.6;
delay() is gone, so don't declare it. That way other parts of code that
use a variable named delay (say, netinet6/in6.c) won't shadow something
that doesn't exist anyway.
 1.9 24-Dec-2005  perry branches: 1.9.4; 1.9.6; 1.9.8;
bare asm -> __asm
 1.8 11-Dec-2005  christos merge ktrace-lwp.
 1.7 11-Aug-2005  cube Change all archs that did:

#define clockframe somethingelse

to:

struct clockframe {
struct somethingelse cf_se;
};

and change access macros accordingly.

That means that, at least for that very issue, things will not go
ka-boomy if you don't have the actual definition of struct clockframe
before including systm.h.
 1.6 25-Sep-2004  yamt branches: 1.6.12;
don't expose cpu_info and friends to userland.
 1.5 25-Sep-2004  yamt fix a typo in a comment.
 1.4 22-Sep-2004  yamt move some per-cpu data definitions to MI place so that they can be modified
without touching all ports. discussed on tech-kern@.
 1.3 30-Dec-2003  pk Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.
 1.2 07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.1 26-Apr-2003  fvdl branches: 1.1.2;
Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1.2.6 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.1.2.5 19-Oct-2004  skrll Sync with HEAD
 1.1.2.4 24-Sep-2004  skrll Sync with HEAD.
 1.1.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.1.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.1.2.1 03-Aug-2004  skrll Sync with HEAD
 1.6.12.10 17-Mar-2008  yamt sync with head.
 1.6.12.9 04-Feb-2008  yamt sync with head.
 1.6.12.8 21-Jan-2008  yamt sync with head
 1.6.12.7 07-Dec-2007  yamt sync with head
 1.6.12.6 15-Nov-2007  yamt sync with head.
 1.6.12.5 27-Oct-2007  yamt sync with head.
 1.6.12.4 03-Sep-2007  yamt sync with head.
 1.6.12.3 26-Feb-2007  yamt sync with head.
 1.6.12.2 30-Dec-2006  yamt sync with head.
 1.6.12.1 21-Jun-2006  yamt sync with head.
 1.9.8.3 11-Aug-2006  yamt sync with head
 1.9.8.2 26-Jun-2006  yamt sync with head.
 1.9.8.1 13-Mar-2006  yamt sync with head.
 1.9.6.2 30-Apr-2006  kardel initrtclock now get the frequency as argument
 1.9.6.1 22-Apr-2006  simonb Sync with head.
 1.9.4.1 09-Sep-2006  rpaulo sync with head
 1.10.6.1 19-Jun-2006  chap Sync with head.
 1.12.10.1 03-Sep-2007  wrstuden Sync w/ NetBSD-4-RC_1
 1.12.8.1 05-Jun-2007  bouyer Pull up following revision(s) (requested by xtraeme in ticket 702):
sys/arch/amd64/amd64/identcpu.c patch
sys/arch/amd64/include/cpu.h patch
sys/arch/x86/include/cputypes.h 1.1
Print all extended features for Intel EM64T CPUs on amd64.
 1.12.4.4 27-Jan-2007  ad If running on a PPro or later, at boot patch in versions of spllower() and
similar that use cmpxchg8b instead of cli/sti. Cuts the clock cycles for
splx() by a factor of ~6 on the P4, and ~3 on the PIII when bracketed by
serializing instructions (and hopefully more when not).
 1.12.4.3 11-Jan-2007  ad Checkpoint work in progress.
 1.12.4.2 17-Nov-2006  ad Checkpoint work in progress.
 1.12.4.1 20-Oct-2006  ad - Make ASTs per-LWP.
- The signal stack and signal mask will be per-LWP shortly.

need_resched(), need_proftick(), signotify():

- Prefix with cpu_
- Make per-LWP.
- Send an IPI if the LWP is on another CPU.
 1.14.2.8 17-May-2007  yamt sync with head.
 1.14.2.7 28-Mar-2007  skrll More merge botch fixes.
 1.14.2.6 28-Mar-2007  skrll Remove committed conflicts.
 1.14.2.5 24-Mar-2007  yamt sync with head.
 1.14.2.4 23-Mar-2007  skrll Remove switch_exit declaration.
 1.14.2.3 17-Mar-2007  rmind Backport lock.h split into the simplelock.h and other #include changes
from HEAD. This fixes the problems with circular includes.
 1.14.2.2 12-Mar-2007  rmind Sync with HEAD.
 1.14.2.1 03-Mar-2007  yamt adapt amd64.

XXX changes in identcpu.c is minmum for MONITOR.
XXX identcpu.c should be shared with i386.
 1.16.2.9 03-Dec-2007  ad Sync with HEAD.
 1.16.2.8 23-Oct-2007  ad Sync with head.
 1.16.2.7 09-Oct-2007  ad Sync with head.
 1.16.2.6 23-Aug-2007  ad Fix some more bugs.
 1.16.2.5 23-Aug-2007  ad Merged x86 cpu.c.
 1.16.2.4 21-Aug-2007  ad amd64 changes, as yet untested:

- Adapt to vmlocking branch.
- Apply TLB shootdown and pv allocation changes to the pmap.
- Make it build.
 1.16.2.3 27-May-2007  ad Sync with head.
 1.16.2.2 10-Apr-2007  ad Sync with head.
 1.16.2.1 13-Mar-2007  ad Sync with head.
 1.17.4.1 18-Mar-2007  reinoud First attempt to bring branch in sync with HEAD
 1.17.2.1 11-Jul-2007  mjf Sync with head.
 1.21.4.2 03-Oct-2007  garbled Sync with HEAD
 1.21.4.1 22-May-2007  matt Update to HEAD.
 1.24.10.3 23-Mar-2008  matt sync with HEAD
 1.24.10.2 09-Jan-2008  matt sync with HEAD
 1.24.10.1 06-Nov-2007  matt sync with HEAD
 1.24.8.9 09-Dec-2007  jmcneill Sync with HEAD.
 1.24.8.8 27-Nov-2007  joerg Sync with HEAD. amd64 Xen support needs testing.
 1.24.8.7 14-Nov-2007  joerg Sync with HEAD.
 1.24.8.6 29-Oct-2007  joerg Sync with HEAD.
 1.24.8.5 28-Oct-2007  joerg Sync with HEAD.
 1.24.8.4 28-Oct-2007  joerg Make the reset of FS/GS base in cpu_init_msrs optional. We don't want
that in the ACPI resume path.
 1.24.8.3 26-Oct-2007  joerg Sync with HEAD.

Follow the merge of pmap.c on i386 and amd64 and move
pmap_init_tmp_pgtbl into arch/x86/x86/pmap.c. Modify the ACPI wakeup
code to restore CR4 before jumping back into kernel space as the large
page option might cover that.
 1.24.8.2 02-Oct-2007  joerg Sync with HEAD.
 1.24.8.1 03-Sep-2007  jmcneill Sync with HEAD.
 1.24.4.1 03-Sep-2007  skrll Sync with HEAD.
 1.25.2.2 06-Oct-2007  yamt sync with head.
 1.25.2.1 30-Sep-2007  yamt implement deferred pmap switching for amd64, and make amd64 use
x86 shared pmap code. it makes several i386 pmap improvements available
to amd64, including tlb shootdown reduction and bug fixes from Stephan Uphoff.
 1.27.2.4 13-Nov-2007  bouyer catch up with changes in HEAD.
 1.27.2.3 13-Nov-2007  bouyer Sync with HEAD
 1.27.2.2 25-Oct-2007  bouyer Sync with HEAD.
 1.27.2.1 17-Oct-2007  bouyer amd64 (aka x86-64) support for Xen. Based on the OpenBSD port done by
Mathieu Ropert in 2006.
DomU-only for now. An INSTALL_XEN3_DOMU kernel with a ramdisk will boot to
sysinst if you're lucky. Often it panics because a runable LWP has
a NULL stack (really, it's all of l->l_addr which is has been zeroed out
while the process was on the queue !)
TODO:
- bug fixes :)
- Most of the xpq_* functions should be shared with xen/i386
- The xen/i386 assembly bootstrap code should be remplaced with the C
version in xenamd64/amd64/xpmap.c
- see if a config(5) trick could allow to merge xenamd64 back to xen or amd64.
 1.31.2.4 18-Feb-2008  mjf Sync with HEAD.
 1.31.2.3 27-Dec-2007  mjf Sync with HEAD.
 1.31.2.2 08-Dec-2007  mjf Sync with HEAD.
 1.31.2.1 19-Nov-2007  mjf Sync with HEAD.
 1.33.2.2 26-Dec-2007  ad Sync with head.
 1.33.2.1 08-Dec-2007  ad Sync with head.
 1.34.2.1 11-Dec-2007  yamt sync with head.
 1.35.2.3 23-Jan-2008  bouyer Sync with HEAD.
 1.35.2.2 08-Jan-2008  bouyer Sync with HEAD
 1.35.2.1 02-Jan-2008  bouyer Sync with HEAD
 1.44.6.3 17-Jan-2009  mjf Sync with HEAD.
 1.44.6.2 02-Jun-2008  mjf Sync with HEAD.
 1.44.6.1 03-Apr-2008  mjf Sync with HEAD.
 1.44.2.1 24-Mar-2008  keiichi sync with head.
 1.46.2.2 04-Jun-2008  yamt sync with head
 1.46.2.1 18-May-2008  yamt sync with head.
 1.49.2.2 04-May-2009  yamt sync with head.
 1.49.2.1 16-May-2008  yamt sync with head.
 1.51.2.1 23-Jun-2008  wrstuden Sync w/ -current. 34 merge conflicts to follow.
 1.57.4.1 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.58.2.1 19-Jan-2009  skrll Sync with HEAD.
 1.59.18.1 18-Feb-2012  mrg merge to -current.
 1.59.14.1 17-Apr-2012  yamt sync with head
 1.60.40.1 17-Mar-2018  martin Pull up the following revisions, requested by maxv in ticket #637:

sys/arch/amd64/amd64/process_machdep.c 1.33,1.34,1.35 (patch)
sys/arch/amd64/include/types.h 1.55 (patch)
sys/arch/x86/x86/vm_machdep.c 1.33 (patch)

- Reduce the number of places where segment register faults can
occur.
- Remove __HAVE_CPU_UAREA_ROUTINES.
 1.60.6.1 03-Dec-2017  jdolecek update from HEAD
 1.61.2.2 26-Nov-2018  pgoyette Sync with HEAD, resolve a couple of conflicts
 1.61.2.1 22-Mar-2018  pgoyette Synch with HEAD, resolve conflicts
 1.62.2.2 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.62.2.1 10-Jun-2019  christos Sync with HEAD
 1.7 10-May-2008  ad Merge cpu_counter.h.
 1.6 28-Apr-2008  martin branches: 1.6.2;
Remove clause 3 and 4 from TNF licenses
 1.5 17-Oct-2007  garbled branches: 1.5.16; 1.5.18; 1.5.20;
Merge the ppcoea-renovation branch to HEAD.

This branch was a major cleanup and rototill of many of the various OEA
cpu based PPC ports that focused on sharing as much code as possible
between the various ports to eliminate near-identical copies of files in
every tree. Additionally there is a new PIC system that unifies the
interface to interrupt code for all different OEA ppc arches. The work
for this branch was done by a variety of people, too long to list here.

TODO:
bebox still needs work to complete the transition to -renovation.
ofppc still needs a bunch of work, which I will be looking at.
ev64260 still needs to be renovated
amigappc was not attempted.

NOTES:
pmppc was removed as an arch, and moved to a evbppc target.
 1.4 07-Jul-2007  tsutsui branches: 1.4.10;
Move x86 common cpu_counter functions into <x86/cpu_counter.h>.
 1.3 16-Feb-2006  perry branches: 1.3.24; 1.3.26; 1.3.32;
Change "inline" back to "__inline" in .h files -- C99 is still too
new, and some apps compile things in C89 mode. C89 keywords stay.

As per core@.
 1.2 24-Dec-2005  perry branches: 1.2.2; 1.2.4; 1.2.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.
 1.1 26-Apr-2003  fvdl branches: 1.1.18;
Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1.18.1 03-Sep-2007  yamt sync with head.
 1.2.6.1 22-Apr-2006  simonb Sync with head.
 1.2.4.1 09-Sep-2006  rpaulo sync with head
 1.2.2.1 18-Feb-2006  yamt sync with head.
 1.3.32.1 03-Oct-2007  garbled Sync with HEAD
 1.3.26.1 11-Jul-2007  mjf Sync with head.
 1.3.24.1 15-Jul-2007  ad Sync with head.
 1.4.10.1 06-Nov-2007  matt sync with HEAD
 1.5.20.1 16-May-2008  yamt sync with head.
 1.5.18.1 18-May-2008  yamt sync with head.
 1.5.16.1 02-Jun-2008  mjf Sync with HEAD.
 1.6.2.1 23-Jun-2008  wrstuden Sync w/ -current. 34 merge conflicts to follow.
 1.1 27-Feb-2016  tls branches: 1.1.2; 1.1.18;
Add cpu_rng, a framework for simple on-CPU random number generators.
 1.1.18.2 03-Dec-2017  jdolecek update from HEAD
 1.1.18.1 27-Feb-2016  jdolecek file cpu_rng.h was added on branch tls-maxphys on 2017-12-03 11:35:47 +0000
 1.1.2.2 19-Mar-2016  skrll Sync with HEAD
 1.1.2.1 27-Feb-2016  skrll file cpu_rng.h was added on branch nick-nhusb on 2016-03-19 11:29:54 +0000
 1.20 30-Nov-2020  bouyer Introduce smap_enable()/smap_disable() functions, to be used from
C code.
 1.19 17-Oct-2007  garbled branches: 1.19.120;
Merge the ppcoea-renovation branch to HEAD.

This branch was a major cleanup and rototill of many of the various OEA
cpu based PPC ports that focused on sharing as much code as possible
between the various ports to eliminate near-identical copies of files in
every tree. Additionally there is a new PIC system that unifies the
interface to interrupt code for all different OEA ppc arches. The work
for this branch was done by a variety of people, too long to list here.

TODO:
bebox still needs work to complete the transition to -renovation.
ofppc still needs a bunch of work, which I will be looking at.
ev64260 still needs to be renovated
amigappc was not attempted.

NOTES:
pmppc was removed as an arch, and moved to a evbppc target.
 1.18 26-Sep-2007  ad x86 changes for pcc and LKMs.

- Replace most inline assembly with proper functions. As a side effect
this reduces the size of amd64 GENERIC by about 120kB, and i386 by a
smaller amount. Nearly all of the inlines did something slow, or something
that does not need to be fast.
- Make curcpu() and curlwp functions proper, unless __GNUC__ && _KERNEL.
In that case make them inlines. Makes curlwp LKM and preemption safe.
- Make bus_space and bus_dma more LKM friendly.
- Share a few more files between the ports.
- Other minor changes.
 1.17 21-May-2007  fvdl branches: 1.17.8; 1.17.10; 1.17.12;
Revert fs/gs changes until I figure out issues with them.
 1.16 11-May-2007  fvdl Don't save/restore %fs and %gs in trapframe. The kernel won't touch them.
Instead, save/restore them on context switch. For 32bit processes, save/restore
the selector values only, for 64bit processes, save/restore the appropriate
MSRs. Iff the defaults have been changed.
 1.15 04-Mar-2007  christos branches: 1.15.2; 1.15.4; 1.15.10;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.
 1.14 09-Feb-2007  ad branches: 1.14.2;
Merge newlock2 to head.
 1.13 14-Jan-2007  ad .. but only if _KERNEL is defined.
 1.12 14-Jan-2007  ad On second thought, implement x86_pause() as a regular function. The small
delay from the call is useful for spinlock backoff.
 1.11 12-Jan-2007  ad x86_pause(): do issue the 'pause' instruction, for EMT64 CPUs.
 1.10 01-Jan-2007  ad Report on and where possible, try to work around some of the known errata
for Athlon 64 and Opteron processors. Tested briefly by cube@ and elad@.
 1.9 26-Aug-2006  ad branches: 1.9.2;
Add x86_sfence(), x86_mfence().
 1.8 19-Aug-2006  dsl Fix the amd64 INSTALL kernel (builds of machdep.c with -Os and -O3).
Load the idt with non-randmon data.
 1.7 28-Dec-2005  perry branches: 1.7.4; 1.7.8; 1.7.18;
inline -> __inline
 1.6 24-Dec-2005  perry Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.
 1.5 11-Dec-2005  christos merge ktrace-lwp.
 1.4 14-Jan-2004  yamt branches: 1.4.14; 1.4.16;
issue memory read barrier for BUS_DMASYNC_POSTREAD operation.
PR/21665 from Stephan Uphoff.
 1.3 08-May-2003  fvdl branches: 1.3.2;
Move x86_pause() out of ifdef _KERNEL.
 1.2 08-May-2003  fvdl Add x86_pause() inline function, containing the "pause" instruction
for i386, and nothing for amd64. Sprinkle it in various spinloops,
as recommended by Intel.
 1.1 26-Apr-2003  fvdl Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.3.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.3.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.3.2.1 03-Aug-2004  skrll Sync with HEAD
 1.4.16.5 27-Oct-2007  yamt sync with head.
 1.4.16.4 03-Sep-2007  yamt sync with head.
 1.4.16.3 26-Feb-2007  yamt sync with head.
 1.4.16.2 30-Dec-2006  yamt sync with head.
 1.4.16.1 21-Jun-2006  yamt sync with head.
 1.4.14.1 28-Sep-2008  jdc Pull up revisions:
sys/arch/amd64/include/cpufunc.h patch
sys/arch/i386/include/cpufunc.h patch
sys/arch/x86/x86/bus_dma.c 1.45 via patch
requested by bouyer in ticket 1945.
 1.7.18.1 27-Aug-2006  riz Pull up following revision(s) (requested by dsl in ticket #68):
sys/arch/amd64/include/cpufunc.h: revision 1.8
Fix the amd64 INSTALL kernel (builds of machdep.c with -Os and -O3).
Load the idt with non-randmon data.
 1.7.8.1 03-Sep-2006  yamt sync with head.
 1.7.4.1 09-Sep-2006  rpaulo sync with head
 1.9.2.4 01-Feb-2007  ad Sync with head.
 1.9.2.3 27-Jan-2007  ad If running on a PPro or later, at boot patch in versions of spllower() and
similar that use cmpxchg8b instead of cli/sti. Cuts the clock cycles for
splx() by a factor of ~6 on the P4, and ~3 on the PIII when bracketed by
serializing instructions (and hopefully more when not).
 1.9.2.2 12-Jan-2007  ad Sync with head.
 1.9.2.1 29-Dec-2006  ad Checkpoint work in progress.
 1.14.2.2 17-May-2007  yamt sync with head.
 1.14.2.1 12-Mar-2007  rmind Sync with HEAD.
 1.15.10.2 03-Oct-2007  garbled Sync with HEAD
 1.15.10.1 22-May-2007  matt Update to HEAD.
 1.15.4.1 11-Jul-2007  mjf Sync with head.
 1.15.2.1 09-Oct-2007  ad Sync with head.
 1.17.12.1 06-Oct-2007  yamt sync with head.
 1.17.10.1 06-Nov-2007  matt sync with HEAD
 1.17.8.1 02-Oct-2007  joerg Sync with HEAD.
 1.19.120.1 14-Dec-2020  thorpej Sync w/ HEAD.
 1.2 09-Feb-2007  ad branches: 1.2.4;
Merge newlock2 to head.
 1.1 12-Jan-2007  ad branches: 1.1.2;
file cputypes.h was initially added on branch newlock2.
 1.1.2.1 12-Jan-2007  ad Sync with head.
 1.2.4.2 26-Feb-2007  yamt sync with head.
 1.2.4.1 09-Feb-2007  yamt file cputypes.h was added on branch yamt-lazymbuf on 2007-02-26 09:05:43 +0000
 1.1 26-Apr-2003  fvdl Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.5 04-Dec-2024  alnsn PTE_BASE is defined in <machine/pmap_private.h>.
 1.4 10-Sep-2020  maxv branches: 1.4.26;
kcsan: fix the copyright notices
 1.3 08-Nov-2019  maxv branches: 1.3.8;
Exclude the PTE space from KCSAN, since there the same VA can point to
different PAs.
 1.2 06-Nov-2019  maxv Change kcsan_md_is_avail() to always return true; I was testing with
interrupts disabled as debugging. Change the delay/sample parameters
to have better fluidity.
 1.1 05-Nov-2019  maxv Add Kernel Concurrency Sanitizer (kCSan) support. This sanitizer allows us
to detect race conditions at runtime. It is a variation of TSan that is
easy to implement and more suited to kernel internals, albeit theoretically
less precise than TSan's happens-before.

We do basically two things:

- On every KCSAN_NACCESSES (=2000) memory accesses, we create a cell
describing the access, and delay the calling CPU (10ms).

- On all memory accesses, we verify if the memory we're reading/writing
is referenced in a cell already.

The combination of the two means that, if for example cpu0 does a read that
is selected and cpu1 does a write at the same address, kCSan will fire,
because cpu1's write collides with cpu0's read cell.

The coverage of the instrumentation is the same as that of kASan. Also, the
code is organized in a way similar to kASan, so it is easy to add support
for more architectures than amd64. kCSan is compatible with KCOV.

Reviewed by Kamil.
 1.3.8.2 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.3.8.1 08-Nov-2019  martin file csan.h was added on branch phil-wifi on 2020-04-13 08:03:30 +0000
 1.4.26.1 02-Aug-2025  perseant Sync with HEAD
 1.17 17-Apr-2021  rillig sys/arch/amd64: remove trailing whitespace
 1.16 06-Nov-2017  christos Cleanup and clarify the ELFSIZE mess:

We now have 2 variables automatically set in elf_machdep.h:

ARCH_ELFSIZE: the size for userland binaries
KERN_ELFSIZE: the size for the kernel binaries

DB_ELFSIZE has been deleted and KERN_ELFSIZE should have always the
same values DB_ELFSIZE used to have.

In sys/exec_elf.h, if ELFSIZE is not set, it is set to KERN_ELFSIZE
for the kernel and ARCH_ELFSIZE for userland. These defaults should
eliminate the need for most manual ELFSIZE setting.
 1.15 26-Jul-2015  mrg properly copy regs for kgdb, and define the number of registers properly.
from openbsd via Vicente Chaves and PR port-amd64/50091.
 1.14 17-Oct-2013  christos branches: 1.14.6;
add missing _
 1.13 17-Oct-2013  christos we need to return something here.
 1.12 17-Oct-2013  christos use the parameter for instruction macros
 1.11 26-May-2011  joerg branches: 1.11.4; 1.11.14; 1.11.18;
Introduce DDB_EXPR_FMT and replace the logic around DB_EXPR_T_IS_QUAD.
 1.10 10-Apr-2011  christos Merge db_trace for x86. From: Vladimir Kirillov proger at wilab dot org dot ua
 1.9 14-Mar-2009  dsl branches: 1.9.4; 1.9.6;
Remove all the __P() from sys (excluding sys/dist)
Diff checked with grep and MK1 eyeball.
i386 and amd64 GENERIC and sys still build.
 1.8 11-Mar-2009  yamt add a missing _KERNEL_OPT ifdef.
 1.7 26-Oct-2008  mrg branches: 1.7.2; 1.7.8;
put the contents of these header files around #ifdef __x86_64__, and
#include the <i386/foo.h> in the #else clause, making these files
largely bit-size independant.
 1.6 21-Feb-2007  thorpej branches: 1.6.42; 1.6.46; 1.6.52;
Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.
 1.5 26-Jun-2006  christos branches: 1.5.10;
define PC_ADVANCE to avoid a LHS cast which makes gcc4 unhappy.
From Kurt Schreiner
 1.4 01-Apr-2006  cherry branches: 1.4.4;
closes: PR kern/32359

modifies machine/db_machdep.h: BKPT_SET(inst) to BKPT_SET(inst, addr) for all archs ie; passess the
breakpoint address as well.

Patch from cherry@mahiti.org
 1.3 23-Jun-2003  martin branches: 1.3.18; 1.3.32; 1.3.34; 1.3.36; 1.3.38; 1.3.40;
Make sure to include opt_foo.h if a defflag option FOO is used.
 1.2 29-Apr-2003  scw Add a BKPT_ADDR() macro which gives MD code a chance to munge a
breakpoint address before it's used. Currently a no-op on all but sh5.

This is useful on sh5, for example, to mask off the instruction
type encoding in the bottom two address bits, and makes it possible
to do "db> break $rXX" instead of manually munging the address.
 1.1 26-Apr-2003  fvdl Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.3.40.1 24-May-2006  tron Merge 2006-05-24 NetBSD-current into the "peter-altq" branch.
 1.3.38.1 19-Apr-2006  elad sync with head - hopefully this will work
 1.3.36.2 11-Aug-2006  yamt sync with head
 1.3.36.1 11-Apr-2006  yamt sync with head
 1.3.34.1 22-Apr-2006  simonb Sync with head.
 1.3.32.1 09-Sep-2006  rpaulo sync with head
 1.3.18.3 26-Feb-2007  yamt sync with head.
 1.3.18.2 30-Dec-2006  yamt sync with head.
 1.3.18.1 21-Jun-2006  yamt sync with head.
 1.4.4.1 13-Jul-2006  gdamore Merge from HEAD.
 1.5.10.1 27-Feb-2007  yamt - sync with head.
- move sched_changepri back to kern_synch.c as it doesn't know PPQ anymore.
 1.6.52.1 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.6.46.1 04-May-2009  yamt sync with head.
 1.6.42.1 17-Jan-2009  mjf Sync with HEAD.
 1.7.8.4 27-Aug-2011  jym Sync with HEAD. Most notably: uvm/pmap work done by rmind@, and MP Xen
work of cherry@.

No regression observed on suspend/restore.
 1.7.8.3 02-May-2011  jym Sync with head.
 1.7.8.2 01-Nov-2009  jym Sync with HEAD.
 1.7.8.1 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.7.2.1 28-Apr-2009  skrll Sync with HEAD.
 1.9.6.1 06-Jun-2011  jruoho Sync with HEAD.
 1.9.4.2 31-May-2011  rmind sync with head
 1.9.4.1 21-Apr-2011  rmind sync with head
 1.11.18.1 18-May-2014  rmind sync with head
 1.11.14.2 03-Dec-2017  jdolecek update from HEAD
 1.11.14.1 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.11.4.1 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.14.6.1 22-Sep-2015  skrll Sync with HEAD
 1.10 30-Aug-2011  bouyer Add getlabelusesmbr(), as proposed in
http://mail-index.netbsd.org/tech-userlevel/2011/08/25/msg005404.html
This is used by disk tools such as disklabel(8) to dynamically decide is
the undelyling platform uses a disklabel-in-mbr-partition or not
(instead of using a compile-time list of ports).
getlabelusesmbr() reads the sysctl kern.labelusesmbr, takes its value from the
machdep #define LABELUSESMBR.
For evbmips, make LABELUSESMBR 1 if the platform uses pmon
as bootloader, and 0 (the previous value) otherwise.
 1.9 23-Nov-2009  pooka If cpu_disklabel includes struct dkbad, define __HAVE_DISKLABEL_DKBAD.
This allows use of subr_disk_mbr on all archs. Default to it for
the rump disk component. No functional change for regular kernels.
(The other option would've been to include dkbad in disklabels
everywhere, but arguably this approach has less possible side-effects,
especially given that wedges and related magic will take over the
world any second now).
 1.8 28-Oct-2008  mrg branches: 1.8.6;
if HAVE_NBTOOL_CONFIG_H is defined, also use the contents of this
file, not <i386/disklabel.h>.

XXX: the way tools/disklabel builds is gross.
 1.7 26-Oct-2008  mrg branches: 1.7.2;
put the contents of these header files around #ifdef __x86_64__, and
#include the <i386/foo.h> in the #else clause, making these files
largely bit-size independant.
 1.6 11-Dec-2005  christos branches: 1.6.74; 1.6.78; 1.6.84;
merge ktrace-lwp.
 1.5 12-Jun-2005  dyoung Make disklabel(8) and fdisk(8) into "host tools " last step: build
and install ${TOOLDIR}/bin/${MACHINE_GNU_PLATFORM}-disklabel,
${TOOLDIR}/bin/${MACHINE_GNU_PLATFORM}-fdisk by "reaching over" to
the sources in ${NETBSDSRCDIR}/sbin/{disklabel fdisk}/.

To avoid clashes with a build-host's header files, especially on
*BSD, the host-tools versions of fdisk and disklabel search for
#includes such as disklabel.h, disklabel_acorn.h, disklabel_gpt.h,
and bootinfo.h in a new #includes namespace, nbinclude/. That is,
they #include <nbinclude/sys/disklabel.h>, <nbinclude/machine/disklabel.h>,
<nbinclude/sparc64/disklabel.h>, instead of <sys/disklabel.h> and
such. I have also updated the system headers to #include from
nbinclude/-space when HAVE_NBTOOL_CONFIG_H is #defined.
 1.4 08-Oct-2003  lukem Overhaul MBR handling (part 1):

<sys/bootblock.h>:
* Added definitions for the Master Boot Record (MBR) used by
a variety of systems (primarily i386), including the format
of the BIOS Parameter Block (BPB).
This information was cribbed from a variety of sources
including <sys/disklabel_mbr.h> which this is a superset of.

As part of this, some data structure elements and #defines
were renamed to be more "namespace friendly" and consistent
with other bootblocks and MBR documentation.
Update all uses of the old names to the new names.

<sys/disklabel_mbr.h>:
* Deprecated in favor of <sys/bootblock.h> (the latter is more
"host tool" friendly).

amd64 & i386:
* Renamed /usr/mdec/bootxx_dosfs to /usr/mdec/bootxx_msdos, to
be consistent with the naming convention of the msdosfs tools.

* Removed /usr/mdec/bootxx_ufs, as it's equivalent to bootxx_ffsv1
and it's confusing to have two functionally equivalent bootblocks,
especially given that "ufs" has multiple meanings (it could be
a synonym for "ffs", or the group of ffs/lfs/ext2fs file systems).

* Rework pbr.S (the first sector of bootxx_*):
+ Ensure that BPB (bytes 11..89) and the partition table
(bytes 446..509) do not contain code.
+ Add support for booting from FAT partitions if BOOT_FROM_FAT
is defined. (Only set for bootxx_msdos).
+ Remove "dummy" partition 3; if people want to installboot(8)
these to the start of the disk they can use fdisk(8) to
create a real MBR partition table...
+ Compile with TERSE_ERROR so it fits because of the above.
Whilst this is less user friendly, I feel it's important
to have a valid partition table and BPB in the MBR/PBR.

* Renamed /usr/mdec/biosboot to /usr/mdec/boot, to be consistent
with other platforms.

* Enable SUPPORT_DOSFS in /usr/mdec/boot (stage2), so that
we can boot off FAT partitions.

* Crank version of /usr/mdec/boot to 3.1, and fix some of the other
entries in the version file.

installboot(8) (i386):
* Read the existing MBR of the filesystem and retain the BIOS
Parameter Block (BPB) in bytes 11..89 and the MBR partition
table in bytes 446..509. (Previously installboot(8) would
trash those two sections of the MBR.)

mbrlabel(8):
* Use sys/lib/libkern/xlat_mbr_fstype.c instead of homegrown code
to map the MBR partition type to the NetBSD disklabel type.


Test built "make release" for i386, and new bootblocks verified to work
(even off FAT!).
 1.3 04-Aug-2003  dsl mbr partition stuff isn't saved here (or anywhere else) anymore.
 1.2 10-May-2003  thorpej branches: 1.2.2;
Remove redundant bounds_check_with_label() prototype.
 1.1 26-Apr-2003  fvdl Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.2.2.4 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.2.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.2.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.2.2.1 03-Aug-2004  skrll Sync with HEAD
 1.6.84.1 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.6.78.2 11-Mar-2010  yamt sync with head
 1.6.78.1 04-May-2009  yamt sync with head.
 1.6.74.1 17-Jan-2009  mjf Sync with HEAD.
 1.7.2.1 19-Jan-2009  skrll Sync with HEAD.
 1.8.6.1 24-Oct-2010  jym Sync with HEAD
 1.1 24-Sep-2022  riastradh x86: Support EFI runtime services.

This creates a special pmap, efi_runtime_pmap, which avoids setting
PTE_U but allows mappings to lie in what would normally be user VM --
this way we don't fall afoul of SMAP/SMEP when executing EFI runtime
services from CPL 0. SVS does not apply to the EFI runtime pmap.

The mechanism is intended to work with either physical addressing or
virtual addressing; currently the bootloader does physical addressing
but in principle it could be modified to do virtual addressing
instead, if it allocated virtual pages, assigned them in the memory
map, and issued RT->SetVirtualAddressMap.

Not sure pmap_activate_sync and pmap_deactivate_sync are correct,
need more review from an x86 wizard.

If this causes fallout, it can be disabled temporarily without
reverting anything by just making efi_runtime_init return immediately
without doing anything, or by removing options EFI_RUNTIME.

amd64-only for now pending type fixes and testing on i386.
 1.6 06-Nov-2017  christos Cleanup and clarify the ELFSIZE mess:

We now have 2 variables automatically set in elf_machdep.h:

ARCH_ELFSIZE: the size for userland binaries
KERN_ELFSIZE: the size for the kernel binaries

DB_ELFSIZE has been deleted and KERN_ELFSIZE should have always the
same values DB_ELFSIZE used to have.

In sys/exec_elf.h, if ELFSIZE is not set, it is set to KERN_ELFSIZE
for the kernel and ARCH_ELFSIZE for userland. These defaults should
eliminate the need for most manual ELFSIZE setting.
 1.5 02-Feb-2016  christos Add more relocation constants
 1.4 18-Mar-2010  cegger branches: 1.4.18; 1.4.36;
buildfix: invert comparison to get the 64bit defines by default.
Fixes 'i386/elf_machdep.h: No such file or directory error' when compiling
amd64 toolchain on OSX.
 1.3 30-May-2009  skrll branches: 1.3.2; 1.3.4;
Add TLS relocation definitions.
 1.2 26-Oct-2008  mrg branches: 1.2.8;
put the contents of these header files around #ifdef __x86_64__, and
#include the <i386/foo.h> in the #else clause, making these files
largely bit-size independant.
 1.1 26-Apr-2003  fvdl branches: 1.1.104; 1.1.108; 1.1.114;
Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1.114.1 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.1.108.3 11-Aug-2010  yamt sync with head.
 1.1.108.2 20-Jun-2009  yamt sync with head
 1.1.108.1 04-May-2009  yamt sync with head.
 1.1.104.1 17-Jan-2009  mjf Sync with HEAD.
 1.2.8.3 24-Oct-2010  jym Sync with HEAD
 1.2.8.2 01-Nov-2009  jym Sync with HEAD.
 1.2.8.1 31-May-2009  jym Sync with HEAD.
 1.3.4.1 30-May-2010  rmind sync with head
 1.3.2.1 30-Apr-2010  uebayasi Sync with HEAD.
 1.4.36.1 19-Mar-2016  skrll Sync with HEAD
 1.4.18.1 03-Dec-2017  jdolecek update from HEAD
 1.1 26-Apr-2003  fvdl Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.4 30-Jan-2006  dsl Move the definitions of ntohl() and friends into sys/endian.h where they
are defined in terms of bswap32() and bswap16().
This makes the definition be in the same place for all systems regardless
of creed^Wendianness.
 1.3 11-Dec-2005  christos branches: 1.3.2;
merge ktrace-lwp.
 1.2 10-Jun-2004  kleink branches: 1.2.12;
Reflect <sys/endian.h> rev. 1.4: make htonl() et al. arguments and
results uint{16,32}_t. Noted by Ian Zagorskih.
 1.1 26-Apr-2003  fvdl branches: 1.1.2; 1.1.4;
Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1.4.1 14-Jun-2004  tron Pull up revision 1.2 (requested by kleink in ticket #467):
Reflect <sys/endian.h> rev. 1.4: make htonl() et al. arguments and
results uint{16,32}_t. Noted by Ian Zagorskih.
 1.1.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.1.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.1.2.1 03-Aug-2004  skrll Sync with HEAD
 1.2.12.1 21-Jun-2006  yamt sync with head.
 1.3.2.1 01-Feb-2006  yamt sync with head.
 1.3 12-Feb-2014  dsl Add definitions of the default control words directly to this file
instead of pulling the kernel definition of the fpu (etc) into
userspace programs.
I've included machine/fenv.h into x86/cpu.c to ensure the duplicated
definitions stay in step.
The default control words are now the hardware defaults.
XXX: Anyone care to explain the differences between the i386 and amd64
versions of this file?
 1.2 11-Feb-2014  dsl Move sys/arch/amd64/amd64/fpu.c and sys/arch/amd64/include/fpu.h
into sys/arch/x86 in preparation for using the same code for i386.
 1.1 31-Jul-2010  joerg branches: 1.1.2; 1.1.4; 1.1.6; 1.1.12; 1.1.16; 1.1.26; 1.1.30;
Add support for fenv.h interface for i386 and amd64.

Submitted by Stathis Kamperis as part of GSoC 2010 and ported from
FreeBSD.
 1.1.30.1 18-May-2014  rmind sync with head
 1.1.26.1 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.16.1 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.1.12.2 05-Mar-2011  rmind sync with head
 1.1.12.1 31-Jul-2010  rmind file fenv.h was added on branch rmind-uvmplock on 2011-03-05 20:49:16 +0000
 1.1.6.2 24-Oct-2010  jym Sync with HEAD
 1.1.6.1 31-Jul-2010  jym file fenv.h was added on branch jym-xensuspend on 2010-10-24 22:47:52 +0000
 1.1.4.2 17-Aug-2010  uebayasi Sync with HEAD.
 1.1.4.1 31-Jul-2010  uebayasi file fenv.h was added on branch uebayasi-xip on 2010-08-17 06:43:53 +0000
 1.1.2.2 11-Aug-2010  yamt sync with head.
 1.1.2.1 31-Jul-2010  yamt file fenv.h was added on branch yamt-nfs-mp on 2010-08-11 22:51:34 +0000
 1.1 26-Apr-2003  fvdl Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.14 18-Feb-2014  dsl It seems that firefox includes machine/fpu.h on amd64.
Add the file back so that the firwfox source doesn't have to depend
on the version of netbsd it is being compiled for.
(The i386 version doesn't play the same games in its SIGFPE handler.)
 1.13 11-Feb-2014  dsl Move sys/arch/amd64/amd64/fpu.c and sys/arch/amd64/include/fpu.h
into sys/arch/x86 in preparation for using the same code for i386.
 1.12 07-Feb-2014  dsl Convert the amd64 build to use x86/cpu_extended_state.h so that the fpu
definitions match those of i386.
Mostly just structure and field renames, in addition:
1) process_xmm_to_s87() and process_s87_to_xmm() moved into
x86/convert_xmm_s87.c so they can be used by amd64's netbsd32 code.
2) The linux signal code simplified to use a structure copy for ths fxsave
data - it matches the hardware definition and won't change.
 1.11 11-Dec-2013  dsl Remove the fields that were used to save the i387 fp state on interrupt.
They were written but never read.
Possibly they should be saved for 32 bit processes, but that might be a relic
from real i387 where the fpu was actully asynchronous.
 1.10 01-Dec-2013  christos revert fpu/pcu changes until we figure out what's wrong; they cause random
freezes
 1.9 11-Nov-2013  joerg NetBSD 6.99.26: Switch i386 and amd64 to the x87 default control word
as initial value for new processes. This means that long double
computations get the expected 63bit mantissa. Binaries tagged as
compiled for 6.99.25 and older get the old value.

Add a simple test case to ensure that double and long double computation
are working correctly.
 1.8 23-Oct-2013  drochner Use the MI "pcu" framework for bookkeeping of npx/fpu states on x86.
This reduces the amount of MD code enormously, and makes it easier
to implement support for newer CPU features which require more fpu
state, or for fpu usage by the kernel.
For access to FPU state across CPUs, an xcall kthread is used now
rather than a dedicated IPI.
No user visible changes intended.
 1.7 31-Dec-2012  dsl branches: 1.7.2;
Move the two fields used to save some i387 state on the last fpu trap
into their own sub-structure of the pcb (from 'struct savefpu').
They only (seem) to be used in some code that generates core dumps
for 32bit processes (code that might be broken as well!).
'struct safefpu' is now identical to 'struct fxsave64'. One (or both)
needs extending to support AVX - might need to be dynamically sized.
Removed all the __aligned(16) except for the one in struct pcb itself.
Only the copy used for the fsave instruction need be aligned.
 1.6 15-Dec-2012  dsl Add the offsets and comments for the members of 'struct fxsave64.
Spilt the 'fx_unused2' field into its reserved and available halves.
The latter could be used by the kernel software (cpu won't read/write it).
Remove the __padded from 'struct fxsave64', everything is aligned.
Add a CTASSERT that the size is correct (512).
Remove the unused 'struct oldfsave'.
Everything still builds.
 1.5 16-Apr-2008  cegger branches: 1.5.38; 1.5.48;
use POSIX integer types
 1.4 15-Jan-2008  joerg branches: 1.4.6;
Introduce optional cpu_offline_md to execute MD actions at the end of
cpu_offline. Use this on amd64/i386 to force a FPU save. As this was
triggered by npxsave_cpu/fpusave_cpu not working for a different CPU,
remove the cpu_info argument and adjust npxsave_*/fpusave_* to use bool
for the save.

OK ad@
 1.3 25-Dec-2007  perry Convert many of the uses of __attribute__ to equivalent
__packed, __unused and __dead macros from cdefs.h
 1.2 27-Nov-2007  christos branches: 1.2.2; 1.2.6;
Add aligned(16) in savefpu like the i386 port has. Suggested by Matthias
Drochner.
 1.1 26-Apr-2003  fvdl branches: 1.1.18; 1.1.60; 1.1.78; 1.1.80; 1.1.86;
Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1.86.2 18-Feb-2008  mjf Sync with HEAD.
 1.1.86.1 08-Dec-2007  mjf Sync with HEAD.
 1.1.80.2 23-Mar-2008  matt sync with HEAD
 1.1.80.1 09-Jan-2008  matt sync with HEAD
 1.1.78.1 03-Dec-2007  joerg Sync with HEAD.
 1.1.60.1 03-Dec-2007  ad Sync with HEAD.
 1.1.18.2 21-Jan-2008  yamt sync with head
 1.1.18.1 07-Dec-2007  yamt sync with head
 1.2.6.2 19-Jan-2008  bouyer Sync with HEAD
 1.2.6.1 02-Jan-2008  bouyer Sync with HEAD
 1.2.2.1 26-Dec-2007  ad Sync with head.
 1.4.6.1 02-Jun-2008  mjf Sync with HEAD.
 1.5.48.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.5.48.1 25-Feb-2013  tls resync with head
 1.5.38.2 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.5.38.1 23-Jan-2013  yamt sync with head
 1.7.2.1 18-May-2014  rmind sync with head
 1.23 27-Jun-2025  andvar Grammar and spelling fixes, mainly in comments. A few in documentation,
logging, test description, and SCSI ASC/ASCQ assignment descriptions.
 1.22 14-Feb-2019  cherry branches: 1.22.36;
Welcome XENPVHVM mode.

It is UP only, has xbd(4) and xennet(4) as PV drivers.

The console is com0 at isa and the native portion is very
rudimentary AT architecture, so is probably suboptimal to
run without PV support.
 1.21 11-Feb-2019  cherry We reorganise definitions for XEN source support as follows:

XEN - common sources required for baseline XEN support.
XENPV - sources required for support of XEN in PV mode.
XENPVHVM - sources required for support for XEN in HVM mode.
XENPVH - sources required for support for XEN in PVH mode.
 1.20 19-Nov-2018  kre Fix editing screwup in previous... noted by Rin Okuyama (thanks!)
 1.19 19-Nov-2018  kre Hide differences between i386 and amd64 interrupt frames so XEN does
not need to know there is one. Hopefully unbreak i386 build.
 1.18 14-Jun-2017  chs branches: 1.18.4; 1.18.6;
add an lwp_trapframe() interface to return an LWP's user trapframe.
needed by dtrace.
 1.17 20-Feb-2014  dsl branches: 1.17.6;
This doesn't need fpu.h, but should include ucontext.h
 1.16 11-Feb-2014  dsl Move sys/arch/amd64/amd64/fpu.c and sys/arch/amd64/include/fpu.h
into sys/arch/x86 in preparation for using the same code for i386.
 1.15 26-Oct-2008  mrg branches: 1.15.28; 1.15.38; 1.15.44;
put the contents of these header files around #ifdef __x86_64__, and
#include the <i386/foo.h> in the #else clause, making these files
largely bit-size independant.
 1.14 28-Apr-2008  martin branches: 1.14.6;
Remove clause 3 and 4 from TNF licenses
 1.13 04-Jan-2008  dsl branches: 1.13.6; 1.13.8; 1.13.10;
Change the way that the trap/intr/syscall frames and the __gregset_t[]
indexes are defined so that only a single list of the registers is used.
The code no longer relies on the two structures matching.
There should be no binary change.
 1.12 22-Dec-2007  dsl Define 'struct intrframe' in terms of 'struct trapframe' since the two are
assumed to match by a lot of code (including that which saves the regs).
This only slightly reduces the number of places the trapframe register
layout is defined.
 1.11 21-Dec-2007  dsl Create the trap/syscall frame space for all the registers in one go.
Use the tramp-frame offsets (TF_foo) for all references to the registers.
Sort the saving of the GP registers into the same order as the trap frame
because consequetive memory accesses are liekly to be faster.
 1.10 17-Oct-2007  garbled branches: 1.10.2; 1.10.4; 1.10.8;
Merge the ppcoea-renovation branch to HEAD.

This branch was a major cleanup and rototill of many of the various OEA
cpu based PPC ports that focused on sharing as much code as possible
between the various ports to eliminate near-identical copies of files in
every tree. Additionally there is a new PIC system that unifies the
interface to interrupt code for all different OEA ppc arches. The work
for this branch was done by a variety of people, too long to list here.

TODO:
bebox still needs work to complete the transition to -renovation.
ofppc still needs a bunch of work, which I will be looking at.
ev64260 still needs to be renovated
amigappc was not attempted.

NOTES:
pmppc was removed as an arch, and moved to a evbppc target.
 1.9 21-May-2007  skrll branches: 1.9.10;
Correct comment - it's cpu_switchto now.
 1.8 17-May-2007  yamt merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.
 1.7 11-Dec-2005  christos branches: 1.7.26; 1.7.30; 1.7.32; 1.7.38;
merge ktrace-lwp.
 1.6 28-Mar-2004  drochner branches: 1.6.16;
We should ensure stack alignment _after_ subtracting sizeof(sigframe).
Should fix PR bin/24948 by Wolfgang S. Rupprecht.
(nuke getframe() completely because its interface doesn't support this,
and it it used at one place only anyway)
 1.5 25-Mar-2004  drochner only accept signal trampoline version 2, and remove "struct sigcontext"
 1.4 13-Oct-2003  fvdl Define all frame members as unsigned, to avoid any possibility of
sign extension on these values.
 1.3 06-Oct-2003  fvdl SIGINFO support.
Todo: 32bit compat support (COMPAT_NETBSD32 will not compile right now,
as it won't on other platforms).
 1.2 07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.1 26-Apr-2003  fvdl branches: 1.1.2;
Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.1.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.1.2.1 03-Aug-2004  skrll Sync with HEAD
 1.6.16.2 21-Jan-2008  yamt sync with head
 1.6.16.1 03-Sep-2007  yamt sync with head.
 1.7.38.1 22-May-2007  matt Update to HEAD.
 1.7.32.1 11-Jul-2007  mjf Sync with head.
 1.7.30.1 27-May-2007  ad Sync with head.
 1.7.26.1 03-Mar-2007  yamt adapt amd64.

XXX changes in identcpu.c is minmum for MONITOR.
XXX identcpu.c should be shared with i386.
 1.9.10.2 09-Jan-2008  matt sync with HEAD
 1.9.10.1 06-Nov-2007  matt sync with HEAD
 1.10.8.2 08-Jan-2008  bouyer Sync with HEAD
 1.10.8.1 02-Jan-2008  bouyer Sync with HEAD
 1.10.4.1 26-Dec-2007  ad Sync with head.
 1.10.2.1 18-Feb-2008  mjf Sync with HEAD.
 1.13.10.2 04-May-2009  yamt sync with head.
 1.13.10.1 16-May-2008  yamt sync with head.
 1.13.8.1 18-May-2008  yamt sync with head.
 1.13.6.2 17-Jan-2009  mjf Sync with HEAD.
 1.13.6.1 02-Jun-2008  mjf Sync with HEAD.
 1.14.6.1 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.15.44.1 18-May-2014  rmind sync with head
 1.15.38.2 03-Dec-2017  jdolecek update from HEAD
 1.15.38.1 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.15.28.1 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.17.6.1 28-Aug-2017  skrll Sync with HEAD
 1.18.6.1 10-Jun-2019  christos Sync with HEAD
 1.18.4.1 26-Nov-2018  pgoyette Sync with HEAD, resolve a couple of conflicts
 1.22.36.1 02-Aug-2025  perseant Sync with HEAD
 1.8 17-Apr-2021  rillig sys/arch/amd64: remove trailing whitespace
 1.7 26-Apr-2015  mrg remove any pathname for gdb's amd64nbsd-tdep.c, so it doesn't really
matter where it lives. PR#49859.
 1.6 26-Apr-2015  christos PR/49859: Kamil Rytarowski: Invalid path to amd64nbsd-tdep.c in a comment
 1.5 27-Nov-2014  uebayasi branches: 1.5.2;
Improve grep'ability..
 1.4 06-Feb-2008  dsl branches: 1.4.2; 1.4.56;
Update comment about 'struct reg' and __greg alignment, and clarify
the comment about the syscall/trap entry substituting %r10 for %rcx.
 1.3 05-Jan-2008  dsl branches: 1.3.2; 1.3.4; 1.3.6;
Reorder the amd64 trapframe (swap rcx/r10 and add 4 spare slots after r9).
This allows the syscall code to pass the syscall args directly from the
trapframe instead of copying them to a separate structure.
It is still possible that some lurking code still assumes that
'struct trapframe', 'struct mcontext' and 'struct reg' all have the
registers in the same order, but I've fixed enough of them to get gdb working.
 1.2 04-Jan-2008  dsl Repeat after me, don't edit files (even to update comments) between test
build and committing.
 1.1 04-Jan-2008  dsl Change the way that the trap/intr/syscall frames and the __gregset_t[]
indexes are defined so that only a single list of the registers is used.
The code no longer relies on the two structures matching.
There should be no binary change.
 1.3.6.3 11-Feb-2008  yamt sync with head.
 1.3.6.2 21-Jan-2008  yamt sync with head
 1.3.6.1 05-Jan-2008  yamt file frame_regs.h was added on branch yamt-lazymbuf on 2008-01-21 09:35:24 +0000
 1.3.4.3 23-Mar-2008  matt sync with HEAD
 1.3.4.2 09-Jan-2008  matt sync with HEAD
 1.3.4.1 05-Jan-2008  matt file frame_regs.h was added on branch matt-armv6 on 2008-01-09 01:44:53 +0000
 1.3.2.2 08-Jan-2008  bouyer Sync with HEAD
 1.3.2.1 05-Jan-2008  bouyer file frame_regs.h was added on branch bouyer-xeni386 on 2008-01-08 22:09:19 +0000
 1.4.56.1 03-Dec-2017  jdolecek update from HEAD
 1.4.2.2 18-Feb-2008  mjf Sync with HEAD.
 1.4.2.1 06-Feb-2008  mjf file frame_regs.h was added on branch mjf-devfs on 2008-02-18 21:04:21 +0000
 1.5.2.1 06-Jun-2015  skrll Sync with HEAD
 1.55 30-Jul-2022  riastradh x86: Eliminate mfence hotpatch for membar_sync.

The more-compatible LOCK ADD $0,-N(%rsp) turns out to be cheaper
than MFENCE anyway. Let's save some space and maintenance and rip
out the hotpatching for it.
 1.54 09-Apr-2022  riastradh x86: Every load is a load-acquire, so membar_consumer is a noop.

lfence is only needed for MD logic, such as operations on I/O memory
rather than normal cacheable memory, or special instructions like
RDTSC -- never for MI synchronization between threads/CPUs. No need
for hot-patching to do lfence here.

(The x86_lfence function might reasonably be patched on i386 to do
lfence for MD logic, but it isn't now and this doesn't change that.)
 1.53 17-Apr-2021  rillig sys/arch/amd64: remove trailing whitespace
 1.52 19-Jul-2020  maxv Revert most of ad's movs/stos change. Instead do a lot simpler: declare
svs_quad_copy() used by SVS only, with no need for instrumentation, because
SVS is disabled when sanitizers are on.
 1.51 21-Jun-2020  bouyer Fix comment
 1.50 01-Jun-2020  ad Reported-by: syzbot+6dd5a230d19f0cbc7814@syzkaller.appspotmail.com

Instrument STOS/MOVS for KMSAN to unbreak it.
 1.49 26-Apr-2020  maxv Use the hotpatch framework for LFENCE/MFENCE.
 1.48 25-Apr-2020  bouyer Merge the bouyer-xenpvh branch, bringing in Xen PV drivers support under HVM
guests in GENERIC.
Xen support can be disabled at runtime with
boot -c
disable hypervisor
 1.47 17-Nov-2019  maxv branches: 1.47.6;
Disable KCOV - by raising the interrupt level - in the TLB IPI handler,
because this is only noise.
 1.46 14-Nov-2019  maxv Add support for Kernel Memory Sanitizer (kMSan). It detects uninitialized
memory used by the kernel at run time, and just like kASan and kCSan, it
is an excellent feature. It has already detected 38 uninitialized variables
in the kernel during my testing, which I have since discreetly fixed.

We use two shadows:
- "shad", to track uninitialized memory with a bit granularity (1:1).
Each bit set to 1 in the shad corresponds to one uninitialized bit of
real kernel memory.
- "orig", to track the origin of the memory with a 4-byte granularity
(1:1). Each uint32_t cell in the orig indicates the origin of the
associated uint32_t of real kernel memory.

The memory consumption of these shadows is consequent, so at least 4GB of
RAM is recommended to run kMSan.

The compiler inserts calls to specific __msan_* functions on each memory
access, to manage both the shad and the orig and detect uninitialized
memory accesses that change the execution flow (like an "if" on an
uninitialized variable).

We mark as uninit several types of memory buffers (stack, pools, kmem,
malloc, uvm_km), and check each buffer passed to copyout, copyoutstr,
bwrite, if_transmit_lock and DMA operations, to detect uninitialized memory
that leaves the system. This allows us to detect kernel info leaks in a way
that is more efficient and also more user-friendly than KLEAK.

Contrary to kASan, kMSan requires comprehensive coverage, ie we cannot
tolerate having one non-instrumented function, because this could cause
false positives. kMSan cannot instrument ASM functions, so I converted
most of them to __asm__ inlines, which kMSan is able to instrument. Those
that remain receive special treatment.

Contrary to kASan again, kMSan uses a TLS, so we must context-switch this
TLS during interrupts. We use different contexts depending on the interrupt
level.

The orig tracks precisely the origin of a buffer. We use a special encoding
for the orig values, and pack together in each uint32_t cell of the orig:
- a code designating the type of memory (Stack, Pool, etc), and
- a compressed pointer, which points either (1) to a string containing
the name of the variable associated with the cell, or (2) to an area
in the kernel .text section which we resolve to a symbol name + offset.

This encoding allows us not to consume extra memory for associating
information with each cell, and produces a precise output, that can tell
for example the name of an uninitialized variable on the stack, the
function in which it was pushed on the stack, and the function where we
accessed this uninitialized variable.

kMSan is available with LLVM, but not with GCC.

The code is organized in a way that is similar to kASan and kCSan, so it
means that other architectures than amd64 can be supported.
 1.45 12-Oct-2019  maxv Rewrite the FPU code on x86. This greatly simplifies the logic and removes
the dependency on IPL_HIGH. NVMM is updated accordingly. Posted on
port-amd64 a week ago.

Bump the kernel version to 9.99.16.
 1.44 18-May-2019  maxv Two changes in the CPU mitigations:

* Micro-optimize: put every mitigation in the same branch. This removes
two branches in each exc/int return path, and removes all branches in
the syscall return path.

* Modify the SpectreV2 mitigation to be compatible with SpectreV4. I
recently realized that both couldn't be enabled at the same time on
Intel. This is because initially, when there was just SpectreV2, we
could reset the whole IA32_SPEC_CTRL MSR. But then Intel added another
bit in it for SpectreV4, so it isn't right to reset it entirely
anymore. SSBD needs to stay.
 1.43 14-May-2019  maxv Mitigation for INTEL-SA-00233: Microarchitectural Data Sampling (MDS).

It requires a microcode update, now available on the Intel website. The
microcode modifies the behavior of the VERW instruction, and makes it flush
internal CPU buffers. We hotpatch the return-to-userland path to add VERW.

Two sysctls are added:

machdep.mds.mitigated = {0/1} user-settable
machdep.mds.method = {string} constructed by the kernel

The kernel will automatically enable the mitigation if the updated
microcode is present. If the new microcode is not present, the user can
load it via cpuctl, and set machdep.mds.mitigated=1.
 1.42 11-Feb-2019  cherry We reorganise definitions for XEN source support as follows:

XEN - common sources required for baseline XEN support.
XENPV - sources required for support of XEN in PV mode.
XENPVHVM - sources required for support for XEN in HVM mode.
XENPVH - sources required for support for XEN in PVH mode.
 1.41 12-Aug-2018  maxv Move the PCPU area from slot 384 to slot 510, to avoid creating too much
fragmentation in the slot space (384 is in the middle of the kernel half
of the VA).
 1.40 13-Jul-2018  martin Provide empty SVS_ENTER_NMI/SVS_LEAVE_NMI for kernels w/o options SVS
 1.39 12-Jul-2018  maxv Handle NMIs correctly when SVS is enabled. We store the kernel's CR3 at the
top of the NMI stack, and we unconditionally switch to it, because we don't
know with which page tables we received the NMI. Hotpatch the whole thing as
usual.

This restores the ability to use PMCs on Intel CPUs.
 1.38 28-Mar-2018  maxv branches: 1.38.2;
Add the IBRS mitigation for SpectreV2 on amd64.

Different operations are performed during context transitions:

user->kernel: IBRS <- 1
kernel->user: IBRS <- 0

And during context switches:

user->user: IBPB <- 0
kernel->user: IBPB <- 0
[user->kernel:IBPB <- 0 this one may not be needed]

We use two macros, IBRS_ENTER and IBRS_LEAVE, to set the IBRS bit. The
thing is hotpatched for better performance, like SVS.

The idea is that IBRS is a "privileged" bit, which is set to 1 in kernel
mode and 0 in user mode. To protect the branch predictor between user
processes (which are of the same privilege), we use the IBPB barrier.

The Intel manual also talks about (MWAIT/HLT)+HyperThreading, and says
that when using either of the two instructions IBRS must be disabled for
better performance on the core. I'm not totally sure about this part, so
I'm not adding it now.

IBRS is available only when the Intel microcode update is applied. The
mitigation must be enabled manually with machdep.spectreV2.mitigated.

Tested by msaitoh a week ago (but I adapted a few things since). Probably
more changes to come.
 1.37 25-Feb-2018  maxv branches: 1.37.2;
Remove INTRENTRY_L, it's not used anymore.
 1.36 22-Feb-2018  maxv Make the machdep.svs_enabled sysctl writable, and add the kernel code
needed to disable SVS at runtime.

We set 'svs_enabled' to false, and hotpatch the kernel entry/exit points
to eliminate the context switch code.

We need to make sure there is no remote CPU that is executing the code we
are hotpatching. So we use two barriers:

* After the first one each CPU is guaranteed to be executing in
svs_disable_cpu with interrupts disabled (this way it can't leave this
place).

* After the second one it is guaranteed that SVS is disabled, so we flush
the cache, enable interrupts and continue execution normally.

Between the two barriers, cpu0 will disable SVS (svs_enabled=false and
hotpatch), and each CPU will restore the generic syscall entry point.

Three notes:

* We should call svs_pgg_update(true) afterwards, to put back PG_G on
the kernel pages (for better performance). This will be done in another
commit.

* The fact that we disable interrupts does not prevent us from receiving
an NMI, and it would be problematic. So we need to add some code to
verify that PMCs are disabled before hotpatching. This will be done
in another commit.

* In svs_disable() we expect each CPU to be online. We need to add a
check to make sure they indeed are.

The sysctl allows only a 1->0 transition. There is no point in doing 0->1
transitions anyway, and it would be complicated to implement because we
need to re-synchronize the CPU user page tables with the current ones (we
lost track of them in the last 1->0 transition).
 1.35 22-Feb-2018  maxv Add a dynamic detection for SVS.

The SVS_* macros are now compiled as skip-noopt. When the system boots, if
the cpu is from Intel, they are hotpatched to their real content.
Typically:

jmp 1f
int3
int3
int3
... int3 ...
1:

gets hotpatched to:

movq SVS_UTLS+UTLS_KPDIRPA,%rax
movq %rax,%cr3
movq CPUVAR(KRSP0),%rsp

These two chunks of code being of the exact same size. We put int3 (0xCC)
to make sure we never execute there.

In the non-SVS (ie non-Intel) case, all it costs is one jump. Given that
the SVS_* macros are small, this jump will likely leave us in the same
icache line, so it's pretty fast.

The syscall entry point is special, because there we use a scratch uint64_t
not in curcpu but in the UTLS page, and it's difficult to hotpatch this
properly. So instead of hotpatching we declare the entry point as an ASM
macro, and define two functions: syscall and syscall_svs, the latter being
the one used in the SVS case.

While here 'syscall' is optimized not to contain an SVS_ENTER - this way
we don't even need to do a jump on the non-SVS case.

When adding pages in the user page tables, make sure we don't have PG_G,
now that it's dynamic.

A read-only sysctl is added, machdep.svs_enabled, that tells whether the
kernel uses SVS or not.

More changes to come, svs_init() is not very clean.
 1.34 27-Jan-2018  maxv Put the default %cs value in INTR_RECURSE_HWFRAME. Pushing an immediate
costs less than reading the %cs register and pushing its value. This
value is not allowed to be != GSEL(GCODE_SEL,SEL_KPL) in all cases.
 1.33 27-Jan-2018  maxv Declare and use INTR_RECURSE_ENTRY, an optimized version of INTRENTRY.
When processing deferred interrupts, we are always entering the new
handler in kernel mode, so there is no point performing the userland
checks.

Saves several instructions.
 1.32 27-Jan-2018  maxv Remove DO_DEFERRED_SWITCH and DO_DEFERRED_SWITCH_RETRY, unused.
 1.31 21-Jan-2018  maxv Unmap the kernel from userland in SVS, and leave only the needed
trampolines. As explained below, SVS should now completely mitigate
Meltdown on GENERIC kernels, even though it needs some more tweaking
for GENERIC_KASLR.

Until now the kernel entry points looked like:

FUNC(intr)
pushq $ERR
pushq $TRAPNO
INTRENTRY
... handle interrupt ...
INTRFASTEXIT
END(intr)

With this change they are split and become:

FUNC(handle)
... handle interrupt ...
INTRFASTEXIT
END(handle)

TEXT_USER_BEGIN
FUNC(intr)
pushq $ERR
pushq $TRAPNO
INTRENTRY
jmp handle
END(intr)
TEXT_USER_END

A new section is introduced, .text.user, that contains minimal kernel
entry/exit points. In order to choose what to put in this section, two
macros are introduced, TEXT_USER_BEGIN and TEXT_USER_END.

The section is mapped in userland with normal 4K pages.

In GENERIC, the section is 4K-page-aligned and embedded in .text, which
is mapped with large pages. That is to say, when an interrupt comes in,
the CPU has the user page tables loaded and executes the 'intr' functions
on 4K pages; after calling SVS_ENTER (in INTRENTRY) these 4K pages become
2MB large pages, and remain so when executing in kernel mode.

In GENERIC_KASLR, the section is 4K-page-aligned and independent from the
other kernel texts. The prekern just picks it up and maps it at a random
address.

In GENERIC, SVS should now completely mitigate Meltdown: what we put in
.text.user is not secret.

In GENERIC_KASLR, SVS would have to be improved a bit more: the
'jmp handle' instruction is actually secret, since it leaks the address
of the section we are jumping into. By exploiting Meltdown on Intel, this
theoretically allows a local user to reconstruct the address of the first
text section. But given that our KASLR produces several texts, and that
each section is not correlated with the others, the level of protection
KASLR provides is still good.
 1.30 20-Jan-2018  maxv Use .pushsection/.popsection, we will soon embed macros in several layers
of nested sections.
 1.29 18-Jan-2018  maxv Unmap the kernel heap from the user page tables (SVS).

This implementation is optimized and organized in such a way that we
don't need to copy the kernel stack to a safe place during user<->kernel
transitions. We create two VAs that point to the same physical page; one
will be mapped in userland and is offset in order to contain only the
trapframe, the other is mapped in the kernel and maps the entire stack.

Sent on tech-kern@ a week ago.
 1.28 11-Jan-2018  maxv Declare new SVS_* variants: SVS_ENTER_NOSTACK and SVS_LEAVE_NOSTACK. Use
SVS_ENTER_NOSTACK in the syscall entry point, and put it before the code
that touches curlwp. (curlwp is located in the direct map.)

Then, disable __HAVE_CPU_UAREA_ROUTINES (to be removed later). This moves
the kernel stack into pmap_kernel(), and not the direct map. That's a
change I've always wanted to make: because of the direct map we can't add
a redzone on the stack, and basically, a stack overflow can go very far
in memory without being detected (as far as erasing all of the system's
memory).

Finally, unmap the direct map from userland.
 1.27 07-Jan-2018  maxv Add a new option, SVS (for Separate Virtual Space), that unmaps kernel
pages when running in userland. For now, only the PTE area is unmapped.

Sent on tech-kern@.
 1.26 07-Jan-2018  maxv Switch x86_retpatch[] -> HOTPATCH().
 1.25 07-Jan-2018  maxv Switch x86_lockpatch[] -> HOTPATCH().
 1.24 07-Jan-2018  maxv Implement a real hotpatch feature.

Define a HOTPATCH() macro, that puts a label and additional information
in the new .rodata.hotpatch kernel section. In patch.c, scan the section
and patch what needs to be. Now it is possible to hotpatch the content of
a macro.

SMAP is switched to use this new system; this saves a call+ret in each
kernel entry/exit point.

Many other operating systems do the same.
 1.23 17-Oct-2017  maxv Have the cpu clear PSL_D automatically when entering the kernel via a
syscall. Then, don't clear PSL_D and PSL_AC in the syscall entry point,
they are now both cleared by the cpu (faster). However they still need to
be manually cleared in the interrupt/trap entry points.
 1.22 17-Oct-2017  maxv Add support for SMAP on amd64.

PSL_AC is cleared from %rflags in each kernel entry point. In the copy
sections, a copy window is opened and the kernel can touch userland
pages. This window is closed when the kernel is done, either at the end
of the copy sections or in the fault-recover functions.

This implementation is not optimized yet, due to the fact that INTRENTRY
is a macro, and we can't hotpatch macros.

Sent on tech-kern@ a month or two ago, tested on a Kabylake.
 1.21 15-Sep-2017  maxv Declare INTRFASTEXIT as a function, so that there is only one iretq in the
kernel. Then, check %rip against the address of this iretq instead of
disassembling (%rip) - which could fault again, or point at some random
address which happens to contain the iretq opcode. The same is true for gs
below, but I'll fix that in another commit.
 1.20 15-Jul-2012  dsl branches: 1.20.2; 1.20.32;
Rename MDP_IRET to MDL_IRET since it is an lwp flag, not a proc one.
Add an MDL_COMPAT32 flag to the lwp's md_flags, set it for 32bit lwps
and use it to force 'return to user' with iret (as is done when
MDL_IRET is set).
Split the iret/sysret code paths much later.
Remove all the replicated code for 32bit system calls - which was only
needed so that iret was always used.
frameasm.h for XEN contains '#define swapgs', while XEN probable never
needs swapgs, this is likely to be confusing.
Add a SWAPGS which is a nop on XEN and swapgs otherwise.
(I've not yet checked all the swapgs in files that include frameasm.h)
Simple x86 programs still work.
Hijack 6.99.9 kernel bump (needed for compat32 modules)
 1.19 17-May-2012  dsl Let the user of INTRENTRY_L() place a label on the 'swapgs' used
when faulting from user space.
 1.18 07-May-2012  dsl Add a ';' that got deleted in a slight tidyup.
 1.17 07-May-2012  dsl Move all the XEN differences to a single conditional.
Merge the XEN/non-XEN versions of INTRFASTEXIT and
INTR_RECURSE_HWFRAME by using extra defines.
Split INTRENTRY so that code can insert extra instructions
inside user/kernel conditional.
 1.16 10-Aug-2011  cherry branches: 1.16.2; 1.16.6; 1.16.8;
Correct offset calculation for ci
 1.15 12-Jan-2011  joerg branches: 1.15.6;
Allow use of traditional CPP to be set on a per platform base in sys.mk.
Honour this for dependency processing in bsd.dep.mk. Switch i386 and
amd64 assembly to use ISO C90 preprocessor concat and drop the
-traditional-cpp on this platform.
 1.14 07-Jul-2010  chs add the guts of TLS support on amd64. based on joerg's patch,
reworked by me to support 32-bit processes as well.
we now keep %fs and %gs loaded with the user values
while in the kernel, which means we don't need to
reload them when returning to user mode.
 1.13 21-Nov-2008  ad branches: 1.13.4; 1.13.6; 1.13.8;
PR port-amd64/39991 modules/compat_linux: build fix
 1.12 21-Apr-2008  cegger branches: 1.12.2; 1.12.8; 1.12.10; 1.12.12; 1.12.14; 1.12.18;
Access Xen's vcpu info structure per-CPU.
Tested on i386 and amd64 (both dom0 and domU) by me.
Xen2 tested (both dom0 and domU) by bouyer.
OK bouyer
 1.11 29-Feb-2008  yamt branches: 1.11.2;
don't bother to check curlwp==NULL.
 1.10 21-Dec-2007  dsl branches: 1.10.2; 1.10.6;
Create the trap/syscall frame space for all the registers in one go.
Use the tramp-frame offsets (TF_foo) for all references to the registers.
Sort the saving of the GP registers into the same order as the trap frame
because consequetive memory accesses are liekly to be faster.
 1.9 21-Dec-2007  dsl Change the xen CLI() and STI() defines to only use one scratch register.
As well as saving an instruction, in one place it saves a push/pop pair.
 1.8 22-Nov-2007  bouyer branches: 1.8.2; 1.8.6;
Pull up the bouyer-xenamd64 branch to HEAD. This brings in amd64 support
to NetBSD/Xen, both Dom0 and DomU.
 1.7 14-Nov-2007  ad Clear the direction flag on entry to the kernel.
 1.6 18-Oct-2007  yamt branches: 1.6.2;
merge yamt-x86pmap branch.

- reduce differences between amd64 and i386. notably, share pmap.c
between them. it makes several i386 pmap improvements available to
amd64, including tlb shootdown reduction and bug fixes from Stephan Uphoff.
- implement deferred pmap switching for amd64.
- remove LARGEPAGES option. always use large pages if available.
also, make it work on amd64.
 1.5 17-Oct-2007  garbled Merge the ppcoea-renovation branch to HEAD.

This branch was a major cleanup and rototill of many of the various OEA
cpu based PPC ports that focused on sharing as much code as possible
between the various ports to eliminate near-identical copies of files in
every tree. Additionally there is a new PIC system that unifies the
interface to interrupt code for all different OEA ppc arches. The work
for this branch was done by a variety of people, too long to list here.

TODO:
bebox still needs work to complete the transition to -renovation.
ofppc still needs a bunch of work, which I will be looking at.
ev64260 still needs to be renovated
amigappc was not attempted.

NOTES:
pmppc was removed as an arch, and moved to a evbppc target.
 1.4 21-May-2007  fvdl branches: 1.4.8; 1.4.10; 1.4.12; 1.4.14;
Revert fs/gs changes until I figure out issues with them.
 1.3 11-May-2007  fvdl Don't save/restore %fs and %gs in trapframe. The kernel won't touch them.
Instead, save/restore them on context switch. For 32bit processes, save/restore
the selector values only, for 64bit processes, save/restore the appropriate
MSRs. Iff the defaults have been changed.
 1.2 09-Feb-2007  ad branches: 1.2.2; 1.2.6; 1.2.8; 1.2.14;
Merge newlock2 to head.
 1.1 26-Apr-2003  fvdl branches: 1.1.18; 1.1.48;
Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1.48.1 20-Oct-2006  ad Make ASTs per-LWP.
 1.1.18.6 17-Mar-2008  yamt sync with head.
 1.1.18.5 21-Jan-2008  yamt sync with head
 1.1.18.4 07-Dec-2007  yamt sync with head
 1.1.18.3 15-Nov-2007  yamt sync with head.
 1.1.18.2 27-Oct-2007  yamt sync with head.
 1.1.18.1 26-Feb-2007  yamt sync with head.
 1.2.14.1 22-May-2007  matt Update to HEAD.
 1.2.8.1 11-Jul-2007  mjf Sync with head.
 1.2.6.3 03-Dec-2007  ad Sync with HEAD.
 1.2.6.2 03-Dec-2007  ad Sync with HEAD.
 1.2.6.1 23-Oct-2007  ad Sync with head.
 1.2.2.1 17-May-2007  yamt sync with head.
 1.4.14.4 18-Nov-2007  bouyer Sync with HEAD
 1.4.14.3 25-Oct-2007  bouyer Finish sync with HEAD. Especially use the new x86 pmap for xenamd64.
For this:
- rename pmap_pte_set() to pmap_pte_testset()
- make pmap_pte_set() a function or macro for non-atomic PTE write
- define and use pmap_pa2pte()/pmap_pte2pa() to read/write PTE entries
- define pmap_pte_flush() which is a nop in x86 case, and flush the
MMUops queue in the Xen case
 1.4.14.2 18-Oct-2007  bouyer Explicitely set the flag argument of HYPERVISOR_iret to 0.
 1.4.14.1 17-Oct-2007  bouyer amd64 (aka x86-64) support for Xen. Based on the OpenBSD port done by
Mathieu Ropert in 2006.
DomU-only for now. An INSTALL_XEN3_DOMU kernel with a ramdisk will boot to
sysinst if you're lucky. Often it panics because a runable LWP has
a NULL stack (really, it's all of l->l_addr which is has been zeroed out
while the process was on the queue !)
TODO:
- bug fixes :)
- Most of the xpq_* functions should be shared with xen/i386
- The xen/i386 assembly bootstrap code should be remplaced with the C
version in xenamd64/amd64/xpmap.c
- see if a config(5) trick could allow to merge xenamd64 back to xen or amd64.
 1.4.12.1 30-Sep-2007  yamt implement deferred pmap switching for amd64, and make amd64 use
x86 shared pmap code. it makes several i386 pmap improvements available
to amd64, including tlb shootdown reduction and bug fixes from Stephan Uphoff.
 1.4.10.3 23-Mar-2008  matt sync with HEAD
 1.4.10.2 09-Jan-2008  matt sync with HEAD
 1.4.10.1 06-Nov-2007  matt sync with HEAD
 1.4.8.3 27-Nov-2007  joerg Sync with HEAD. amd64 Xen support needs testing.
 1.4.8.2 14-Nov-2007  joerg Sync with HEAD.
 1.4.8.1 26-Oct-2007  joerg Sync with HEAD.

Follow the merge of pmap.c on i386 and amd64 and move
pmap_init_tmp_pgtbl into arch/x86/x86/pmap.c. Modify the ACPI wakeup
code to restore CR4 before jumping back into kernel space as the large
page option might cover that.
 1.6.2.3 27-Dec-2007  mjf Sync with HEAD.
 1.6.2.2 08-Dec-2007  mjf Sync with HEAD.
 1.6.2.1 19-Nov-2007  mjf Sync with HEAD.
 1.8.6.1 02-Jan-2008  bouyer Sync with HEAD
 1.8.2.1 26-Dec-2007  ad Sync with head.
 1.10.6.3 17-Jan-2009  mjf Sync with HEAD.
 1.10.6.2 02-Jun-2008  mjf Sync with HEAD.
 1.10.6.1 03-Apr-2008  mjf Sync with HEAD.
 1.10.2.1 24-Mar-2008  keiichi sync with head.
 1.11.2.1 18-May-2008  yamt sync with head.
 1.12.18.1 12-Jun-2012  riz Pull up following revision(s) (requested by spz in ticket #1772):
sys/arch/amd64/amd64/trap.c: revision 1.71 via patch
sys/arch/amd64/amd64/vector.S: revision 1.41 via patch
sys/arch/amd64/include/frameasm.h: patch

Treat traps in kernel mode during the 'return to user' iret sequence
as user faults.
Based heavily in the i386 code with the correct opcode bytes inserted.
iret path tested, arranging for segment register errors is harder.
User %fs and %gs (32bit apps) are loaded much earlier and any errors
will generate kernel panics - there is probably code to try to stop
the invalid values being set.
If we get a fault setting the user %gs, or on a iret that is returning
to userspace, we must do a 'swapgs' to reload the kernel %gs_base.
Also save the %ds, %es, %fs, %gs selector values in the frame so
they can be restored if we finally return to user (probably after
an application SIGSEGV handler has fixed the error).
Without this any such fault leaves the kernel running with the wrong
%gs offset and it will most likely fault again early in trap().
Repeats until the stack tramples on something important.
iret change works, invalid %gs is a little harder to arrange.
 1.12.14.1 12-Jun-2012  riz Pull up following revision(s) (requested by spz in ticket #1772):
sys/arch/amd64/amd64/trap.c: revision 1.71 via patch
sys/arch/amd64/amd64/vector.S: revision 1.41 via patch
sys/arch/amd64/include/frameasm.h: patch

Treat traps in kernel mode during the 'return to user' iret sequence
as user faults.
Based heavily in the i386 code with the correct opcode bytes inserted.
iret path tested, arranging for segment register errors is harder.
User %fs and %gs (32bit apps) are loaded much earlier and any errors
will generate kernel panics - there is probably code to try to stop
the invalid values being set.
If we get a fault setting the user %gs, or on a iret that is returning
to userspace, we must do a 'swapgs' to reload the kernel %gs_base.
Also save the %ds, %es, %fs, %gs selector values in the frame so
they can be restored if we finally return to user (probably after
an application SIGSEGV handler has fixed the error).
Without this any such fault leaves the kernel running with the wrong
%gs offset and it will most likely fault again early in trap().
Repeats until the stack tramples on something important.
iret change works, invalid %gs is a little harder to arrange.
 1.12.12.1 12-Jun-2012  riz Pull up following revision(s) (requested by spz in ticket #1772):
sys/arch/amd64/amd64/trap.c: revision 1.71 via patch
sys/arch/amd64/amd64/vector.S: revision 1.41 via patch
sys/arch/amd64/include/frameasm.h: patch

Treat traps in kernel mode during the 'return to user' iret sequence
as user faults.
Based heavily in the i386 code with the correct opcode bytes inserted.
iret path tested, arranging for segment register errors is harder.
User %fs and %gs (32bit apps) are loaded much earlier and any errors
will generate kernel panics - there is probably code to try to stop
the invalid values being set.
If we get a fault setting the user %gs, or on a iret that is returning
to userspace, we must do a 'swapgs' to reload the kernel %gs_base.
Also save the %ds, %es, %fs, %gs selector values in the frame so
they can be restored if we finally return to user (probably after
an application SIGSEGV handler has fixed the error).
Without this any such fault leaves the kernel running with the wrong
%gs offset and it will most likely fault again early in trap().
Repeats until the stack tramples on something important.
iret change works, invalid %gs is a little harder to arrange.
 1.12.10.1 19-Jan-2009  skrll Sync with HEAD.
 1.12.8.1 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.12.2.2 11-Aug-2010  yamt sync with head.
 1.12.2.1 04-May-2009  yamt sync with head.
 1.13.8.1 05-Mar-2011  rmind sync with head
 1.13.6.1 17-Aug-2010  uebayasi Sync with HEAD.
 1.13.4.3 27-Aug-2011  jym Sync with HEAD. Most notably: uvm/pmap work done by rmind@, and MP Xen
work of cherry@.

No regression observed on suspend/restore.
 1.13.4.2 28-Mar-2011  jym Sync with HEAD. TODO before merge:
- shortcut for suspend code in sysmon, when powerd(8) is not running.
Borrow ``xs_watch'' thread context?
- bug hunting in xbd + xennet resume. Rings are currently thrashed upon
resume, so current implementation force flush them on suspend. It's not
really needed.
 1.13.4.1 24-Oct-2010  jym Sync with HEAD
 1.15.6.1 03-Jun-2011  cherry Initial import of xen MP sources, with kernel and userspace tests.
- this is a source priview.
- boots to single user.
- spurious interrupt and pmap related panics are normal
 1.16.8.1 03-Jun-2012  jdc Pull up revisions:
src/sys/arch/amd64/include/frameasm.h revision 1.17-1.19
src/sys/arch/amd64/amd64/vector.S revision 1.40-1.41
src/sys/arch/amd64/amd64/trap.c revision 1.71
(requested by dsl in ticket #280).

Move all the XEN differences to a single conditional.
Merge the XEN/non-XEN versions of INTRFASTEXIT and
INTR_RECURSE_HWFRAME by using extra defines.
Split INTRENTRY so that code can insert extra instructions
inside user/kernel conditional.

Add a ';' that got deleted in a slight tidyup.

Rejig the way TRAP() and ZTRAP() are defined and add Z/TRAP_NJ() that
excludes the 'jmp alltraps'.
Use the _NJ versions for trap entries with non-standard code.
Move all the KDTRACE_HOOKS code into a single block inside the
IDTVEC(trap03) code. This removes a mis-predicted from every
trap when KDTRACE_HOOKS are enabled.
Add a few blank lines, need some comments as well :-)
No functional changes intended.

Let the user of INTRENTRY_L() place a label on the 'swapgs' used
when faulting from user space.

If we get a fault setting the user %gs, or on a iret that is returning
to userspace, we must do a 'swapgs' to reload the kernel %gs_base.
Also save the %ds, %es, %fs, %gs selector values in the frame so
they can be restored if we finally return to user (probably after
an application SIGSEGV handler has fixed the error).
Without this any such fault leaves the kernel running with the wrong
%gs offset and it will most likely fault again early in trap().
Repeats until the stack tramples on something important.
iret change works, invalid %gs is a little harder to arrange.

Treat traps in kernel mode during the 'return to user' iret sequence
as user faults.
Based heavily in the i386 code with the correct opcode bytes inserted.
iret path tested, arranging for segment register errors is harder.
User %fs and %gs (32bit apps) are loaded much earlier and any errors
will generate kernel panics - there is probably code to try to stop
the invalid values being set.
 1.16.6.1 02-Jun-2012  mrg sync to latest -current.
 1.16.2.2 30-Oct-2012  yamt sync with head
 1.16.2.1 23-May-2012  yamt sync with head.
 1.20.32.4 14-May-2019  martin Pull up following revision(s) (requested by maxv in ticket #1269):

sys/arch/amd64/amd64/locore.S: revision 1.181 (adapted)
sys/arch/amd64/amd64/amd64_trap.S: revision 1.47 (adapted)
sys/arch/x86/include/specialreg.h: revision 1.144 (adapted)
sys/arch/amd64/include/frameasm.h: revision 1.43 (adapted)
sys/arch/x86/x86/spectre.c: revision 1.27 (adapted)

Mitigation for INTEL-SA-00233: Microarchitectural Data Sampling (MDS).
It requires a microcode update, now available on the Intel website. The
microcode modifies the behavior of the VERW instruction, and makes it flush
internal CPU buffers. We hotpatch the return-to-userland path to add VERW.

Two sysctls are added:

machdep.mds.mitigated = {0/1} user-settable
machdep.mds.method = {string} constructed by the kernel

The kernel will automatically enable the mitigation if the updated
microcode is present. If the new microcode is not present, the user can
load it via cpuctl, and set machdep.mds.mitigated=1.
 1.20.32.3 14-Apr-2018  martin Pullup the following revisions via patch, requested by maxv in ticket #748:

sys/arch/amd64/amd64/copy.S 1.29 (adapted, via patch)
sys/arch/amd64/amd64/amd64_trap.S 1.16,1.19 (partial) (via patch)
sys/arch/amd64/amd64/trap.c 1.102,1.106 (partial),1.110 (via patch)
sys/arch/amd64/include/frameasm.h 1.22,1.24 (via patch)
sys/arch/x86/x86/cpu.c 1.137 (via patch)
sys/arch/x86/x86/patch.c 1.23,1.26 (partial) (via patch)

Backport of SMAP support.
 1.20.32.2 22-Mar-2018  martin Pull up the following revisions, requested by maxv in ticket #652:

sys/arch/amd64/amd64/amd64_trap.S upto 1.39 (partial, patch)
sys/arch/amd64/amd64/db_machdep.c 1.6 (patch)
sys/arch/amd64/amd64/genassym.cf 1.65,1.66,1.67 (patch)
sys/arch/amd64/amd64/locore.S upto 1.159 (partial, patch)
sys/arch/amd64/amd64/machdep.c 1.299-1.302 (patch)
sys/arch/amd64/amd64/trap.c upto 1.113 (partial, patch)
sys/arch/amd64/amd64/amd64/vector.S upto 1.61 (partial, patch)
sys/arch/amd64/conf/GENERIC 1.477,1.478 (patch)
sys/arch/amd64/conf/kern.ldscript 1.26 (patch)
sys/arch/amd64/include/frameasm.h upto 1.37 (partial, patch)
sys/arch/amd64/include/param.h 1.25 (patch)
sys/arch/amd64/include/pmap.h 1.41,1.43,1.44 (patch)
sys/arch/x86/conf/files.x86 1.91,1.93 (patch)
sys/arch/x86/include/cpu.h 1.88,1.89 (patch)
sys/arch/x86/include/pmap.h 1.75 (patch)
sys/arch/x86/x86/cpu.c 1.144,1.146,1.148,1.149 (patch)
sys/arch/x86/x86/pmap.c upto 1.289 (partial, patch)
sys/arch/x86/x86/vm_machdep.c 1.31,1.32 (patch)
sys/arch/x86/x86/x86_machdep.c 1.104,1.106,1.108 (patch)
sys/arch/x86/x86/svs.c 1.1-1.14
sys/arch/xen/conf/files.compat 1.30 (patch)

Backport SVS. Not enabled yet.
 1.20.32.1 07-Mar-2018  martin Pull up the following revisions (via patch), requested by maxv in ticket #610:

sys/arch/amd64/amd64/amd64_trap.S 1.8,1.10,1.12 (partial),1.13-1.15,
1.19 (partial),1.20,1.21,1.22,1.24
(via patch)
sys/arch/amd64/amd64/locore.S 1.129 (partial),1.132 (via patch)
sys/arch/amd64/amd64/trap.c 1.97 (partial),1.111 (via patch)
sys/arch/amd64/amd64/vector.S 1.54,1.55 (via patch)
sys/arch/amd64/include/frameasm.h 1.21,1.23 (via patch)
sys/arch/x86/x86/cpu.c 1.138 (via patch)
sys/arch/xen/conf/Makefile.xen 1.45 (via patch)

Rename and reorder several things in amd64_trap.S.
Compile amd64_trap.S as a file.
Introduce nmitrap and doubletrap.
Have the CPU clear PSL_D automatically in the syscall entry point.
 1.20.2.1 03-Dec-2017  jdolecek update from HEAD
 1.37.2.3 06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.37.2.2 28-Jul-2018  pgoyette Sync with HEAD
 1.37.2.1 30-Mar-2018  pgoyette Resolve conflicts between branch and HEAD
 1.38.2.2 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.38.2.1 10-Jun-2019  christos Sync with HEAD
 1.47.6.1 11-Apr-2020  bouyer Include ci_isources[] for XenPV too.
Adjust spllower() to XenPV needs, and switch XenPV to the native spllower().
Remove xen_spllower().
 1.14 30-Apr-2021  christos Merge the x86 gdt function and constant definitions
 1.13 30-Apr-2021  christos Bump MAX_USERLDT_SIZE to the max size (wastes some memory). wine needs more
than PAGE_SIZE and fails spuriously.
XXX: Note the duplicate definition hacks. Should really create <x86/gdt.h>,
put the just the constants there and unify them.
This would also avoid the hack in: src/tests/lib/libi386/t_user_ldt.c#46
 1.12 25-Apr-2020  bouyer branches: 1.12.6;
Merge the bouyer-xenpvh branch, bringing in Xen PV drivers support under HVM
guests in GENERIC.
Xen support can be disabled at runtime with
boot -c
disable hypervisor
 1.11 24-Apr-2020  maxv Give the ldt a fixed size of one page (512 slots), and drop the variable-
sized mechanism that was too complex.

This fixes a race between USER_LDT and SVS: during context switches, the
way SVS installs the new ldt relies on the ldt pointer AND the ldt size,
but both cannot be accessed atomically at the same time.
 1.10 08-Feb-2017  maxv branches: 1.10.24;
Remove gdt_reload_cpu. GDTR takes a VA as base, and in our x86
implementation this VA is per-cpu and does not change; there is therefore
no need to remotely reload GDTR.
 1.9 08-Feb-2017  maxv Localify, add a comment and merge some others.
 1.8 20-Aug-2016  maxv branches: 1.8.2;
Make this area compile, even if we don't support USER_LDT on amd64.
 1.7 07-Jul-2010  chs branches: 1.7.18; 1.7.36; 1.7.40;
add the guts of TLS support on amd64. based on joerg's patch,
reworked by me to support 32-bit processes as well.
we now keep %fs and %gs loaded with the user values
while in the kernel, which means we don't need to
reload them when returning to user mode.
 1.6 14-Mar-2009  dsl branches: 1.6.2; 1.6.4;
Remove all the __P() from sys (excluding sys/dist)
Diff checked with grep and MK1 eyeball.
i386 and amd64 GENERIC and sys still build.
 1.5 28-Apr-2008  martin branches: 1.5.8; 1.5.14;
Remove clause 3 and 4 from TNF licenses
 1.4 05-Jan-2008  yamt branches: 1.4.6; 1.4.8; 1.4.10;
- make amd64 use per-cpu tss.
- fix iopl syscall for amd64+xen.
 1.3 11-Dec-2005  christos branches: 1.3.50; 1.3.56; 1.3.64;
merge ktrace-lwp.
 1.2 16-Jun-2004  fvdl branches: 1.2.12;
When converting GDT length units from segment structures to bytes for the
amd64 port, I converted MINGDTSIZ wrongly; it was not page aligned, causing
gdt_grow to corrupt the GDT. Fix this, and remove the extraneous definitions
of the sizes from gdt.c.

From OpenBSD.
 1.1 26-Apr-2003  fvdl branches: 1.1.2; 1.1.4;
Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1.4.1 17-Jun-2004  tron Pull up revision 1.2 (requested by fvdl in ticket #506):
When converting GDT length units from segment structures to bytes for the
amd64 port, I converted MINGDTSIZ wrongly; it was not page aligned, causing
gdt_grow to corrupt the GDT. Fix this, and remove the extraneous definitions
of the sizes from gdt.c.
From OpenBSD.
 1.1.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.1.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.1.2.1 03-Aug-2004  skrll Sync with HEAD
 1.2.12.1 21-Jan-2008  yamt sync with head
 1.3.64.1 08-Jan-2008  bouyer Sync with HEAD
 1.3.56.1 18-Feb-2008  mjf Sync with HEAD.
 1.3.50.1 09-Jan-2008  matt sync with HEAD
 1.4.10.3 11-Aug-2010  yamt sync with head.
 1.4.10.2 04-May-2009  yamt sync with head.
 1.4.10.1 16-May-2008  yamt sync with head.
 1.4.8.1 18-May-2008  yamt sync with head.
 1.4.6.1 02-Jun-2008  mjf Sync with HEAD.
 1.5.14.3 24-Oct-2010  jym Sync with HEAD
 1.5.14.2 01-Nov-2009  jym Sync with HEAD.
 1.5.14.1 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.5.8.1 28-Apr-2009  skrll Sync with HEAD.
 1.6.4.1 05-Mar-2011  rmind sync with head
 1.6.2.1 17-Aug-2010  uebayasi Sync with HEAD.
 1.7.40.1 20-Mar-2017  pgoyette Sync with HEAD
 1.7.36.2 28-Aug-2017  skrll Sync with HEAD
 1.7.36.1 05-Oct-2016  skrll Sync with HEAD
 1.7.18.1 03-Dec-2017  jdolecek update from HEAD
 1.8.2.1 21-Apr-2017  bouyer Sync with HEAD
 1.10.24.1 25-Apr-2020  bouyer Sync with bouyer-xenpvh-base2 (HEAD)
 1.12.6.1 13-May-2021  thorpej Sync with HEAD.
 1.10 17-Apr-2021  rillig sys/arch/amd64: remove trailing whitespace
 1.9 13-Nov-2017  nakayama Don't write a 1 to the read only RIRR bit in the IOAPIC redirection
register to fix "tlp0: filter setup and transmit timeout" observed
on Hyper-V VMs with the Legacy Network Adapter.

From OpenBSD via PR kern/49323:

https://marc.info/?l=openbsd-cvs&m=146718035432599&w=2

| Modified files:
| sys/arch/amd64/amd64: ioapic.c
| sys/arch/amd64/include: i82093reg.h
|
| Log message:
| Don't write a 1 to the RIRR bit in the IOAPIC redirection register. This bit
| is R/O, and although it should not matter what value is written there,
| Hyper-V's emulated IOAPIC interprets a write of 1 in some unexpected way and
| subsequently blocks interrupt delivery. This primarily manifests itself as
| de(4) timeouts when using Hyper-V VMs with the "Legacy Network Adapter"
| interface.

Tested both amd64 and i386 on Client Hyper-V on Windows 10.
 1.8 23-May-2017  nonaka branches: 1.8.2;
x86: Add preliminary x2APIC support.

x2APIC is used only when x2APIC is enabled in BIOS/UEFI.
LAPIC ID is not supported above 256.
 1.7 25-Nov-2016  maxv Move the virtual address of the LAPIC page out of the data segment on amd64
and i386. The old design was error-prone, and it didn't allow us to map the
data segment with large pages.

Now, the VA is allocated dynamically in the pmap bootstrap code, and entered
manually later. We go from using &local_apic to using *local_apic_va, and we
therefore need one more level of indirection in the asm code.

Discussed on tech-kern.
 1.6 11-Aug-2016  maxv Use absolute addressing mode, just like the rest.
 1.5 03-Jul-2008  drochner branches: 1.5.40; 1.5.58; 1.5.60; 1.5.62; 1.5.64; 1.5.68;
Remove "struct device" from "struct pic", where it was only real
for ioapics and faked up for others. Add it to "struct ioapic_softc"
for now, until device/softc get split.
This required all typecasts between "struct pic" and "struct ioapic_softc"
to be replaced, I hope I got them all.
functionally tested on i386, compile-tested on xen, untested on amd64
 1.4 27-Apr-2008  skd branches: 1.4.2; 1.4.4;
Fix pic locking to mirror simple_lock primitives.
 1.3 11-May-2003  fvdl branches: 1.3.104; 1.3.106; 1.3.108;
Reselect the ioapic register for each read or write.
 1.2 04-May-2003  fvdl Follow i386, and mask deferred level-triggered interrupts at the ioapic.
 1.1 26-Apr-2003  fvdl Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.3.108.2 04-May-2009  yamt sync with head.
 1.3.108.1 16-May-2008  yamt sync with head.
 1.3.106.1 18-May-2008  yamt sync with head.
 1.3.104.2 28-Sep-2008  mjf Sync with HEAD.
 1.3.104.1 02-Jun-2008  mjf Sync with HEAD.
 1.4.4.1 03-Jul-2008  simonb Sync with head.
 1.4.2.1 18-Sep-2008  wrstuden Sync with wrstuden-revivesa-base-2.
 1.5.68.1 03-Jan-2018  snj Pull up following revision(s) (requested by nakayama in ticket #1527):
sys/arch/amd64/include/i82093reg.h: revision 1.9
sys/arch/i386/include/i82093reg.h: revision 1.11
sys/arch/x86/x86/ioapic.c: revision 1.54
Don't write a 1 to the read only RIRR bit in the IOAPIC redirection
register to fix "tlp0: filter setup and transmit timeout" observed
on Hyper-V VMs with the Legacy Network Adapter.
From OpenBSD via PR kern/49323:
https://marc.info/?l=openbsd-cvs&m=146718035432599&w=2
 1.5.64.1 07-Jan-2017  pgoyette Sync with HEAD. (Note that most of these changes are simply $NetBSD$
tag issues.)
 1.5.62.1 03-Jan-2018  snj Pull up following revision(s) (requested by nakayama in ticket #1527):
sys/arch/amd64/include/i82093reg.h: revision 1.9
sys/arch/i386/include/i82093reg.h: revision 1.11
sys/arch/x86/x86/ioapic.c: revision 1.54
Don't write a 1 to the read only RIRR bit in the IOAPIC redirection
register to fix "tlp0: filter setup and transmit timeout" observed
on Hyper-V VMs with the Legacy Network Adapter.
From OpenBSD via PR kern/49323:
https://marc.info/?l=openbsd-cvs&m=146718035432599&w=2
 1.5.60.3 28-Aug-2017  skrll Sync with HEAD
 1.5.60.2 05-Dec-2016  skrll Sync with HEAD
 1.5.60.1 05-Oct-2016  skrll Sync with HEAD
 1.5.58.1 03-Jan-2018  snj Pull up following revision(s) (requested by nakayama in ticket #1527):
sys/arch/amd64/include/i82093reg.h: revision 1.9
sys/arch/i386/include/i82093reg.h: revision 1.11
sys/arch/x86/x86/ioapic.c: revision 1.54
Don't write a 1 to the read only RIRR bit in the IOAPIC redirection
register to fix "tlp0: filter setup and transmit timeout" observed
on Hyper-V VMs with the Legacy Network Adapter.
From OpenBSD via PR kern/49323:
https://marc.info/?l=openbsd-cvs&m=146718035432599&w=2
 1.5.40.1 03-Dec-2017  jdolecek update from HEAD
 1.8.2.1 21-Nov-2017  martin Pull up following revision(s) (requested by nakayama in ticket #359):
sys/arch/amd64/include/i82093reg.h: revision 1.9
sys/arch/x86/x86/ioapic.c: revision 1.54
sys/arch/i386/include/i82093reg.h: revision 1.11
Don't write a 1 to the read only RIRR bit in the IOAPIC redirection
register to fix "tlp0: filter setup and transmit timeout" observed
on Hyper-V VMs with the Legacy Network Adapter.
From OpenBSD via PR kern/49323:
https://marc.info/?l=openbsd-cvs&m=146718035432599&w=2
Modified files:
sys/arch/amd64/amd64: ioapic.c
sys/arch/amd64/include: i82093reg.h
Log message:
Don't write a 1 to the RIRR bit in the IOAPIC redirection register. This bit
is R/O, and although it should not matter what value is written there,
Hyper-V's emulated IOAPIC interprets a write of 1 in some unexpected way and
subsequently blocks interrupt delivery. This primarily manifests itself as
de(4) timeouts when using Hyper-V VMs with the "Legacy Network Adapter"
interface.
Tested both amd64 and i386 on Client Hyper-V on Windows 10.
 1.1 26-Apr-2003  fvdl Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1 26-Apr-2003  fvdl Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1 26-Apr-2003  fvdl Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1 26-Apr-2003  fvdl Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1 26-Apr-2003  fvdl Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1 26-Apr-2003  fvdl Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.5 25-Jul-2014  joerg Add generic versions of machine/int_*.h for compilers providing
appropiate macros for all necessary types.
 1.4 29-May-2010  tnozaki branches: 1.4.18; 1.4.32;
fix wrong integer promotion rule(removed U suffix from UINT{8,16}_C).
see ISO/IEC 9899:1999 7.18.4.3.
 1.3 26-Oct-2008  mrg branches: 1.3.8; 1.3.14; 1.3.16;
put the contents of these header files around #ifdef __x86_64__, and
#include the <i386/foo.h> in the #else clause, making these files
largely bit-size independant.
 1.2 28-Apr-2008  martin branches: 1.2.6;
Remove clause 3 and 4 from TNF licenses
 1.1 26-Apr-2003  fvdl branches: 1.1.104; 1.1.106; 1.1.108;
Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1.108.3 11-Aug-2010  yamt sync with head.
 1.1.108.2 04-May-2009  yamt sync with head.
 1.1.108.1 16-May-2008  yamt sync with head.
 1.1.106.1 18-May-2008  yamt sync with head.
 1.1.104.2 17-Jan-2009  mjf Sync with HEAD.
 1.1.104.1 02-Jun-2008  mjf Sync with HEAD.
 1.2.6.1 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.3.16.1 30-May-2010  rmind sync with head
 1.3.14.1 17-Aug-2010  uebayasi Sync with HEAD.
 1.3.8.1 24-Oct-2010  jym Sync with HEAD
 1.4.32.1 10-Aug-2014  tls Rebase.
 1.4.18.1 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.7 25-Jul-2014  joerg Add generic versions of machine/int_*.h for compilers providing
appropiate macros for all necessary types.
 1.6 26-Oct-2008  mrg branches: 1.6.38; 1.6.54;
put the contents of these header files around #ifdef __x86_64__, and
#include the <i386/foo.h> in the #else clause, making these files
largely bit-size independant.
 1.5 28-Apr-2008  martin branches: 1.5.6;
Remove clause 3 and 4 from TNF licenses
 1.4 11-Dec-2005  christos branches: 1.4.74; 1.4.76; 1.4.78;
merge ktrace-lwp.
 1.3 23-May-2004  kleink Change {u,}int_fast{8,16}_t to 32-bit types.

Note: While this is technically an ABI change I believe it is a
change that we can afford at this time (and to be pulled up to
2.0, which will be the first release for amd64). The types are
not widely used yet, and a survey of pkgsrc has not shown uses
that would be adversely affected by it.
 1.2 31-Aug-2003  fvdl branches: 1.2.2;
Update a few types and formats.
 1.1 26-Apr-2003  fvdl branches: 1.1.2;
Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.1.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.1.2.1 03-Aug-2004  skrll Sync with HEAD
 1.2.2.1 25-May-2004  jmc Pullup rev 1.3 (requested by kleink in ticket #380)

Change {u,}int_fast{8,16}_t to 32-bit types.
 1.4.78.2 04-May-2009  yamt sync with head.
 1.4.78.1 16-May-2008  yamt sync with head.
 1.4.76.1 18-May-2008  yamt sync with head.
 1.4.74.2 17-Jan-2009  mjf Sync with HEAD.
 1.4.74.1 02-Jun-2008  mjf Sync with HEAD.
 1.5.6.1 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.6.54.1 10-Aug-2014  tls Rebase.
 1.6.38.1 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.10 17-Apr-2021  rillig sys/arch/amd64: remove trailing whitespace
 1.9 25-Jul-2014  joerg Add generic versions of machine/int_*.h for compilers providing
appropiate macros for all necessary types.
 1.8 27-Jan-2012  christos branches: 1.8.6; 1.8.20;
PR/45878: Nick Hudson: SIG_ATOMIC_{MAX,MIN} wrong for sig_atomic_t on amd64
sig_atomic_t is an int on amd64, put the proper limits there
 1.7 26-Oct-2008  mrg branches: 1.7.28; 1.7.32;
put the contents of these header files around #ifdef __x86_64__, and
#include the <i386/foo.h> in the #else clause, making these files
largely bit-size independant.
 1.6 28-Apr-2008  martin branches: 1.6.6;
Remove clause 3 and 4 from TNF licenses
 1.5 17-Oct-2007  garbled branches: 1.5.16; 1.5.18; 1.5.20;
Merge the ppcoea-renovation branch to HEAD.

This branch was a major cleanup and rototill of many of the various OEA
cpu based PPC ports that focused on sharing as much code as possible
between the various ports to eliminate near-identical copies of files in
every tree. Additionally there is a new PIC system that unifies the
interface to interrupt code for all different OEA ppc arches. The work
for this branch was done by a variety of people, too long to list here.

TODO:
bebox still needs work to complete the transition to -renovation.
ofppc still needs a bunch of work, which I will be looking at.
ev64260 still needs to be renovated
amigappc was not attempted.

NOTES:
pmppc was removed as an arch, and moved to a evbppc target.
 1.4 31-Aug-2007  drochner Fix definitions of UCHAR_MAX/USHRT_MAX and related
types. C99 requires that these definitions promote to (signed/unsigned)
integer the same way as the types the definition is for. And since
unsigned char/short fit into an "int" on all our archs and thus promote
to signed int, the definitions must not be unsigned.
Fixes PR lib/31306 by Neil Booth.
 1.3 11-Dec-2005  christos branches: 1.3.30; 1.3.38; 1.3.44; 1.3.48; 1.3.50;
merge ktrace-lwp.
 1.2 08-May-2004  kleink branches: 1.2.12;
Factor out W{CHAR,INT}_{MAX,MIN} into their own header file.
 1.1 26-Apr-2003  fvdl branches: 1.1.2;
Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.1.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.1.2.1 03-Aug-2004  skrll Sync with HEAD
 1.2.12.1 03-Sep-2007  yamt sync with head.
 1.3.50.1 06-Nov-2007  matt sync with HEAD
 1.3.48.1 03-Sep-2007  jmcneill Sync with HEAD.
 1.3.44.1 03-Sep-2007  skrll Sync with HEAD.
 1.3.38.1 03-Oct-2007  garbled Sync with HEAD
 1.3.30.1 09-Oct-2007  ad Sync with head.
 1.5.20.2 04-May-2009  yamt sync with head.
 1.5.20.1 16-May-2008  yamt sync with head.
 1.5.18.1 18-May-2008  yamt sync with head.
 1.5.16.2 17-Jan-2009  mjf Sync with HEAD.
 1.5.16.1 02-Jun-2008  mjf Sync with HEAD.
 1.6.6.1 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.7.32.1 18-Feb-2012  mrg merge to -current.
 1.7.28.1 17-Apr-2012  yamt sync with head
 1.8.20.1 10-Aug-2014  tls Rebase.
 1.8.6.1 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.8 25-Jul-2014  joerg Add generic versions of machine/int_*.h for compilers providing
appropiate macros for all necessary types.
 1.7 26-Oct-2008  mrg branches: 1.7.38; 1.7.54;
put the contents of these header files around #ifdef __x86_64__, and
#include the <i386/foo.h> in the #else clause, making these files
largely bit-size independant.
 1.6 28-Apr-2008  martin branches: 1.6.6;
Remove clause 3 and 4 from TNF licenses
 1.5 24-Dec-2005  perry branches: 1.5.74; 1.5.76; 1.5.78;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.
 1.4 11-Dec-2005  christos merge ktrace-lwp.
 1.3 23-May-2004  kleink branches: 1.3.12;
Change {u,}int_fast{8,16}_t to 32-bit types.

Note: While this is technically an ABI change I believe it is a
change that we can afford at this time (and to be pulled up to
2.0, which will be the first release for amd64). The types are
not widely used yet, and a survey of pkgsrc has not shown uses
that would be adversely affected by it.
 1.2 31-Aug-2003  fvdl branches: 1.2.2;
Update a few types and formats.
 1.1 26-Apr-2003  fvdl branches: 1.1.2;
Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.1.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.1.2.1 03-Aug-2004  skrll Sync with HEAD
 1.2.2.1 25-May-2004  jmc Pullup rev 1.3 (requested by kleink in ticket #380)

Change {u,}int_fast{8,16}_t to 32-bit types.
 1.3.12.1 21-Jun-2006  yamt sync with head.
 1.5.78.2 04-May-2009  yamt sync with head.
 1.5.78.1 16-May-2008  yamt sync with head.
 1.5.76.1 18-May-2008  yamt sync with head.
 1.5.74.2 17-Jan-2009  mjf Sync with HEAD.
 1.5.74.1 02-Jun-2008  mjf Sync with HEAD.
 1.6.6.1 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.7.54.1 10-Aug-2014  tls Rebase.
 1.7.38.1 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.7 25-Jul-2014  joerg Add generic versions of machine/int_*.h for compilers providing
appropiate macros for all necessary types.
 1.6 26-Oct-2008  mrg branches: 1.6.38; 1.6.54;
put the contents of these header files around #ifdef __x86_64__, and
#include the <i386/foo.h> in the #else clause, making these files
largely bit-size independant.
 1.5 24-Dec-2005  perry branches: 1.5.74; 1.5.78; 1.5.84;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.
 1.4 11-Dec-2005  christos merge ktrace-lwp.
 1.3 25-May-2005  kleink branches: 1.3.2;
Include <sys/cdefs.h> for __signed; related to lib/30072.
 1.2 07-Aug-2003  agc branches: 1.2.6; 1.2.14;
Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.1 26-Apr-2003  fvdl branches: 1.1.2;
Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1.2.4 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.1.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.1.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.1.2.1 03-Aug-2004  skrll Sync with HEAD
 1.2.14.1 28-May-2005  tron Pull up revision 1.3 (requested by klein in ticket #346):
Include <sys/cdefs.h> for __signed; related to lib/30072.
 1.2.6.1 29-May-2005  riz Pull up revision 1.3 (requested by kleink in ticket #1555):
Include <sys/cdefs.h> for __signed; related to lib/30072.
 1.3.2.1 21-Jun-2006  yamt sync with head.
 1.5.84.1 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.5.78.1 04-May-2009  yamt sync with head.
 1.5.74.1 17-Jan-2009  mjf Sync with HEAD.
 1.6.54.1 10-Aug-2014  tls Rebase.
 1.6.38.1 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1 26-Apr-2003  fvdl Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.4 30-Apr-2020  bouyer Don't #include xen/intrdefs.h is !XEN.
Should fix third-party module builds (e.g. virtualbox)
 1.3 25-Apr-2020  bouyer Merge the bouyer-xenpvh branch, bringing in Xen PV drivers support under HVM
guests in GENERIC.
Xen support can be disabled at runtime with
boot -c
disable hypervisor
 1.2 10-Aug-2011  cherry branches: 1.2.64;
Include Xen specific definitions.
 1.1 26-Apr-2003  fvdl branches: 1.1.122; 1.1.140;
Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1.140.1 03-Jun-2011  cherry Initial import of xen MP sources, with kernel and userspace tests.
- this is a source priview.
- boots to single user.
- spurious interrupt and pmap related panics are normal
 1.1.122.1 27-Aug-2011  jym Sync with HEAD. Most notably: uvm/pmap work done by rmind@, and MP Xen
work of cherry@.

No regression observed on suspend/restore.
 1.2.64.1 16-Apr-2020  bouyer Avoid overflow of ci_ipi_events[] in the PVHVM case (it's size is
XEN_NIPIS but we use x86 IPIs): size XEN_NIPIS only for PV, and
CTASSERT that XEN_NIPIS <= X86_NIPI if we ever use Xen IPIs for
PVHVM.
 1.1 26-Apr-2003  fvdl Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.2 11-Dec-2005  christos merge ktrace-lwp.
 1.1 02-Jul-2004  drochner branches: 1.1.2; 1.1.4;
add a <machine/joystick.h> which just includes the new common one
 1.1.4.4 21-Sep-2004  skrll Fix the sync with head I botched.
 1.1.4.3 18-Sep-2004  skrll Sync with HEAD.
 1.1.4.2 03-Aug-2004  skrll Sync with HEAD
 1.1.4.1 02-Jul-2004  skrll file joystick.h was added on branch ktrace-lwp on 2004-08-03 10:31:36 +0000
 1.1.2.2 05-Jul-2004  he Pull up revision 1.1 (new, via patch, requested by drochner in ticket #605):
Add a <machine/joystick.h> which here is just a copy
of the i386 one.
 1.1.2.1 02-Jul-2004  he file joystick.h was added on branch netbsd-2-0 on 2004-07-05 22:12:16 +0000
 1.3 25-Jun-2013  joerg Fix header guards.
 1.2 16-Apr-2008  cegger branches: 1.2.38; 1.2.48;
use POSIX integer types
 1.1 26-Apr-2003  fvdl branches: 1.1.104;
Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1.104.1 02-Jun-2008  mjf Sync with HEAD.
 1.2.48.1 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.2.38.1 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.2 08-Feb-2020  maxv Retire KLEAK.

KLEAK was a nice feature and served its purpose; it allowed us to detect
dozens of info leaks on the kernel->userland boundary, and thanks to it we
tackled a good part of the infoleak problem 1.5 years ago.

Nowadays however, we have kMSan, which can detect uninitialized memory in
the kernel. kMSan supersedes KLEAK: it can detect what KLEAK was able to
detect, but in addition, (1) it operates in all of the kernel and not just
the kernel->userland boundary, (2) it requires no user interaction, and (3)
it is deterministic and not statistical.

That makes kMSan the feature of choice to detect info leaks nowadays;
people interested in detecting info leaks should boot a kMSan kernel and
just wait for the magic to happen.

KLEAK was a good ride, and a fun project, but now is time for it to go.

Discussed with several people, including Thomas Barabosch.
 1.1 02-Dec-2018  maxv branches: 1.1.2; 1.1.6; 1.1.10;
Introduce KLEAK, a new feature that can detect kernel information leaks.

It works by tainting memory sources with marker values, letting the data
travel through the kernel, and scanning the kernel<->user frontier for
these marker values. Combined with compiler instrumentation and rotation
of the markers, it is able to yield relevant results with little effort.

We taint the pools and the stack, and scan copyout/copyoutstr. KLEAK is
supported on amd64 only for now, but it is not complicated to add more
architectures (just a matter of having the address of .text, and a stack
unwinder).

A userland tool is provided, that allows to execute a command in rounds
and monitor the leaks generated all the while.

KLEAK already detected directly 12 kernel info leaks, and prompted changes
that in total fixed 25+ leaks.

Based on an idea developed jointly with Thomas Barabosch (of Fraunhofer
FKIE).
 1.1.10.1 29-Feb-2020  ad Sync with head.
 1.1.6.3 08-Apr-2020  martin Merge changes from current as of 20200406
 1.1.6.2 10-Jun-2019  christos Sync with HEAD
 1.1.6.1 02-Dec-2018  christos file kleak.h was added on branch phil-wifi on 2019-06-10 22:05:47 +0000
 1.1.2.2 26-Dec-2018  pgoyette Sync with HEAD, resolve a few conflicts
 1.1.2.1 02-Dec-2018  pgoyette file kleak.h was added on branch pgoyette-compat on 2018-12-26 14:01:31 +0000
 1.15 21-Jan-2019  dholland Fix wrong scoping of {U,}LLONG_MAX.
PR 53298 from Roberto E. Vargas Caballero.
 1.14 21-Apr-2014  matt branches: 1.14.26; 1.14.28;
Since all our compilers support __DBL_* and __FLT_*, use them to define
{DBL,FLT}_{DIG,MIN,MAX}
 1.13 11-Apr-2013  christos branches: 1.13.4; 1.13.8;
add missing SSIZE_MIN
 1.12 28-Mar-2012  christos branches: 1.12.2;
- Normalize inclusion protection (remove)
- Move CHAR_{MIN,MAX} to a common file.
- Fix broken comments
 1.11 07-Jun-2010  tnozaki branches: 1.11.8; 1.11.12;
1. MB_LEN_MAX switch MD to MI.
2. unfortunately hppa's MB_LEN_MAX is defined incorrectly 6 instead of 32
so we have to add more setlocale(3) __RENAME func, __setlocale50.
3. move setlocale1.c and setlocale32.c to lib/libc/compat/locale/*
prepareing for next libc major crunk.
4. bump libc minor version.
 1.10 26-Oct-2008  mrg branches: 1.10.8; 1.10.14; 1.10.16;
put the contents of these header files around #ifdef __x86_64__, and
#include the <i386/foo.h> in the #else clause, making these files
largely bit-size independant.
 1.9 17-Oct-2007  garbled branches: 1.9.16; 1.9.20; 1.9.26;
Merge the ppcoea-renovation branch to HEAD.

This branch was a major cleanup and rototill of many of the various OEA
cpu based PPC ports that focused on sharing as much code as possible
between the various ports to eliminate near-identical copies of files in
every tree. Additionally there is a new PIC system that unifies the
interface to interrupt code for all different OEA ppc arches. The work
for this branch was done by a variety of people, too long to list here.

TODO:
bebox still needs work to complete the transition to -renovation.
ofppc still needs a bunch of work, which I will be looking at.
ev64260 still needs to be renovated
amigappc was not attempted.

NOTES:
pmppc was removed as an arch, and moved to a evbppc target.
 1.8 31-Aug-2007  drochner Fix definitions of UCHAR_MAX/USHRT_MAX and related
types. C99 requires that these definitions promote to (signed/unsigned)
integer the same way as the types the definition is for. And since
unsigned char/short fit into an "int" on all our archs and thus promote
to signed int, the definitions must not be unsigned.
Fixes PR lib/31306 by Neil Booth.
 1.7 11-Dec-2005  christos branches: 1.7.30; 1.7.38; 1.7.44; 1.7.48; 1.7.50;
merge ktrace-lwp.
 1.6 19-Sep-2003  fvdl branches: 1.6.16;
LONG_BIT should be 64. From Nicolas Joly.
 1.5 07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.4 25-May-2003  fvdl branches: 1.4.2;
Correct a few maximum values.
 1.3 28-Apr-2003  bjh21 Add a new feature-test macro, _NETBSD_SOURCE. If this is defined
by the application, all NetBSD interfaces are made visible, even
if some other feature-test macro (like _POSIX_C_SOURCE) is defined.
<sys/featuretest.h> defined _NETBSD_SOURCE if none of _ANSI_SOURCE,
_POSIX_C_SOURCE and _XOPEN_SOURCE is defined, so as to preserve
existing behaviour.

This has two major advantages:
+ Programs that require non-POSIX facilities but define _POSIX_C_SOURCE
can trivially be overruled by putting -D_NETBSD_SOURCE in their CFLAGS.
+ It makes most of the #ifs simpler, in that they're all now ORs of the
various macros, rather than having checks for (!defined(_ANSI_SOURCE) ||
!defined(_POSIX_C_SOURCE) || !defined(_XOPEN_SOURCE)) all over the place.

I've tried not to change the semantics of the headers in any case where
_NETBSD_SOURCE wasn't defined, but there were some places where the
current semantics were clearly mad, and retaining them was harder than
correcting them. In particular, I've mostly normalised things so that
_ANSI_SOURCE gets you the smallest set of stuff, then _POSIX_C_SOURCE,
_XOPEN_SOURCE and _NETBSD_SOURCE in that order.

Tested by building for vax, encouraged by thorpej, and uncontested in
tech-userlevel for a week.
 1.2 27-Apr-2003  fvdl Fix *LONGMAX values.
 1.1 26-Apr-2003  fvdl Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.4.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.4.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.4.2.1 03-Aug-2004  skrll Sync with HEAD
 1.6.16.1 03-Sep-2007  yamt sync with head.
 1.7.50.1 06-Nov-2007  matt sync with HEAD
 1.7.48.1 03-Sep-2007  jmcneill Sync with HEAD.
 1.7.44.1 03-Sep-2007  skrll Sync with HEAD.
 1.7.38.1 03-Oct-2007  garbled Sync with HEAD
 1.7.30.1 09-Oct-2007  ad Sync with head.
 1.9.26.1 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.9.20.2 11-Aug-2010  yamt sync with head.
 1.9.20.1 04-May-2009  yamt sync with head.
 1.9.16.1 17-Jan-2009  mjf Sync with HEAD.
 1.10.16.1 03-Jul-2010  rmind sync with head
 1.10.14.1 17-Aug-2010  uebayasi Sync with HEAD.
 1.10.8.1 24-Oct-2010  jym Sync with HEAD
 1.11.12.1 05-Apr-2012  mrg sync to latest -current.
 1.11.8.2 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.11.8.1 17-Apr-2012  yamt sync with head
 1.12.2.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.12.2.1 23-Jun-2013  tls resync from head
 1.13.8.1 10-Aug-2014  tls Rebase.
 1.13.4.1 18-May-2014  rmind sync with head
 1.14.28.1 10-Jun-2019  christos Sync with HEAD
 1.14.26.1 26-Jan-2019  pgoyette Sync with HEAD
 1.3 18-Nov-2013  chs implement the *at() syscalls.
bring the unimplemented syscall list up to date.
 1.2 18-Nov-2011  christos branches: 1.2.10; 1.2.14;
include the new siginfo.h file
 1.1 09-Feb-2006  manu branches: 1.1.2; 1.1.10; 1.1.16; 1.1.22; 1.1.114;
Add initial (but unfinished) COMPAT_LINUX32 for amd64. This is good enough so
that the i386 license manager part of amd64 version of Fluent works.

While I'm here, add SysV IPC to COMPAT_LINUX/amd64
 1.1.114.2 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.1.114.1 17-Apr-2012  yamt sync with head
 1.1.22.2 09-Sep-2006  rpaulo sync with head
 1.1.22.1 09-Feb-2006  rpaulo file linux32_machdep.h was added on branch rpaulo-netinet-merge-pcb on 2006-09-09 02:37:18 +0000
 1.1.16.2 21-Jun-2006  yamt sync with head.
 1.1.16.1 09-Feb-2006  yamt file linux32_machdep.h was added on branch yamt-lazymbuf on 2006-06-21 14:48:25 +0000
 1.1.10.2 22-Apr-2006  simonb Sync with head.
 1.1.10.1 09-Feb-2006  simonb file linux32_machdep.h was added on branch simonb-timecounters on 2006-04-22 11:37:12 +0000
 1.1.2.2 18-Feb-2006  yamt sync with head.
 1.1.2.1 09-Feb-2006  yamt file linux32_machdep.h was added on branch yamt-uio_vmspace on 2006-02-18 15:38:31 +0000
 1.2.14.1 18-May-2014  rmind sync with head
 1.2.10.1 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.4 17-Oct-2007  garbled Merge the ppcoea-renovation branch to HEAD.

This branch was a major cleanup and rototill of many of the various OEA
cpu based PPC ports that focused on sharing as much code as possible
between the various ports to eliminate near-identical copies of files in
every tree. Additionally there is a new PIC system that unifies the
interface to interrupt code for all different OEA ppc arches. The work
for this branch was done by a variety of people, too long to list here.

TODO:
bebox still needs work to complete the transition to -renovation.
ofppc still needs a bunch of work, which I will be looking at.
ev64260 still needs to be renovated
amigappc was not attempted.

NOTES:
pmppc was removed as an arch, and moved to a evbppc target.
 1.3 01-Oct-2007  ad Now that the bootblocks are the same, share loadfile_machdep.h between
amd64 and i386.
 1.2 25-Jan-2006  christos branches: 1.2.28; 1.2.36; 1.2.46; 1.2.48; 1.2.50;
free -> dealloc
unsigned -> size_t for alloc/dealloc
 1.1 26-Apr-2003  fvdl branches: 1.1.18; 1.1.30;
Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1.30.1 01-Feb-2006  yamt sync with head.
 1.1.18.2 27-Oct-2007  yamt sync with head.
 1.1.18.1 21-Jun-2006  yamt sync with head.
 1.2.50.1 06-Oct-2007  yamt sync with head.
 1.2.48.1 06-Nov-2007  matt sync with HEAD
 1.2.46.1 02-Oct-2007  joerg Sync with HEAD.
 1.2.36.1 03-Oct-2007  garbled Sync with HEAD
 1.2.28.1 09-Oct-2007  ad Sync with head.
 1.1 26-Apr-2003  fvdl Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.2 04-Nov-2024  christos Undo previous lwp.h change.
 1.1 03-Nov-2024  christos Split __lwp_getprivate_fast and __lwp_*tcb from mcontext.h into a separate
lwp.h file.
 1.1 30-Nov-2024  christos branches: 1.1.4;
Create a new header lwp_private.h to contain _lwp_getprivate_fast,
_lwp_gettcb_fast, _lwp_settcb and remove them from mcontext.h, so that:
1. we don't need special hacks to hide them
2. we can include <lwp.h> where needed to get the necessary prototypes
without redefining them locally.
 1.1.4.2 02-Aug-2025  perseant Sync with HEAD
 1.1.4.1 30-Nov-2024  perseant file lwp_private.h was added on branch perseant-exfatfs on 2025-08-02 05:55:24 +0000
 1.3 11-Dec-2005  christos merge ktrace-lwp.
 1.2 22-Oct-2003  kleink Use a common <machine/math.h> for amd64 and i386.
 1.1 26-Apr-2003  fvdl branches: 1.1.2;
Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.1.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.1.2.1 03-Aug-2004  skrll Sync with HEAD
 1.24 30-Nov-2024  christos Create a new header lwp_private.h to contain _lwp_getprivate_fast,
_lwp_gettcb_fast, _lwp_settcb and remove them from mcontext.h, so that:
1. we don't need special hacks to hide them
2. we can include <lwp.h> where needed to get the necessary prototypes
without redefining them locally.
 1.23 04-Nov-2024  christos Undo previous lwp.h change.
 1.22 03-Nov-2024  christos Split __lwp_getprivate_fast and __lwp_*tcb from mcontext.h into a separate
lwp.h file.
 1.21 18-May-2024  thorpej branches: 1.21.2;
Clean up the <sys/ucontext.h> <-> <machine/mcontext.h> interface up
a little:
- Define _UC_MD_BIT* constants for the available machine-dependent bits,
and use those constants to define the machine-dependent bits as well
as the machine-independent bits that have machine-dependent values.
- Explicitly generate an error if _UC_TLSBASE, _UC_SETSTACK, or
_UC_CLRSTACK are not defined by <machine/mcontext.h>.
 1.20 27-Dec-2019  kamil Harmonize the namespace of fast TLS base pointer getter functions

Protect __lwp_getprivate_fast() with _RTLD_SOURCE, _LIBC_SOURCE and
__LIBPTHREAD_SOURCE__.

Include in this namespace <sys/tcl.h> and use __BEGIN_DECLS/__END_DECLS
for the sake of consistency.
 1.19 15-Feb-2018  kamil branches: 1.19.4;
Introduce _UC_MACHINE_FP() as a macro

_UC_MACHINE_FP() is a helper macro to extract from mcontext a frame pointer.

Don't rely on this interface as a compiler might strip frame pointer or
optimize it making this interface unreliable.


For hppa assume a small frame context, for larger frames FP might be located
in a different register (4 instead of 3).

For ia64 there is no strict frame pointer, and registers might rotate.
Reuse 79 following:

./gcc/config/ia64/ia64.h:#define HARD_FRAME_POINTER_REGNUM LOC_REG (79)

Once ia64 will mature, this should be revisited.

A macro can encapsulate a real function for extracting Frame Pointer on
more complex CPUs / ABIs.


For the remaining CPUs, reuse standard register as defined in appropriate ABI.

The direct users of this macro are LLVM and GCC with Sanitizers.

Proposed on tech-userlevel@.

Sponsored by <The NetBSD Foundation>
 1.18 12-May-2014  uebayasi branches: 1.18.20;
Comments.
 1.17 15-Feb-2014  dsl branches: 1.17.2;
Load and save the fpu registers (for copies to/from userspace) using
helper functions in arch/x86/x86/fpu.c
They (hopefully) ensure that we write to the entire buffer and don't load
values that might cause faults in kernel.
Also zero out the 'pad' field of the i386 mcontext fp area that I think
once contained the registers of any Weitek fpu.
Dunno why it wasn't pasrt of the union.
Some of these copies could be removed if the code directly copied the save
area to/from userspace addresses.
 1.16 15-Dec-2012  dsl branches: 1.16.2;
Remove the incorrect comment about a (now deleted) pad field added
to make __fpregs be 16-byte aligned within ucontext_t.
I doubt that has been true for years!
Since the __fpregs field isn't accessed by fxsave it doesn't matter.
There is a lot of type fubar here, at leat mark the char[] __aligned(8).
 1.15 21-May-2012  martin branches: 1.15.2;
Calling _lwp_create() with a bogus ucontext could trigger a kernel
assertion failure (and thus a crash in DIAGNOSTIC kernels). Independently
discovered by YAMAMOTO Takashi and Joel Sing.

To avoid this, introduce a cpu_mcontext_validate() function and move all
sanity checks from cpu_setmcontext() there. Also untangle the netbsd32
compat mess slightly and add a cpu_mcontext32_validate() cousin there.

Add an exhaustive atf test case, based partly on code from Joel Sing.

Should finally fix the remaining open part of PR kern/43903.
 1.14 25-Feb-2011  joerg branches: 1.14.4; 1.14.8; 1.14.10;
Be nicer to software that insists on -ansi and use __inline.
 1.13 24-Feb-2011  joerg Allow storing and receiving the LWP private pointer via ucontext_t
on all platforms except VAX and IA64. Add fast access via register for
AMD64, i386 and SH3 ports. Use this fast access in libpthread to replace
the stack based pthread_self(). Implement skeleton support for Alpha,
HPPA, PowerPC, SPARC and SPARC64, but leave it disabled.

Ports that support this feature provide __HAVE____LWP_GETPRIVATE_FAST in
machine/types.h and a corresponding __lwp_getprivate_fast in
machine/mcontext.h.

This material is based upon work partially supported by
The NetBSD Foundation under a contract with Joerg Sonnenberger.
 1.12 23-Feb-2011  joerg Fix ucontext32_t on AMD64. Add optional compile time assertions for
ucontext_t and ucontext32_t to ensure that they don't change.
Provide the constants for AMD64, ARM, i386, and M68K.
 1.11 26-Oct-2008  mrg branches: 1.11.8; 1.11.16; 1.11.22; 1.11.24;
put the contents of these header files around #ifdef __x86_64__, and
#include the <i386/foo.h> in the #else clause, making these files
largely bit-size independant.
 1.10 28-Apr-2008  martin branches: 1.10.6;
Remove clause 3 and 4 from TNF licenses
 1.9 04-Jan-2008  dsl branches: 1.9.6; 1.9.8; 1.9.10;
Change the way that the trap/intr/syscall frames and the __gregset_t[]
indexes are defined so that only a single list of the registers is used.
The code no longer relies on the two structures matching.
There should be no binary change.
 1.8 29-Mar-2006  cube branches: 1.8.38; 1.8.44; 1.8.52;
Add the netbsd32 MD bits for sparc64 and amd64 to support SA.

Many thanks to all who helped for that little project, notably Martin
Husemann for teaching me a bit about the very special sparc64 world.
 1.7 11-Dec-2005  christos branches: 1.7.4; 1.7.6; 1.7.8; 1.7.10; 1.7.12;
merge ktrace-lwp.
 1.6 15-May-2005  fvdl branches: 1.6.2;
Optionally include saving and restoring the 64bit %gs and %fs base register
values in the PCB. Do this in pmap_activate for now (XXX not a good place
for it, but a convenient one).
 1.5 21-Oct-2004  fvdl Fix thread context switching to take the stack ABI into account.
From Wolfgang Solfrank.
 1.4 13-Oct-2003  fvdl branches: 1.4.2;
Define mcontext32_t (if COMPAT_NETBSD32).
 1.3 08-Oct-2003  thorpej Add some accessor macros for the ucontext:
* _UC_MACHINE_PC() - access the program counter
* _UC_MACHINE_INTRV() - access the integer return value register
* _UC_MACHINE_SET_PC() - set the program counter (this requires
special handling on some platforms).
 1.2 06-Oct-2003  fvdl SIGINFO support.
Todo: 32bit compat support (COMPAT_NETBSD32 will not compile right now,
as it won't on other platforms).
 1.1 26-Apr-2003  fvdl branches: 1.1.2;
Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1.2.5 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.1.2.4 02-Nov-2004  skrll Sync with HEAD.
 1.1.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.1.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.1.2.1 03-Aug-2004  skrll Sync with HEAD
 1.4.2.1 12-Nov-2004  jmc Pullup rev 1.5 (requested by fvdl in ticket #956)

Fix thread context switching to take the stack ABI into account.
 1.6.2.2 21-Jan-2008  yamt sync with head
 1.6.2.1 21-Jun-2006  yamt sync with head.
 1.7.12.1 31-Mar-2006  tron Merge 2006-03-31 NetBSD-current into the "peter-altq" branch.
 1.7.10.1 19-Apr-2006  elad sync with head - hopefully this will work
 1.7.8.1 01-Apr-2006  yamt sync with head.
 1.7.6.1 22-Apr-2006  simonb Sync with head.
 1.7.4.1 09-Sep-2006  rpaulo sync with head
 1.8.52.1 08-Jan-2008  bouyer Sync with HEAD
 1.8.44.1 18-Feb-2008  mjf Sync with HEAD.
 1.8.38.1 09-Jan-2008  matt sync with HEAD
 1.9.10.2 04-May-2009  yamt sync with head.
 1.9.10.1 16-May-2008  yamt sync with head.
 1.9.8.1 18-May-2008  yamt sync with head.
 1.9.6.2 17-Jan-2009  mjf Sync with HEAD.
 1.9.6.1 02-Jun-2008  mjf Sync with HEAD.
 1.10.6.1 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.11.24.1 05-Mar-2011  bouyer Sync with HEAD
 1.11.22.1 06-Jun-2011  jruoho Sync with HEAD.
 1.11.16.1 05-Mar-2011  rmind sync with head
 1.11.8.1 28-Mar-2011  jym Sync with HEAD. TODO before merge:
- shortcut for suspend code in sysmon, when powerd(8) is not running.
Borrow ``xs_watch'' thread context?
- bug hunting in xbd + xennet resume. Rings are currently thrashed upon
resume, so current implementation force flush them on suspend. It's not
really needed.
 1.14.10.1 21-May-2012  riz Pull up following revision(s) (requested by martin in ticket #274):
sys/arch/amd64/amd64/process_machdep.c: revision 1.20
sys/kern/sys_lwp.c: revision 1.54
sys/arch/sparc64/sparc64/machdep.c: revision 1.267
sys/arch/mips/mips/cpu_subr.c: revision 1.16
sys/arch/vax/vax/machdep.c: revision 1.188
sys/sys/lwp.h: revision 1.161
sys/arch/sparc64/sparc64/netbsd32_machdep.c: revision 1.98
sys/arch/alpha/alpha/machdep.c: revision 1.339
sys/compat/sys/ucontext.h: revision 1.6
sys/arch/hppa/hppa/hppa_machdep.c: revision 1.28
distrib/sets/lists/tests/mi: revision 1.469
sys/arch/powerpc/powerpc/sig_machdep.c: revision 1.42
tests/lib/libc/sys/t_lwp_create.c: revision 1.1
tests/lib/libc/sys/Makefile: revision 1.23
sys/arch/arm/arm/sig_machdep.c: revision 1.42
sys/arch/amd64/include/mcontext.h: revision 1.15
sys/arch/amd64/amd64/machdep.c: revision 1.183
sys/arch/sh3/sh3/sh3_machdep.c: revision 1.99
sys/arch/i386/i386/machdep.c: revision 1.727
sys/compat/netbsd32/netbsd32_lwp.c: revision 1.13
sys/arch/sparc/sparc/machdep.c: revision 1.319
sys/arch/amd64/amd64/netbsd32_machdep.c: revision 1.76
sys/arch/m68k/m68k/sig_machdep.c: revision 1.49
sys/sys/ucontext.h: revision 1.16
sys/arch/mips/mips/netbsd32_machdep.c: revision 1.9
lib/libc/sys/_lwp_create.2: revision 1.5
Calling _lwp_create() with a bogus ucontext could trigger a kernel
assertion failure (and thus a crash in DIAGNOSTIC kernels). Independently
discovered by YAMAMOTO Takashi and Joel Sing.
To avoid this, introduce a cpu_mcontext_validate() function and move all
sanity checks from cpu_setmcontext() there. Also untangle the netbsd32
compat mess slightly and add a cpu_mcontext32_validate() cousin there.
Add an exhaustive atf test case, based partly on code from Joel Sing.
Should finally fix the remaining open part of PR kern/43903.
 1.14.8.1 02-Jun-2012  mrg sync to latest -current.
 1.14.4.3 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.14.4.2 23-Jan-2013  yamt sync with head
 1.14.4.1 23-May-2012  yamt sync with head.
 1.15.2.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.15.2.1 25-Feb-2013  tls resync with head
 1.16.2.1 18-May-2014  rmind sync with head
 1.17.2.1 10-Aug-2014  tls Rebase.
 1.18.20.3 21-Mar-2018  martin Pull up the following, requested by kamil in ticket #552:

external/gpl3/gcc{.old}/dist/libsanitizer/asan/asan_linux.cc 1.4
sys/arch/aarch64/include/mcontext.h 1.2
sys/arch/alpha/include/mcontext.h 1.9
sys/arch/amd64/include/mcontext.h 1.19
sys/arch/arm/include/mcontext.h 1.19
sys/arch/hppa/include/mcontext.h 1.9
sys/arch/i386/include/mcontext.h 1.14
sys/arch/ia64/include/mcontext.h 1.6
sys/arch/m68k/include/mcontext.h 1.10
sys/arch/mips/include/mcontext.h 1.22
sys/arch/or1k/include/mcontext.h 1.2
sys/arch/powerpc/include/mcontext.h 1.18
sys/arch/riscv/include/mcontext.h 1.5
sys/arch/sh3/include/mcontext.h 1.11
sys/arch/sparc/include/mcontext.h 1.14-1.17
sys/arch/sparc64/include/mcontext.h 1.10
sys/arch/vax/include/mcontext.h 1.9
tests/lib/libc/sys/Makefile 1.50
tests/lib/libc/sys/t_ucontext.c 1.2-1.5
sys/arch/hppa/include/mcontext.h 1.10
sys/arch/ia64/include/mcontext.h 1.7

- Introduce _UC_MACHINE_FP(). _UC_MACHINE_FP() is a helper
macro to extract from mcontext a frame pointer.
- Add new tests in lib/libc/sys/t_ucontext:
* ucontext_sp (testing _UC_MACHINE_SP)
* ucontext_fp (testing _UC_MACHINE_FP)
* ucontext_pc (testing _UC_MACHINE_PC)
* ucontext_intrv (testing _UC_MACHINE_INTRV)

Add a dummy implementation of _UC_MACHINE_INTRV() for ia64.

Implement _UC_MACHINE_INTRV() for hppa.

Make the t_ucontext.c test more portable.

We now have _UC_MACHINE_FP.
 1.18.20.2 26-Feb-2018  snj revert ticket 552, which broke the build
 1.18.20.1 25-Feb-2018  snj Pull up following revision(s) (requested by kamil in ticket #552):
sys/arch/aarch64/include/mcontext.h: 1.2
sys/arch/alpha/include/mcontext.h: 1.9
sys/arch/amd64/include/mcontext.h: 1.19
sys/arch/arm/include/mcontext.h: 1.19
sys/arch/hppa/include/mcontext.h: 1.9
sys/arch/i386/include/mcontext.h: 1.14
sys/arch/ia64/include/mcontext.h: 1.6
sys/arch/m68k/include/mcontext.h: 1.10
sys/arch/mips/include/mcontext.h: 1.22
sys/arch/or1k/include/mcontext.h: 1.2
sys/arch/powerpc/include/mcontext.h: 1.18
sys/arch/riscv/include/mcontext.h: 1.5
sys/arch/sh3/include/mcontext.h: 1.11
sys/arch/sparc/include/mcontext.h: 1.14-1.17
sys/arch/sparc64/include/mcontext.h: 1.10
sys/arch/vax/include/mcontext.h: 1.9
tests/lib/libc/sys/Makefile: 1.50
tests/lib/libc/sys/t_ucontext.c: 1.2
Introduce _UC_MACHINE_FP() as a macro
_UC_MACHINE_FP() is a helper macro to extract from mcontext a frame pointer.
Don't rely on this interface as a compiler might strip frame pointer or
optimize it making this interface unreliable.
For hppa assume a small frame context, for larger frames FP might be located
in a different register (4 instead of 3).
For ia64 there is no strict frame pointer, and registers might rotate.
Reuse 79 following:
./gcc/config/ia64/ia64.h:#define HARD_FRAME_POINTER_REGNUM LOC_REG (79)
Once ia64 will mature, this should be revisited.
A macro can encapsulate a real function for extracting Frame Pointer on
more complex CPUs / ABIs.
For the remaining CPUs, reuse standard register as defined in appropriate ABI.
The direct users of this macro are LLVM and GCC with Sanitizers.
Proposed on tech-userlevel@.
Sponsored by <The NetBSD Foundation>
--
Improve _UC_MACHINE_FP() for SPARC/SPARC64
Introduce a static inline function _uc_machine_fp() that contains improved
caluclation of a frame pointer.
Algorithm:
uptr *stk_ptr;
# if defined (__arch64__)
stk_ptr = (uptr *) (*sp + 2047);
# else
stk_ptr = (uptr *) *sp;
# endif
*bp = stk_ptr[15];
Noted by <mrg>
--
Make _UC_MACHINE_FP() compile again and fix it so that it does not add
the offset twice.
--
fix _UC_MACHINE32_FP() -- use 32 bit pointer value so that [15] is
the right offset. do this by using __greg32_t, which is only in
the sparc64 version, and these are only useful there, so move them.
--
Add new tests in lib/libc/sys/t_ucontext
New tests:
- ucontext_sp
- ucontext_fp
- ucontext_pc
- ucontext_intrv
They test respectively:
- _UC_MACHINE_SP
- _UC_MACHINE_FP
- _UC_MACHINE_PC
- _UC_MACHINE_INTRV
These tests attempt to access and print the values from ucontext, without
interpreting the values.
This is a follow up of the _UC_MACHINE_FP() introduction.
These tests use PRIxREGISTER, and require to be built with -D_KERNTYPES.
Sponsored by <The NetBSD Foundation>
 1.19.4.1 08-Apr-2020  martin Merge changes from current as of 20200406
 1.21.2.1 02-Aug-2025  perseant Sync with HEAD
 1.1 11-May-2003  fvdl ACPI support. Wakeup code still to be done.
 1.1 26-Apr-2003  fvdl Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1 26-Apr-2003  fvdl Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1 26-Apr-2003  fvdl Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.8 13-Sep-2022  riastradh amd64/asan.h, amd64/msan.h: Add include guards.
 1.7 22-Aug-2022  riastradh amd64/msan.h: Fix includes for private pmap.
 1.6 18-Nov-2020  hannken Make this at least compile.
Looks like a missing part from "Round of uvm.h cleanup (2020-09-05 18:30)".
 1.5 09-Sep-2020  maxv branches: 1.5.2;
kmsan: update the copyright notices
 1.4 07-Jun-2020  christos make this compile.
 1.3 25-Apr-2020  bouyer Merge the bouyer-xenpvh branch, bringing in Xen PV drivers support under HVM
guests in GENERIC.
Xen support can be disabled at runtime with
boot -c
disable hypervisor
 1.2 15-Apr-2020  maxv Use large pages for the kMSan shadows. This greatly improves performance,
and slightly reduces memory consumption.
 1.1 14-Nov-2019  maxv branches: 1.1.6; 1.1.8;
Add support for Kernel Memory Sanitizer (kMSan). It detects uninitialized
memory used by the kernel at run time, and just like kASan and kCSan, it
is an excellent feature. It has already detected 38 uninitialized variables
in the kernel during my testing, which I have since discreetly fixed.

We use two shadows:
- "shad", to track uninitialized memory with a bit granularity (1:1).
Each bit set to 1 in the shad corresponds to one uninitialized bit of
real kernel memory.
- "orig", to track the origin of the memory with a 4-byte granularity
(1:1). Each uint32_t cell in the orig indicates the origin of the
associated uint32_t of real kernel memory.

The memory consumption of these shadows is consequent, so at least 4GB of
RAM is recommended to run kMSan.

The compiler inserts calls to specific __msan_* functions on each memory
access, to manage both the shad and the orig and detect uninitialized
memory accesses that change the execution flow (like an "if" on an
uninitialized variable).

We mark as uninit several types of memory buffers (stack, pools, kmem,
malloc, uvm_km), and check each buffer passed to copyout, copyoutstr,
bwrite, if_transmit_lock and DMA operations, to detect uninitialized memory
that leaves the system. This allows us to detect kernel info leaks in a way
that is more efficient and also more user-friendly than KLEAK.

Contrary to kASan, kMSan requires comprehensive coverage, ie we cannot
tolerate having one non-instrumented function, because this could cause
false positives. kMSan cannot instrument ASM functions, so I converted
most of them to __asm__ inlines, which kMSan is able to instrument. Those
that remain receive special treatment.

Contrary to kASan again, kMSan uses a TLS, so we must context-switch this
TLS during interrupts. We use different contexts depending on the interrupt
level.

The orig tracks precisely the origin of a buffer. We use a special encoding
for the orig values, and pack together in each uint32_t cell of the orig:
- a code designating the type of memory (Stack, Pool, etc), and
- a compressed pointer, which points either (1) to a string containing
the name of the variable associated with the cell, or (2) to an area
in the kernel .text section which we resolve to a symbol name + offset.

This encoding allows us not to consume extra memory for associating
information with each cell, and produces a precise output, that can tell
for example the name of an uninitialized variable on the stack, the
function in which it was pushed on the stack, and the function where we
accessed this uninitialized variable.

kMSan is available with LLVM, but not with GCC.

The code is organized in a way that is similar to kASan and kCSan, so it
means that other architectures than amd64 can be supported.
 1.1.8.3 21-Apr-2020  martin Sync with HEAD
 1.1.8.2 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.1.8.1 14-Nov-2019  martin file msan.h was added on branch phil-wifi on 2020-04-13 08:03:30 +0000
 1.1.6.1 20-Apr-2020  bouyer Sync with HEAD
 1.5.2.1 14-Dec-2020  thorpej Sync w/ HEAD.
 1.1 26-Apr-2003  fvdl Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.2 09-Feb-2007  ad branches: 1.2.4;
Merge newlock2 to head.
 1.1 10-Sep-2006  ad branches: 1.1.2;
file mutex.h was initially added on branch newlock2.
 1.1.2.4 29-Dec-2006  ad Checkpoint work in progress.
 1.1.2.3 24-Oct-2006  ad Compile fixes
 1.1.2.2 20-Oct-2006  ad - Don't need locked bus cycles on release from C code.
- Save an integer ID in the lock structures for LOCKDEBUG code.
 1.1.2.1 10-Sep-2006  ad Add updated locking primatives.
 1.2.4.2 26-Feb-2007  yamt sync with head.
 1.2.4.1 09-Feb-2007  yamt file mutex.h was added on branch yamt-lazymbuf on 2007-02-26 09:05:43 +0000
 1.25 27-Nov-2019  rin Add support for PT_[GS]ETXMMREGS requests for COMPAT_NETBSD32 on amd64.

For this purpose, PT_[GS]ETXMMREGS are added to amd64/ptrace.h. These
are intended for internal usage for COMPAT_NETBSD32, and therefore not
exposed to userland.

Thanks to kamil, mgorny, and pgoyette for their kind review!

XXX
pullup to netbsd-9
 1.24 26-Jun-2019  mgorny Implement PT_GETXSTATE and PT_SETXSTATE

Introduce two new ptrace() requests: PT_GETXSTATE and PT_SETXSTATE,
that provide access to the extended (and extensible) set of FPU
registers on amd64 and i386. At the moment, this covers AVX (YMM)
and AVX-512 (ZMM, opmask) registers. It can be easily extended
to cover further register types without breaking backwards
compatibility.

PT_GETXSTATE issues the XSAVE instruction with all kernel-supported
extended components enabled. The data is copied into 'struct xstate'
(which -- unlike the XSAVE area itself -- has stable format
and offsets).

PT_SETXSTATE issues the XRSTOR instruction to restore the register
values from user-provided 'struct xstate'. The function replaces only
the specific XSAVE components that are listed in 'xs_rfbm' field,
making it possible to issue partial updates.

Both syscalls take a 'struct iovec' pointer rather than a direct
argument. This requires the caller to explicitly specify the buffer
size. As a result, existing code will continue to work correctly
when the structure is extended (performing partial reads/updates).
 1.23 04-Jun-2019  mgorny compat32: Translate userland PT_* request values into kernel codes

Currently, the compat32 passes PT_* request values to kernel functions
without translation. This works fine for low PT_* requests that happen
to have the same values both on i386 and amd64. However, for requests
higher than PT_SETFPREGS, the value passed from userland (matching i386
const) does not match the correct kernel (amd64) request. As a result,
e.g. when compat32 process calls PT_GETDBREGS, kernel actually processes
it as PT_SETSTEP.

To resolve this, introduce support for compat32 PT_* request
translation. The interface is based on PTRACE_TRANSLATE_REQUEST32 macro
that is defined to a mapping function on architectures needing it.
In case of amd64, this function maps userland i386 PT_* values into
appropriate amd64 PT_* values.

For the time being, the two additional PT_GETXMMREGS and PT_SETXMMREGS
requests are unsupported due to lack of matching free amd64 constant.
 1.22 23-Feb-2017  kamil branches: 1.22.14;
Introduce PT_GETDBREGS and PT_SETDBREGS in ptrace(2) on i386 and amd64

This interface is modeled after FreeBSD API with the usage.

This replaced previous watchpoint API. The previous one was introduced
recently in NetBSD-current and remove its spurs without any
backward-compatibility.

Design choices for Debug Register accessors:
- exec() (TRAP_EXEC event) must remove debug registers from LWP
- debug registers are only per-LWP, not per-process globally
- debug registers must not be inherited after (v)forking a process
- debug registers must not be inherited after forking a thread
- a debugger is responsible to set global watchpoints/breakpoints with the
debug registers, to achieve this PTRACE_LWP_CREATE/PTRACE_LWP_EXIT event
monitoring function is designed to be used
- debug register traps must generate SIGTRAP with si_code TRAP_DBREG
- debugger is responsible to retrieve debug register state to distinguish
the exact debug register trap (DR6 is Status Register on x86)
- kernel must not remove debug register traps after triggering a trap event
a debugger is responsible to detach this trap with appropriate PT_SETDBREGS
call (DR7 is Control Register on x86)
- debug registers must not be exposed in mcontext
- userland must not be allowed to set a trap on the kernel

Implementation notes on i386 and amd64:
- the initial state of debug register is retrieved on boot and this value is
stored in a local copy (initdbregs), this value is used to initialize dbreg
context after PT_GETDBREGS
- struct dbregs is stored in pcb as a pointer and by default not initialized
- reserved registers (DR4-DR5, DR9-DR15) are ignored

Further ideas:
- restrict this interface with securelevel

Tested on real hardware i386 (Intel Pentium IV) and amd64 (Intel i7).

This commit enables 390 debug register ATF tests in kernel/arch/x86.
All tests are passing.

This commit does not cover netbsd32 compat code. Currently other interface
PT_GET_SIGINFO/PT_SET_SIGINFO is required in netbsd32 compat code in order to
validate reliably PT_GETDBREGS/PT_SETDBREGS.

This implementation does not cover FreeBSD specific defines in their
<x86/reg.h>: DBREG_DR7_LOCAL_ENABLE, DBREG_DR7_GLOBAL_ENABLE, DBREG_DR7_LEN_1
etc. These values tend to be reinvented by each tracer on its own. GNU
Debugger (GDB) works with NetBSD debug registers after adding this patch:

--- gdb/amd64bsd-nat.c.orig 2016-02-10 03:19:39.000000000 +0000
+++ gdb/amd64bsd-nat.c
@@ -167,6 +167,10 @@ amd64bsd_target (void)

#ifdef HAVE_PT_GETDBREGS

+#ifndef DBREG_DRX
+#define DBREG_DRX(d,x) ((d)->dr[(x)])
+#endif
+
static unsigned long
amd64bsd_dr_get (ptid_t ptid, int regnum)
{


Another reason to stop introducing unpopular defines covering machine
specific register macros is that these value varies across generations of
the same CPU family.

GDB demo:
(gdb) c
Continuing.

Watchpoint 2: traceme

Old value = 0
New value = 16
main (argc=1, argv=0x7f7fff79fe30) at test.c:8
8 printf("traceme=%d\n", traceme);

(Currently the GDB interface is not reliable due to NetBSD support bugs)

Sponsored by <The NetBSD Foundation>
 1.21 06-Feb-2017  maxv Add the USER_LDT sysarch options in netbsd32. We don't translate 'desc',
since if we ever implement USER_LDT we will only allow 8-byte-sized
entries, which have the same layout on amd64 and i386.
 1.20 19-Oct-2016  skrll branches: 1.20.2;
PR kern/51514: ptrace(2) fails for 32-bit process on 64-bit kernel

Updated from the original patch in the PR by me.
 1.19 07-Feb-2014  dsl branches: 1.19.6; 1.19.10;
Convert the amd64 build to use x86/cpu_extended_state.h so that the fpu
definitions match those of i386.
Mostly just structure and field renames, in addition:
1) process_xmm_to_s87() and process_s87_to_xmm() moved into
x86/convert_xmm_s87.c so they can be used by amd64's netbsd32 code.
2) The linux signal code simplified to use a structure copy for ths fxsave
data - it matches the hardware definition and won't change.
 1.18 04-Jan-2014  dsl Remove __HAVE_PROCESS_XFPREGS and add the extra parameter for the size
of the fp save area to all the process_read_fpregs() and
process_write_fpregs() functions.
None of the functions have been modified to use the new parameters.
The size is set for all the writes, but some of the arch-specific reads
just pass NULL.
The amd64 (and i386) need variable sized fp register save areas in order
to support AVX and other enhanced register areas.
These functions are rarely called - so the extra argument won't matter.
 1.17 19-Feb-2012  rmind branches: 1.17.2; 1.17.4;
Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.
 1.16 15-Oct-2008  wrstuden branches: 1.16.28; 1.16.32;
Merge wrstuden-revivesa into HEAD.
 1.15 25-Dec-2007  perry branches: 1.15.6; 1.15.10; 1.15.12; 1.15.16;
Convert many of the uses of __attribute__ to equivalent
__packed, __unused and __dead macros from cdefs.h
 1.14 17-Oct-2007  garbled branches: 1.14.2; 1.14.4; 1.14.8;
Merge the ppcoea-renovation branch to HEAD.

This branch was a major cleanup and rototill of many of the various OEA
cpu based PPC ports that focused on sharing as much code as possible
between the various ports to eliminate near-identical copies of files in
every tree. Additionally there is a new PIC system that unifies the
interface to interrupt code for all different OEA ppc arches. The work
for this branch was done by a variety of people, too long to list here.

TODO:
bebox still needs work to complete the transition to -renovation.
ofppc still needs a bunch of work, which I will be looking at.
ev64260 still needs to be renovated
amigappc was not attempted.

NOTES:
pmppc was removed as an arch, and moved to a evbppc target.
 1.13 16-Sep-2007  dsl Define netbsd32_uint64 for 64bit integers with the alignment requirement
of the corresponding 32bit architecture.
Use it for the 64bit items in netbsd32_statvfs so that the structure
doesn't collect 8byte alignment (and 4 bytes of trailing padding).
This replaces the 'packed' attribute which wasn't architecture specific
and would cause massive overheads accessing every member of sparc64.
Should allow the MIPS64 port do DTRT.
 1.12 16-Mar-2007  dsl branches: 1.12.4; 1.12.12; 1.12.14;
Use NETBSD32PTR64() and NETBSD32PTR32() throughout.
 1.11 09-Feb-2007  ad branches: 1.11.2; 1.11.6; 1.11.8; 1.11.10;
Merge newlock2 to head.
 1.10 29-Mar-2006  cube branches: 1.10.8;
Add the netbsd32 MD bits for sparc64 and amd64 to support SA.

Many thanks to all who helped for that little project, notably Martin
Husemann for teaching me a bit about the very special sparc64 world.
 1.9 12-Mar-2006  cube branches: 1.9.2;
Support the generation of coredumps for 32-bits binaries under
COMPAT_NETBSD32. They haven't worked for 5 years.

Silently agreed by the tech-kern readers.

XXX sparc64 MD glue still lacking.
XXX The FPU registers on i386 are not dumped correctly, according to my
XXX tests. It shouldn't be much work for someone who has the slightest
XXX idea of how that stuff is supposed to be laid out on i386.
 1.8 11-Dec-2005  christos branches: 1.8.4; 1.8.6; 1.8.8; 1.8.10;
merge ktrace-lwp.
 1.7 27-Sep-2005  chs make this compile again.
 1.6 14-Sep-2005  chs need to include <compat/sys/ucontext.h> here.
 1.5 20-Feb-2004  drochner branches: 1.5.16;
provide a definition NETBSD32_MID_MACHINE which tells for the a.out MID
to look for in 32-bit emulation
 1.4 13-Oct-2003  fvdl Define 32bit versions of signal frames and contexts.
 1.3 26-Sep-2003  christos move MI stuff to the MI include.
 1.2 26-Sep-2003  christos add catch up with const sigset_t *
 1.1 26-Apr-2003  fvdl branches: 1.1.2;
Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1.2.4 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.1.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.1.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.1.2.1 03-Aug-2004  skrll Sync with HEAD
 1.5.16.5 21-Jan-2008  yamt sync with head
 1.5.16.4 27-Oct-2007  yamt sync with head.
 1.5.16.3 03-Sep-2007  yamt sync with head.
 1.5.16.2 26-Feb-2007  yamt sync with head.
 1.5.16.1 21-Jun-2006  yamt sync with head.
 1.8.10.1 19-Apr-2006  elad sync with head - hopefully this will work
 1.8.8.2 01-Apr-2006  yamt sync with head.
 1.8.8.1 13-Mar-2006  yamt sync with head.
 1.8.6.1 22-Apr-2006  simonb Sync with head.
 1.8.4.1 09-Sep-2006  rpaulo sync with head
 1.9.2.1 31-Mar-2006  tron Merge 2006-03-31 NetBSD-current into the "peter-altq" branch.
 1.10.8.1 01-Feb-2007  ad Remove definition of struct netbsd32_saframe.
 1.11.10.1 18-Mar-2007  reinoud First attempt to bring branch in sync with HEAD
 1.11.8.1 11-Jul-2007  mjf Sync with head.
 1.11.6.2 09-Oct-2007  ad Sync with head.
 1.11.6.1 10-Apr-2007  ad Sync with head.
 1.11.2.1 24-Mar-2007  yamt sync with head.
 1.12.14.2 09-Jan-2008  matt sync with HEAD
 1.12.14.1 06-Nov-2007  matt sync with HEAD
 1.12.12.1 02-Oct-2007  joerg Sync with HEAD.
 1.12.4.1 03-Oct-2007  garbled Sync with HEAD
 1.14.8.1 02-Jan-2008  bouyer Sync with HEAD
 1.14.4.1 26-Dec-2007  ad Sync with head.
 1.14.2.1 18-Feb-2008  mjf Sync with HEAD.
 1.15.16.1 19-Oct-2008  haad Sync with HEAD.
 1.15.12.1 28-Sep-2008  skrll Adapt the SA COMPAT_NETBSD32 stuff to this branch.
 1.15.10.1 04-May-2009  yamt sync with head.
 1.15.6.1 17-Jan-2009  mjf Sync with HEAD.
 1.16.32.1 24-Feb-2012  mrg sync to -current.
 1.16.28.2 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.16.28.1 17-Apr-2012  yamt sync with head
 1.17.4.1 18-May-2014  rmind sync with head
 1.17.2.2 03-Dec-2017  jdolecek update from HEAD
 1.17.2.1 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.19.10.2 20-Mar-2017  pgoyette Sync with HEAD
 1.19.10.1 04-Nov-2016  pgoyette Sync with HEAD
 1.19.6.2 28-Aug-2017  skrll Sync with HEAD
 1.19.6.1 05-Dec-2016  skrll Sync with HEAD
 1.20.2.1 21-Apr-2017  bouyer Sync with HEAD
 1.22.14.3 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.22.14.2 08-Apr-2020  martin Merge changes from current as of 20200406
 1.22.14.1 10-Jun-2019  christos Sync with HEAD
 1.42 27-Apr-2025  riastradh amd64/param.h: Fix KASAN/KMSAN build (and ALL build by extension).

Make UPAGES match what it was before my recent changes, as verified
by the __CTASSERT below, justifying the existence of the __CTASSERT.

As I recall, SVS is incompatible with KASAN/KMSAN, so it doesn't
contribute to the sum; presumably KASAN/KMSAN requires three pages,
though I'm not sure where this is documented. If it turns out this
accounting is wrong, we should fix it and cross-reference any relevant
constraints affecting the accounting. (But for now I'm just making
sure to restore the status quo of these definitions, which was my
intent all along with adding the __CTASSERT.)

Issue noted by hannken@.

PR port-amd64/57661: Crash when booting on Xeon Silver 4416+ in
KVM/Qemu
 1.41 24-Apr-2025  riastradh x86: Make sure esp is aligned when delivering signal.

While here, use STACK_ALIGNBYTES consistently for the alignment mask
(or STACK_ALIGNBYTES32 in amd64 for the compat32 alignment mask).

PR kern/59327: user stack pointer is not aligned properly
 1.40 24-Apr-2025  kre Skip __CTASSERT() when _STANDALONE

Unbreak the amd64 build (the assembler did not like those things!).

These __CTASSERT()'s aren't really useful in any case, and
should probably just be deleted - they simply check that the
arithmetic in the previous few lines produces the answers
expected.

If those answers were critical, then we shouldn't be computing
them in the first place, and should simply
#define UPAGES 8
(conditionally, replacing all the previous computation which
generates UPAGES, and the 3 __CTASSERT()'s.)

But I suspect that they're not critical, they just happen to be
the answers currently expected to be achieved. That's not something
that should be asserted to be true, it isn't a required fact, it
could easily be altered in one of the cases if needed, and everything
else should cope with that.
 1.39 24-Apr-2025  riastradh amd64/param.h: Make UPAGES definition clearer.

Break it down into subaccounts that are summed at the end to make it
clear how much each part is using, and how many pages are actually
reserved for the stack.

No functional change intended.

PR port-amd64/57661: Crash when booting on Xeon Silver 4416+ in
KVM/Qemu
 1.38 29-Jun-2020  jdolecek branches: 1.38.26;
increase UPAGES (used for lwp kernel stack) for SVS so the the
amount of actually usable kernel stack is the same for SVS and
non-SVS kernels (currently 12 KiB)

discussed with maxv@, part of investigation for PR kern/S55402
 1.37 17-Mar-2020  maxv Add a redzone between the pcb and the stack. Sent to port-amd64@.
 1.36 08-Feb-2020  maxv Retire KLEAK.

KLEAK was a nice feature and served its purpose; it allowed us to detect
dozens of info leaks on the kernel->userland boundary, and thanks to it we
tackled a good part of the infoleak problem 1.5 years ago.

Nowadays however, we have kMSan, which can detect uninitialized memory in
the kernel. kMSan supersedes KLEAK: it can detect what KLEAK was able to
detect, but in addition, (1) it operates in all of the kernel and not just
the kernel->userland boundary, (2) it requires no user interaction, and (3)
it is deterministic and not statistical.

That makes kMSan the feature of choice to detect info leaks nowadays;
people interested in detecting info leaks should boot a kMSan kernel and
just wait for the magic to happen.

KLEAK was a good ride, and a fun project, but now is time for it to go.

Discussed with several people, including Thomas Barabosch.
 1.35 22-Jan-2020  ad Move the UBC defaults into vmparam.h
 1.34 17-Jan-2020  ad Bump UBC_WINSHIFT & UBC_NWINS to more reasonable values for amd64.
 1.33 14-Nov-2019  maxv branches: 1.33.2;
Add support for Kernel Memory Sanitizer (kMSan). It detects uninitialized
memory used by the kernel at run time, and just like kASan and kCSan, it
is an excellent feature. It has already detected 38 uninitialized variables
in the kernel during my testing, which I have since discreetly fixed.

We use two shadows:
- "shad", to track uninitialized memory with a bit granularity (1:1).
Each bit set to 1 in the shad corresponds to one uninitialized bit of
real kernel memory.
- "orig", to track the origin of the memory with a 4-byte granularity
(1:1). Each uint32_t cell in the orig indicates the origin of the
associated uint32_t of real kernel memory.

The memory consumption of these shadows is consequent, so at least 4GB of
RAM is recommended to run kMSan.

The compiler inserts calls to specific __msan_* functions on each memory
access, to manage both the shad and the orig and detect uninitialized
memory accesses that change the execution flow (like an "if" on an
uninitialized variable).

We mark as uninit several types of memory buffers (stack, pools, kmem,
malloc, uvm_km), and check each buffer passed to copyout, copyoutstr,
bwrite, if_transmit_lock and DMA operations, to detect uninitialized memory
that leaves the system. This allows us to detect kernel info leaks in a way
that is more efficient and also more user-friendly than KLEAK.

Contrary to kASan, kMSan requires comprehensive coverage, ie we cannot
tolerate having one non-instrumented function, because this could cause
false positives. kMSan cannot instrument ASM functions, so I converted
most of them to __asm__ inlines, which kMSan is able to instrument. Those
that remain receive special treatment.

Contrary to kASan again, kMSan uses a TLS, so we must context-switch this
TLS during interrupts. We use different contexts depending on the interrupt
level.

The orig tracks precisely the origin of a buffer. We use a special encoding
for the orig values, and pack together in each uint32_t cell of the orig:
- a code designating the type of memory (Stack, Pool, etc), and
- a compressed pointer, which points either (1) to a string containing
the name of the variable associated with the cell, or (2) to an area
in the kernel .text section which we resolve to a symbol name + offset.

This encoding allows us not to consume extra memory for associating
information with each cell, and produces a precise output, that can tell
for example the name of an uninitialized variable on the stack, the
function in which it was pushed on the stack, and the function where we
accessed this uninitialized variable.

kMSan is available with LLVM, but not with GCC.

The code is organized in a way that is similar to kASan and kCSan, so it
means that other architectures than amd64 can be supported.
 1.32 28-Sep-2019  christos remove local version of mstohz() now that <sys/param.h> provides it.
 1.31 20-Aug-2019  riastradh New macro ALIGNED_POINTER_LOAD.

To be used with ALIGNED_POINTER(p,t) instead of writing *(const t *)p
directly. This way, on machines without strict alignment, we can use
memcpy to pacify sanitizers, while getting the same compiled code in
the end with a single (say) MOV instruction.
 1.30 16-Mar-2019  rin branches: 1.30.4;
Bump STACK_ALIGNBYTES to (16 - 1) to satisfy requirement by AMD64
System V ABI in kernel level. This is because

(1) for LLDB, we want to bypass libc/csu (and therefore manual stack
alignment in _start), and

(2) rtld in glibc >= 2.23 for Linux/x86_64 requires it.

Fix SEGV for Linux/x86_64 binaries with glibc >= 2.23, reported as
PR port-amd64/54052.
 1.29 11-Feb-2019  cherry We reorganise definitions for XEN source support as follows:

XEN - common sources required for baseline XEN support.
XENPV - sources required for support of XEN in PV mode.
XENPVHVM - sources required for support for XEN in HVM mode.
XENPVH - sources required for support for XEN in PVH mode.
 1.28 07-Jan-2019  jdolecek move DEV_BSIZE, DEV_BSHIFT out of MD param.h, they are same on all ports

also move BLKDEV_IOSIZE, MAXPHYS, but allow override since some ports
have different value (powerpc uses NBPG for BLKDEV_IOSIZE, sun2/sun3
have lower MAXPHYS)
 1.27 02-Dec-2018  maxv Introduce KLEAK, a new feature that can detect kernel information leaks.

It works by tainting memory sources with marker values, letting the data
travel through the kernel, and scanning the kernel<->user frontier for
these marker values. Combined with compiler instrumentation and rotation
of the markers, it is able to yield relevant results with little effort.

We taint the pools and the stack, and scan copyout/copyoutstr. KLEAK is
supported on amd64 only for now, but it is not complicated to add more
architectures (just a matter of having the address of .text, and a stack
unwinder).

A userland tool is provided, that allows to execute a command in rounds
and monitor the leaks generated all the while.

KLEAK already detected directly 12 kernel info leaks, and prompted changes
that in total fixed 25+ leaks.

Based on an idea developed jointly with Thomas Barabosch (of Fraunhofer
FKIE).
 1.26 22-Aug-2018  maxv Add support for monitoring the stack with kASan. This allows us to detect
illegal memory accesses occuring there.

The compiler inlines a piece of code in each function that adds redzones
around the local variables and poisons them. The illegal accesses are then
detected using the usual kASan machinery.

The stack size is doubled, from 4 pages to 8 pages.

Several boot functions are marked with the __noasan flag, to prevent the
compiler from adding redzones in them (because we haven't yet initialized
kASan). The kasan_early_init function is called early at boot time to
quickly create the shadow for the current stack; after this is done, we
don't need __noasan anymore in the boot path.

We pass -fasan-shadow-offset=0xDFFF900000000000, because the compiler
wants to do
shad = shadow-offset + (addr >> 3)
and we do, in kasan_addr_to_shad
shad = KASAN_SHADOW_START + ((addr - CANONICAL_BASE) >> 3)
hence
shad = KASAN_SHADOW_START + (addr >> 3) - (CANONICAL_BASE >> 3)
= [KASAN_SHADOW_START - (CANONICAL_BASE >> 3)] + (addr >> 3)
implies
shadow-offset = KASAN_SHADOW_START - (CANONICAL_BASE >> 3)
= 0xFFFF800000000000 - (0xFFFF800000000000 >> 3)
= 0xDFFF900000000000

In UVM, we add a kasan_free (that is not preceded by a kasan_alloc). We
don't add poisoned redzones ourselves, but all the functions we execute
do, so we need to manually clear the poison before freeing the stack.

With the help of Kamil for the makefile stuff.
 1.25 16-Mar-2018  maxv branches: 1.25.2;
Add one more page for the stack, to compensate for the fact that SVS's
stack switching mechanism consumes approximately one page.
 1.24 19-Feb-2018  sborrill branches: 1.24.2;
Double size of MSGBUFSIZE as existing value is not big enough to hold boot dmesg
on modern server-class hardware with lots of CPUs, etc.
 1.23 11-Jan-2018  maxv Initialize ist0 in cpu_init_tss. On amd64 this is the DDB stack, and it has
nothing to do with ci_intrstack. While here, style, and don't forget to
pass UVM_KMF_ZERO in uvm_km_alloc.
 1.22 14-Jun-2017  maxv Define MAXPHYSMEM globally.
 1.21 02-Feb-2017  maxv branches: 1.21.6;
Increase KERNTEXTOFF from 1MB to 2MB on amd64. [1MB; 2MB[ is now handled
by UVM, so there is no physical loss.

On amd64 we always remap the kernel text with 2MB pages, and because of the
1MB start address we were forced to map [0MB; 2MB[ inside the first large
page. The problem is, the lower half is used by UVM to allocate physical
pages, and it is possible that some of these could be used by userland. We
could end up with userland-controllable data mapped into the kernel text on
a privileged page, which is far from being a good idea from a security pov.

I am not fixing i386 yet, because the large page size depends on PAE, and
we probably don't want to have a text located at 4MB on low-memory systems.

(note: I didn't introduce this issue, it was already there when I came in)
 1.20 20-Jan-2017  maya increase max io mem on amd64. some devices need it.
 1.19 27-Oct-2015  mrg branches: 1.19.2; 1.19.4;
make sure MSGBUFSIZE can't expand strangely by using parens.
 1.18 20-Apr-2012  rmind branches: 1.18.2; 1.18.14; 1.18.16;
- Convert x86 MD code, mainly pmap(9) e.g. TLB shootdown code, to use
kcpuset(9) and thus replace hardcoded CPU bitmasks. This removes the
limitation of maximum CPUs.

- Support up to 256 CPUs on amd64 architecture by default.

Bug fixes, improvements, completion of Xen part and testing on 64-core
AMD Opteron(tm) Processor 6282 SE (also, as Xen HVM domU with 128 CPUs)
by Manuel Bouyer.
 1.17 04-Feb-2012  para branches: 1.17.2;
improve sizing of kmem_arena now that more allocations are made from it
don't enforce limits if not required

ok: riz@
 1.16 24-Jan-2012  christos Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.
 1.15 20-Jan-2012  joerg Change CMSG_SPACE and CMSG_LEN to provide Integer Constant Expressions
again. This was changed in sys/socket.h r1.51 to work around fallout
from the IPv6 aux data migration. It broke the historic ABI on some
platforms. This commit restores compatibility for netbsd32 code on such
platforms and provides a template for future changes to the CMSG_*
alignment. Revert PCC/Clang workarounds in postfix and tmux.
 1.14 26-Jul-2011  yamt branches: 1.14.2; 1.14.6;
g/c round_pdr
 1.13 08-Feb-2010  joerg Remove separate mb_map. The nmbclusters is computed at boot time based
on the amount of physical memory and limited by NMBCLUSTERS if present.
Architectures without direct mapping also limit it based on the kmem_map
size, which is used as backing store. On i386 and ARM, the maximum KVA
used for mbuf clusters is limited to 64MB by default.

The old default limits and limits based on GATEWAY have been removed.
key_registered_sb_max is hard-wired to a value derived from 2048
clusters.
 1.12 11-Nov-2009  haad branches: 1.12.2;
Reert change which was not meant to be comitted.
 1.11 11-Nov-2009  haad Build kernel modules with -mno-red-zone like kernel is build. This fixes
frequent panics in amd64 zfs module. This should also fix problem reported
by Nicolas Joly in:

http://mail-index.netbsd.org/port-amd64/2008/12/09/msg000646.html

Thanks to cube@ for his help with this.
 1.10 20-Dec-2008  ad branches: 1.10.2;
- Kill NOREDZONE.
- Make the redzone conditional on DIAGNOSTIC.
- Give amd64 an additional page for the uarea. 2 is not enough.
 1.9 26-Oct-2008  mrg branches: 1.9.2; 1.9.4;
put the contents of these header files around #ifdef __x86_64__, and
#include the <i386/foo.h> in the #else clause, making these files
largely bit-size independant.
 1.8 08-Jan-2008  yamt branches: 1.8.6; 1.8.10; 1.8.16;
change the layout in u-area and reduce UPAGES.
 1.7 05-Jan-2008  yamt - make amd64 use per-cpu tss.
- fix iopl syscall for amd64+xen.
 1.6 18-Oct-2007  yamt branches: 1.6.2; 1.6.8;
merge yamt-x86pmap branch.

- reduce differences between amd64 and i386. notably, share pmap.c
between them. it makes several i386 pmap improvements available to
amd64, including tlb shootdown reduction and bug fixes from Stephan Uphoff.
- implement deferred pmap switching for amd64.
- remove LARGEPAGES option. always use large pages if available.
also, make it work on amd64.
 1.5 17-Oct-2007  garbled Merge the ppcoea-renovation branch to HEAD.

This branch was a major cleanup and rototill of many of the various OEA
cpu based PPC ports that focused on sharing as much code as possible
between the various ports to eliminate near-identical copies of files in
every tree. Additionally there is a new PIC system that unifies the
interface to interrupt code for all different OEA ppc arches. The work
for this branch was done by a variety of people, too long to list here.

TODO:
bebox still needs work to complete the transition to -renovation.
ofppc still needs a bunch of work, which I will be looking at.
ev64260 still needs to be renovated
amigappc was not attempted.

NOTES:
pmppc was removed as an arch, and moved to a evbppc target.
 1.4 13-Oct-2007  joerg branches: 1.4.2;
Bump default size of the message buffer from 16 KB to 32 KB.
This is large enough that boot -v on most systems fits into the
message buffer, which makes it easier for debugging.
 1.3 28-Aug-2006  yamt branches: 1.3.12; 1.3.20; 1.3.30; 1.3.32; 1.3.34;
- remove unused bdbtofsb.
- move the following macros from MD headers to sys/param.h.
ctod
dtoc
ctob
btoc
dbtob
btodb
 1.2 12-Feb-2006  chs branches: 1.2.2;
increase NKMEMPAGES_MAX_DEFAULT to 1 GB.
this allows lots more memory to be used for amaps, etc.
 1.1 26-Apr-2003  fvdl branches: 1.1.16; 1.1.18; 1.1.30; 1.1.32; 1.1.34;
Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1.34.1 22-Apr-2006  simonb Sync with head.
 1.1.32.1 09-Sep-2006  rpaulo sync with head
 1.1.30.1 18-Feb-2006  yamt sync with head.
 1.1.18.4 21-Jan-2008  yamt sync with head
 1.1.18.3 27-Oct-2007  yamt sync with head.
 1.1.18.2 30-Dec-2006  yamt sync with head.
 1.1.18.1 21-Jun-2006  yamt sync with head.
 1.1.16.1 14-Feb-2006  tron Pull up following revision(s) (requested by chs in ticket #1166):
sys/arch/amd64/include/param.h: revision 1.2
increase NKMEMPAGES_MAX_DEFAULT to 1 GB.
this allows lots more memory to be used for amaps, etc.
 1.2.2.1 03-Sep-2006  yamt sync with head.
 1.3.34.2 14-Oct-2007  yamt sync with head.
 1.3.34.1 07-Oct-2007  yamt remove some #ifdef _LOCORE and use genassym instead.
 1.3.32.2 09-Jan-2008  matt sync with HEAD
 1.3.32.1 06-Nov-2007  matt sync with HEAD
 1.3.30.1 26-Oct-2007  joerg Sync with HEAD.

Follow the merge of pmap.c on i386 and amd64 and move
pmap_init_tmp_pgtbl into arch/x86/x86/pmap.c. Modify the ACPI wakeup
code to restore CR4 before jumping back into kernel space as the large
page option might cover that.
 1.3.20.1 16-Oct-2007  garbled Sync with HEAD
 1.3.12.1 23-Oct-2007  ad Sync with head.
 1.4.2.1 25-Oct-2007  bouyer Sync with HEAD.
 1.6.8.1 08-Jan-2008  bouyer Sync with HEAD
 1.6.2.1 18-Feb-2008  mjf Sync with HEAD.
 1.8.16.1 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.8.10.2 11-Mar-2010  yamt sync with head
 1.8.10.1 04-May-2009  yamt sync with head.
 1.8.6.1 17-Jan-2009  mjf Sync with HEAD.
 1.9.4.1 16-Feb-2009  snj Pull up following revision(s) (requested by ad in ticket #355):
sys/arch/amd64/amd64/vm_machdep.c: revision 1.37
sys/arch/amd64/include/param.h: revision 1.10
- Kill NOREDZONE.
- Make the redzone conditional on DIAGNOSTIC.
- Give amd64 an additional page for the uarea. 2 is not enough.
 1.9.2.1 19-Jan-2009  skrll Sync with HEAD.
 1.10.2.2 27-Aug-2011  jym Sync with HEAD. Most notably: uvm/pmap work done by rmind@, and MP Xen
work of cherry@.

No regression observed on suspend/restore.
 1.10.2.1 24-Oct-2010  jym Sync with HEAD
 1.12.2.1 30-Apr-2010  uebayasi Sync with HEAD.
 1.14.6.2 29-Apr-2012  mrg sync to latest -current.
 1.14.6.1 18-Feb-2012  mrg merge to -current.
 1.14.2.2 23-May-2012  yamt sync with head.
 1.14.2.1 17-Apr-2012  yamt sync with head
 1.17.2.1 09-May-2012  riz Pull up following revision(s) (requested by rmind in ticket #202):
sys/arch/x86/include/cpuvar.h: revision 1.46
sys/arch/xen/include/xenpmap.h: revision 1.34
sys/arch/i386/include/param.h: revision 1.77
sys/arch/x86/x86/pmap_tlb.c: revision 1.5
sys/arch/x86/x86/pmap_tlb.c: revision 1.6
sys/arch/i386/i386/genassym.cf: revision 1.92
sys/arch/xen/x86/cpu.c: revision 1.91
sys/arch/x86/x86/pmap.c: revision 1.177
sys/arch/xen/x86/xen_pmap.c: revision 1.21
sys/arch/x86/acpi/acpi_wakeup.c: revision 1.31
sys/kern/subr_kcpuset.c: revision 1.5
sys/arch/amd64/include/param.h: revision 1.18
sys/sys/kcpuset.h: revision 1.5
sys/arch/x86/x86/mtrr_i686.c: revision 1.26
sys/arch/x86/x86/mtrr_i686.c: revision 1.27
sys/arch/xen/x86/x86_xpmap.c: revision 1.43
sys/arch/x86/x86/cpu.c: revision 1.98
sys/arch/amd64/amd64/mptramp.S: revision 1.14
sys/kern/sys_sched.c: revision 1.42
sys/arch/amd64/amd64/genassym.cf: revision 1.50
sys/arch/i386/i386/mptramp.S: revision 1.24
sys/arch/x86/include/pmap.h: revision 1.52
sys/arch/x86/include/cpu.h: revision 1.50
- Convert x86 MD code, mainly pmap(9) e.g. TLB shootdown code, to use
kcpuset(9) and thus replace hardcoded CPU bitmasks. This removes the
limitation of maximum CPUs.
- Support up to 256 CPUs on amd64 architecture by default.
Bug fixes, improvements, completion of Xen part and testing on 64-core
AMD Opteron(tm) Processor 6282 SE (also, as Xen HVM domU with 128 CPUs)
by Manuel Bouyer.
- pmap_tlb_shootdown: do not overwrite tp_cpumask with pm_cpus, but merge
like pm_kernel_cpus. Remove unecessary intersection with kcpuset_running.
Do not reset tp_userpmap if pmap_kernel().
- Remove pmap_tlb_mailbox_t wrapping, which is pointless after recent changes.
- pmap_tlb_invalidate, pmap_tlb_intr: constify for packet structure.
i686_mtrr_init_first: handle the case when there are no variable-size MTRR
registers available (i686_mtrr_vcnt == 0).
 1.18.16.3 28-Aug-2017  skrll Sync with HEAD
 1.18.16.2 05-Feb-2017  skrll Sync with HEAD
 1.18.16.1 27-Dec-2015  skrll Sync with HEAD (as of 26th Dec)
 1.18.14.2 30-Mar-2019  bouyer Pull up following revision(s) (requested by rin in ticket #1687):
sys/arch/amd64/include/param.h: revision 1.30
Bump STACK_ALIGNBYTES to (16 - 1) to satisfy requirement by AMD64
System V ABI in kernel level. This is because
(1) for LLDB, we want to bypass libc/csu (and therefore manual stack
alignment in _start), and
(2) rtld in glibc >= 2.23 for Linux/x86_64 requires it.
Fix SEGV for Linux/x86_64 binaries with glibc >= 2.23, reported as
PR port-amd64/54052.
 1.18.14.1 26-Mar-2017  snj Pull up following revision(s) (requested by maya in ticket #1375):
sys/arch/amd64/include/param.h: revision 1.20
sys/arch/i386/include/param.h: revision 1.80
sys/arch/x86/x86/bus_space.c: revision 1.39
increase max io mem on amd64. some devices need it.
 1.18.2.2 03-Dec-2017  jdolecek update from HEAD
 1.18.2.1 12-Sep-2012  tls Initial snapshot of work to eliminate 64K MAXPHYS. Basically works for
physio (I/O to raw devices); needs more doing to get it going with the
filesystems, but it shouldn't damage data.

All work's been done on amd64 so far. Not hard to add support to other
ports. If others want to pitch in, one very helpful thing would be to
sort out when and how IDE disks can do 128K or larger transfers, and
adjust the various PCI IDE (or at least ahcisata) drivers and wd.c
accordingly -- it would make testing much easier. Another very helpful
thing would be to implement a smart minphys() for RAIDframe along the
lines detailed in the MAXPHYS-NOTES file.
 1.19.4.1 21-Apr-2017  bouyer Sync with HEAD
 1.19.2.1 20-Mar-2017  pgoyette Sync with HEAD
 1.21.6.4 29-Mar-2019  martin Pull up following revision(s) (requested by rin in ticket #1220):

sys/arch/amd64/include/param.h: revision 1.30

Bump STACK_ALIGNBYTES to (16 - 1) to satisfy requirement by AMD64
System V ABI in kernel level. This is because

(1) for LLDB, we want to bypass libc/csu (and therefore manual stack
alignment in _start), and
(2) rtld in glibc >= 2.23 for Linux/x86_64 requires it.

Fix SEGV for Linux/x86_64 binaries with glibc >= 2.23, reported as

PR port-amd64/54052.
 1.21.6.3 11-Apr-2018  martin Pull up following revision(s) (requested by sborrill in ticket #736):

sys/arch/i386/include/param.h: revision 1.83
sys/arch/amd64/include/param.h: revision 1.24

Double size of MSGBUFSIZE as existing value is not big enough to hold
boot dmesg on modern server-class hardware with lots of CPUs, etc.
 1.21.6.2 22-Mar-2018  martin Pull up the following revisions, requested by maxv in ticket #652:

sys/arch/amd64/amd64/amd64_trap.S upto 1.39 (partial, patch)
sys/arch/amd64/amd64/db_machdep.c 1.6 (patch)
sys/arch/amd64/amd64/genassym.cf 1.65,1.66,1.67 (patch)
sys/arch/amd64/amd64/locore.S upto 1.159 (partial, patch)
sys/arch/amd64/amd64/machdep.c 1.299-1.302 (patch)
sys/arch/amd64/amd64/trap.c upto 1.113 (partial, patch)
sys/arch/amd64/amd64/amd64/vector.S upto 1.61 (partial, patch)
sys/arch/amd64/conf/GENERIC 1.477,1.478 (patch)
sys/arch/amd64/conf/kern.ldscript 1.26 (patch)
sys/arch/amd64/include/frameasm.h upto 1.37 (partial, patch)
sys/arch/amd64/include/param.h 1.25 (patch)
sys/arch/amd64/include/pmap.h 1.41,1.43,1.44 (patch)
sys/arch/x86/conf/files.x86 1.91,1.93 (patch)
sys/arch/x86/include/cpu.h 1.88,1.89 (patch)
sys/arch/x86/include/pmap.h 1.75 (patch)
sys/arch/x86/x86/cpu.c 1.144,1.146,1.148,1.149 (patch)
sys/arch/x86/x86/pmap.c upto 1.289 (partial, patch)
sys/arch/x86/x86/vm_machdep.c 1.31,1.32 (patch)
sys/arch/x86/x86/x86_machdep.c 1.104,1.106,1.108 (patch)
sys/arch/x86/x86/svs.c 1.1-1.14
sys/arch/xen/conf/files.compat 1.30 (patch)

Backport SVS. Not enabled yet.
 1.21.6.1 16-Mar-2018  martin Pull up the following revisions (via patch), requested by maxv in #635:

sys/arch/amd64/amd64/gdt.c 1.39-1.45 (patch)
sys/arch/amd64/amd64/amd64/machdep.c 1.284,1.287,1.288 (patch)
sys/arch/amd64/amd64/include/param.h 1.23 (patch)
sys/arch/amd64/include/types.h 1.53 (patch)
sys/arch/x86/include/cpu.h 1.87 (patch)
sys/arch/x86/include/pmap.h 1.73,1.74 (patch)
sys/arch/x86/x86/cpu.c 1.142 (patch)
sys/arch/x86/x86/intr.c 1.117 (partial),1.120 (patch)
sys/arch/x86/x86/pmap.c 1.276 (patch)

Initialize ist0 in cpu_init_tss.
Backport __HAVE_PCPU_AREA.
 1.24.2.4 18-Jan-2019  pgoyette Synch with HEAD
 1.24.2.3 26-Dec-2018  pgoyette Sync with HEAD, resolve a few conflicts
 1.24.2.2 06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.24.2.1 22-Mar-2018  pgoyette Synch with HEAD, resolve conflicts
 1.25.2.2 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.25.2.1 10-Jun-2019  christos Sync with HEAD
 1.30.4.1 08-Dec-2019  martin Pull up following revision(s) (requested by riastradh in ticket #505):

common/lib/libc/hash/murmurhash/murmurhash.c: revision 1.7
common/lib/libc/hash/murmurhash/murmurhash.c: revision 1.8
sys/sys/param.h: revision 1.610
sys/arch/amd64/include/param.h: revision 1.31
sys/arch/i386/include/param.h: revision 1.85

New macro ALIGNED_POINTER_LOAD.

To be used with ALIGNED_POINTER(p,t) instead of writing *(const t *)p
directly. This way, on machines without strict alignment, we can use
memcpy to pacify sanitizers, while getting the same compiled code in
the end with a single (say) MOV instruction.

Fix byte order bug in murmurhash and pacify sanitizers.
add now required includes for memcpy prototypes analogue to other hash functions
(fix the build)
 1.33.2.3 29-Feb-2020  ad Sync with head.
 1.33.2.2 25-Jan-2020  ad Sync with head.
 1.33.2.1 17-Jan-2020  ad Sync with head.
 1.38.26.1 02-Aug-2025  perseant Sync with HEAD
 1.35 28-Apr-2025  riastradh xen: Stop-gap FPU PCB fix; disable Intel AMX for now.

Since the custom cpu_uarea_alloc/free are disabled under XENPV,
nothing would initialize struct pcb::pcb_savefpu to point either to
struct pcb::pcb_savefpusmall, or to a separately allocated large area
on machines with Intel AMX TILECFG/TILEDATA requiring it. So the
memset in fpu_lwp_fork would crash on null pointer dereference:

[ 1.0000030] uvm_fault(0xffffffff8094a300, 0x0, 2) -> e
[ 1.0000030] fatal page fault in supervisor mode
[ 1.0000030] trap type 6 code 0x2 rip 0xffffffff8062795c cs 0xe030 rflags 0x10202 cr2 0 ilevel 0 rsp 0xffffffff80adad38
[ 1.0000030] curlwp 0xffffffff8078f880 pid 0.0 lowest kstack 0xffffffff80ad62c0
kernel: page fault trap, code=0
Stopped in pid 0.0 (system) at netbsd:memset+0x2c: repe stosq %es:(%rdi)
memset() at netbsd:memset+0x2c
lwp_create() at netbsd:lwp_create+0x2f1
fork1() at netbsd:fork1+0x42c
main() at netbsd:main+0x44f

In order to support Intel AMX TILECFG/TILEDATA, or any other CPU
extensions that increase the XSAVE area beyond what fits in a single
page after struct pcb, we would need to enable the the custom
cpu_uarea_alloc/free. Currently that would imply allocating stack
guard pages (`redzone') under XENPV; if there's some reason the stack
guard pages don't work, we could also push #ifdef XENPV conditionals
into cpu_uarea_alloc/free to cover the guard pages -- to be
considered.

PR kern/59371: Xen domU uvm_fault since FPU state allocation patch

PR port-amd64/57661: Crash when booting on Xeon Silver 4416+ in
KVM/Qemu
 1.34 24-Apr-2025  kre offsetof() needs <stddef.h> (<sys/stddef.h>)

Include <sys/stddef.h> when offsetof() is to be used.

First step in fixing x86 builds.
 1.33 24-Apr-2025  riastradh amd64: Allocate FPU save state outside pcb if it's too large.

We have seen x86_fpu_save_size values (CPUID[EAX=0x0d, ECX=0].ECX) as
large as 11008 bytes, notably with Intel AMX TILEDATA's 8192-byte
state.

We only do this for user threads, and only on machines where it's
necessary, to avoid incurring much overhead. There is still a tiny
bit of overhead when saving and restoring the FPU state by using a
pointer indirection instead of arithmetic indirection for access to
struct pcb::pcb_savefpu, but this is probably a drop in the bucket
compared to the memory traffic incurred by the FPU state save/restore
anyway.

For now, these paths are mostly disabled on i386. We could enable
them but it will require either rewriting cpu_uarea_alloc/free for
i386, or adopting a guard page like amd64 does, which might be costly
and so should be undertaken only with some thought and care. And
since Intel AMX instructions only work in 64-bit mode, it's not
likely to be useful on i386.

PR port-amd64/57661: Crash when booting on Xeon Silver 4416+ in
KVM/Qemu

These changes, as a side effect, may fix:

PR kern/57258: kthread_fpu_enter/exit problem

by making sure to allocate an FPU save space that is large enough to
guarantee fpu_kern_enter/leave work safely, instead of just using a
union savefpu object on the stack (which, at 576 bytes, may be too
small on some machines, particularly with AVX512 requiring ~2.5K).
(But we'll have to do some extra work with kthread_fpu_enter/exit_md
-- if we try doing them again on x86 -- to actually allocate the
separate pcb on these machines!)
 1.32 17-Mar-2020  maxv branches: 1.32.28;
Add a redzone between the pcb and the stack. Sent to port-amd64@.
 1.31 12-Oct-2019  christos disable CTASSERT for lint
 1.30 12-Oct-2019  maxv Rewrite the FPU code on x86. This greatly simplifies the logic and removes
the dependency on IPL_HIGH. NVMM is updated accordingly. Posted on
port-amd64 a week ago.

Bump the kernel version to 9.99.16.
 1.29 26-Jul-2018  maxv Rework dbregs, to switch the registers during context switches, and not on
each user->kernel transition via userret. Reloads of DR6/DR7 are expensive
on both native and xen.
 1.28 31-Dec-2017  maxv branches: 1.28.2; 1.28.4;
gc unused
 1.27 31-Oct-2017  maxv Don't embed our own values in the reserved fields of the XSAVE area, it
really is a bad idea. Move them into the PCB.
 1.26 23-Feb-2017  kamil Introduce PT_GETDBREGS and PT_SETDBREGS in ptrace(2) on i386 and amd64

This interface is modeled after FreeBSD API with the usage.

This replaced previous watchpoint API. The previous one was introduced
recently in NetBSD-current and remove its spurs without any
backward-compatibility.

Design choices for Debug Register accessors:
- exec() (TRAP_EXEC event) must remove debug registers from LWP
- debug registers are only per-LWP, not per-process globally
- debug registers must not be inherited after (v)forking a process
- debug registers must not be inherited after forking a thread
- a debugger is responsible to set global watchpoints/breakpoints with the
debug registers, to achieve this PTRACE_LWP_CREATE/PTRACE_LWP_EXIT event
monitoring function is designed to be used
- debug register traps must generate SIGTRAP with si_code TRAP_DBREG
- debugger is responsible to retrieve debug register state to distinguish
the exact debug register trap (DR6 is Status Register on x86)
- kernel must not remove debug register traps after triggering a trap event
a debugger is responsible to detach this trap with appropriate PT_SETDBREGS
call (DR7 is Control Register on x86)
- debug registers must not be exposed in mcontext
- userland must not be allowed to set a trap on the kernel

Implementation notes on i386 and amd64:
- the initial state of debug register is retrieved on boot and this value is
stored in a local copy (initdbregs), this value is used to initialize dbreg
context after PT_GETDBREGS
- struct dbregs is stored in pcb as a pointer and by default not initialized
- reserved registers (DR4-DR5, DR9-DR15) are ignored

Further ideas:
- restrict this interface with securelevel

Tested on real hardware i386 (Intel Pentium IV) and amd64 (Intel i7).

This commit enables 390 debug register ATF tests in kernel/arch/x86.
All tests are passing.

This commit does not cover netbsd32 compat code. Currently other interface
PT_GET_SIGINFO/PT_SET_SIGINFO is required in netbsd32 compat code in order to
validate reliably PT_GETDBREGS/PT_SETDBREGS.

This implementation does not cover FreeBSD specific defines in their
<x86/reg.h>: DBREG_DR7_LOCAL_ENABLE, DBREG_DR7_GLOBAL_ENABLE, DBREG_DR7_LEN_1
etc. These values tend to be reinvented by each tracer on its own. GNU
Debugger (GDB) works with NetBSD debug registers after adding this patch:

--- gdb/amd64bsd-nat.c.orig 2016-02-10 03:19:39.000000000 +0000
+++ gdb/amd64bsd-nat.c
@@ -167,6 +167,10 @@ amd64bsd_target (void)

#ifdef HAVE_PT_GETDBREGS

+#ifndef DBREG_DRX
+#define DBREG_DRX(d,x) ((d)->dr[(x)])
+#endif
+
static unsigned long
amd64bsd_dr_get (ptid_t ptid, int regnum)
{


Another reason to stop introducing unpopular defines covering machine
specific register macros is that these value varies across generations of
the same CPU family.

GDB demo:
(gdb) c
Continuing.

Watchpoint 2: traceme

Old value = 0
New value = 16
main (argc=1, argv=0x7f7fff79fe30) at test.c:8
8 printf("traceme=%d\n", traceme);

(Currently the GDB interface is not reliable due to NetBSD support bugs)

Sponsored by <The NetBSD Foundation>
 1.25 20-Feb-2014  dsl branches: 1.25.6; 1.25.10; 1.25.14;
Move the amd64 and i386 pcb to the bottom of the uarea, and move the
kernel stack to the top.
Change the pcb layouts so that fpu save area is at the end and is
64byte aligned ready for xsave (saving the ymm registers).
Welcome to 6.99.32
 1.24 11-Feb-2014  dsl Move sys/arch/amd64/amd64/fpu.c and sys/arch/amd64/include/fpu.h
into sys/arch/x86 in preparation for using the same code for i386.
 1.23 07-Feb-2014  dsl Convert the amd64 build to use x86/cpu_extended_state.h so that the fpu
definitions match those of i386.
Mostly just structure and field renames, in addition:
1) process_xmm_to_s87() and process_s87_to_xmm() moved into
x86/convert_xmm_s87.c so they can be used by amd64's netbsd32 code.
2) The linux signal code simplified to use a structure copy for ths fxsave
data - it matches the hardware definition and won't change.
 1.22 19-Jan-2014  dsl Remove the unused 'struct md_coredump'.
 1.21 11-Dec-2013  dsl Remove the fields that were used to save the i387 fp state on interrupt.
They were written but never read.
Possibly they should be saved for 32 bit processes, but that might be a relic
from real i387 where the fpu was actully asynchronous.
 1.20 01-Dec-2013  christos revert fpu/pcu changes until we figure out what's wrong; they cause random
freezes
 1.19 23-Oct-2013  drochner Use the MI "pcu" framework for bookkeeping of npx/fpu states on x86.
This reduces the amount of MD code enormously, and makes it easier
to implement support for newer CPU features which require more fpu
state, or for fpu usage by the kernel.
For access to FPU state across CPUs, an xcall kthread is used now
rather than a dedicated IPI.
No user visible changes intended.
 1.18 31-Dec-2012  dsl branches: 1.18.2;
Move the two fields used to save some i387 state on the last fpu trap
into their own sub-structure of the pcb (from 'struct savefpu').
They only (seem) to be used in some code that generates core dumps
for 32bit processes (code that might be broken as well!).
'struct safefpu' is now identical to 'struct fxsave64'. One (or both)
needs extending to support AVX - might need to be dynamically sized.
Removed all the __aligned(16) except for the one in struct pcb itself.
Only the copy used for the fsave instruction need be aligned.
 1.17 07-Jul-2010  chs branches: 1.17.8; 1.17.18;
add the guts of TLS support on amd64. based on joerg's patch,
reworked by me to support 32-bit processes as well.
we now keep %fs and %gs loaded with the user values
while in the kernel, which means we don't need to
reload them when returning to user mode.
 1.16 27-Oct-2009  rmind branches: 1.16.2; 1.16.4;
Make pcb_ldt_sel, in amd64, an unused field. Unlike in i386, it was
missed during clean-up of LDT handling.
 1.15 26-Oct-2008  mrg branches: 1.15.8;
put the contents of these header files around #ifdef __x86_64__, and
#include the <i386/foo.h> in the #else clause, making these files
largely bit-size independant.
 1.14 30-Apr-2008  ad branches: 1.14.6;
lcr0() was changed to take a u_long. pcb_cr0 was a 32-bit signed quantity.
It was being sign extended in cpu_hatch() (CR0_PG is always set), causing
systems to crash and reboot before going multiuser.
 1.13 28-Apr-2008  martin Remove clause 3 and 4 from TNF licenses
 1.12 16-Apr-2008  cegger branches: 1.12.2; 1.12.4;
use POSIX integer types
 1.11 05-Jan-2008  yamt branches: 1.11.6;
- make amd64 use per-cpu tss.
- fix iopl syscall for amd64+xen.
 1.10 27-Nov-2007  christos branches: 1.10.6;
Shuffle things around so that pcb_savefpu goes back to be aligned in a 16
bit boundary. Noted by Arto Huusko.
 1.9 26-Nov-2007  christos make cr2 64 bits. Requested by fvdl.
 1.8 24-Nov-2007  christos preserve cr2 on pcb for the benefit of linux emulation.
 1.7 17-Oct-2007  garbled branches: 1.7.2;
Merge the ppcoea-renovation branch to HEAD.

This branch was a major cleanup and rototill of many of the various OEA
cpu based PPC ports that focused on sharing as much code as possible
between the various ports to eliminate near-identical copies of files in
every tree. Additionally there is a new PIC system that unifies the
interface to interrupt code for all different OEA ppc arches. The work
for this branch was done by a variety of people, too long to list here.

TODO:
bebox still needs work to complete the transition to -renovation.
ofppc still needs a bunch of work, which I will be looking at.
ev64260 still needs to be renovated
amigappc was not attempted.

NOTES:
pmppc was removed as an arch, and moved to a evbppc target.
 1.6 17-May-2007  yamt branches: 1.6.8; 1.6.10;
merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.
 1.5 04-Mar-2007  christos branches: 1.5.2; 1.5.4; 1.5.10;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.
 1.4 11-Dec-2005  christos branches: 1.4.26;
merge ktrace-lwp.
 1.3 15-May-2005  fvdl branches: 1.3.2;
Optionally include saving and restoring the 64bit %gs and %fs base register
values in the PCB. Do this in pmap_activate for now (XXX not a good place
for it, but a convenient one).
 1.2 07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.1 26-Apr-2003  fvdl branches: 1.1.2;
Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1.2.4 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.1.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.1.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.1.2.1 03-Aug-2004  skrll Sync with HEAD
 1.3.2.3 21-Jan-2008  yamt sync with head
 1.3.2.2 07-Dec-2007  yamt sync with head
 1.3.2.1 03-Sep-2007  yamt sync with head.
 1.4.26.2 12-Mar-2007  rmind Sync with HEAD.
 1.4.26.1 03-Mar-2007  yamt adapt amd64.

XXX changes in identcpu.c is minmum for MONITOR.
XXX identcpu.c should be shared with i386.
 1.5.10.1 22-May-2007  matt Update to HEAD.
 1.5.4.1 11-Jul-2007  mjf Sync with head.
 1.5.2.2 03-Dec-2007  ad Sync with HEAD.
 1.5.2.1 27-May-2007  ad Sync with head.
 1.6.10.2 09-Jan-2008  matt sync with HEAD
 1.6.10.1 06-Nov-2007  matt sync with HEAD
 1.6.8.1 27-Nov-2007  joerg Sync with HEAD. amd64 Xen support needs testing.
 1.7.2.2 18-Feb-2008  mjf Sync with HEAD.
 1.7.2.1 08-Dec-2007  mjf Sync with HEAD.
 1.10.6.1 08-Jan-2008  bouyer Sync with HEAD
 1.11.6.2 17-Jan-2009  mjf Sync with HEAD.
 1.11.6.1 02-Jun-2008  mjf Sync with HEAD.
 1.12.4.4 11-Aug-2010  yamt sync with head.
 1.12.4.3 11-Mar-2010  yamt sync with head
 1.12.4.2 04-May-2009  yamt sync with head.
 1.12.4.1 16-May-2008  yamt sync with head.
 1.12.2.1 18-May-2008  yamt sync with head.
 1.14.6.1 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.15.8.2 24-Oct-2010  jym Sync with HEAD
 1.15.8.1 01-Nov-2009  jym Sync with HEAD.
 1.16.4.1 05-Mar-2011  rmind sync with head
 1.16.2.1 17-Aug-2010  uebayasi Sync with HEAD.
 1.17.18.3 03-Dec-2017  jdolecek update from HEAD
 1.17.18.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.17.18.1 25-Feb-2013  tls resync with head
 1.17.8.2 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.17.8.1 23-Jan-2013  yamt sync with head
 1.18.2.1 18-May-2014  rmind sync with head
 1.25.14.1 21-Apr-2017  bouyer Sync with HEAD
 1.25.10.1 20-Mar-2017  pgoyette Sync with HEAD
 1.25.6.1 28-Aug-2017  skrll Sync with HEAD
 1.28.4.3 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.28.4.2 08-Apr-2020  martin Merge changes from current as of 20200406
 1.28.4.1 10-Jun-2019  christos Sync with HEAD
 1.28.2.1 28-Jul-2018  pgoyette Sync with HEAD
 1.32.28.1 02-Aug-2025  perseant Sync with HEAD
 1.2 15-Jun-2003  fvdl Handle 64bit DMA addresses on PCI for platforms that can (currently only
enabled on amd64). Add a dmat64 field to various PCI attach structures,
and pass it down where needed. Implement a simple new function called
pci_dma64_available(pa) to test if 64bit DMA addresses may be used.
This returns 1 iff _PCI_HAVE_DMA64 is defined in <machine/pci_machdep.h>,
and there is more than 4G of memory.
 1.1 26-Apr-2003  fvdl Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1 26-Apr-2003  fvdl Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1 26-Apr-2003  fvdl Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.69 20-Aug-2022  riastradh x86: Move definition of struct pmap to pmap_private.h.

This makes pmap_resident_count and pmap_wired_count out-of-line
functions instead of inline. No functional change intended
otherwise.
 1.68 20-Aug-2022  riastradh x86: Split most of pmap.h into pmap_private.h or vmparam.h.

This way pmap.h only contains the MD definition of the MI pmap(9)
API, which loads of things in the kernel rely on, so changing x86
pmap internals no longer requires recompiling the entire kernel every
time.

Callers needing these internals must now use machine/pmap_private.h.
Note: This is not x86/pmap_private.h because it contains three parts:

1. CPU-specific (different for i386/amd64) definitions used by...

2. common definitions, including Xenisms like xpmap_ptetomach,
further used by...

3. more CPU-specific inlines for pmap_pte_* operations

So {amd64,i386}/pmap_private.h defines 1, includes x86/pmap_private.h
for 2, and then defines 3. Maybe we should split that out into a new
pmap_pte.h to reduce this trouble.

No functional change intended, other than that some .c files must
include machine/pmap_private.h when previously uvm/uvm_pmap.h
polluted the namespace with pmap internals.

Note: This migrates part of i386/pmap.h into i386/vmparam.h --
specifically the parts that are needed for several constants defined
in vmparam.h:

VM_MAXUSER_ADDRESS
VM_MAX_ADDRESS
VM_MAX_KERNEL_ADDRESS
VM_MIN_KERNEL_ADDRESS

Since i386 needs PDP_SIZE in vmparam.h, I added it there on amd64
too, just to keep things parallel.
 1.67 20-Aug-2022  riastradh x86: Move struct vm_page_md to common x86/pmap.h.
 1.66 15-May-2020  ad Revert previous after thinking about it. It was wrong, don't need to use
an atomic to clear a PTE or set initial version unless the circumstances
call for it.
 1.65 17-Mar-2020  ad Always set PTEs using atomics. There are too many assumptions to go wrong.
 1.64 14-Nov-2019  maxv Add support for Kernel Memory Sanitizer (kMSan). It detects uninitialized
memory used by the kernel at run time, and just like kASan and kCSan, it
is an excellent feature. It has already detected 38 uninitialized variables
in the kernel during my testing, which I have since discreetly fixed.

We use two shadows:
- "shad", to track uninitialized memory with a bit granularity (1:1).
Each bit set to 1 in the shad corresponds to one uninitialized bit of
real kernel memory.
- "orig", to track the origin of the memory with a 4-byte granularity
(1:1). Each uint32_t cell in the orig indicates the origin of the
associated uint32_t of real kernel memory.

The memory consumption of these shadows is consequent, so at least 4GB of
RAM is recommended to run kMSan.

The compiler inserts calls to specific __msan_* functions on each memory
access, to manage both the shad and the orig and detect uninitialized
memory accesses that change the execution flow (like an "if" on an
uninitialized variable).

We mark as uninit several types of memory buffers (stack, pools, kmem,
malloc, uvm_km), and check each buffer passed to copyout, copyoutstr,
bwrite, if_transmit_lock and DMA operations, to detect uninitialized memory
that leaves the system. This allows us to detect kernel info leaks in a way
that is more efficient and also more user-friendly than KLEAK.

Contrary to kASan, kMSan requires comprehensive coverage, ie we cannot
tolerate having one non-instrumented function, because this could cause
false positives. kMSan cannot instrument ASM functions, so I converted
most of them to __asm__ inlines, which kMSan is able to instrument. Those
that remain receive special treatment.

Contrary to kASan again, kMSan uses a TLS, so we must context-switch this
TLS during interrupts. We use different contexts depending on the interrupt
level.

The orig tracks precisely the origin of a buffer. We use a special encoding
for the orig values, and pack together in each uint32_t cell of the orig:
- a code designating the type of memory (Stack, Pool, etc), and
- a compressed pointer, which points either (1) to a string containing
the name of the variable associated with the cell, or (2) to an area
in the kernel .text section which we resolve to a symbol name + offset.

This encoding allows us not to consume extra memory for associating
information with each cell, and produces a precise output, that can tell
for example the name of an uninitialized variable on the stack, the
function in which it was pushed on the stack, and the function where we
accessed this uninitialized variable.

kMSan is available with LLVM, but not with GCC.

The code is organized in a way that is similar to kASan and kCSan, so it
means that other architectures than amd64 can be supported.
 1.63 01-Nov-2019  maxv Fix KUBSAN: the kernel size now exceeds the mapping limit, so bump the
limit.
 1.62 07-Aug-2019  maxv Add support for USER_LDT in SVS. This allows us to have both enabled at
the same time.

We allocate an LDT for each CPU in the GDT and map an area for it, in
addition to the default LDT already present. In context switches between
different processes, we choose between the default or the per-cpu LDT
selector: if the user set specific LDT entries, we memcpy them to the
per-cpu LDT and load the per-cpu selector.

Tested by Naveen Narayanan (with Wine on amd64).
 1.61 29-May-2019  maxv Add PCID support in SVS. This avoids TLB flushes during kernel<->user
transitions, which greatly reduces the performance penalty introduced by
SVS.

We use two ASIDs, 0 (kern) and 1 (user), and use invpcid to flush pages
in both ASIDs.

The read-only machdep.svs.pcid={0,1} sysctl is added, and indicates whether
SVS+PCID is in use.
 1.60 09-Mar-2019  maxv New software PTE bits.
 1.59 11-Feb-2019  cherry We reorganise definitions for XEN source support as follows:

XEN - common sources required for baseline XEN support.
XENPV - sources required for support of XEN in PV mode.
XENPVHVM - sources required for support for XEN in HVM mode.
XENPVH - sources required for support for XEN in PVH mode.
 1.58 19-Nov-2018  maxv Introduce pl_pi, will be used soon.
 1.57 19-Nov-2018  maxv Rename 'mask' -> 'frame', we will use the real 'mask' soon.
 1.56 29-Aug-2018  maxv Remove the constants of the DMAP, they are unused, and move NL4_SLOT_DIRECT
into amd64/.
 1.55 20-Aug-2018  maxv Add support for kASan on amd64. Written by me, with some parts inspired
from Siddharth Muralee's initial work. This feature can detect several
kinds of memory bugs, and it's an excellent feature.

It can be enabled by uncommenting these three lines in GENERIC:

#makeoptions KASAN=1 # Kernel Address Sanitizer
#options KASAN
#no options SVS

The kernel is compiled without SVS, without DMAP and without PCPU area.
A shadow area is created at boot time, and it can cover the upper 128TB
of the address space. This area is populated gradually as we allocate
memory. With this design the memory consumption is kept at its lowest
level.

The compiler calls the __asan_* functions each time a memory access is
done. We verify whether this access is legal by looking at the shadow
area.

We declare our own special memcpy/memset/etc functions, because the
compiler's builtins don't add the __asan_* instrumentation.

Initially all the mappings are marked as valid. During dynamic
allocations, we add a redzone, which we mark as invalid. Any access on
it will trigger a kASan error message. Additionally, the compiler adds
a redzone on global variables, and we mark these redzones as invalid too.
The illegal-access detection works with a 1-byte granularity.

For now, we cover three areas:

- global variables
- kmem_alloc-ated areas
- malloc-ated areas

More will come, but that's a good start.
 1.54 17-Aug-2018  maxv Remove big outdated comment, remove unused macros, remove XXX that has
nothing to do here, style.
 1.53 12-Aug-2018  maxv More ASLR: randomize the location of the PTE area. The PTE slot is not
created in locore anymore, but a little later; by using the already
entered L4 page, rather than the recursive slot itself (which doesn't
exist yet).

In the prekern we still map the slot - the prekern behaves as an external
locore -, because we need it as part of the randomization/relocation
work. The kernel then removes this slot, and regenerates a randomized
one.

Tested on GENERIC and GENERIC_KASLR, Xen doesn't have it and dom0 still
boots fine.
 1.52 12-Aug-2018  maxv Move the PTE area from slot 255 to slot 509. I've never understood why we
put it on 255; the "kernel" half of the VM space begins on slot 256, so
if anything, the PTE area should have been above it, not below.

Virtually extend the user slots in slotspace, because we don't want
(randomized) kernel mappings to land on slot 255.

The prekern is updated accordingly.

Tested on GENERIC, GENERIC_KASLR and XEN3_DOM0.
 1.51 12-Aug-2018  maxv Introduce PDIR_SLOT_USERLIM, which indicates the limit of the user slots.
Use it instead of PDIR_SLOT_PTE when we just want to iterate over the
user slots. Also use it in SVS, I had hardcoded 255 because there was no
proper define (which there now is).
 1.50 12-Aug-2018  maxv Randomize the main memory on Xen, same as native. Tested on amd64-dom0.
 1.49 12-Aug-2018  maxv More ASLR: randomize the kernel main memory. VM_MIN_KERNEL_ADDRESS becomes
variable, and its location is chosen at boot time. There is room for
improvement, since for now we ask for an alignment of NBPD_L4.

This is enabled by default in GENERIC, but not in Xen. Tested extensively
on GENERIC and GENERIC_KASLR, XEN3_DOM0 still boots fine.
 1.48 27-Jul-2018  maxv Remove KERN_BASE, unused. It has always been wrong anyway, the value
should have been passed into VA_SIGN_NEG().
 1.47 25-Jul-2018  maxv Remove NPTECL, unused.
 1.46 19-May-2018  jdolecek branches: 1.46.2;
add experimental new function uvm_direct_process(), to allow of read/writes
of contents of uvm pages without mapping them into kernel, using
direct map or moral equivalent; pmaps supporting the interface need
to provide pmap_direct_process() and define PMAP_DIRECT

implement the new interface for amd64; I hear alpha and mips might be relatively
easy to add too, but I lack the knowledge

part of resolution for PR kern/53124
 1.45 22-Feb-2018  maxv branches: 1.45.2;
Remove svs_pgg_update(). Instead of manually changing PG_G on each page,
we can disable the global-paging mechanism in %cr4 with CR4_PGE. Do that.

In addition, install CR4_PGE when SVS is disabled manually (via the
sysctl).

Now, doing "sysctl -w machdep.svs_enabled=0" restores the performance
completely, exactly as if SVS hadn't been enabled in the first place.
 1.44 22-Feb-2018  maxv Improve the SVS initialization.

Declare x86_patch_window_open() and x86_patch_window_close(), and globalify
x86_hotpatch().

Introduce svs_enable() in x86/svs.c, that does the SVS hotpatching.

Change svs_init() to take a bool. This function gets called twice; early
when the system just booted (and nothing is initialized), lately when at
least pmap_kernel has been initialized.
 1.43 18-Feb-2018  maxv Add svs_enabled, which defaults to 'true' when SVS is compiled (no dynamic
detection yet).
 1.42 21-Jan-2018  maxv Increase the size of the initial mapping of the kernel. KASLR kernels are
bigger than their GENERIC counterparts, and the limit will soon be hit on
them.
 1.41 07-Jan-2018  maxv Add a new option, SVS (for Separate Virtual Space), that unmaps kernel
pages when running in userland. For now, only the PTE area is unmapped.

Sent on tech-kern@.
 1.40 17-Jun-2017  maxv Increase the kernel heap size from 512GB to 32TB, in such a way that it
is able to map the maximum amount of ram supported twice (16TB x 2).
 1.39 11-Nov-2016  maxv branches: 1.39.8;
Remove useless values, and explain where some others come from
 1.38 22-Jul-2016  maxv Remove pmap_prealloc_lowmem_ptps on amd64. This function creates levels in
the page tree so that the first 2MB of virtual memory can be kentered in
L1.

Strictly speaking, the kernel should never kenter a virtual page below
VM_MIN_KERNEL_ADDRESS, because then it wouldn't be available in userland.
It used to need the first 2MB in order to map the CPU trampoline and the
initial VAs used by the bootstrap code. Now, the CPU trampoline VA is
allocated with uvm_km_alloc and the VAs used by the bootstrap code are
allocated with pmap_bootstrap_valloc, and in either case the resulting VA
is above VM_MIN_KERNEL_ADDRESS.

The low levels in the page tree are therefore unused. By removing this
function, we are making sure no one will be tempted to map an area below
VM_MIN_KERNEL_ADDRESS in kernel mode, and particularly, we are making sure
NULL cannot be kentered.

In short, there is no way to map NULL in kernel mode anymore.
 1.37 21-May-2016  maxv branches: 1.37.2;
Explain where this value comes from.
 1.36 14-May-2016  maxv KNF so it appears aligned on NXR, and fix a comment.
 1.35 09-Jan-2015  riastradh Bump amd64 module map size to 32 MB.

For lack of anything better to do, after no progress in discussion on
the matter:

https://mail-index.netbsd.org/port-amd64/2014/08/22/msg002108.html

Needed in order to load the (solaris module needed by) dtrace module.
 1.34 30-Jun-2012  jym branches: 1.34.2; 1.34.14; 1.34.16;
Extend the xpmap API, as described in [1]. This change is mechanical and
avoids exposing the MD phys_to_machine/machine_to_phys tables directly.
Added:

- xpmap_ptom handles PFN (pseudo physical) to MFN (machine frame number)
translations, and is under control of the domain.
- xpmap_mtop is its counterpart (MFN to PFN), and is under control of
hypervisor.

xpmap_ptom_map() map a pseudo-phys address to a machine address
xpmap_ptom_unmap() unmap a pseudo-phys address (invalidation)
xpmap_ptom_isvalid() check for pseudo-phys address validity

The parameters are physical/machine addresses, like bus_dma/bus_space(9).
As x86 MFNs are tracked by u_long (Xen's choice) while machine addresses
can be 64 bits entities (PAE), use ptoa() to avoid truncation when bit
shifting by PAGE_SHIFT.

I kept the same namespace (xpmap_) to avoid code churn.

[1] http://mail-index.netbsd.org/port-xen/2009/05/09/msg004951.html

XXX will document ptoa/atop/trunc_page separately.
 1.33 11-Jun-2012  chs allow more space for modules.
 1.32 19-Feb-2012  cherry Removing remaining references to the alternate PTE space. Modify documentation appropriately
 1.31 19-Jan-2012  bouyer branches: 1.31.2;
pmap_pte_set() is not supposed to be atomic, so only raise IPL, no need to
take pte_lock
 1.30 15-Jan-2012  cherry for xen on amd64 PDP_BASE points to the per-cpu ci->ci_kpm_pdir copy of *pmap_kernel()*'s L4 pdir, which is an alias for ci->ci_kpm_pdir. This is unlike PAE, where PDP_BASE points to the per-pmap pm_pdir consisting of 4 pages, the last of which is the "shadow". This "shadow" is not used directly in an active pmap, since it duplicates the kernel space and, for PAE, xen dissallows multiple cpus pointing to the same L3[3] page. Therefore, we use a per-cpu copy of the pmap_kernel() pdir's L3[3] page, ci->ci_pae_l3_pdir[3], while L3[0-2] point to the original pmap's pm_pdir[0 - 2]. Thus the "shadow" pdir only exists on i386 PAE. Note that on PAE, the recursive PDIR_SLOT_PTE is not per-cpu, and therefore cannot be made to point to per-cpu pdirs via (L4_BASE + PDIR_SLOT_PTE), unlike xen x86_64 where this is exactly the case.
 1.29 09-Jan-2012  cherry Make cross-cpu pte access MP safe.
XXX: review cases of use of pmap_set_pte() vs direct use of xpq_queue_pte_update()
 1.28 06-Nov-2011  cherry branches: 1.28.4;
[merging from cherry-xenmp] make pmap_kernel() shadow PMD per-cpu and MP aware.
 1.27 06-Nov-2011  cherry [merging from cherry-xenmp] Make the xen MMU op queue locking api private. Implement per-cpu queues.
 1.26 27-Aug-2011  christos branches: 1.26.2;
Implement sparse dumps for amd64 (copied from i386). Disabled for now via
sysctl.
XXX: most of the code can be merged.
 1.25 13-Aug-2011  cherry Add locking around ops to the hypervisor MMU "queue".
 1.24 01-Feb-2011  chuck branches: 1.24.2;
udpate license clauses on my code to match the new-style BSD licenses.
based on diff that rmind@ sent me.

no functional change with this commit.
 1.23 14-Nov-2010  uebayasi branches: 1.23.2; 1.23.4;
Move struct vm_page_md definition from vmparam.h to pmap.h, because
it's used only by pmap. vmparam.h has definitions for wider
audience.

All GENERIC kernels build tested, except ia64.

powerpc/include/booke/vmparam.h has one too, but it has no pmap.h,
so it's left as is.
 1.22 26-Oct-2008  mrg branches: 1.22.8; 1.22.14; 1.22.16;
put the contents of these header files around #ifdef __x86_64__, and
#include the <i386/foo.h> in the #else clause, making these files
largely bit-size independant.
 1.21 23-Jan-2008  bouyer branches: 1.21.6; 1.21.10; 1.21.16;
Merge the bouyer-xeni386 branch. This brings in PAE support to NetBSD xeni386
(domU only). PAE support is enabled by 'options PAE', see the new XEN3PAE_DOMU
and INSTALL_XEN3PAE_DOMU kernel config files.

See the comments in arch/i386/include/{pte.h,pmap.h} to see how it works.
In short, we still handle it as a 2-level MMU, with the second level page
directory being 4 pages in size. pmap switching is done by switching the
L2 pages in the L3 entries, instead of loading %cr3. This is almost required
by Xen, which handle the last L2 page (the one mapping 0xc0000000 - 0xffffffff)
in a very special way. But this approach should also work for native PAE
support if ever supported (in fact, the pmap should almost suport native
PAE, what's missing is bootstrap code in locore.S).
 1.20 20-Jan-2008  bouyer Make first argument of Xen's pmap_pte_cas() volatile, fix a warning
building pmap.c.
 1.19 13-Jan-2008  yamt add pmap_pte_cas.
 1.18 03-Jan-2008  ad Bump NKL2_KIMG_ENTRIES to allow for 20MB of kernel.

Well past time for an in-kernel linker...
 1.17 28-Nov-2007  ad branches: 1.17.6;
Remove remaining CPUCLASS_386 tests.
 1.16 28-Nov-2007  ad Use the new atomic ops.
 1.15 22-Nov-2007  bouyer Pull up the bouyer-xenamd64 branch to HEAD. This brings in amd64 support
to NetBSD/Xen, both Dom0 and DomU.
 1.14 18-Oct-2007  yamt branches: 1.14.2;
merge yamt-x86pmap branch.

- reduce differences between amd64 and i386. notably, share pmap.c
between them. it makes several i386 pmap improvements available to
amd64, including tlb shootdown reduction and bug fixes from Stephan Uphoff.
- implement deferred pmap switching for amd64.
- remove LARGEPAGES option. always use large pages if available.
also, make it work on amd64.
 1.13 17-Oct-2007  garbled Merge the ppcoea-renovation branch to HEAD.

This branch was a major cleanup and rototill of many of the various OEA
cpu based PPC ports that focused on sharing as much code as possible
between the various ports to eliminate near-identical copies of files in
every tree. Additionally there is a new PIC system that unifies the
interface to interrupt code for all different OEA ppc arches. The work
for this branch was done by a variety of people, too long to list here.

TODO:
bebox still needs work to complete the transition to -renovation.
ofppc still needs a bunch of work, which I will be looking at.
ev64260 still needs to be renovated
amigappc was not attempted.

NOTES:
pmppc was removed as an arch, and moved to a evbppc target.
 1.12 27-Sep-2007  ad branches: 1.12.2;
Only include machine/cpufunc.h if _KERNEL.
 1.11 29-Aug-2007  ad branches: 1.11.2;
Merge most x86 changes from the vmlocking branch, except the threaded soft
interrupt stuff. This is mostly comprised of changes to the pmap modules to
work on multiprocessor systems without kernel_lock, and changes to speed up
tlb shootdowns.
 1.10 21-Feb-2007  thorpej branches: 1.10.4; 1.10.10; 1.10.12; 1.10.16; 1.10.20; 1.10.22;
Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.
 1.9 16-Feb-2006  perry branches: 1.9.20;
Change "inline" back to "__inline" in .h files -- C99 is still too
new, and some apps compile things in C89 mode. C89 keywords stay.

As per core@.
 1.8 24-Dec-2005  perry branches: 1.8.2; 1.8.4; 1.8.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.
 1.7 11-Dec-2005  christos merge ktrace-lwp.
 1.6 04-Jul-2005  blymn branches: 1.6.2;
Remove bogus external declaration for pdes, it appears not to be needed.
 1.5 08-Aug-2004  yamt kvtopte: use a correct base addr for LARGEPAGES.
 1.4 08-Aug-2004  yamt correct VAs in a comment.
 1.3 15-Jun-2004  fvdl Add a prototype for pmap_changeprot_local, a function that changes
protection for a page and doesn't care about TLB shootdowns.
 1.2 04-Jun-2004  sekiya Use the SPLAY_* macros. Copied from the i386 pmap, okay'ed by fvdl@
 1.1 26-Apr-2003  fvdl branches: 1.1.2;
Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1.2.5 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.1.2.4 21-Sep-2004  skrll Fix the sync with head I botched.
 1.1.2.3 18-Sep-2004  skrll Sync with HEAD.
 1.1.2.2 12-Aug-2004  skrll Sync with HEAD.
 1.1.2.1 03-Aug-2004  skrll Sync with HEAD
 1.6.2.6 04-Feb-2008  yamt sync with head.
 1.6.2.5 21-Jan-2008  yamt sync with head
 1.6.2.4 07-Dec-2007  yamt sync with head
 1.6.2.3 27-Oct-2007  yamt sync with head.
 1.6.2.2 03-Sep-2007  yamt sync with head.
 1.6.2.1 26-Feb-2007  yamt sync with head.
 1.8.6.1 22-Apr-2006  simonb Sync with head.
 1.8.4.1 09-Sep-2006  rpaulo sync with head
 1.8.2.1 18-Feb-2006  yamt sync with head.
 1.9.20.1 27-Feb-2007  yamt - sync with head.
- move sched_changepri back to kern_synch.c as it doesn't know PPQ anymore.
 1.10.22.3 23-Mar-2008  matt sync with HEAD
 1.10.22.2 09-Jan-2008  matt sync with HEAD
 1.10.22.1 06-Nov-2007  matt sync with HEAD
 1.10.20.6 03-Dec-2007  joerg Sync with HEAD.
 1.10.20.5 27-Nov-2007  joerg Sync with HEAD. amd64 Xen support needs testing.
 1.10.20.4 26-Oct-2007  joerg Sync with HEAD.

Follow the merge of pmap.c on i386 and amd64 and move
pmap_init_tmp_pgtbl into arch/x86/x86/pmap.c. Modify the ACPI wakeup
code to restore CR4 before jumping back into kernel space as the large
page option might cover that.
 1.10.20.3 02-Oct-2007  joerg Sync with HEAD.
 1.10.20.2 10-Sep-2007  joerg Introduce pmap_init_tmp_pgtbl to build a temporary copy of the kernel
side page mapping and an identity mapping low page for use in real mode.
Switch MP bootstrap and i386 ACPI wakeup code to use it.
 1.10.20.1 03-Sep-2007  jmcneill Sync with HEAD.
 1.10.16.1 03-Sep-2007  skrll Sync with HEAD.
 1.10.12.1 03-Oct-2007  garbled Sync with HEAD
 1.10.10.1 18-Apr-2007  thorpej Convert i386 and amd64 to the new atomic ops API.
 1.10.4.9 03-Dec-2007  ad Sync with HEAD.
 1.10.4.8 23-Oct-2007  ad Sync with head.
 1.10.4.7 12-Oct-2007  ad Remove unnecessary changes v head.
 1.10.4.6 09-Oct-2007  ad Sync with head.
 1.10.4.5 09-Oct-2007  ad Sync with head.
 1.10.4.4 09-Oct-2007  ad Sync with head.
 1.10.4.3 01-Sep-2007  ad Use pool_cache for allocating a few more types of objects.
 1.10.4.2 23-Aug-2007  ad Add pmap_pte_set, pmap_pte_setbits, pmap_pte_clearbits where missing.
 1.10.4.1 21-Aug-2007  ad amd64 changes, as yet untested:

- Adapt to vmlocking branch.
- Apply TLB shootdown and pv allocation changes to the pmap.
- Make it build.
 1.11.2.17 08-Oct-2007  yamt revive pmap_changeprot_local which has been removed mistakenly.
 1.11.2.16 08-Oct-2007  yamt fix the previous.
 1.11.2.15 08-Oct-2007  yamt merge some parts of x86 pmap.h.
 1.11.2.14 07-Oct-2007  yamt tweak assertions to reduce diffs between i386 and amd64.
 1.11.2.13 07-Oct-2007  yamt sync comments and whitespaces.
 1.11.2.12 07-Oct-2007  yamt rename PTDpaddr to PDPpaddr to match with i386.
(if you think it's a good idea to make gratuitous renames like this,
please do it for both of i386 and amd64 consistently.)
 1.11.2.11 07-Oct-2007  yamt sync with i386. remove some unused externs.
 1.11.2.10 07-Oct-2007  yamt remove some #ifdef _LOCORE and use genassym instead.
 1.11.2.9 07-Oct-2007  yamt sync with i386. no functional changes.
- kill __P
- ansify
 1.11.2.8 07-Oct-2007  yamt remove an unused definition.
 1.11.2.7 07-Oct-2007  yamt remove unused definitions.
 1.11.2.6 06-Oct-2007  yamt sync with head.
 1.11.2.5 04-Oct-2007  yamt remove LARGEPAGES option. always use large pages if available.
 1.11.2.4 30-Sep-2007  yamt implement deferred pmap switching for amd64, and make amd64 use
x86 shared pmap code. it makes several i386 pmap improvements available
to amd64, including tlb shootdown reduction and bug fixes from Stephan Uphoff.
 1.11.2.3 30-Sep-2007  yamt remove unused pmap_remove_record.
 1.11.2.2 30-Sep-2007  yamt - whitespace
- ptob -> x86_ptob
 1.11.2.1 29-Sep-2007  yamt u_int32_t -> uint32_t to reduce diffs from i386
 1.12.2.3 25-Oct-2007  bouyer Finish sync with HEAD. Especially use the new x86 pmap for xenamd64.
For this:
- rename pmap_pte_set() to pmap_pte_testset()
- make pmap_pte_set() a function or macro for non-atomic PTE write
- define and use pmap_pa2pte()/pmap_pte2pa() to read/write PTE entries
- define pmap_pte_flush() which is a nop in x86 case, and flush the
MMUops queue in the Xen case
 1.12.2.2 21-Oct-2007  bouyer Protect xpq_* usage with splvm()
Make sure xen_current_user_pgd really reflects what's in the hypervisor.
 1.12.2.1 17-Oct-2007  bouyer amd64 (aka x86-64) support for Xen. Based on the OpenBSD port done by
Mathieu Ropert in 2006.
DomU-only for now. An INSTALL_XEN3_DOMU kernel with a ramdisk will boot to
sysinst if you're lucky. Often it panics because a runable LWP has
a NULL stack (really, it's all of l->l_addr which is has been zeroed out
while the process was on the queue !)
TODO:
- bug fixes :)
- Most of the xpq_* functions should be shared with xen/i386
- The xen/i386 assembly bootstrap code should be remplaced with the C
version in xenamd64/amd64/xpmap.c
- see if a config(5) trick could allow to merge xenamd64 back to xen or amd64.
 1.14.2.2 18-Feb-2008  mjf Sync with HEAD.
 1.14.2.1 08-Dec-2007  mjf Sync with HEAD.
 1.17.6.5 20-Jan-2008  bouyer Sync with HEAD: make first argument of pmap_pte_cas() volatile.
 1.17.6.4 19-Jan-2008  bouyer Make things build again after sync with HEAD
 1.17.6.3 19-Jan-2008  bouyer Sync with HEAD
 1.17.6.2 13-Jan-2008  bouyer Work in progress on xeni386 PAE support:
Make xeni386 build with a 64bit paddr_t. For this vaddr_t vs paddr_t vs
pointers usages had to be clarified.
If 'options PAE' is present in a Xen3 kernel, switch paddr_t, pd_entry_t
and pt_entry_t to 64bits, and add the PAE entry in the __xen_guest ELF section.
 1.17.6.1 08-Jan-2008  bouyer Sync with HEAD
 1.21.16.1 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.21.10.1 04-May-2009  yamt sync with head.
 1.21.6.1 17-Jan-2009  mjf Sync with HEAD.
 1.22.16.1 05-Mar-2011  rmind sync with head
 1.22.14.1 15-Nov-2010  uebayasi Sync with HEAD.
 1.22.8.3 27-Aug-2011  jym Sync with HEAD. Most notably: uvm/pmap work done by rmind@, and MP Xen
work of cherry@.

No regression observed on suspend/restore.
 1.22.8.2 28-Mar-2011  jym Sync with HEAD. TODO before merge:
- shortcut for suspend code in sysmon, when powerd(8) is not running.
Borrow ``xs_watch'' thread context?
- bug hunting in xbd + xennet resume. Rings are currently thrashed upon
resume, so current implementation force flush them on suspend. It's not
really needed.
 1.22.8.1 10-Jan-2011  jym Sync with HEAD
 1.23.4.1 08-Feb-2011  bouyer Sync with HEAD
 1.23.2.1 06-Jun-2011  jruoho Sync with HEAD.
 1.24.2.3 20-Sep-2011  cherry Remove the "xpq lock", since we have per-cpu mmu queues now. This may need further testing. Also add some preliminary locking around queue-ops in the network backend driver
 1.24.2.2 20-Aug-2011  cherry PAE MP support (preliminary), amd64 per-cpu L4 model redesigned, i386 pmap_pa_start/end fixup
 1.24.2.1 03-Jun-2011  cherry Initial import of xen MP sources, with kernel and userspace tests.
- this is a source priview.
- boots to single user.
- spurious interrupt and pmap related panics are normal
 1.26.2.3 30-Oct-2012  yamt sync with head
 1.26.2.2 17-Apr-2012  yamt sync with head
 1.26.2.1 10-Nov-2011  yamt sync with head
 1.28.4.2 24-Feb-2012  mrg sync to -current.
 1.28.4.1 18-Feb-2012  mrg merge to -current.
 1.31.2.1 22-Nov-2012  riz Pull up following revision(s) (requested by chs in ticket #690):
external/cddl/osnet/dev/dtrace/amd64/dtrace_isa.c: revision 1.4
sys/arch/amd64/include/Makefile.inc: revision 1.4
sys/arch/amd64/include/pmap.h: revision 1.33
external/cddl/osnet/dev/dtrace/amd64/dtrace_subr.c: revision 1.6
sys/arch/amd64/include/asm.h: revision 1.15
sys/arch/amd64/amd64/genassym.cf: revision 1.51
external/cddl/osnet/dev/dtrace/amd64/dtrace_asm.S: revision 1.4
make dtrace work on amd64.
allow more space for modules.
 1.34.16.5 28-Aug-2017  skrll Sync with HEAD
 1.34.16.4 05-Dec-2016  skrll Sync with HEAD
 1.34.16.3 05-Oct-2016  skrll Sync with HEAD
 1.34.16.2 29-May-2016  skrll Sync with HEAD
 1.34.16.1 06-Apr-2015  skrll Sync with HEAD
 1.34.14.1 18-Mar-2015  snj Pull up following revision(s) (requested by riastradh in ticket #612):
sys/arch/amd64/include/pmap.h: revision 1.35
Bump amd64 module map size to 32 MB.
For lack of anything better to do, after no progress in discussion on
the matter:
https://mail-index.netbsd.org/port-amd64/2014/08/22/msg002108.html
Needed in order to load the (solaris module needed by) dtrace module.
 1.34.2.1 03-Dec-2017  jdolecek update from HEAD
 1.37.2.2 07-Jan-2017  pgoyette Sync with HEAD. (Note that most of these changes are simply $NetBSD$
tag issues.)
 1.37.2.1 26-Jul-2016  pgoyette Sync with HEAD
 1.39.8.1 22-Mar-2018  martin Pull up the following revisions, requested by maxv in ticket #652:

sys/arch/amd64/amd64/amd64_trap.S upto 1.39 (partial, patch)
sys/arch/amd64/amd64/db_machdep.c 1.6 (patch)
sys/arch/amd64/amd64/genassym.cf 1.65,1.66,1.67 (patch)
sys/arch/amd64/amd64/locore.S upto 1.159 (partial, patch)
sys/arch/amd64/amd64/machdep.c 1.299-1.302 (patch)
sys/arch/amd64/amd64/trap.c upto 1.113 (partial, patch)
sys/arch/amd64/amd64/amd64/vector.S upto 1.61 (partial, patch)
sys/arch/amd64/conf/GENERIC 1.477,1.478 (patch)
sys/arch/amd64/conf/kern.ldscript 1.26 (patch)
sys/arch/amd64/include/frameasm.h upto 1.37 (partial, patch)
sys/arch/amd64/include/param.h 1.25 (patch)
sys/arch/amd64/include/pmap.h 1.41,1.43,1.44 (patch)
sys/arch/x86/conf/files.x86 1.91,1.93 (patch)
sys/arch/x86/include/cpu.h 1.88,1.89 (patch)
sys/arch/x86/include/pmap.h 1.75 (patch)
sys/arch/x86/x86/cpu.c 1.144,1.146,1.148,1.149 (patch)
sys/arch/x86/x86/pmap.c upto 1.289 (partial, patch)
sys/arch/x86/x86/vm_machdep.c 1.31,1.32 (patch)
sys/arch/x86/x86/x86_machdep.c 1.104,1.106,1.108 (patch)
sys/arch/x86/x86/svs.c 1.1-1.14
sys/arch/xen/conf/files.compat 1.30 (patch)

Backport SVS. Not enabled yet.
 1.45.2.4 26-Nov-2018  pgoyette Sync with HEAD, resolve a couple of conflicts
 1.45.2.3 06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.45.2.2 28-Jul-2018  pgoyette Sync with HEAD
 1.45.2.1 21-May-2018  pgoyette Sync with HEAD
 1.46.2.3 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.46.2.2 08-Apr-2020  martin Merge changes from current as of 20200406
 1.46.2.1 10-Jun-2019  christos Sync with HEAD
 1.4 21-Aug-2022  riastradh x86 Move VA_SIGN_POS/NEG to machine/pte.h.

It's used by pl[1-4]_pi, also defined in machine/pte.h, and used in
libkvm without pmap_private.h.
 1.3 20-Aug-2022  riastradh {amd64,i386}/pmap_private.h: Fix minor whitespace issues.
 1.2 20-Aug-2022  riastradh x86: Move definition of struct pmap to pmap_private.h.

This makes pmap_resident_count and pmap_wired_count out-of-line
functions instead of inline. No functional change intended
otherwise.
 1.1 20-Aug-2022  riastradh x86: Split most of pmap.h into pmap_private.h or vmparam.h.

This way pmap.h only contains the MD definition of the MI pmap(9)
API, which loads of things in the kernel rely on, so changing x86
pmap internals no longer requires recompiling the entire kernel every
time.

Callers needing these internals must now use machine/pmap_private.h.
Note: This is not x86/pmap_private.h because it contains three parts:

1. CPU-specific (different for i386/amd64) definitions used by...

2. common definitions, including Xenisms like xpmap_ptetomach,
further used by...

3. more CPU-specific inlines for pmap_pte_* operations

So {amd64,i386}/pmap_private.h defines 1, includes x86/pmap_private.h
for 2, and then defines 3. Maybe we should split that out into a new
pmap_pte.h to reduce this trouble.

No functional change intended, other than that some .c files must
include machine/pmap_private.h when previously uvm/uvm_pmap.h
polluted the namespace with pmap internals.

Note: This migrates part of i386/pmap.h into i386/vmparam.h --
specifically the parts that are needed for several constants defined
in vmparam.h:

VM_MAXUSER_ADDRESS
VM_MAX_ADDRESS
VM_MAX_KERNEL_ADDRESS
VM_MIN_KERNEL_ADDRESS

Since i386 needs PDP_SIZE in vmparam.h, I added it there on amd64
too, just to keep things parallel.
 1.5 12-Jul-2018  maxv Remove the kernel PMC code. Sent yesterday on tech-kern@.

This change:

* Removes "options PERFCTRS", the associated includes, and the associated
ifdefs. In doing so, it removes several XXXSMPs in the MI code, which is
good.

* Removes the PMC code of ARM XSCALE.

* Removes all the pmc.h files. They were all empty, except for ARM XSCALE.

* Reorders the x86 PMC code not to rely on the legacy pmc.h file. The
definitions are put in sysarch.h.

* Removes the kern/sys_pmc.c file, and along with it, the sys_pmc_control
and sys_pmc_get_info syscalls. They are marked as OBSOL in kern,
netbsd32 and rump.

* Removes the pmc_evid_t and pmc_ctr_t types.

* Removes all the associated man pages. The sets are marked as obsolete.
 1.4 10-Mar-2017  maxv branches: 1.4.12; 1.4.14;
Move pmc.c into x86/, it can be shared with amd64.
 1.3 18-Feb-2017  maxv PERFCTRS -> PMC (not implemented anyway)
 1.2 20-Mar-2014  christos branches: 1.2.6; 1.2.10; 1.2.14;
make pmc compile with amd64
 1.1 26-Apr-2003  fvdl branches: 1.1.142; 1.1.152; 1.1.158;
Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1.158.1 18-May-2014  rmind sync with head
 1.1.152.2 03-Dec-2017  jdolecek update from HEAD
 1.1.152.1 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.142.1 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.2.14.1 21-Apr-2017  bouyer Sync with HEAD
 1.2.10.1 20-Mar-2017  pgoyette Sync with HEAD
 1.2.6.1 28-Aug-2017  skrll Sync with HEAD
 1.4.14.1 10-Jun-2019  christos Sync with HEAD
 1.4.12.1 28-Jul-2018  pgoyette Sync with HEAD
 1.25 13-Jun-2020  ad Print a rate limited warning if the TSC timecounter goes backwards from the
viewpoint of any single LWP.
 1.24 13-Jan-2020  ad Remove now unused mdlwp fields md_gc_pmap and md_gc_ptp.
 1.23 12-Oct-2019  maxv branches: 1.23.2;
Rewrite the FPU code on x86. This greatly simplifies the logic and removes
the dependency on IPL_HIGH. NVMM is updated accordingly. Posted on
port-amd64 a week ago.

Bump the kernel version to 9.99.16.
 1.22 25-Feb-2017  kamil branches: 1.22.14;
Garbage collect unneeded inclusion of <x86/dbregs.h> in <machine/proc.h>

This is left over after introduction of Debug Register accessors.
This interface replaced older watchpoint API.

Sponsored by <The NetBSD Foundation>
 1.21 23-Feb-2017  kamil Introduce PT_GETDBREGS and PT_SETDBREGS in ptrace(2) on i386 and amd64

This interface is modeled after FreeBSD API with the usage.

This replaced previous watchpoint API. The previous one was introduced
recently in NetBSD-current and remove its spurs without any
backward-compatibility.

Design choices for Debug Register accessors:
- exec() (TRAP_EXEC event) must remove debug registers from LWP
- debug registers are only per-LWP, not per-process globally
- debug registers must not be inherited after (v)forking a process
- debug registers must not be inherited after forking a thread
- a debugger is responsible to set global watchpoints/breakpoints with the
debug registers, to achieve this PTRACE_LWP_CREATE/PTRACE_LWP_EXIT event
monitoring function is designed to be used
- debug register traps must generate SIGTRAP with si_code TRAP_DBREG
- debugger is responsible to retrieve debug register state to distinguish
the exact debug register trap (DR6 is Status Register on x86)
- kernel must not remove debug register traps after triggering a trap event
a debugger is responsible to detach this trap with appropriate PT_SETDBREGS
call (DR7 is Control Register on x86)
- debug registers must not be exposed in mcontext
- userland must not be allowed to set a trap on the kernel

Implementation notes on i386 and amd64:
- the initial state of debug register is retrieved on boot and this value is
stored in a local copy (initdbregs), this value is used to initialize dbreg
context after PT_GETDBREGS
- struct dbregs is stored in pcb as a pointer and by default not initialized
- reserved registers (DR4-DR5, DR9-DR15) are ignored

Further ideas:
- restrict this interface with securelevel

Tested on real hardware i386 (Intel Pentium IV) and amd64 (Intel i7).

This commit enables 390 debug register ATF tests in kernel/arch/x86.
All tests are passing.

This commit does not cover netbsd32 compat code. Currently other interface
PT_GET_SIGINFO/PT_SET_SIGINFO is required in netbsd32 compat code in order to
validate reliably PT_GETDBREGS/PT_SETDBREGS.

This implementation does not cover FreeBSD specific defines in their
<x86/reg.h>: DBREG_DR7_LOCAL_ENABLE, DBREG_DR7_GLOBAL_ENABLE, DBREG_DR7_LEN_1
etc. These values tend to be reinvented by each tracer on its own. GNU
Debugger (GDB) works with NetBSD debug registers after adding this patch:

--- gdb/amd64bsd-nat.c.orig 2016-02-10 03:19:39.000000000 +0000
+++ gdb/amd64bsd-nat.c
@@ -167,6 +167,10 @@ amd64bsd_target (void)

#ifdef HAVE_PT_GETDBREGS

+#ifndef DBREG_DRX
+#define DBREG_DRX(d,x) ((d)->dr[(x)])
+#endif
+
static unsigned long
amd64bsd_dr_get (ptid_t ptid, int regnum)
{


Another reason to stop introducing unpopular defines covering machine
specific register macros is that these value varies across generations of
the same CPU family.

GDB demo:
(gdb) c
Continuing.

Watchpoint 2: traceme

Old value = 0
New value = 16
main (argc=1, argv=0x7f7fff79fe30) at test.c:8
8 printf("traceme=%d\n", traceme);

(Currently the GDB interface is not reliable due to NetBSD support bugs)

Sponsored by <The NetBSD Foundation>
 1.20 15-Dec-2016  kamil branches: 1.20.2;
Add support for hardware assisted watchpoints/breakpoints API in ptrace(2)

Add new ptrace(2) calls:
- PT_COUNT_WATCHPOINTS - count the number of available hardware watchpoints
- PT_READ_WATCHPOINT - read struct ptrace_watchpoint from the kernel state
- PT_WRITE_WATCHPOINT - write new struct ptrace_watchpoint state, this
includes enabling and disabling watchpoints

The ptrace_watchpoint structure contains MI and MD parts:

typedef struct ptrace_watchpoint {
int pw_index; /* HW Watchpoint ID (count from 0) */
lwpid_t pw_lwpid; /* LWP described */
struct mdpw pw_md; /* MD fields */
} ptrace_watchpoint_t;

For example amd64 defines MD as follows:
struct mdpw {
void *md_address;
int md_condition;
int md_length;
};

These calls are protected with the __HAVE_PTRACE_WATCHPOINTS guard.

Tested on amd64, initial support added for i386 and XEN.

Sponsored by <The NetBSD Foundation>
 1.19 20-Feb-2014  dsl branches: 1.19.6; 1.19.10;
Move the amd64 and i386 pcb to the bottom of the uarea, and move the
kernel stack to the top.
Change the pcb layouts so that fpu save area is at the end and is
64byte aligned ready for xsave (saving the ymm registers).
Welcome to 6.99.32
 1.18 15-Feb-2014  dsl Remove all references to MDL_USEDFPU and deferred fpu initialisation.
The cost of zeroing the save area on exec is minimal.
This stops the FP registers of a random process being used the first
time an lwp uses the fpu.
sendsig_siginfo() and get_mcontext() now unconditionally copy the FP
registers.
I'll remove the double-copy for signal handlers soon.
get_mcontext() might have been leaking kernel memory to userspace - and
may still do so if i386_use_fxsave is false (short copies).
 1.17 01-Dec-2013  christos revert fpu/pcu changes until we figure out what's wrong; they cause random
freezes
 1.16 23-Oct-2013  drochner Use the MI "pcu" framework for bookkeeping of npx/fpu states on x86.
This reduces the amount of MD code enormously, and makes it easier
to implement support for newer CPU features which require more fpu
state, or for fpu usage by the kernel.
For access to FPU state across CPUs, an xcall kthread is used now
rather than a dedicated IPI.
No user visible changes intended.
 1.15 15-Jul-2012  dsl branches: 1.15.2; 1.15.4;
Rename MDP_IRET to MDL_IRET since it is an lwp flag, not a proc one.
Add an MDL_COMPAT32 flag to the lwp's md_flags, set it for 32bit lwps
and use it to force 'return to user' with iret (as is done when
MDL_IRET is set).
Split the iret/sysret code paths much later.
Remove all the replicated code for 32bit system calls - which was only
needed so that iret was always used.
frameasm.h for XEN contains '#define swapgs', while XEN probable never
needs swapgs, this is likely to be confusing.
Add a SWAPGS which is a nop on XEN and swapgs otherwise.
(I've not yet checked all the swapgs in files that include frameasm.h)
Simple x86 programs still work.
Hijack 6.99.9 kernel bump (needed for compat32 modules)
 1.14 08-Jul-2012  dsl The MDP_USEDFPU (amd64 and sh3) and MDP_SSTEP (sh3) are lwp flags not
process ones, rename to MDL_xxx.
 1.13 14-Jan-2011  rmind branches: 1.13.8;
Retire struct user, remove sys/user.h inclusions. Note sys/user.h header
as obsolete. Remove USER_TO_UAREA/UAREA_TO_USER macros.

Various #include fixes and review by matt@.
 1.12 14-Mar-2009  dsl branches: 1.12.4;
Remove all the __P() from sys (excluding sys/dist)
Diff checked with grep and MK1 eyeball.
i386 and amd64 GENERIC and sys still build.
 1.11 26-Oct-2008  mrg branches: 1.11.2; 1.11.8;
put the contents of these header files around #ifdef __x86_64__, and
#include the <i386/foo.h> in the #else clause, making these files
largely bit-size independant.
 1.10 05-Jun-2008  ad branches: 1.10.4;
pmap_remove_all() for x86. Also, always defer freeing ptps to pmap_update().
There may be a better way to do this, but for now this is simple and avoids
potential bugs.

Proposed on tech-kern and discussed with chs@.
 1.9 08-Jan-2008  yamt branches: 1.9.6; 1.9.8; 1.9.10; 1.9.12;
change the layout in u-area and reduce UPAGES.
 1.8 05-Jan-2008  yamt - make amd64 use per-cpu tss.
- fix iopl syscall for amd64+xen.
 1.7 16-Nov-2007  skrll branches: 1.7.6;
s/proc/lwp/ in comment
 1.6 09-Feb-2007  ad branches: 1.6.6; 1.6.22; 1.6.24; 1.6.28; 1.6.30;
Merge newlock2 to head.
 1.5 24-Dec-2005  perry branches: 1.5.20;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.
 1.4 11-Dec-2005  christos merge ktrace-lwp.
 1.3 20-Aug-2003  fvdl branches: 1.3.16;
Pass pointers to frames from assembly, do not use the 'frame on stack
as argument passed by value' trick, as gcc 3.3.x makes (valid) assumptions
about the stack that will not be true. Costs 2 instructions per trap/syscall
on i386, 4 per interrupt for MP. One instruction per trap/syscall on amd64,
2 per interrupt for MP. I expect gcc 3.3.1 to make up for this by better
optimization (it'd better..)

While here, make amd64 compile again by using subr_mbr_disk.c
 1.2 07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.1 26-Apr-2003  fvdl branches: 1.1.2;
Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.1.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.1.2.1 03-Aug-2004  skrll Sync with HEAD
 1.3.16.4 21-Jan-2008  yamt sync with head
 1.3.16.3 07-Dec-2007  yamt sync with head
 1.3.16.2 26-Feb-2007  yamt sync with head.
 1.3.16.1 21-Jun-2006  yamt sync with head.
 1.5.20.1 20-Oct-2006  ad Make ASTs per-LWP.
 1.6.30.2 18-Feb-2008  mjf Sync with HEAD.
 1.6.30.1 19-Nov-2007  mjf Sync with HEAD.
 1.6.28.1 18-Nov-2007  bouyer Sync with HEAD
 1.6.24.1 09-Jan-2008  matt sync with HEAD
 1.6.22.1 21-Nov-2007  joerg Sync with HEAD.
 1.6.6.1 03-Dec-2007  ad Sync with HEAD.
 1.7.6.1 08-Jan-2008  bouyer Sync with HEAD
 1.9.12.1 23-Jun-2008  wrstuden Sync w/ -current. 34 merge conflicts to follow.
 1.9.10.1 04-May-2009  yamt sync with head.
 1.9.8.1 17-Jun-2008  yamt sync with head.
 1.9.6.2 17-Jan-2009  mjf Sync with HEAD.
 1.9.6.1 29-Jun-2008  mjf Sync with HEAD.
 1.10.4.1 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.11.8.3 28-Mar-2011  jym Sync with HEAD. TODO before merge:
- shortcut for suspend code in sysmon, when powerd(8) is not running.
Borrow ``xs_watch'' thread context?
- bug hunting in xbd + xennet resume. Rings are currently thrashed upon
resume, so current implementation force flush them on suspend. It's not
really needed.
 1.11.8.2 01-Nov-2009  jym Sync with HEAD.
 1.11.8.1 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.11.2.1 28-Apr-2009  skrll Sync with HEAD.
 1.12.4.1 05-Mar-2011  rmind sync with head
 1.13.8.2 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.13.8.1 30-Oct-2012  yamt sync with head
 1.15.4.1 18-May-2014  rmind sync with head
 1.15.2.2 03-Dec-2017  jdolecek update from HEAD
 1.15.2.1 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.19.10.2 20-Mar-2017  pgoyette Sync with HEAD
 1.19.10.1 07-Jan-2017  pgoyette Sync with HEAD. (Note that most of these changes are simply $NetBSD$
tag issues.)
 1.19.6.2 28-Aug-2017  skrll Sync with HEAD
 1.19.6.1 05-Feb-2017  skrll Sync with HEAD
 1.20.2.1 21-Apr-2017  bouyer Sync with HEAD
 1.22.14.2 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.22.14.1 08-Apr-2020  martin Merge changes from current as of 20200406
 1.23.2.1 17-Jan-2020  ad Sync with head.
 1.21 02-Nov-2021  ryo In order to prevent _mcount() from being recursively called when built with COPTS=-O0,
sprinkle `__always_inline' to make _mcount() be generated as a single function.
 1.20 17-Apr-2021  rillig sys/arch/amd64: remove trailing whitespace
 1.19 11-Feb-2019  cherry We reorganise definitions for XEN source support as follows:

XEN - common sources required for baseline XEN support.
XENPV - sources required for support of XEN in PV mode.
XENPVHVM - sources required for support for XEN in HVM mode.
XENPVH - sources required for support for XEN in PVH mode.
 1.18 11-Apr-2016  bouyer branches: 1.18.18;
x86_lfence() calls mcount(), so inline lfence instructions in mcount
helper functions
 1.17 10-Jan-2016  ryo __mcount_lock is moved to MI from MD.
because it is needed for all MULTIPROCESSOR arch, but it is exists only in i386 and amd64.

ok christos@, on tech-kern@
 1.16 12-Sep-2013  joerg branches: 1.16.6;
Pass PICFLAGS down to cc-as-as and use __PIC__ to decide if it is small
vs big PIC mode. Retire -DPIC and -DBIGPIC.
 1.15 26-Oct-2008  mrg branches: 1.15.28; 1.15.38; 1.15.44;
put the contents of these header files around #ifdef __x86_64__, and
#include the <i386/foo.h> in the #else clause, making these files
largely bit-size independant.
 1.14 25-May-2008  chs branches: 1.14.4;
fix profiling compilation.
 1.13 21-Apr-2008  cegger branches: 1.13.2; 1.13.4;
Access Xen's vcpu info structure per-CPU.
Tested on i386 and amd64 (both dom0 and domU) by me.
Xen2 tested (both dom0 and domU) by bouyer.
OK bouyer
 1.12 20-Dec-2007  ad branches: 1.12.6; 1.12.8;
- Make __cpu_simple_lock and similar real functions and patch at runtime.
- Remove old x86 atomic ops.
- Drop text alignment back to 16 on i386 (really, this time).
- Minor cleanup.
 1.11 24-Nov-2007  bouyer branches: 1.11.2; 1.11.6;
Make Xen profiling kernels work.
XXX assembly still needs to be fixed for non-Xen kernels.
 1.10 17-Oct-2007  garbled branches: 1.10.2;
Merge the ppcoea-renovation branch to HEAD.

This branch was a major cleanup and rototill of many of the various OEA
cpu based PPC ports that focused on sharing as much code as possible
between the various ports to eliminate near-identical copies of files in
every tree. Additionally there is a new PIC system that unifies the
interface to interrupt code for all different OEA ppc arches. The work
for this branch was done by a variety of people, too long to list here.

TODO:
bebox still needs work to complete the transition to -renovation.
ofppc still needs a bunch of work, which I will be looking at.
ev64260 still needs to be renovated
amigappc was not attempted.

NOTES:
pmppc was removed as an arch, and moved to a evbppc target.
 1.9 27-Sep-2007  ad Sync with i386.
 1.8 26-Sep-2007  xtraeme Fix profiling kernels:

read_psl -> x86_read_psl
write_psl -> x86_write_psl
disable_intr -> x86_disable_intr
 1.7 09-Feb-2007  ad branches: 1.7.6; 1.7.12; 1.7.14; 1.7.22; 1.7.24; 1.7.26;
Merge newlock2 to head.
 1.6 03-Feb-2006  skrll branches: 1.6.16;
Make sure we generate the right call with -fPIC.
 1.5 11-Dec-2005  christos branches: 1.5.2; 1.5.4;
merge ktrace-lwp.
 1.4 22-Sep-2005  chs pull in changes from i386 profile.h:
- allow profiling of MP kernels, add a spinlock around the body of mcount().
fixes PR 31360.
- save and restore eflags instead of just doing cli/sti.
 1.3 28-Nov-2003  fvdl branches: 1.3.14; 1.3.16;
Define the mcount function in assembler, and have it save all registers
used for argument passing, plus %rax (used to pass the number of float
arguments to varargs functions), to avoid having it clobber caller-saved
registers. mcount is emitted "under the radar", so the compiler doesn't
know it should do this.

Change the kernel mcount entry/exit macros to use plain cli/sti, like on i386.
 1.2 07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.1 26-Apr-2003  fvdl branches: 1.1.2;
Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1.2.4 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.1.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.1.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.1.2.1 03-Aug-2004  skrll Sync with HEAD
 1.3.16.5 21-Jan-2008  yamt sync with head
 1.3.16.4 07-Dec-2007  yamt sync with head
 1.3.16.3 27-Oct-2007  yamt sync with head.
 1.3.16.2 26-Feb-2007  yamt sync with head.
 1.3.16.1 21-Jun-2006  yamt sync with head.
 1.3.14.1 30-Sep-2005  tron Pull up following revision(s) (requested by chs in ticket #831):
sys/arch/amd64/include/profile.h: revision 1.4
pull in changes from i386 profile.h:
- allow profiling of MP kernels, add a spinlock around the body of mcount().
fixes PR 31360.
- save and restore eflags instead of just doing cli/sti.
 1.5.4.1 09-Sep-2006  rpaulo sync with head
 1.5.2.1 18-Feb-2006  yamt sync with head.
 1.6.16.1 06-Feb-2007  ad mcount(): fix entry so LOCKDEBUG+GPROF can be used together. Previously
it would recurse until eventually the machine triple faulted.
 1.7.26.1 06-Oct-2007  yamt sync with head.
 1.7.24.2 09-Jan-2008  matt sync with HEAD
 1.7.24.1 06-Nov-2007  matt sync with HEAD
 1.7.22.2 27-Nov-2007  joerg Sync with HEAD. amd64 Xen support needs testing.
 1.7.22.1 02-Oct-2007  joerg Sync with HEAD.
 1.7.14.1 03-Oct-2007  garbled Sync with HEAD
 1.7.12.1 18-Apr-2007  thorpej Convert i386 and amd64 to the new atomic ops API.
 1.7.6.3 03-Dec-2007  ad Sync with HEAD.
 1.7.6.2 03-Dec-2007  ad Sync with HEAD.
 1.7.6.1 09-Oct-2007  ad Sync with head.
 1.10.2.2 27-Dec-2007  mjf Sync with HEAD.
 1.10.2.1 08-Dec-2007  mjf Sync with HEAD.
 1.11.6.1 02-Jan-2008  bouyer Sync with HEAD
 1.11.2.1 26-Dec-2007  ad Sync with head.
 1.12.8.2 04-Jun-2008  yamt sync with head
 1.12.8.1 18-May-2008  yamt sync with head.
 1.12.6.2 17-Jan-2009  mjf Sync with HEAD.
 1.12.6.1 02-Jun-2008  mjf Sync with HEAD.
 1.13.4.1 23-Jun-2008  wrstuden Sync w/ -current. 34 merge conflicts to follow.
 1.13.2.1 04-May-2009  yamt sync with head.
 1.14.4.1 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.15.44.1 18-May-2014  rmind sync with head
 1.15.38.2 03-Dec-2017  jdolecek update from HEAD
 1.15.38.1 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.15.28.1 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.16.6.2 22-Apr-2016  skrll Sync with HEAD
 1.16.6.1 19-Mar-2016  skrll Sync with HEAD
 1.18.18.1 10-Jun-2019  christos Sync with HEAD
 1.1 26-Apr-2003  fvdl Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.17 21-Aug-2022  riastradh x86 Move VA_SIGN_POS/NEG to machine/pte.h.

It's used by pl[1-4]_pi, also defined in machine/pte.h, and used in
libkvm without pmap_private.h.
 1.16 20-Aug-2022  riastradh x86: Forbid using x86/pte.h directly; use machine/pte.h.

machine/pte.h already used outside sys/arch, so let's make it the
primary thing and make sure to use x86/pte.h only as a subroutine.
 1.15 20-Aug-2022  riastradh amd64/pte.h, i386/pte.h: Need sys/stdint.h for uintN_t.
 1.14 25-Apr-2020  bouyer Merge the bouyer-xenpvh branch, bringing in Xen PV drivers support under HVM
guests in GENERIC.
Xen support can be disabled at runtime with
boot -c
disable hypervisor
 1.13 25-Apr-2020  maxv Switch to the new PTE naming. The old naming is now unused, remove it.
 1.12 09-Mar-2019  maxv branches: 1.12.10;
Start replacing the x86 PTE bits.
 1.11 07-Mar-2019  maxv Introduce a new set of PTE bits, with a different naming convention.

PG_V -> PTE_P /* Present */
PG_RW -> PTE_W /* Write */
PG_u -> PTE_U /* User */
PG_WT -> PTE_PWT /* Write-Through */
PG_N -> PTE_PCD /* Cache-Disable */
PG_U -> PTE_A /* Accessed */
PG_M -> PTE_D /* Dirty */
PG_PAT -> PTE_PAT /* PAT on 4KB Pages */
PG_PS -> PTE_PS /* Large Page Size */
PG_G -> PTE_G /* Global Translation */
PG_AVAIL1 -> PTE_AVL1 /* Ignored by Hardware */
PG_AVAIL2 -> PTE_AVL2 /* Ignored by Hardware */
PG_AVAIL3 -> PTE_AVL3 /* Ignored by Hardware */
PG_LGPAT -> PTE_LGPAT /* PAT on Large Pages */
PG_NX -> PTE_NX /* No Execute */

Until now we were using "PG_BIT". The "BIT" part of the naming did not
follow the x86 naming convention in the spec, and was very confusing. We
don't want the "PG_" part of it either, because UVM has similar flags
(ie PG_BUSY).
 1.10 07-Mar-2019  maxv Drop PG_RO, PG_KR and PG_PROT, they are useless and create confusion.
 1.9 13-May-2016  maxv branches: 1.9.18;
KNF, so it appears aligned on NXR.
 1.8 24-Jul-2010  njoly branches: 1.8.18; 1.8.36;
Pull i386 pte.h on amd64 for 32bit compat.
 1.7 06-Jul-2010  cegger Turn PMAP_NOCACHE into MI flag.
Add MI flags PMAP_WRITE_COMBINE, PMAP_WRITE_BACK, PMAP_NOCACHE_OVR.
Update pmap(9) manpage.

hppa: Remove MD PMAP_NOCACHE flag as it exists as MI flag
mips: Rename MD PMAP_NOCACHE to PGC_NOCACHE.

x86: Implement new MI flags using Page-Attribute Tables.
x86: Implement BUS_SPACE_MAP_PREFETCHABLE.

Patch presented on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2010/06/30/msg008458.html

No comments on this last version.
 1.6 26-Feb-2010  jym branches: 1.6.2;
Fixes regarding paddr_t/pd_entry_t types in MD x86 code, exposed by PAE:

- NBPD_* macros are set to the types that better match their architecture
(UL for i386 and amd64, ULL for i386 PAE) - will revisit when paddr_t is
set to 64 bits for i386 non-PAE.

- type fixes in printf/printk messages (Use PRIxPADDR when printing paddr_t
values, instead of %lx - paddr_t/pd_entry_t being 64 bits with PAE)

- remove casts that are no more needed now that Xen2 support has been dropped

Some fixes are from jmorse@ patches for PAE.

Compile + tested for i386 GENERIC and XEN3 kernels. Only compile tested for
amd64.

Reviewed by bouyer@.

See also http://mail-index.netbsd.org/tech-kern/2010/02/22/msg007373.html
 1.5 28-Jan-2010  mbalmer branches: 1.5.2;
Fix language.
 1.4 16-Apr-2008  cegger branches: 1.4.4; 1.4.18;
use POSIX integer types
 1.3 11-Dec-2005  christos branches: 1.3.74;
merge ktrace-lwp.
 1.2 19-Feb-2004  drochner use no-execute page permissions if supported
 1.1 26-Apr-2003  fvdl branches: 1.1.2;
Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.1.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.1.2.1 03-Aug-2004  skrll Sync with HEAD
 1.3.74.1 02-Jun-2008  mjf Sync with HEAD.
 1.4.18.1 24-Oct-2010  jym Sync with HEAD
 1.4.4.2 11-Aug-2010  yamt sync with head.
 1.4.4.1 11-Mar-2010  yamt sync with head
 1.5.2.2 17-Aug-2010  uebayasi Sync with HEAD.
 1.5.2.1 30-Apr-2010  uebayasi Sync with HEAD.
 1.6.2.1 05-Mar-2011  rmind sync with head
 1.8.36.1 29-May-2016  skrll Sync with HEAD
 1.8.18.1 03-Dec-2017  jdolecek update from HEAD
 1.9.18.1 10-Jun-2019  christos Sync with HEAD
 1.12.10.1 25-Apr-2020  bouyer Sync with bouyer-xenpvh-base2 (HEAD)
 1.23 20-Nov-2023  simonb Note some large xstate stack objects what Somebody(tm) should look at
when they find some round tuits.
 1.22 30-May-2020  maxv Introduce PTRACE_REGS_ALIGN, and on x86, enforce a 16-byte alignment, due
to fpregs having fxsave which requires 16-byte alignment.

Reported-by: syzbot+f44d47e617ebf7fda081@syzkaller.appspotmail.com
 1.21 08-Jan-2020  mgorny Include XSTATE note in x86 core dumps

Introduce a simple COREDUMP_MACHDEP_LWP_NOTES logic to provide machdep
API for injecting per-LWP notes into coredumps, and use it to append
PT_GETXSTATE note.

Since the XSTATE block uses the same format on i386 and amd64, the code
does not have to conditionalize between 32-bit and 64-bit ELF format
on that. However, it does need to distinguish between 32-bit and 64-bit
PT_* values. In order to do that, it reuses PT32_* constant already
present for ptrace(), and adds a matching PT64_GETXSTATE to satisfy
the cpp logic.
 1.20 02-Dec-2019  kamil branches: 1.20.2;
Define PT_GETXMMREGS and PT_SETXMMREGS in PT_MACHDEP_STRINGS/amd64
 1.19 27-Nov-2019  rin Add support for PT_[GS]ETXMMREGS requests for COMPAT_NETBSD32 on amd64.

For this purpose, PT_[GS]ETXMMREGS are added to amd64/ptrace.h. These
are intended for internal usage for COMPAT_NETBSD32, and therefore not
exposed to userland.

Thanks to kamil, mgorny, and pgoyette for their kind review!

XXX
pullup to netbsd-9
 1.18 27-Nov-2019  rin Rename process_machdep_validxstate() to process_machdep_validfpu(), as
this function will be used to check validity of XMM registers also.
 1.17 27-Nov-2019  rin Fix copy-paste in comment. No binary changes.
 1.16 26-Jun-2019  mgorny Implement PT_GETXSTATE and PT_SETXSTATE

Introduce two new ptrace() requests: PT_GETXSTATE and PT_SETXSTATE,
that provide access to the extended (and extensible) set of FPU
registers on amd64 and i386. At the moment, this covers AVX (YMM)
and AVX-512 (ZMM, opmask) registers. It can be easily extended
to cover further register types without breaking backwards
compatibility.

PT_GETXSTATE issues the XSAVE instruction with all kernel-supported
extended components enabled. The data is copied into 'struct xstate'
(which -- unlike the XSAVE area itself -- has stable format
and offsets).

PT_SETXSTATE issues the XRSTOR instruction to restore the register
values from user-provided 'struct xstate'. The function replaces only
the specific XSAVE components that are listed in 'xs_rfbm' field,
making it possible to issue partial updates.

Both syscalls take a 'struct iovec' pointer rather than a direct
argument. This requires the caller to explicitly specify the buffer
size. As a result, existing code will continue to work correctly
when the structure is extended (performing partial reads/updates).
 1.15 18-Jun-2019  kamil Introduce PTRACE_REG_FP() a helper macro to retrieve the frame pointer

The macro is dummy for ia64 (the FP register is unknown and can change
freely) and sparc/sparc64 (not stored in struct reg).
 1.14 04-Jun-2019  mgorny compat32: Translate userland PT_* request values into kernel codes

Currently, the compat32 passes PT_* request values to kernel functions
without translation. This works fine for low PT_* requests that happen
to have the same values both on i386 and amd64. However, for requests
higher than PT_SETFPREGS, the value passed from userland (matching i386
const) does not match the correct kernel (amd64) request. As a result,
e.g. when compat32 process calls PT_GETDBREGS, kernel actually processes
it as PT_SETSTEP.

To resolve this, introduce support for compat32 PT_* request
translation. The interface is based on PTRACE_TRANSLATE_REQUEST32 macro
that is defined to a mapping function on architectures needing it.
In case of amd64, this function maps userland i386 PT_* values into
appropriate amd64 PT_* values.

For the time being, the two additional PT_GETXMMREGS and PT_SETXMMREGS
requests are unsupported due to lack of matching free amd64 constant.
 1.13 07-Feb-2019  kamil Define PTRACE_ILLEGAL_ASM for NetBSD/amd64 in ptrace.h

Use ud2 instruction that is guaranteed to raise an invalid instruction
exception (through SIGILL).

On NetBSD and FreeBSD this instruction raises ILL_PRVOPC, on Linux
ILL_ILLOPN. It's not clear which opion is better "Privileged opcode" vs
"Illegal operand", because ud2 doesn't seem to be a privileged operation
and it doesn't take any operand.

Assume in future changes that this opcode will raise ILL_PRVOPC and keep
it purely for testing purposes of the SIGILL crash type.
 1.12 12-Apr-2017  kamil branches: 1.12.12;
Add new macro PTRACE_BREAKPOINT_ASM in <sys/ptrace.h> MD part

This macro ships with a MD-specific assembly instruction triggering
a software breakpoint.

Missing instruction for powerpc targets.

This code is used in ATF tests (lib/libc/sys/t_ptrace_wait).

Original patch by Nick Hudson, thanks!
 1.11 08-Apr-2017  kamil Add new ptrace(2) API: PT_SETSTEP & PT_CLEARSTEP

These operations allow to mark thread as a single-stepping one.

This allows to i.a.:
- single step and emit a signal (PT_SETSTEP & PT_CONTINUE)
- single step and trace syscall entry and exit (PT_SETSTEP & PT_SYSCALL)

The former is useful for debuggers like GDB or LLDB. The latter can be used
to singlestep a usermode kernel. These examples don't limit use-cases of
this interface.

Define PT_*STEP only for platforms defining PT_STEP.

Add new ATF tests setstep[1234].

These ptrace(2) operations first appeared in FreeBSD.

Sponsored by <The NetBSD Foundation>
 1.10 23-Feb-2017  kamil Introduce PT_GETDBREGS and PT_SETDBREGS in ptrace(2) on i386 and amd64

This interface is modeled after FreeBSD API with the usage.

This replaced previous watchpoint API. The previous one was introduced
recently in NetBSD-current and remove its spurs without any
backward-compatibility.

Design choices for Debug Register accessors:
- exec() (TRAP_EXEC event) must remove debug registers from LWP
- debug registers are only per-LWP, not per-process globally
- debug registers must not be inherited after (v)forking a process
- debug registers must not be inherited after forking a thread
- a debugger is responsible to set global watchpoints/breakpoints with the
debug registers, to achieve this PTRACE_LWP_CREATE/PTRACE_LWP_EXIT event
monitoring function is designed to be used
- debug register traps must generate SIGTRAP with si_code TRAP_DBREG
- debugger is responsible to retrieve debug register state to distinguish
the exact debug register trap (DR6 is Status Register on x86)
- kernel must not remove debug register traps after triggering a trap event
a debugger is responsible to detach this trap with appropriate PT_SETDBREGS
call (DR7 is Control Register on x86)
- debug registers must not be exposed in mcontext
- userland must not be allowed to set a trap on the kernel

Implementation notes on i386 and amd64:
- the initial state of debug register is retrieved on boot and this value is
stored in a local copy (initdbregs), this value is used to initialize dbreg
context after PT_GETDBREGS
- struct dbregs is stored in pcb as a pointer and by default not initialized
- reserved registers (DR4-DR5, DR9-DR15) are ignored

Further ideas:
- restrict this interface with securelevel

Tested on real hardware i386 (Intel Pentium IV) and amd64 (Intel i7).

This commit enables 390 debug register ATF tests in kernel/arch/x86.
All tests are passing.

This commit does not cover netbsd32 compat code. Currently other interface
PT_GET_SIGINFO/PT_SET_SIGINFO is required in netbsd32 compat code in order to
validate reliably PT_GETDBREGS/PT_SETDBREGS.

This implementation does not cover FreeBSD specific defines in their
<x86/reg.h>: DBREG_DR7_LOCAL_ENABLE, DBREG_DR7_GLOBAL_ENABLE, DBREG_DR7_LEN_1
etc. These values tend to be reinvented by each tracer on its own. GNU
Debugger (GDB) works with NetBSD debug registers after adding this patch:

--- gdb/amd64bsd-nat.c.orig 2016-02-10 03:19:39.000000000 +0000
+++ gdb/amd64bsd-nat.c
@@ -167,6 +167,10 @@ amd64bsd_target (void)

#ifdef HAVE_PT_GETDBREGS

+#ifndef DBREG_DRX
+#define DBREG_DRX(d,x) ((d)->dr[(x)])
+#endif
+
static unsigned long
amd64bsd_dr_get (ptid_t ptid, int regnum)
{


Another reason to stop introducing unpopular defines covering machine
specific register macros is that these value varies across generations of
the same CPU family.

GDB demo:
(gdb) c
Continuing.

Watchpoint 2: traceme

Old value = 0
New value = 16
main (argc=1, argv=0x7f7fff79fe30) at test.c:8
8 printf("traceme=%d\n", traceme);

(Currently the GDB interface is not reliable due to NetBSD support bugs)

Sponsored by <The NetBSD Foundation>
 1.9 16-Jan-2017  kamil Refactor ptrace_watchpoint structure to allow extensions

Add new field pw_type in the ptrace_watchpoint structure.

amd64 and i386 offer the current set of watchpoints as
PTRACE_PW_TYPE_DBREGS.

On other archs than x86, there are readily available different types of
hardware assisted watchpoints like for code-only or data-only registers on
ARM. Also in future there is an option to implement MMU-based watchpoints
and future per-port or per-cpu extensions.

Next step is to alter this interface on x86 to generate SIGTRAP with
si_code TRAP_HWWTRAP with additional information on occurred event:
- which watchpoint fired,
- additional watchpoint-type specific information, like on amd64 with
PTRACE_PW_TYPE_DBREGS.:
* only watchpoint fired
* watchpoint fired and single step occurred

Adjust ATF tests for the pw_type change.

Sponsored by <The NetBSD Foundation>
 1.8 15-Dec-2016  kamil branches: 1.8.2;
Add support for hardware assisted watchpoints/breakpoints API in ptrace(2)

Add new ptrace(2) calls:
- PT_COUNT_WATCHPOINTS - count the number of available hardware watchpoints
- PT_READ_WATCHPOINT - read struct ptrace_watchpoint from the kernel state
- PT_WRITE_WATCHPOINT - write new struct ptrace_watchpoint state, this
includes enabling and disabling watchpoints

The ptrace_watchpoint structure contains MI and MD parts:

typedef struct ptrace_watchpoint {
int pw_index; /* HW Watchpoint ID (count from 0) */
lwpid_t pw_lwpid; /* LWP described */
struct mdpw pw_md; /* MD fields */
} ptrace_watchpoint_t;

For example amd64 defines MD as follows:
struct mdpw {
void *md_address;
int md_condition;
int md_length;
};

These calls are protected with the __HAVE_PTRACE_WATCHPOINTS guard.

Tested on amd64, initial support added for i386 and XEN.

Sponsored by <The NetBSD Foundation>
 1.7 19-Oct-2016  skrll PR kern/51514: ptrace(2) fails for 32-bit process on 64-bit kernel

Updated from the original patch in the PR by me.
 1.6 25-Sep-2015  christos branches: 1.6.2;
For processors that have memory breakpoints, add macros for them to help
libproc
 1.5 17-Sep-2015  christos fix 32 bit build.
 1.4 15-Sep-2015  christos Provide access to pc/sp/syscall-return registers like we have for mcontext
 1.3 16-Apr-2007  njoly branches: 1.3.80; 1.3.100;
Add PT_MACHDEP_STRINGS, for kdump output.
 1.2 12-Mar-2006  cube branches: 1.2.16; 1.2.20; 1.2.22;
Support the generation of coredumps for 32-bits binaries under
COMPAT_NETBSD32. They haven't worked for 5 years.

Silently agreed by the tech-kern readers.

XXX sparc64 MD glue still lacking.
XXX The FPU registers on i386 are not dumped correctly, according to my
XXX tests. It shouldn't be much work for someone who has the slightest
XXX idea of how that stuff is supposed to be laid out on i386.
 1.1 26-Apr-2003  fvdl branches: 1.1.18; 1.1.32; 1.1.34; 1.1.36; 1.1.38;
Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1.38.1 19-Apr-2006  elad sync with head - hopefully this will work
 1.1.36.1 13-Mar-2006  yamt sync with head.
 1.1.34.1 22-Apr-2006  simonb Sync with head.
 1.1.32.1 09-Sep-2006  rpaulo sync with head
 1.1.18.2 03-Sep-2007  yamt sync with head.
 1.1.18.1 21-Jun-2006  yamt sync with head.
 1.2.22.1 11-Jul-2007  mjf Sync with head.
 1.2.20.1 27-May-2007  ad Sync with head.
 1.2.16.1 07-May-2007  yamt sync with head.
 1.3.100.5 28-Aug-2017  skrll Sync with HEAD
 1.3.100.4 05-Feb-2017  skrll Sync with HEAD
 1.3.100.3 05-Dec-2016  skrll Sync with HEAD
 1.3.100.2 27-Dec-2015  skrll Sync with HEAD (as of 26th Dec)
 1.3.100.1 22-Sep-2015  skrll Sync with HEAD
 1.3.80.1 03-Dec-2017  jdolecek update from HEAD
 1.6.2.4 26-Apr-2017  pgoyette Sync with HEAD
 1.6.2.3 20-Mar-2017  pgoyette Sync with HEAD
 1.6.2.2 07-Jan-2017  pgoyette Sync with HEAD. (Note that most of these changes are simply $NetBSD$
tag issues.)
 1.6.2.1 04-Nov-2016  pgoyette Sync with HEAD
 1.8.2.1 21-Apr-2017  bouyer Sync with HEAD
 1.12.12.2 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.12.12.1 10-Jun-2019  christos Sync with HEAD
 1.20.2.1 17-Jan-2020  ad Sync with head.
 1.2 15-Dec-2009  snj Move to 2-clause license. Approved by HAYAKAWA Koichi (copyright holder).
 1.1 21-Dec-2005  rjs branches: 1.1.18; 1.1.80; 1.1.94;
Add boilerplate for cardbus support.
 1.1.94.1 24-Oct-2010  jym Sync with HEAD
 1.1.80.1 11-Mar-2010  yamt sync with head
 1.1.18.2 21-Jun-2006  yamt sync with head.
 1.1.18.1 21-Dec-2005  yamt file rbus_machdep.h was added on branch yamt-lazymbuf on 2006-06-21 14:48:25 +0000
 1.11 22-May-2022  andvar fix various small typos, mainly in comments.
 1.10 23-Feb-2017  kamil Introduce PT_GETDBREGS and PT_SETDBREGS in ptrace(2) on i386 and amd64

This interface is modeled after FreeBSD API with the usage.

This replaced previous watchpoint API. The previous one was introduced
recently in NetBSD-current and remove its spurs without any
backward-compatibility.

Design choices for Debug Register accessors:
- exec() (TRAP_EXEC event) must remove debug registers from LWP
- debug registers are only per-LWP, not per-process globally
- debug registers must not be inherited after (v)forking a process
- debug registers must not be inherited after forking a thread
- a debugger is responsible to set global watchpoints/breakpoints with the
debug registers, to achieve this PTRACE_LWP_CREATE/PTRACE_LWP_EXIT event
monitoring function is designed to be used
- debug register traps must generate SIGTRAP with si_code TRAP_DBREG
- debugger is responsible to retrieve debug register state to distinguish
the exact debug register trap (DR6 is Status Register on x86)
- kernel must not remove debug register traps after triggering a trap event
a debugger is responsible to detach this trap with appropriate PT_SETDBREGS
call (DR7 is Control Register on x86)
- debug registers must not be exposed in mcontext
- userland must not be allowed to set a trap on the kernel

Implementation notes on i386 and amd64:
- the initial state of debug register is retrieved on boot and this value is
stored in a local copy (initdbregs), this value is used to initialize dbreg
context after PT_GETDBREGS
- struct dbregs is stored in pcb as a pointer and by default not initialized
- reserved registers (DR4-DR5, DR9-DR15) are ignored

Further ideas:
- restrict this interface with securelevel

Tested on real hardware i386 (Intel Pentium IV) and amd64 (Intel i7).

This commit enables 390 debug register ATF tests in kernel/arch/x86.
All tests are passing.

This commit does not cover netbsd32 compat code. Currently other interface
PT_GET_SIGINFO/PT_SET_SIGINFO is required in netbsd32 compat code in order to
validate reliably PT_GETDBREGS/PT_SETDBREGS.

This implementation does not cover FreeBSD specific defines in their
<x86/reg.h>: DBREG_DR7_LOCAL_ENABLE, DBREG_DR7_GLOBAL_ENABLE, DBREG_DR7_LEN_1
etc. These values tend to be reinvented by each tracer on its own. GNU
Debugger (GDB) works with NetBSD debug registers after adding this patch:

--- gdb/amd64bsd-nat.c.orig 2016-02-10 03:19:39.000000000 +0000
+++ gdb/amd64bsd-nat.c
@@ -167,6 +167,10 @@ amd64bsd_target (void)

#ifdef HAVE_PT_GETDBREGS

+#ifndef DBREG_DRX
+#define DBREG_DRX(d,x) ((d)->dr[(x)])
+#endif
+
static unsigned long
amd64bsd_dr_get (ptid_t ptid, int regnum)
{


Another reason to stop introducing unpopular defines covering machine
specific register macros is that these value varies across generations of
the same CPU family.

GDB demo:
(gdb) c
Continuing.

Watchpoint 2: traceme

Old value = 0
New value = 16
main (argc=1, argv=0x7f7fff79fe30) at test.c:8
8 printf("traceme=%d\n", traceme);

(Currently the GDB interface is not reliable due to NetBSD support bugs)

Sponsored by <The NetBSD Foundation>
 1.9 11-Feb-2014  dsl branches: 1.9.6; 1.9.10; 1.9.14;
Move sys/arch/amd64/amd64/fpu.c and sys/arch/amd64/include/fpu.h
into sys/arch/x86 in preparation for using the same code for i386.
 1.8 07-Feb-2014  dsl Convert the amd64 build to use x86/cpu_extended_state.h so that the fpu
definitions match those of i386.
Mostly just structure and field renames, in addition:
1) process_xmm_to_s87() and process_s87_to_xmm() moved into
x86/convert_xmm_s87.c so they can be used by amd64's netbsd32 code.
2) The linux signal code simplified to use a structure copy for ths fxsave
data - it matches the hardware definition and won't change.
 1.7 26-Oct-2008  mrg branches: 1.7.28; 1.7.38; 1.7.44;
put the contents of these header files around #ifdef __x86_64__, and
#include the <i386/foo.h> in the #else clause, making these files
largely bit-size independant.
 1.6 05-Jan-2008  dsl branches: 1.6.6; 1.6.10; 1.6.16;
Reorder the amd64 trapframe (swap rcx/r10 and add 4 spare slots after r9).
This allows the syscall code to pass the syscall args directly from the
trapframe instead of copying them to a separate structure.
It is still possible that some lurking code still assumes that
'struct trapframe', 'struct mcontext' and 'struct reg' all have the
registers in the same order, but I've fixed enough of them to get gdb working.
 1.5 10-Jul-2006  fvdl branches: 1.5.34; 1.5.40; 1.5.48;
kern/33961: add kgdb support and remove some redundant (and incorrect) register
offset definitions from reg.h
 1.4 11-Dec-2005  christos branches: 1.4.4; 1.4.8; 1.4.16;
merge ktrace-lwp.
 1.3 18-Jun-2004  jmc branches: 1.3.12;
Pull in machine/fpu.h to pick up fxsave64
 1.2 07-Aug-2003  agc branches: 1.2.4;
Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.1 26-Apr-2003  fvdl branches: 1.1.2;
Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.1.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.1.2.1 03-Aug-2004  skrll Sync with HEAD
 1.2.4.1 29-Aug-2005  tron Pull up following revision(s) (requested by riz in ticket #5621):
sys/arch/amd64/include/reg.h: revision 1.3
Pull in machine/fpu.h to pick up fxsave64
 1.3.12.2 21-Jan-2008  yamt sync with head
 1.3.12.1 30-Dec-2006  yamt sync with head.
 1.4.16.1 13-Jul-2006  gdamore Merge from HEAD.
 1.4.8.1 11-Aug-2006  yamt sync with head
 1.4.4.1 09-Sep-2006  rpaulo sync with head
 1.5.48.1 08-Jan-2008  bouyer Sync with HEAD
 1.5.40.1 18-Feb-2008  mjf Sync with HEAD.
 1.5.34.1 09-Jan-2008  matt sync with HEAD
 1.6.16.1 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.6.10.1 04-May-2009  yamt sync with head.
 1.6.6.1 17-Jan-2009  mjf Sync with HEAD.
 1.7.44.1 18-May-2014  rmind sync with head
 1.7.38.2 03-Dec-2017  jdolecek update from HEAD
 1.7.38.1 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.7.28.1 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.9.14.1 21-Apr-2017  bouyer Sync with HEAD
 1.9.10.1 20-Mar-2017  pgoyette Sync with HEAD
 1.9.6.1 28-Aug-2017  skrll Sync with HEAD
 1.2 09-Feb-2007  ad branches: 1.2.4; 1.2.144;
Merge newlock2 to head.
 1.1 10-Sep-2006  ad branches: 1.1.2;
file rwlock.h was initially added on branch newlock2.
 1.1.2.3 29-Dec-2006  ad Checkpoint work in progress.
 1.1.2.2 24-Oct-2006  ad Compile fixes
 1.1.2.1 10-Sep-2006  ad Add updated locking primatives.
 1.2.144.2 22-Jan-2020  ad Back out previous.
 1.2.144.1 19-Jan-2020  ad empty these; remove later.
 1.2.4.2 26-Feb-2007  yamt sync with head.
 1.2.4.1 09-Feb-2007  yamt file rwlock.h was added on branch yamt-lazymbuf on 2007-02-26 09:05:44 +0000
 1.38 17-Apr-2021  rillig sys/arch/amd64: remove trailing whitespace
 1.37 14-Jul-2020  yamaguchi Introduce per-cpu IDTs

This is realized by following modifications:
- Add IDT pages and its allocation maps for each cpu in "struct cpu_info"
- Load per-cpu IDTs at cpu_init_idt(struct cpu_info*)
- Copy the IDT entries for cpu0 to other CPUs at attach
- These are, for example, exceptions, db, system calls, etc.

And, added a kernel option named PCPU_IDT to enable the feature.
 1.36 11-Feb-2019  cherry We reorganise definitions for XEN source support as follows:

XEN - common sources required for baseline XEN support.
XENPV - sources required for support of XEN in PV mode.
XENPVHVM - sources required for support for XEN in HVM mode.
XENPVH - sources required for support for XEN in PVH mode.
 1.35 23-Sep-2018  cherry Make XEN use the same api as native, for idt vector allocation
and registration.

lidt() placed in xenfunc() on maxv@ suggestion.

There should be no functional change due to this commit.

Tested on amd64 native and XEN.
 1.34 31-Dec-2017  maxv branches: 1.34.2; 1.34.4;
Fix a huge privilege separation vulnerability in Xen-amd64.

On amd64 the kernel runs in ring3, like userland, and therefore SEL_KPL
equals SEL_UPL. While Xen can make a distinction between usermode and
kernelmode in %cs, it can't when it comes to iopl. Since we set SEL_KPL
in iopl, Xen sees SEL_UPL, and allows (unprivileged) userland processes
to read and write to the CPU ports.

It is easy, then, to completely escalate privileges; by reprogramming the
PIC, by reading the ATA disks, by intercepting the keyboard interrupts
(keylogger), etc.

Declare IOPL_KPL, set to 1 on Xen-amd64, which allows the kernel to use
the ports but not userland. I didn't test this change on i386, but it
seems fine enough.
 1.33 04-Nov-2017  cherry In XEN PV, the idt vector table is not required to be altered at
runtime, since only entries for exceptions/traps are registered with
the hypervisor and interrupts are managed via a completely different
mechanism.

This change uses the idt_vec_reserve() mechanism nevertheless,
modifying it slightly to only do namespace management in XEN, while on
native it will continue to do idt entry init as before.

Rationale: Consistent API usage and potential future merging of
XEN/non-XEN code.

There are no functional changes in this commit.
 1.32 01-Nov-2017  maxv Remove unused macros and LDT entries.
 1.31 15-Oct-2017  maxv Use two separate functions: cpu_segregs32_zero and cpu_segregs64_zero. The
way segment registers work on amd64 will diverge between 32bit and 64bit
LWPs.
 1.30 17-Sep-2017  maxv Remove the second argument from USERMODE and KERNELMODE, it is unused
now that we don't have vm86 anymore.
 1.29 05-Feb-2017  maxv branches: 1.29.6;
Remove misleading comment; these macros should not be used if a user LDT
is active.
 1.28 02-Sep-2016  maxv branches: 1.28.2;
Give the structure sizes.
 1.27 27-Aug-2016  maxv Remove idt_init.
 1.26 27-Aug-2016  maxv Rename this value, and use it.
 1.25 21-Aug-2016  maxv KNF, and typo.
 1.24 07-Jan-2013  chs branches: 1.24.12; 1.24.14; 1.24.16; 1.24.18; 1.24.22;
rearrange the LDT entries so that (32-bit) COMPAT_10 binaries work again.
in long mode, call gates use two slots, so the first entry (a call gate)
would overlap the second one (the 32-bit user code descriptor).
 1.23 16-Jun-2012  dsl branches: 1.23.2;
memseg_baseaddr() is only called from valid_user_selector() and
both only locally.
Make static, remove one of the functions, and remove the never-set args.
Code is still very dubious.
 1.22 07-Feb-2011  chs branches: 1.22.4; 1.22.10; 1.22.14; 1.22.16;
move macros for validating fs/gs to segments.h and use them
in the linux32 code as well.
 1.21 05-Sep-2010  chs branches: 1.21.2; 1.21.4;
in check_mcontext32(), accept the LDT selector for 32-bit user code
as well as the GDT selector. fixes PR 43835.
 1.20 07-Jul-2010  chs add the guts of TLS support on amd64. based on joerg's patch,
reworked by me to support 32-bit processes as well.
we now keep %fs and %gs loaded with the user values
while in the kernel, which means we don't need to
reload them when returning to user mode.
 1.19 26-Oct-2008  mrg branches: 1.19.4; 1.19.8; 1.19.10; 1.19.12; 1.19.14; 1.19.16;
put the contents of these header files around #ifdef __x86_64__, and
#include the <i386/foo.h> in the #else clause, making these files
largely bit-size independant.
 1.18 19-Apr-2008  cegger branches: 1.18.2; 1.18.8;
idt_* are not implemented for Xen. So don't provide the prototypes for Xen.
 1.17 16-Apr-2008  cegger branches: 1.17.2;
use POSIX integer types
 1.16 26-Dec-2007  yamt branches: 1.16.6;
- share idt entry allocation code among x86.
- introduce a function to reserve an idt entry and use it instead of
manipulating idt_allocmap directly.
- rename idt to xen_idt for amd64 xen. add missing #ifdef XEN.
 1.15 25-Dec-2007  perry Convert many of the uses of __attribute__ to equivalent
__packed, __unused and __dead macros from cdefs.h
 1.14 23-Nov-2007  bouyer branches: 1.14.2; 1.14.6;
Include opt_xen.h #ifdef _KERNEL_OPT instead of custom logic.
Thanks to Izumi Tsutsui for pointing me at _KERNEL_OPT
 1.13 22-Nov-2007  bouyer Fix bouyer-xenamd64 merge fallout:
we can #include "opt_xen.h" when
#if defined(_KERNEL) && !defined(_RUMPKERNEL) && !defined(_LKM),
#ifdef _KERNEL isn't enough.
 1.12 22-Nov-2007  bouyer only include opt_xen.h #ifdef _KERNEL
 1.11 22-Nov-2007  bouyer Pull up the bouyer-xenamd64 branch to HEAD. This brings in amd64 support
to NetBSD/Xen, both Dom0 and DomU.
 1.10 18-Oct-2007  yamt branches: 1.10.2;
merge yamt-x86pmap branch.

- reduce differences between amd64 and i386. notably, share pmap.c
between them. it makes several i386 pmap improvements available to
amd64, including tlb shootdown reduction and bug fixes from Stephan Uphoff.
- implement deferred pmap switching for amd64.
- remove LARGEPAGES option. always use large pages if available.
also, make it work on amd64.
 1.9 17-Oct-2007  garbled Merge the ppcoea-renovation branch to HEAD.

This branch was a major cleanup and rototill of many of the various OEA
cpu based PPC ports that focused on sharing as much code as possible
between the various ports to eliminate near-identical copies of files in
every tree. Additionally there is a new PIC system that unifies the
interface to interrupt code for all different OEA ppc arches. The work
for this branch was done by a variety of people, too long to list here.

TODO:
bebox still needs work to complete the transition to -renovation.
ofppc still needs a bunch of work, which I will be looking at.
ev64260 still needs to be renovated
amigappc was not attempted.

NOTES:
pmppc was removed as an arch, and moved to a evbppc target.
 1.8 26-Sep-2007  mrg branches: 1.8.2;
in VALID_USER_DSEL3() only check the low 16 bits.

this fixes 32bit gmake from occasionally reporting "Error 255" after
a command has successfully run.

lots of help from ad@ and joerg@.
 1.7 19-Aug-2006  dsl branches: 1.7.12; 1.7.20; 1.7.30; 1.7.32; 1.7.34;
de __P()
 1.6 11-Dec-2005  christos branches: 1.6.4; 1.6.8;
merge ktrace-lwp.
 1.5 15-May-2005  fvdl branches: 1.5.2;
Optionally include saving and restoring the 64bit %gs and %fs base register
values in the PCB. Do this in pmap_activate for now (XXX not a good place
for it, but a convenient one).
 1.4 13-Feb-2004  wiz Uppercase CPU, plural is CPUs.
 1.3 13-Oct-2003  fvdl Define a few macros to validate userspace selectors.
 1.2 07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.1 26-Apr-2003  fvdl branches: 1.1.2;
Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1.2.4 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.1.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.1.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.1.2.1 03-Aug-2004  skrll Sync with HEAD
 1.5.2.4 21-Jan-2008  yamt sync with head
 1.5.2.3 07-Dec-2007  yamt sync with head
 1.5.2.2 27-Oct-2007  yamt sync with head.
 1.5.2.1 30-Dec-2006  yamt sync with head.
 1.6.8.1 03-Sep-2006  yamt sync with head.
 1.6.4.1 09-Sep-2006  rpaulo sync with head
 1.7.34.2 18-Oct-2007  yamt remove unused GDT_SYS_OFFSET.
 1.7.34.1 06-Oct-2007  yamt sync with head.
 1.7.32.2 09-Jan-2008  matt sync with HEAD
 1.7.32.1 06-Nov-2007  matt sync with HEAD
 1.7.30.3 27-Nov-2007  joerg Sync with HEAD. amd64 Xen support needs testing.
 1.7.30.2 26-Oct-2007  joerg Sync with HEAD.

Follow the merge of pmap.c on i386 and amd64 and move
pmap_init_tmp_pgtbl into arch/x86/x86/pmap.c. Modify the ACPI wakeup
code to restore CR4 before jumping back into kernel space as the large
page option might cover that.
 1.7.30.1 02-Oct-2007  joerg Sync with HEAD.
 1.7.20.1 03-Oct-2007  garbled Sync with HEAD
 1.7.12.3 03-Dec-2007  ad Sync with HEAD.
 1.7.12.2 23-Oct-2007  ad Sync with head.
 1.7.12.1 09-Oct-2007  ad Sync with head.
 1.8.2.2 25-Oct-2007  bouyer Sync with HEAD.
 1.8.2.1 17-Oct-2007  bouyer amd64 (aka x86-64) support for Xen. Based on the OpenBSD port done by
Mathieu Ropert in 2006.
DomU-only for now. An INSTALL_XEN3_DOMU kernel with a ramdisk will boot to
sysinst if you're lucky. Often it panics because a runable LWP has
a NULL stack (really, it's all of l->l_addr which is has been zeroed out
while the process was on the queue !)
TODO:
- bug fixes :)
- Most of the xpq_* functions should be shared with xen/i386
- The xen/i386 assembly bootstrap code should be remplaced with the C
version in xenamd64/amd64/xpmap.c
- see if a config(5) trick could allow to merge xenamd64 back to xen or amd64.
 1.10.2.2 18-Feb-2008  mjf Sync with HEAD.
 1.10.2.1 08-Dec-2007  mjf Sync with HEAD.
 1.14.6.1 02-Jan-2008  bouyer Sync with HEAD
 1.14.2.1 26-Dec-2007  ad Sync with head.
 1.16.6.2 17-Jan-2009  mjf Sync with HEAD.
 1.16.6.1 02-Jun-2008  mjf Sync with HEAD.
 1.17.2.1 18-May-2008  yamt sync with head.
 1.18.8.1 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.18.2.3 09-Oct-2010  yamt sync with head
 1.18.2.2 11-Aug-2010  yamt sync with head.
 1.18.2.1 04-May-2009  yamt sync with head.
 1.19.16.1 05-Mar-2011  rmind sync with head
 1.19.14.2 22-Oct-2010  uebayasi Sync with HEAD (-D20101022).
 1.19.14.1 17-Aug-2010  uebayasi Sync with HEAD.
 1.19.12.1 20-May-2011  matt bring matt-nb5-mips64 up to date with netbsd-5-1-RELEASE (except compat).
 1.19.10.1 07-Sep-2010  bouyer Pull up following revision(s) (requested by chs in ticket #1449):
sys/arch/amd64/amd64/netbsd32_machdep.c: revisions 1.66, 1.67
sys/arch/amd64/include/segments.h: revision 1.21
in check_mcontext32(), accept the LDT selector for 32-bit user code
as well as the GDT selector. fixes PR 43835.
accept the LDT selector in check_sigcontext32() too.
 1.19.8.2 28-Mar-2011  jym Sync with HEAD. TODO before merge:
- shortcut for suspend code in sysmon, when powerd(8) is not running.
Borrow ``xs_watch'' thread context?
- bug hunting in xbd + xennet resume. Rings are currently thrashed upon
resume, so current implementation force flush them on suspend. It's not
really needed.
 1.19.8.1 24-Oct-2010  jym Sync with HEAD
 1.19.4.1 07-Sep-2010  bouyer Pull up following revision(s) (requested by chs in ticket #1449):
sys/arch/amd64/amd64/netbsd32_machdep.c: revisions 1.66, 1.67
sys/arch/amd64/include/segments.h: revision 1.21
in check_mcontext32(), accept the LDT selector for 32-bit user code
as well as the GDT selector. fixes PR 43835.
accept the LDT selector in check_sigcontext32() too.
 1.21.4.1 08-Feb-2011  bouyer Sync with HEAD
 1.21.2.1 06-Jun-2011  jruoho Sync with HEAD.
 1.22.16.1 19-Feb-2018  snj Pull up following revision(s) (requested by maxv in ticket #1517):
sys/arch/amd64/amd64/machdep.c: 1.280 via patch
sys/arch/amd64/include/segments.h: 1.34 via patch
sys/arch/i386/i386/machdep.c: 1.800
sys/arch/i386/include/segments.h: 1.64
sys/arch/x86/x86/vm_machdep.c: 1.30
Fix a huge privilege separation vulnerability in Xen-amd64.
On amd64 the kernel runs in ring3, like userland, and therefore SEL_KPL
equals SEL_UPL. While Xen can make a distinction between usermode and
kernelmode in %cs, it can't when it comes to iopl. Since we set SEL_KPL
in iopl, Xen sees SEL_UPL, and allows (unprivileged) userland processes
to read and write to the CPU ports.
It is easy, then, to completely escalate privileges; by reprogramming the
PIC, by reading the ATA disks, by intercepting the keyboard interrupts
(keylogger), etc.
Declare IOPL_KPL, set to 1 on Xen-amd64, which allows the kernel to use
the ports but not userland. I didn't test this change on i386, but it
seems fine enough.
 1.22.14.1 19-Feb-2018  snj Pull up following revision(s) (requested by maxv in ticket #1517):
sys/arch/amd64/amd64/machdep.c: 1.280 via patch
sys/arch/amd64/include/segments.h: 1.34 via patch
sys/arch/i386/i386/machdep.c: 1.800
sys/arch/i386/include/segments.h: 1.64
sys/arch/x86/x86/vm_machdep.c: 1.30
Fix a huge privilege separation vulnerability in Xen-amd64.
On amd64 the kernel runs in ring3, like userland, and therefore SEL_KPL
equals SEL_UPL. While Xen can make a distinction between usermode and
kernelmode in %cs, it can't when it comes to iopl. Since we set SEL_KPL
in iopl, Xen sees SEL_UPL, and allows (unprivileged) userland processes
to read and write to the CPU ports.
It is easy, then, to completely escalate privileges; by reprogramming the
PIC, by reading the ATA disks, by intercepting the keyboard interrupts
(keylogger), etc.
Declare IOPL_KPL, set to 1 on Xen-amd64, which allows the kernel to use
the ports but not userland. I didn't test this change on i386, but it
seems fine enough.
 1.22.10.1 19-Feb-2018  snj Pull up following revision(s) (requested by maxv in ticket #1517):
sys/arch/amd64/amd64/machdep.c: 1.280 via patch
sys/arch/amd64/include/segments.h: 1.34 via patch
sys/arch/i386/i386/machdep.c: 1.800
sys/arch/i386/include/segments.h: 1.64
sys/arch/x86/x86/vm_machdep.c: 1.30
Fix a huge privilege separation vulnerability in Xen-amd64.
On amd64 the kernel runs in ring3, like userland, and therefore SEL_KPL
equals SEL_UPL. While Xen can make a distinction between usermode and
kernelmode in %cs, it can't when it comes to iopl. Since we set SEL_KPL
in iopl, Xen sees SEL_UPL, and allows (unprivileged) userland processes
to read and write to the CPU ports.
It is easy, then, to completely escalate privileges; by reprogramming the
PIC, by reading the ATA disks, by intercepting the keyboard interrupts
(keylogger), etc.
Declare IOPL_KPL, set to 1 on Xen-amd64, which allows the kernel to use
the ports but not userland. I didn't test this change on i386, but it
seems fine enough.
 1.22.4.2 23-Jan-2013  yamt sync with head
 1.22.4.1 30-Oct-2012  yamt sync with head
 1.23.2.2 03-Dec-2017  jdolecek update from HEAD
 1.23.2.1 25-Feb-2013  tls resync with head
 1.24.22.1 22-Jan-2018  snj Pull up following revision(s) (requested by maxv in ticket #1550):
sys/arch/amd64/amd64/machdep.c: revision 1.280 via patch
sys/arch/amd64/include/segments.h: revision 1.34 via patch
sys/arch/i386/i386/machdep.c: revision 1.800 via patch
sys/arch/i386/include/segments.h: revision 1.64 via patch
sys/arch/x86/x86/vm_machdep.c: revision 1.30 via patch
Fix a huge privilege separation vulnerability in Xen-amd64.
On amd64 the kernel runs in ring3, like userland, and therefore SEL_KPL
equals SEL_UPL. While Xen can make a distinction between usermode and
kernelmode in %cs, it can't when it comes to iopl. Since we set SEL_KPL
in iopl, Xen sees SEL_UPL, and allows (unprivileged) userland processes
to read and write to the CPU ports.
It is easy, then, to completely escalate privileges; by reprogramming the
PIC, by reading the ATA disks, by intercepting the keyboard interrupts
(keylogger), etc.
Declare IOPL_KPL, set to 1 on Xen-amd64, which allows the kernel to use
the ports but not userland. I didn't test this change on i386, but it
seems fine enough.
 1.24.18.1 20-Mar-2017  pgoyette Sync with HEAD
 1.24.16.1 22-Jan-2018  snj Pull up following revision(s) (requested by maxv in ticket #1550):
sys/arch/amd64/amd64/machdep.c: revision 1.280 via patch
sys/arch/amd64/include/segments.h: revision 1.34 via patch
sys/arch/i386/i386/machdep.c: revision 1.800 via patch
sys/arch/i386/include/segments.h: revision 1.64 via patch
sys/arch/x86/x86/vm_machdep.c: revision 1.30 via patch
Fix a huge privilege separation vulnerability in Xen-amd64.
On amd64 the kernel runs in ring3, like userland, and therefore SEL_KPL
equals SEL_UPL. While Xen can make a distinction between usermode and
kernelmode in %cs, it can't when it comes to iopl. Since we set SEL_KPL
in iopl, Xen sees SEL_UPL, and allows (unprivileged) userland processes
to read and write to the CPU ports.
It is easy, then, to completely escalate privileges; by reprogramming the
PIC, by reading the ATA disks, by intercepting the keyboard interrupts
(keylogger), etc.
Declare IOPL_KPL, set to 1 on Xen-amd64, which allows the kernel to use
the ports but not userland. I didn't test this change on i386, but it
seems fine enough.
 1.24.14.2 28-Aug-2017  skrll Sync with HEAD
 1.24.14.1 05-Oct-2016  skrll Sync with HEAD
 1.24.12.1 22-Jan-2018  snj Pull up following revision(s) (requested by maxv in ticket #1550):
sys/arch/amd64/amd64/machdep.c: revision 1.280 via patch
sys/arch/amd64/include/segments.h: revision 1.34 via patch
sys/arch/i386/i386/machdep.c: revision 1.800 via patch
sys/arch/i386/include/segments.h: revision 1.64 via patch
sys/arch/x86/x86/vm_machdep.c: revision 1.30 via patch
Fix a huge privilege separation vulnerability in Xen-amd64.
On amd64 the kernel runs in ring3, like userland, and therefore SEL_KPL
equals SEL_UPL. While Xen can make a distinction between usermode and
kernelmode in %cs, it can't when it comes to iopl. Since we set SEL_KPL
in iopl, Xen sees SEL_UPL, and allows (unprivileged) userland processes
to read and write to the CPU ports.
It is easy, then, to completely escalate privileges; by reprogramming the
PIC, by reading the ATA disks, by intercepting the keyboard interrupts
(keylogger), etc.
Declare IOPL_KPL, set to 1 on Xen-amd64, which allows the kernel to use
the ports but not userland. I didn't test this change on i386, but it
seems fine enough.
 1.28.2.1 21-Apr-2017  bouyer Sync with HEAD
 1.29.6.1 01-Jan-2018  snj Pull up following revision(s) (requested by maxv in ticket #477):
sys/arch/amd64/amd64/machdep.c: revision 1.280
sys/arch/amd64/include/segments.h: revision 1.34
sys/arch/i386/i386/machdep.c: revision 1.800
sys/arch/i386/include/segments.h: revision 1.64 via patch
sys/arch/x86/x86/vm_machdep.c: revision 1.30
Fix a huge privilege separation vulnerability in Xen-amd64.
On amd64 the kernel runs in ring3, like userland, and therefore SEL_KPL
equals SEL_UPL. While Xen can make a distinction between usermode and
kernelmode in %cs, it can't when it comes to iopl. Since we set SEL_KPL
in iopl, Xen sees SEL_UPL, and allows (unprivileged) userland processes
to read and write to the CPU ports.
It is easy, then, to completely escalate privileges; by reprogramming the
PIC, by reading the ATA disks, by intercepting the keyboard interrupts
(keylogger), etc.
Declare IOPL_KPL, set to 1 on Xen-amd64, which allows the kernel to use
the ports but not userland. I didn't test this change on i386, but it
seems fine enough.
 1.34.4.1 10-Jun-2019  christos Sync with HEAD
 1.34.2.1 30-Sep-2018  pgoyette Ssync with HEAD
 1.2 26-Oct-2008  mrg put the contents of these header files around #ifdef __x86_64__, and
#include the <i386/foo.h> in the #else clause, making these files
largely bit-size independant.
 1.1 26-Apr-2003  fvdl branches: 1.1.104; 1.1.108; 1.1.114;
Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1.114.1 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.1.108.1 04-May-2009  yamt sync with head.
 1.1.104.1 17-Jan-2009  mjf Sync with HEAD.
 1.13 27-Oct-2021  thorpej - In sendsig() and sigaction1(), don't hard-code signal trampoline
versions. Instead, use the version constants from <sys/signal.h>
and automatically (and correctly) handle cases where multiple versions
of a particular trampoline flavor exist. Conditionalize support
for sigcontext trampolines on __HAVE_STRUCT_SIGCONTEXT.
- aarch64 and amd64 don't use sigcontext natively, but do need to
support it for 32-bit compatibility; define __HAVE_STRUCT_SIGCONTEXT
conditionally on _KERNEL.
 1.12 02-Jan-2013  dsl This is included into user-programs by signal.h, it shouldn't be
pulling in machine/fpu.h - which doesn't describe anything userspace
(directly) needs.
 1.11 19-Nov-2008  ad branches: 1.11.16; 1.11.26;
Make the emulations, exec formats, coredump, NFS, and the NFS server
into modules. By and large this commit:

- shuffles header files and ifdefs
- splits code out where necessary to be modular
- adds module glue for each of the components
- adds/replaces hooks for things that can be installed at runtime
 1.10 26-Oct-2008  mrg branches: 1.10.2;
put the contents of these header files around #ifdef __x86_64__, and
#include the <i386/foo.h> in the #else clause, making these files
largely bit-size independant.
 1.9 18-Feb-2007  pavel branches: 1.9.42; 1.9.46; 1.9.52;
Accept the old ABI versions of signal trampolines for 32-bit
compatibility. Unbreaks the i386 cvsup binary on amd64.

Problem reported by Blair Sadewitz and Viktor Holmlund, fix tested by
Viktor Holmlund.
 1.8 11-Dec-2005  christos branches: 1.8.24; 1.8.26;
merge ktrace-lwp.
 1.7 10-May-2004  drochner branches: 1.7.10; 1.7.12;
SIGTRAMP_VALID() should not pollute the user namespace
 1.6 25-Mar-2004  drochner only accept signal trampoline version 2, and remove "struct sigcontext"
 1.5 25-Nov-2003  christos bye, bye _MCONTEXT_TO_SIGCONTEXT and vice versa.
 1.4 18-Oct-2003  briggs Add SIGTRAMP_VALID().
 1.3 07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.2 28-Apr-2003  bjh21 branches: 1.2.2;
Add a new feature-test macro, _NETBSD_SOURCE. If this is defined
by the application, all NetBSD interfaces are made visible, even
if some other feature-test macro (like _POSIX_C_SOURCE) is defined.
<sys/featuretest.h> defined _NETBSD_SOURCE if none of _ANSI_SOURCE,
_POSIX_C_SOURCE and _XOPEN_SOURCE is defined, so as to preserve
existing behaviour.

This has two major advantages:
+ Programs that require non-POSIX facilities but define _POSIX_C_SOURCE
can trivially be overruled by putting -D_NETBSD_SOURCE in their CFLAGS.
+ It makes most of the #ifs simpler, in that they're all now ORs of the
various macros, rather than having checks for (!defined(_ANSI_SOURCE) ||
!defined(_POSIX_C_SOURCE) || !defined(_XOPEN_SOURCE)) all over the place.

I've tried not to change the semantics of the headers in any case where
_NETBSD_SOURCE wasn't defined, but there were some places where the
current semantics were clearly mad, and retaining them was harder than
correcting them. In particular, I've mostly normalised things so that
_ANSI_SOURCE gets you the smallest set of stuff, then _POSIX_C_SOURCE,
_XOPEN_SOURCE and _NETBSD_SOURCE in that order.

Tested by building for vax, encouraged by thorpej, and uncontested in
tech-userlevel for a week.
 1.1 26-Apr-2003  fvdl Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.2.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.2.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.2.2.1 03-Aug-2004  skrll Sync with HEAD
 1.7.12.1 26-Feb-2007  yamt sync with head.
 1.7.10.1 19-Feb-2007  tron Pull up following revision(s) (requested by pavel in ticket #1669):
sys/arch/amd64/include/signal.h: revision 1.9
Accept the old ABI versions of signal trampolines for 32-bit
compatibility. Unbreaks the i386 cvsup binary on amd64.
Problem reported by Blair Sadewitz and Viktor Holmlund, fix tested by
Viktor Holmlund.
 1.8.26.1 27-Feb-2007  yamt - sync with head.
- move sched_changepri back to kern_synch.c as it doesn't know PPQ anymore.
 1.8.24.1 19-Feb-2007  riz Pull up following revision(s) (requested by pavel in ticket #455):
sys/arch/amd64/include/signal.h: revision 1.9
Accept the old ABI versions of signal trampolines for 32-bit
compatibility. Unbreaks the i386 cvsup binary on amd64.
Problem reported by Blair Sadewitz and Viktor Holmlund, fix tested by
Viktor Holmlund.
 1.9.52.1 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.9.46.1 04-May-2009  yamt sync with head.
 1.9.42.1 17-Jan-2009  mjf Sync with HEAD.
 1.10.2.1 19-Jan-2009  skrll Sync with HEAD.
 1.11.26.1 25-Feb-2013  tls resync with head
 1.11.16.1 23-Jan-2013  yamt sync with head
 1.2 10-Jun-2015  alnsn Include <i386/sljit_machdep.h> for i386 compat build.
 1.1 23-Jul-2014  alnsn branches: 1.1.2; 1.1.4; 1.1.6; 1.1.8;
Rename sljitarch.h to sljit_machdep.h.
 1.1.8.1 22-Sep-2015  skrll Sync with HEAD
 1.1.6.3 03-Dec-2017  jdolecek update from HEAD
 1.1.6.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.6.1 23-Jul-2014  tls file sljit_machdep.h was added on branch tls-maxphys on 2014-08-20 00:02:42 +0000
 1.1.4.1 08-Jul-2015  snj Pull up following revision(s) (requested by alnsn in ticket #872):
sys/arch/amd64/include/sljit_machdep.h: revision 1.2
Include <i386/sljit_machdep.h> for i386 compat build.
 1.1.2.2 10-Aug-2014  tls Rebase.
 1.1.2.1 23-Jul-2014  tls file sljit_machdep.h was added on branch tls-earlyentropy on 2014-08-10 06:53:49 +0000
 1.3 23-Jul-2014  alnsn Rename sljitarch.h to sljit_machdep.h.
 1.2 17-Nov-2013  alnsn branches: 1.2.2;
Always define SLJIT_CACHE_FLUSH(), start include guards with '_' and use _LP64 guard.
 1.1 13-Oct-2012  alnsn branches: 1.1.2; 1.1.4; 1.1.6;
Enable sljit in amd64 kernel and modules.
 1.1.6.1 18-May-2014  rmind sync with head
 1.1.4.3 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.2 20-Nov-2012  tls Resync to 2012-11-19 00:00:00 UTC
 1.1.4.1 13-Oct-2012  tls file sljitarch.h was added on branch tls-maxphys on 2012-11-20 03:00:56 +0000
 1.1.2.3 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.1.2.2 30-Oct-2012  yamt sync with head
 1.1.2.1 13-Oct-2012  yamt file sljitarch.h was added on branch yamt-pagecache on 2012-10-30 17:18:45 +0000
 1.2.2.1 10-Aug-2014  tls Rebase.
 1.5 17-Oct-2007  garbled Merge the ppcoea-renovation branch to HEAD.

This branch was a major cleanup and rototill of many of the various OEA
cpu based PPC ports that focused on sharing as much code as possible
between the various ports to eliminate near-identical copies of files in
every tree. Additionally there is a new PIC system that unifies the
interface to interrupt code for all different OEA ppc arches. The work
for this branch was done by a variety of people, too long to list here.

TODO:
bebox still needs work to complete the transition to -renovation.
ofppc still needs a bunch of work, which I will be looking at.
ev64260 still needs to be renovated
amigappc was not attempted.

NOTES:
pmppc was removed as an arch, and moved to a evbppc target.
 1.4 06-Jun-2007  njoly branches: 1.4.10;
Remove duplicated AMD K8 MSR definitions, already in x86 specialreg.h
file.

ok by xtraeme.
 1.3 11-Dec-2005  christos branches: 1.3.30; 1.3.32; 1.3.38;
merge ktrace-lwp.
 1.2 19-Feb-2004  drochner branches: 1.2.16;
use no-execute page permissions if supported
 1.1 26-Apr-2003  fvdl branches: 1.1.2;
Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.1.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.1.2.1 03-Aug-2004  skrll Sync with HEAD
 1.2.16.1 03-Sep-2007  yamt sync with head.
 1.3.38.1 26-Jun-2007  garbled Sync with HEAD.
 1.3.32.1 11-Jul-2007  mjf Sync with head.
 1.3.30.1 09-Jun-2007  ad Sync with head.
 1.4.10.1 06-Nov-2007  matt sync with HEAD
 1.8 17-Jul-2011  joerg Retire varargs.h support. Move machine/stdarg.h logic into MI
sys/stdarg.h and expect compiler to provide proper builtins, defaulting
to the GCC interface. lint still has a special fallback.
Reduce abuse of _BSD_VA_LIST_ by defining __va_list by default and
derive va_list as required by standards.
 1.7 20-Jun-2011  mrg various build fixes for gcc 4.5. from chuq. XXX i'm not sure all of
these work properly wtf pointer aliasing, but there are no casts at
least...

the lib/libpuffs/puffs_priv.h is definately a real bug fix.

from chuq.
 1.6 26-Oct-2008  mrg branches: 1.6.8; 1.6.26;
put the contents of these header files around #ifdef __x86_64__, and
#include the <i386/foo.h> in the #else clause, making these files
largely bit-size independant.
 1.5 11-Dec-2005  christos branches: 1.5.74; 1.5.78; 1.5.84;
merge ktrace-lwp.
 1.4 30-Dec-2004  christos change the definition of va_start for lint.
 1.3 07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.2 28-Apr-2003  bjh21 branches: 1.2.2;
Add a new feature-test macro, _NETBSD_SOURCE. If this is defined
by the application, all NetBSD interfaces are made visible, even
if some other feature-test macro (like _POSIX_C_SOURCE) is defined.
<sys/featuretest.h> defined _NETBSD_SOURCE if none of _ANSI_SOURCE,
_POSIX_C_SOURCE and _XOPEN_SOURCE is defined, so as to preserve
existing behaviour.

This has two major advantages:
+ Programs that require non-POSIX facilities but define _POSIX_C_SOURCE
can trivially be overruled by putting -D_NETBSD_SOURCE in their CFLAGS.
+ It makes most of the #ifs simpler, in that they're all now ORs of the
various macros, rather than having checks for (!defined(_ANSI_SOURCE) ||
!defined(_POSIX_C_SOURCE) || !defined(_XOPEN_SOURCE)) all over the place.

I've tried not to change the semantics of the headers in any case where
_NETBSD_SOURCE wasn't defined, but there were some places where the
current semantics were clearly mad, and retaining them was harder than
correcting them. In particular, I've mostly normalised things so that
_ANSI_SOURCE gets you the smallest set of stuff, then _POSIX_C_SOURCE,
_XOPEN_SOURCE and _NETBSD_SOURCE in that order.

Tested by building for vax, encouraged by thorpej, and uncontested in
tech-userlevel for a week.
 1.1 26-Apr-2003  fvdl Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.2.2.4 17-Jan-2005  skrll Sync with HEAD.
 1.2.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.2.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.2.2.1 03-Aug-2004  skrll Sync with HEAD
 1.5.84.1 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.5.78.1 04-May-2009  yamt sync with head.
 1.5.74.1 17-Jan-2009  mjf Sync with HEAD.
 1.6.26.1 23-Jun-2011  cherry Catchup with rmind-uvmplock merge.
 1.6.8.1 27-Aug-2011  jym Add/remove files, like in HEAD.
 1.6 16-Apr-2007  ad Share the sysarch stuff between the x86 ports. PR kern/36046.
 1.5 07-Jan-2006  dsl branches: 1.5.24; 1.5.28; 1.5.30;
De __P
Add some 'struct xxx' definitions in the else part of a '#if 0' so that
the function prototypes later down the file don't define the structure
within the argument list.
 1.4 11-Dec-2005  christos branches: 1.4.2;
merge ktrace-lwp.
 1.3 15-May-2005  fvdl branches: 1.3.2;
New definitions for LDT system call arguments, amd64 version. Compatible
with the Linux interface. As yet unused.
 1.2 11-Sep-2003  kleink __{BEGIN,END}_DECLS-wrap prototypes.
 1.1 26-Apr-2003  fvdl branches: 1.1.2;
Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1.2.4 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.1.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.1.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.1.2.1 03-Aug-2004  skrll Sync with HEAD
 1.3.2.2 03-Sep-2007  yamt sync with head.
 1.3.2.1 21-Jun-2006  yamt sync with head.
 1.4.2.1 15-Jan-2006  yamt sync with head.
 1.5.30.1 11-Jul-2007  mjf Sync with head.
 1.5.28.1 27-May-2007  ad Sync with head.
 1.5.24.1 07-May-2007  yamt sync with head.
 1.1 26-Apr-2003  fvdl Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.8 07-Jul-2018  kamil Correct unportable signed integer left shift in i386/amd64 tss code

Change the type of IOMAP_INVALOFF to unsigned int.

sys/arch/amd64/amd64/machdep.c:518:42, left shift of 65535 by 16 places cannot be represented in type 'int'

Detected with Kernel Undefined Behavior Sanitizer.
 1.7 04-Jan-2018  maxv branches: 1.7.2; 1.7.4;
Declare IOMAP_VALIDOFF, not to use ci_tss pointers.
 1.6 12-Jul-2017  maxv rsp2, not 3
 1.5 26-Oct-2008  mrg branches: 1.5.38; 1.5.58;
put the contents of these header files around #ifdef __x86_64__, and
#include the <i386/foo.h> in the #else clause, making these files
largely bit-size independant.
 1.4 16-Apr-2008  cegger branches: 1.4.4; 1.4.10;
use POSIX integer types
 1.3 05-Jan-2008  yamt branches: 1.3.6;
- make amd64 use per-cpu tss.
- fix iopl syscall for amd64+xen.
 1.2 25-Dec-2007  perry Convert many of the uses of __attribute__ to equivalent
__packed, __unused and __dead macros from cdefs.h
 1.1 26-Apr-2003  fvdl branches: 1.1.18; 1.1.80; 1.1.86; 1.1.90; 1.1.94;
Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1.94.2 08-Jan-2008  bouyer Sync with HEAD
 1.1.94.1 02-Jan-2008  bouyer Sync with HEAD
 1.1.90.1 26-Dec-2007  ad Sync with head.
 1.1.86.1 18-Feb-2008  mjf Sync with HEAD.
 1.1.80.1 09-Jan-2008  matt sync with HEAD
 1.1.18.1 21-Jan-2008  yamt sync with head
 1.3.6.2 17-Jan-2009  mjf Sync with HEAD.
 1.3.6.1 02-Jun-2008  mjf Sync with HEAD.
 1.4.10.1 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.4.4.1 04-May-2009  yamt sync with head.
 1.5.58.1 28-Aug-2017  skrll Sync with HEAD
 1.5.38.1 03-Dec-2017  jdolecek update from HEAD
 1.7.4.1 10-Jun-2019  christos Sync with HEAD
 1.7.2.1 28-Jul-2018  pgoyette Sync with HEAD
 1.73 08-May-2025  imil Rename BOOTCYCLETIME kernel option and subsequent files to BOOT_DURATION
 1.72 06-May-2025  imil Add BOOTCYCLETIME option to print kernel boot time

Introduce a new kernel option, BOOTCYCLETIME, which will print
the time taken for the kernel to boot on (for now) amd64 and i386
architectures.
 1.71 01-Apr-2021  simonb branches: 1.71.22;
Whitespace: #define<tab>
 1.70 23-Jan-2021  christos branches: 1.70.2;
Document via __HAVE_BUS_SPACE_8 platforms that implement bus_space_*_8
 1.69 01-Aug-2020  jdolecek branches: 1.69.2;
move __HAVE_PCI_MSI_MSIX to <x86/pci_machdep_common.h>
 1.68 04-May-2020  jdolecek add support for using MSI for XenPV Dom0

use PHYSDEVOP_map_pirq to get the pirq/gsi for MSI/MSI-X, switch also INTx
to use it instead of PHYSDEVOP_alloc_irq_vector

MSI confirmed working with single-vector MSI for wm(4), ahcisata(4), bge(4)

XXX added some provision for MSI-X, but it doesn't actually work (no interrupts
delivered), needs some further investigation; disable MSI-X for XENPV
via flag in x86/pci/pci_machdep.c
 1.67 25-Apr-2020  bouyer Merge the bouyer-xenpvh branch, bringing in Xen PV drivers support under HVM
guests in GENERIC.
Xen support can be disabled at runtime with
boot -c
disable hypervisor
 1.66 13-Apr-2020  maxv Add KASAN-DMA support on aarch64, same as amd64. Discussed with skrll@.
 1.65 17-Mar-2020  maxv branches: 1.65.2;
Add a redzone between the pcb and the stack. Sent to port-amd64@.
 1.64 14-Nov-2019  maxv Add support for Kernel Memory Sanitizer (kMSan). It detects uninitialized
memory used by the kernel at run time, and just like kASan and kCSan, it
is an excellent feature. It has already detected 38 uninitialized variables
in the kernel during my testing, which I have since discreetly fixed.

We use two shadows:
- "shad", to track uninitialized memory with a bit granularity (1:1).
Each bit set to 1 in the shad corresponds to one uninitialized bit of
real kernel memory.
- "orig", to track the origin of the memory with a 4-byte granularity
(1:1). Each uint32_t cell in the orig indicates the origin of the
associated uint32_t of real kernel memory.

The memory consumption of these shadows is consequent, so at least 4GB of
RAM is recommended to run kMSan.

The compiler inserts calls to specific __msan_* functions on each memory
access, to manage both the shad and the orig and detect uninitialized
memory accesses that change the execution flow (like an "if" on an
uninitialized variable).

We mark as uninit several types of memory buffers (stack, pools, kmem,
malloc, uvm_km), and check each buffer passed to copyout, copyoutstr,
bwrite, if_transmit_lock and DMA operations, to detect uninitialized memory
that leaves the system. This allows us to detect kernel info leaks in a way
that is more efficient and also more user-friendly than KLEAK.

Contrary to kASan, kMSan requires comprehensive coverage, ie we cannot
tolerate having one non-instrumented function, because this could cause
false positives. kMSan cannot instrument ASM functions, so I converted
most of them to __asm__ inlines, which kMSan is able to instrument. Those
that remain receive special treatment.

Contrary to kASan again, kMSan uses a TLS, so we must context-switch this
TLS during interrupts. We use different contexts depending on the interrupt
level.

The orig tracks precisely the origin of a buffer. We use a special encoding
for the orig values, and pack together in each uint32_t cell of the orig:
- a code designating the type of memory (Stack, Pool, etc), and
- a compressed pointer, which points either (1) to a string containing
the name of the variable associated with the cell, or (2) to an area
in the kernel .text section which we resolve to a symbol name + offset.

This encoding allows us not to consume extra memory for associating
information with each cell, and produces a precise output, that can tell
for example the name of an uninitialized variable on the stack, the
function in which it was pushed on the stack, and the function where we
accessed this uninitialized variable.

kMSan is available with LLVM, but not with GCC.

The code is organized in a way that is similar to kASan and kCSan, so it
means that other architectures than amd64 can be supported.
 1.63 04-Oct-2019  maxv Add DMA instrumentation in KASAN. We note the original buffer and length in
the map, and check the buffer on each bus_dmamap_sync. This allows us to
find DMA buffer overflows and UAFs, which couldn't be found before because
the device accesses to memory are outside of KASAN's control.
 1.62 23-Sep-2019  kamil Disable __NO_STRICT_ALIGNMENT on amd64/i386 for UBSan builds

This change allows to pick code paths in the kernel that are tuned for
alignment sensitive (and stricted in C meaning) code paths. In particular
the IPv6 code uses this heavily and skips whenever possible the process
of aligning of networking data.

With this modification all ATF tests are executed on amd64 without
triggering any UBSan reports in dmesg.

In theory __NO_STRICT_ALIGNMENT could be tuned for vax and m68k, however
these machines are still unsupported in LLVM sanitizers and syzkaller.

sys/netinet6/scope6.c:404:6, member access within misaligned address 0xfffffaea81276086 for type 'struct in6_addr' which requires 4 byte alignment
Reported-by: syzbot+a86f58d17685317b3df9@syzkaller.appspotmail.com

sys/net/rtsock_shared.c:629:41, member access within misaligned address 0xffffddb5db3ff04c for type 'struct rt_msghdr50' which requires 8 byte alignment
Reported-by: syzbot+0a3a022bc9d2b8880c16@syzkaller.appspotmail.com
 1.61 22-Sep-2019  maxv Fix KASAN on aarch64: the bus_space_* functions are macros, so we can't
redefine them. Introduce __HAVE_KASAN_INSTR_BUS, which indicates whether
to instrument the bus functions. Defined on amd64 only.
 1.60 06-Apr-2019  thorpej Overhaul the API used to fetch and store individual memory cells in
userspace. The old fetch(9) and store(9) APIs (fubyte(), fuword(),
subyte(), suword(), etc.) are retired and replaced with new ufetch(9)
and ustore(9) APIs that can return proper error codes, etc. and are
implemented consistently across all platforms. The interrupt-safe
variants are no longer supported (and several of the existing attempts
at fuswintr(), etc. were buggy and not actually interrupt-safe).

Also augmement the ucas(9) API, making it consistently available on
all plaforms, supporting uniprocessor and multiprocessor systems, even
those that do not have CAS or LL/SC primitives.

Welcome to NetBSD 8.99.37.
 1.59 11-Feb-2019  cherry We reorganise definitions for XEN source support as follows:

XEN - common sources required for baseline XEN support.
XENPV - sources required for support of XEN in PV mode.
XENPVHVM - sources required for support for XEN in HVM mode.
XENPVH - sources required for support for XEN in PVH mode.
 1.58 15-Nov-2018  riastradh Make the direct-map API always available, but fail if KASAN or rump.

(Only for architectures that support it at all; on others,
__HAVE_MM_MD_DIRECT_MAPPED_PHYS/IO are still undefined and the
functions unimplemented.)

This gives modules like zfs an opportunity to use it.

While here, fix the one caller of mm_md_direct_mapped_phys that
ignored the return value (and make sure to call pmap_kremove/update
before uvm_km_free).
 1.57 20-Aug-2018  maxv Add support for kASan on amd64. Written by me, with some parts inspired
from Siddharth Muralee's initial work. This feature can detect several
kinds of memory bugs, and it's an excellent feature.

It can be enabled by uncommenting these three lines in GENERIC:

#makeoptions KASAN=1 # Kernel Address Sanitizer
#options KASAN
#no options SVS

The kernel is compiled without SVS, without DMAP and without PCPU area.
A shadow area is created at boot time, and it can cover the upper 128TB
of the address space. This area is populated gradually as we allocate
memory. With this design the memory consumption is kept at its lowest
level.

The compiler calls the __asan_* functions each time a memory access is
done. We verify whether this access is legal by looking at the shadow
area.

We declare our own special memcpy/memset/etc functions, because the
compiler's builtins don't add the __asan_* instrumentation.

Initially all the mappings are marked as valid. During dynamic
allocations, we add a redzone, which we mark as invalid. Any access on
it will trigger a kASan error message. Additionally, the compiler adds
a redzone on global variables, and we mark these redzones as invalid too.
The illegal-access detection works with a 1-byte granularity.

For now, we cover three areas:

- global variables
- kmem_alloc-ated areas
- malloc-ated areas

More will come, but that's a good start.
 1.56 12-Jul-2018  maxv Remove the kernel PMC code. Sent yesterday on tech-kern@.

This change:

* Removes "options PERFCTRS", the associated includes, and the associated
ifdefs. In doing so, it removes several XXXSMPs in the MI code, which is
good.

* Removes the PMC code of ARM XSCALE.

* Removes all the pmc.h files. They were all empty, except for ARM XSCALE.

* Reorders the x86 PMC code not to rely on the legacy pmc.h file. The
definitions are put in sysarch.h.

* Removes the kern/sys_pmc.c file, and along with it, the sys_pmc_control
and sys_pmc_get_info syscalls. They are marked as OBSOL in kern,
netbsd32 and rump.

* Removes the pmc_evid_t and pmc_ctr_t types.

* Removes all the associated man pages. The sets are marked as obsolete.
 1.55 16-Mar-2018  maxv branches: 1.55.2;
Remove the __HAVE_CPU_UAREA_ROUTINES code from x86.

It was available only in amd64, and I disabled it a few months ago in
order to support SVS. Regardless of SVS this option was questionable,
since it made stack overflows more difficult to detect.
 1.54 11-Jan-2018  maxv branches: 1.54.2;
Declare new SVS_* variants: SVS_ENTER_NOSTACK and SVS_LEAVE_NOSTACK. Use
SVS_ENTER_NOSTACK in the syscall entry point, and put it before the code
that touches curlwp. (curlwp is located in the direct map.)

Then, disable __HAVE_CPU_UAREA_ROUTINES (to be removed later). This moves
the kernel stack into pmap_kernel(), and not the direct map. That's a
change I've always wanted to make: because of the direct map we can't add
a redzone on the stack, and basically, a stack overflow can go very far
in memory without being detected (as far as erasing all of the system's
memory).

Finally, unmap the direct map from userland.
 1.53 05-Jan-2018  maxv Add a __HAVE_PCPU_AREA option, enabled by default on native amd64 but not
Xen.

With this option, the CPU structures that must always be present in the
CPU's page tables are moved on L4 slot 384, which means address
0xffffc00000000000.

A new pcpu_area structure is defined. It contains shared structures (IDT,
LDT), and then an array of pcpu_entry structures, indexed by cpu_index(ci).
Theoretically the LDT should be in the array, but this will be done later.

During the boot procedure, cpu0 calls pmap_init_pcpu, which creates a
page tree that is able to map the pcpu_area structure entirely. cpu0 then
immediately maps the shared structures. Later, every CPU goes through
cpu_pcpuarea_init, which allocates physical pages and kenters the relevant
pcpu_entry to them. Finally, each pointer is replaced to point to pcpuarea.

The point of this change is to make sure that the structures that must
always be present in the page tables have their own L4 slot. Until now
their L4 slot was that of pmap_kernel, and making a distinction between
what must be mapped and what does not need to be was complicated.

Even in the non-speculative-bug case this change makes some sense: there
are several x86 instructions that leak the addresses of the CPU structures,
and putting these structures inside pmap_kernel actually offered a way to
compute the address of the kernel heap - which would have made ASLR on it
plainly useless, had we implemented that.

Note that, for now, pcpuarea does not contain rsp0.

Unfortunately this change adds many #ifdefs, and makes the code harder to
understand. There is also some duplication, but that will be solved later.
 1.52 26-Jan-2017  christos branches: 1.52.6;
provide __HAVE_COMPAT_NETBSD32 and fix multiple include protection consistently.
 1.51 27-Feb-2016  tls branches: 1.51.2; 1.51.4;
Add cpu_rng, a framework for simple on-CPU random number generators.
 1.50 23-Jan-2016  christos expose the kernel types for standalone code.
 1.49 23-Jan-2016  christos Hide {p,v}{addr,size}_t and register_t (and a couple more types that
are machine-specific) from userland unless _KERNEL/_KMEMUSER and a
new _KERNTYPES variables is defined. The _KERNTYPES should be fixed
for many subsystems that should not be using it (rump)...
 1.48 27-Aug-2015  pooka Fix PTHREAD_FOO_INITIALIZER for C++ by not using volatile in the relevant
pthread types in C++ builds, attempt 2.

The problem with attempt 1 was making assumptions of what the MD
__cpu_simple_lock_t (declared volatile) looks like. To get a same type
except non-volatile, we change the MD type to __cpu_simple_lock_nv_t
and typedef __cpu_simple_lock_t as a volatile __cpu_simple_lock_nv_t.
IMO, __cpu_simple_lock_t should not be volatile at all, but changing it
now is too risky.

Fixes at least Rumprun w/ gcc 5.1/5.2. Furthermore, the mpd application
(and possibly others) will no longer require NetBSD-specific patches.

Tested: build.sh for i386, Rumprun for x86_64 w/ gcc 5.2.

Based on the patch from Christos in lib/49989.
 1.47 21-Aug-2015  pooka Make it possible to explicitly disable MSI/MSIX with NO_PCI_MSI_MSIX.

Some platforms, e.g. linux uio-pci-generic, do not support MSI at all.

XXX: does MSI being defined intentionally depend on _KERNEL_OPT on amd64
but not i386?
 1.46 27-Apr-2015  knakahara add x86 MD MSI/MSI-X support code.
 1.45 03-Apr-2014  christos branches: 1.45.6;
we have cpu_bootconf()
 1.44 20-Mar-2014  christos make pmc compile with amd64
 1.43 01-Dec-2013  christos revert fpu/pcu changes until we figure out what's wrong; they cause random
freezes
 1.42 23-Oct-2013  drochner Use the MI "pcu" framework for bookkeeping of npx/fpu states on x86.
This reduces the amount of MD code enormously, and makes it easier
to implement support for newer CPU features which require more fpu
state, or for fpu usage by the kernel.
For access to FPU state across CPUs, an xcall kthread is used now
rather than a dedicated IPI.
No user visible changes intended.
 1.41 21-Jan-2012  chs branches: 1.41.6; 1.41.10;
allocate uareas contiguously and access them via the direct map.
 1.40 04-Dec-2011  chs map all of physical memory using large pages.
ported from openbsd years ago by Murray Armfield,
updated for changes since then by me.
 1.39 06-Jul-2011  dyoung branches: 1.39.2; 1.39.6;
Implement bus_space_tag_create() and _destroy().

Factor bus_space_reserve(), bus_space_release(), et cetera out of
bus_space_alloc(), bus_space_map(), bus_space_free(), bus_space_unmap(),
et cetera.

For i386 and amd64, activate the use of <machine/bus_defs.h> and
<machine/bus_funcs.h> by #defining __HAVE_NEW_STYLE_BUS_H in
their respective types.h. While I'm here, remove unnecessary
__HAVE_DEVICE_REGISTER #defines.
 1.38 12-Jun-2011  rmind Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.
 1.37 12-Mar-2011  joerg branches: 1.37.2;
Add TLS support for AMD64, i386 and SH3.

This material is based upon work partially supported by
The NetBSD Foundation under a contract with Joerg Sonnenberger.
 1.36 24-Feb-2011  joerg Allow storing and receiving the LWP private pointer via ucontext_t
on all platforms except VAX and IA64. Add fast access via register for
AMD64, i386 and SH3 ports. Use this fast access in libpthread to replace
the stack based pthread_self(). Implement skeleton support for Alpha,
HPPA, PowerPC, SPARC and SPARC64, but leave it disabled.

Ports that support this feature provide __HAVE____LWP_GETPRIVATE_FAST in
machine/types.h and a corresponding __lwp_getprivate_fast in
machine/mcontext.h.

This material is based upon work partially supported by
The NetBSD Foundation under a contract with Joerg Sonnenberger.
 1.35 22-Dec-2010  njoly branches: 1.35.2; 1.35.4;
__HAVE_CPU_INFO_FIRST -> __HAVE_CPU_DATA_FIRST.
 1.34 22-Dec-2010  christos Make __HAVE_CPU_DATA_FIRST true
 1.33 11-Dec-2009  matt branches: 1.33.4;
Add PRIx{P,V}{ADDR,SIZE}, PRIu{P,V}SIZE, and PRIxREGISTER{,32,64} for all
(except where they will be added via merge). These should be used to print
{p,v}{addr,size}_t and register*_t as appropriate.
 1.32 19-Apr-2009  ad cpuctl:

- Add interrupt shielding (direct hardware interrupts away from the
specified CPUs). Not documented just yet but will be soon.

- Redo /dev/cpu time_t compat so no kernel changes are needed.

x86:

- Make intr_establish, intr_disestablish safe to use when !cold.

- Distribute hardware interrupts among the CPUs, instead of directing
everything to the boot CPU.

- Add MD code for interrupt sheilding. This works in most cases but there is
a bug where delivery is not accepted by an LAPIC after redistribution. It
also needs re-balancing to make things fair after interrupts are turned
back on for a CPU.
 1.31 05-Apr-2009  tsutsui Use #define<tab> consistently.
 1.30 05-Apr-2009  tsutsui Remove __HAVE_UFS2_BOOT since it belongs to sysinst for now.
"Feel free to change it" by ad@.
 1.29 04-Apr-2009  ad +__HAVE_UFS2_BOOT
 1.28 29-Mar-2009  ad _lwp_setprivate: provide the value to MD code if a hook is present.

This will be used to support TLS. The MD method must match the ELF TLS spec
for that CPU architecture (if there is a spec).

At this time it is only implemented for i386, where it means setting the
per-thread base address for %gs. Please implement this for your platform!
 1.27 26-Oct-2008  mrg branches: 1.27.2; 1.27.8;
put the contents of these header files around #ifdef __x86_64__, and
#include the <i386/foo.h> in the #else clause, making these files
largely bit-size independant.
 1.26 21-Feb-2008  ad branches: 1.26.2; 1.26.6; 1.26.12;
#define __HAVE_ATOMIC_AS_MEMBAR, indicating that:

- atomic_cas_ni() does an implicit membar_exit()
- all other atomic operations do an implicit membar_sync()

While this might seem kind of arbitrary it's the basis for some important
optimizations.
 1.25 20-Jan-2008  joerg branches: 1.25.2;
Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.
 1.24 15-Jan-2008  joerg Introduce optional cpu_offline_md to execute MD actions at the end of
cpu_offline. Use this on amd64/i386 to force a FPU save. As this was
triggered by npxsave_cpu/fpusave_cpu not working for a different CPU,
remove the cpu_info argument and adjust npxsave_*/fpusave_* to use bool
for the save.

OK ad@
 1.23 08-Jan-2008  joerg Switch Xen to generic TODR. Tested by Manuel Bouyer.
 1.22 05-Jan-2008  yamt remove no longer necessary cpu_maxproc.
 1.21 29-Nov-2007  ad branches: 1.21.6;
__HAVE_ATOMIC64_OPS
 1.20 23-Nov-2007  bouyer Include opt_xen.h #ifdef _KERNEL_OPT instead of custom logic.
Thanks to Izumi Tsutsui for pointing me at _KERNEL_OPT
 1.19 22-Nov-2007  bouyer Fix bouyer-xenamd64 merge fallout:
we can #include "opt_xen.h" when
#if defined(_KERNEL) && !defined(_RUMPKERNEL) && !defined(_LKM),
#ifdef _KERNEL isn't enough.
 1.18 22-Nov-2007  bouyer Pull up the bouyer-xenamd64 branch to HEAD. This brings in amd64 support
to NetBSD/Xen, both Dom0 and DomU.
 1.17 17-Oct-2007  garbled branches: 1.17.2;
Merge the ppcoea-renovation branch to HEAD.

This branch was a major cleanup and rototill of many of the various OEA
cpu based PPC ports that focused on sharing as much code as possible
between the various ports to eliminate near-identical copies of files in
every tree. Additionally there is a new PIC system that unifies the
interface to interrupt code for all different OEA ppc arches. The work
for this branch was done by a variety of people, too long to list here.

TODO:
bebox still needs work to complete the transition to -renovation.
ofppc still needs a bunch of work, which I will be looking at.
ev64260 still needs to be renovated
amigappc was not attempted.

NOTES:
pmppc was removed as an arch, and moved to a evbppc target.
 1.16 14-Jul-2007  ad branches: 1.16.8; 1.16.10; 1.16.14;
Generic soft interrupts are mandatory.
 1.15 09-Feb-2007  ad branches: 1.15.6; 1.15.12; 1.15.14;
Merge newlock2 to head.
 1.14 03-Sep-2006  perry branches: 1.14.2;
temporarily turn on "__HAVE_GENERIC_TODR"
 1.13 03-Sep-2006  bjh21 Nothing in the kernel now tests __HAVE_NWSCONS, so stop defining it everywhere.
 1.12 07-Jun-2006  kardel convert to timecounters (from branch simonb-timecounters)
 1.11 24-Dec-2005  perry branches: 1.11.4; 1.11.6; 1.11.8; 1.11.14;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.
 1.10 11-Dec-2005  christos merge ktrace-lwp.
 1.9 26-Mar-2004  drochner branches: 1.9.16;
nothing cares about __HAVE_SIGINFO anymore, so nuke it
 1.8 21-Jan-2004  mrg back out previous; it was only required for a dead function.
 1.7 20-Jan-2004  jdolecek add register64_t which appears to be necessary for COMPAT_NETBSD32 nowadays
 1.6 18-Jan-2004  martin Do not export __HAVE_RAS to userland. Applications are supposed to try
rasctl() and detect failure with EOPNOTSUPP.
 1.5 06-Oct-2003  fvdl SIGINFO support.
Todo: 32bit compat support (COMPAT_NETBSD32 will not compile right now,
as it won't on other platforms).
 1.4 26-Sep-2003  nathanw Move __cpu_simple_lock_t and __SIMPLELOCK_{UN,}LOCKED to machine/types.h
so that they can be used in a namespace-friendly way.
 1.3 07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.2 28-Apr-2003  bjh21 branches: 1.2.2;
Add a new feature-test macro, _NETBSD_SOURCE. If this is defined
by the application, all NetBSD interfaces are made visible, even
if some other feature-test macro (like _POSIX_C_SOURCE) is defined.
<sys/featuretest.h> defined _NETBSD_SOURCE if none of _ANSI_SOURCE,
_POSIX_C_SOURCE and _XOPEN_SOURCE is defined, so as to preserve
existing behaviour.

This has two major advantages:
+ Programs that require non-POSIX facilities but define _POSIX_C_SOURCE
can trivially be overruled by putting -D_NETBSD_SOURCE in their CFLAGS.
+ It makes most of the #ifs simpler, in that they're all now ORs of the
various macros, rather than having checks for (!defined(_ANSI_SOURCE) ||
!defined(_POSIX_C_SOURCE) || !defined(_XOPEN_SOURCE)) all over the place.

I've tried not to change the semantics of the headers in any case where
_NETBSD_SOURCE wasn't defined, but there were some places where the
current semantics were clearly mad, and retaining them was harder than
correcting them. In particular, I've mostly normalised things so that
_ANSI_SOURCE gets you the smallest set of stuff, then _POSIX_C_SOURCE,
_XOPEN_SOURCE and _NETBSD_SOURCE in that order.

Tested by building for vax, encouraged by thorpej, and uncontested in
tech-userlevel for a week.
 1.1 26-Apr-2003  fvdl Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.2.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.2.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.2.2.1 03-Aug-2004  skrll Sync with HEAD
 1.9.16.9 27-Feb-2008  yamt sync with head.
 1.9.16.8 21-Jan-2008  yamt remove __HAVE_LAZY_MBUF for now as it's incompatible with in_cksum.S.
 1.9.16.7 21-Jan-2008  yamt sync with head
 1.9.16.6 07-Dec-2007  yamt sync with head
 1.9.16.5 03-Sep-2007  yamt sync with head.
 1.9.16.4 26-Feb-2007  yamt sync with head.
 1.9.16.3 30-Dec-2006  yamt sync with head.
 1.9.16.2 21-Jun-2006  yamt sync with head.
 1.9.16.1 07-Jul-2005  yamt define __HAVE_LAZY_MBUF for i386 and amd64.
 1.11.14.1 19-Jun-2006  chap Sync with head.
 1.11.8.2 14-Sep-2006  yamt sync with head.
 1.11.8.1 26-Jun-2006  yamt sync with head.
 1.11.6.1 30-Apr-2006  kardel define __HAVE_TIMECOUNER -> switch to timcounter for amd64
 1.11.4.1 09-Sep-2006  rpaulo sync with head
 1.14.2.1 29-Dec-2006  ad Checkpoint work in progress.
 1.15.14.1 03-Oct-2007  garbled Sync with HEAD
 1.15.12.1 17-Apr-2007  thorpej amd64 has 64-bit atomic ops
 1.15.6.2 03-Dec-2007  ad Sync with HEAD.
 1.15.6.1 15-Jul-2007  ad Sync with head.
 1.16.14.1 17-Oct-2007  bouyer amd64 (aka x86-64) support for Xen. Based on the OpenBSD port done by
Mathieu Ropert in 2006.
DomU-only for now. An INSTALL_XEN3_DOMU kernel with a ramdisk will boot to
sysinst if you're lucky. Often it panics because a runable LWP has
a NULL stack (really, it's all of l->l_addr which is has been zeroed out
while the process was on the queue !)
TODO:
- bug fixes :)
- Most of the xpq_* functions should be shared with xen/i386
- The xen/i386 assembly bootstrap code should be remplaced with the C
version in xenamd64/amd64/xpmap.c
- see if a config(5) trick could allow to merge xenamd64 back to xen or amd64.
 1.16.10.3 23-Mar-2008  matt sync with HEAD
 1.16.10.2 09-Jan-2008  matt sync with HEAD
 1.16.10.1 06-Nov-2007  matt sync with HEAD
 1.16.8.2 03-Dec-2007  joerg Sync with HEAD.
 1.16.8.1 27-Nov-2007  joerg Sync with HEAD. amd64 Xen support needs testing.
 1.17.2.2 18-Feb-2008  mjf Sync with HEAD.
 1.17.2.1 08-Dec-2007  mjf Sync with HEAD.
 1.21.6.4 23-Jan-2008  bouyer Sync with HEAD.
 1.21.6.3 19-Jan-2008  bouyer Sync with HEAD
 1.21.6.2 10-Jan-2008  bouyer Sync with HEAD
 1.21.6.1 08-Jan-2008  bouyer Sync with HEAD
 1.25.2.1 24-Mar-2008  keiichi sync with head.
 1.26.12.1 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.26.6.2 11-Mar-2010  yamt sync with head
 1.26.6.1 04-May-2009  yamt sync with head.
 1.26.2.2 17-Jan-2009  mjf Sync with HEAD.
 1.26.2.1 21-Feb-2008  mjf file types.h was added on branch mjf-devfs2 on 2009-01-17 13:27:49 +0000
 1.27.8.6 27-Aug-2011  jym Sync with HEAD. Most notably: uvm/pmap work done by rmind@, and MP Xen
work of cherry@.

No regression observed on suspend/restore.
 1.27.8.5 28-Mar-2011  jym Sync with HEAD. TODO before merge:
- shortcut for suspend code in sysmon, when powerd(8) is not running.
Borrow ``xs_watch'' thread context?
- bug hunting in xbd + xennet resume. Rings are currently thrashed upon
resume, so current implementation force flush them on suspend. It's not
really needed.
 1.27.8.4 10-Jan-2011  jym Sync with HEAD
 1.27.8.3 24-Oct-2010  jym Sync with HEAD
 1.27.8.2 01-Nov-2009  jym Sync with HEAD.
 1.27.8.1 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.27.2.1 28-Apr-2009  skrll Sync with HEAD.
 1.33.4.3 21-Apr-2011  rmind sync with head
 1.33.4.2 05-Mar-2011  rmind sync with head
 1.33.4.1 18-Mar-2010  rmind Unify /dev/{mem,kmem,zero,null} implementations in MI code. Based on patch
from Joerg Sonnenberger, proposed on tech-kern@, in February 2008.

Work and depression still in progress.
 1.35.4.1 05-Mar-2011  bouyer Sync with HEAD
 1.35.2.1 06-Jun-2011  jruoho Sync with HEAD.
 1.37.2.1 23-Jun-2011  cherry Catchup with rmind-uvmplock merge.
 1.39.6.1 18-Feb-2012  mrg merge to -current.
 1.39.2.2 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.39.2.1 17-Apr-2012  yamt sync with head
 1.41.10.1 18-May-2014  rmind sync with head
 1.41.6.2 03-Dec-2017  jdolecek update from HEAD
 1.41.6.1 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.45.6.4 05-Feb-2017  skrll Sync with HEAD
 1.45.6.3 19-Mar-2016  skrll Sync with HEAD
 1.45.6.2 22-Sep-2015  skrll Sync with HEAD
 1.45.6.1 06-Jun-2015  skrll Sync with HEAD
 1.51.4.1 21-Apr-2017  bouyer Sync with HEAD
 1.51.2.1 20-Mar-2017  pgoyette Sync with HEAD
 1.52.6.2 17-Mar-2018  martin Pull up the following revisions, requested by maxv in ticket #637:

sys/arch/amd64/amd64/process_machdep.c 1.33,1.34,1.35 (patch)
sys/arch/amd64/include/types.h 1.55 (patch)
sys/arch/x86/x86/vm_machdep.c 1.33 (patch)

- Reduce the number of places where segment register faults can
occur.
- Remove __HAVE_CPU_UAREA_ROUTINES.
 1.52.6.1 16-Mar-2018  martin Pull up the following revisions (via patch), requested by maxv in #635:

sys/arch/amd64/amd64/gdt.c 1.39-1.45 (patch)
sys/arch/amd64/amd64/amd64/machdep.c 1.284,1.287,1.288 (patch)
sys/arch/amd64/amd64/include/param.h 1.23 (patch)
sys/arch/amd64/include/types.h 1.53 (patch)
sys/arch/x86/include/cpu.h 1.87 (patch)
sys/arch/x86/include/pmap.h 1.73,1.74 (patch)
sys/arch/x86/x86/cpu.c 1.142 (patch)
sys/arch/x86/x86/intr.c 1.117 (partial),1.120 (patch)
sys/arch/x86/x86/pmap.c 1.276 (patch)

Initialize ist0 in cpu_init_tss.
Backport __HAVE_PCPU_AREA.
 1.54.2.4 26-Nov-2018  pgoyette Sync with HEAD, resolve a couple of conflicts
 1.54.2.3 06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.54.2.2 28-Jul-2018  pgoyette Sync with HEAD
 1.54.2.1 22-Mar-2018  pgoyette Synch with HEAD, resolve conflicts
 1.55.2.4 21-Apr-2020  martin Sync with HEAD
 1.55.2.3 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.55.2.2 08-Apr-2020  martin Merge changes from current as of 20200406
 1.55.2.1 10-Jun-2019  christos Sync with HEAD
 1.65.2.1 20-Apr-2020  bouyer Sync with HEAD
 1.69.2.1 03-Apr-2021  thorpej Sync with HEAD.
 1.70.2.1 03-Apr-2021  thorpej Sync with HEAD.
 1.71.22.1 02-Aug-2025  perseant Sync with HEAD
 1.13 26-Jul-2018  maxv Rework dbregs, to switch the registers during context switches, and not on
each user->kernel transition via userret. Reloads of DR6/DR7 are expensive
on both native and xen.
 1.12 23-Feb-2017  kamil branches: 1.12.12; 1.12.14;
Introduce PT_GETDBREGS and PT_SETDBREGS in ptrace(2) on i386 and amd64

This interface is modeled after FreeBSD API with the usage.

This replaced previous watchpoint API. The previous one was introduced
recently in NetBSD-current and remove its spurs without any
backward-compatibility.

Design choices for Debug Register accessors:
- exec() (TRAP_EXEC event) must remove debug registers from LWP
- debug registers are only per-LWP, not per-process globally
- debug registers must not be inherited after (v)forking a process
- debug registers must not be inherited after forking a thread
- a debugger is responsible to set global watchpoints/breakpoints with the
debug registers, to achieve this PTRACE_LWP_CREATE/PTRACE_LWP_EXIT event
monitoring function is designed to be used
- debug register traps must generate SIGTRAP with si_code TRAP_DBREG
- debugger is responsible to retrieve debug register state to distinguish
the exact debug register trap (DR6 is Status Register on x86)
- kernel must not remove debug register traps after triggering a trap event
a debugger is responsible to detach this trap with appropriate PT_SETDBREGS
call (DR7 is Control Register on x86)
- debug registers must not be exposed in mcontext
- userland must not be allowed to set a trap on the kernel

Implementation notes on i386 and amd64:
- the initial state of debug register is retrieved on boot and this value is
stored in a local copy (initdbregs), this value is used to initialize dbreg
context after PT_GETDBREGS
- struct dbregs is stored in pcb as a pointer and by default not initialized
- reserved registers (DR4-DR5, DR9-DR15) are ignored

Further ideas:
- restrict this interface with securelevel

Tested on real hardware i386 (Intel Pentium IV) and amd64 (Intel i7).

This commit enables 390 debug register ATF tests in kernel/arch/x86.
All tests are passing.

This commit does not cover netbsd32 compat code. Currently other interface
PT_GET_SIGINFO/PT_SET_SIGINFO is required in netbsd32 compat code in order to
validate reliably PT_GETDBREGS/PT_SETDBREGS.

This implementation does not cover FreeBSD specific defines in their
<x86/reg.h>: DBREG_DR7_LOCAL_ENABLE, DBREG_DR7_GLOBAL_ENABLE, DBREG_DR7_LEN_1
etc. These values tend to be reinvented by each tracer on its own. GNU
Debugger (GDB) works with NetBSD debug registers after adding this patch:

--- gdb/amd64bsd-nat.c.orig 2016-02-10 03:19:39.000000000 +0000
+++ gdb/amd64bsd-nat.c
@@ -167,6 +167,10 @@ amd64bsd_target (void)

#ifdef HAVE_PT_GETDBREGS

+#ifndef DBREG_DRX
+#define DBREG_DRX(d,x) ((d)->dr[(x)])
+#endif
+
static unsigned long
amd64bsd_dr_get (ptid_t ptid, int regnum)
{


Another reason to stop introducing unpopular defines covering machine
specific register macros is that these value varies across generations of
the same CPU family.

GDB demo:
(gdb) c
Continuing.

Watchpoint 2: traceme

Old value = 0
New value = 16
main (argc=1, argv=0x7f7fff79fe30) at test.c:8
8 printf("traceme=%d\n", traceme);

(Currently the GDB interface is not reliable due to NetBSD support bugs)

Sponsored by <The NetBSD Foundation>
 1.11 16-Jan-2017  kamil Allow to mix single-step with hardware assisted watchpoints on amd64

This case needs new handling in trap recognition.

Sponsored by <The NetBSD Foundation>
 1.10 15-Dec-2016  kamil branches: 1.10.2;
Add support for hardware assisted watchpoints/breakpoints API in ptrace(2)

Add new ptrace(2) calls:
- PT_COUNT_WATCHPOINTS - count the number of available hardware watchpoints
- PT_READ_WATCHPOINT - read struct ptrace_watchpoint from the kernel state
- PT_WRITE_WATCHPOINT - write new struct ptrace_watchpoint state, this
includes enabling and disabling watchpoints

The ptrace_watchpoint structure contains MI and MD parts:

typedef struct ptrace_watchpoint {
int pw_index; /* HW Watchpoint ID (count from 0) */
lwpid_t pw_lwpid; /* LWP described */
struct mdpw pw_md; /* MD fields */
} ptrace_watchpoint_t;

For example amd64 defines MD as follows:
struct mdpw {
void *md_address;
int md_condition;
int md_length;
};

These calls are protected with the __HAVE_PTRACE_WATCHPOINTS guard.

Tested on amd64, initial support added for i386 and XEN.

Sponsored by <The NetBSD Foundation>
 1.9 28-Apr-2008  martin branches: 1.9.44; 1.9.64; 1.9.68;
Remove clause 3 and 4 from TNF licenses
 1.8 09-Feb-2007  ad branches: 1.8.44; 1.8.46; 1.8.48;
Merge newlock2 to head.
 1.7 15-Apr-2006  simonb branches: 1.7.8;
The comment says this is the same as the i386 counterpart, so catch
up on de-__P and ANSIficiation that i386 has.
 1.6 16-Feb-2006  perry branches: 1.6.2; 1.6.4; 1.6.6;
Change "inline" back to "__inline" in .h files -- C99 is still too
new, and some apps compile things in C89 mode. C89 keywords stay.

As per core@.
 1.5 24-Dec-2005  perry branches: 1.5.2; 1.5.4; 1.5.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.
 1.4 11-Dec-2005  christos merge ktrace-lwp.
 1.3 31-Oct-2003  cl branches: 1.3.16;
Reduce code duplication by adding mi_userret() in sys/userret.h
containing signal posting, kernel-exit handling and sa_upcall processing.

XXX the pc532, sparc, sparc64 and vax ports should have their
XXX userret() code rearranged to use this.
 1.2 07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.1 26-Apr-2003  fvdl branches: 1.1.2;
Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.1.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.1.2.1 03-Aug-2004  skrll Sync with HEAD
 1.3.16.2 26-Feb-2007  yamt sync with head.
 1.3.16.1 21-Jun-2006  yamt sync with head.
 1.5.6.1 22-Apr-2006  simonb Sync with head.
 1.5.4.1 09-Sep-2006  rpaulo sync with head
 1.5.2.1 18-Feb-2006  yamt sync with head.
 1.6.6.1 24-May-2006  tron Merge 2006-05-24 NetBSD-current into the "peter-altq" branch.
 1.6.4.1 19-Apr-2006  elad sync with head - hopefully this will work
 1.6.2.1 24-May-2006  yamt sync with head.
 1.7.8.1 20-Oct-2006  ad Do the priority adjustment in mi_userret().
 1.8.48.1 16-May-2008  yamt sync with head.
 1.8.46.1 18-May-2008  yamt sync with head.
 1.8.44.1 02-Jun-2008  mjf Sync with HEAD.
 1.9.68.2 20-Mar-2017  pgoyette Sync with HEAD
 1.9.68.1 07-Jan-2017  pgoyette Sync with HEAD. (Note that most of these changes are simply $NetBSD$
tag issues.)
 1.9.64.2 28-Aug-2017  skrll Sync with HEAD
 1.9.64.1 05-Feb-2017  skrll Sync with HEAD
 1.9.44.1 03-Dec-2017  jdolecek update from HEAD
 1.10.2.1 21-Apr-2017  bouyer Sync with HEAD
 1.12.14.1 10-Jun-2019  christos Sync with HEAD
 1.12.12.1 28-Jul-2018  pgoyette Sync with HEAD
 1.5 17-Jul-2011  joerg Retire varargs.h support. Move machine/stdarg.h logic into MI
sys/stdarg.h and expect compiler to provide proper builtins, defaulting
to the GCC interface. lint still has a special fallback.
Reduce abuse of _BSD_VA_LIST_ by defining __va_list by default and
derive va_list as required by standards.
 1.4 26-Oct-2008  mrg branches: 1.4.8;
put the contents of these header files around #ifdef __x86_64__, and
#include the <i386/foo.h> in the #else clause, making these files
largely bit-size independant.
 1.3 11-Dec-2005  christos branches: 1.3.74; 1.3.78; 1.3.84;
merge ktrace-lwp.
 1.2 07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.1 26-Apr-2003  fvdl branches: 1.1.2;
Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.1.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.1.2.1 03-Aug-2004  skrll Sync with HEAD
 1.3.84.1 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.3.78.1 04-May-2009  yamt sync with head.
 1.3.74.1 17-Jan-2009  mjf Sync with HEAD.
 1.4.8.1 27-Aug-2011  jym Add/remove files, like in HEAD.
 1.55 20-Aug-2022  riastradh x86: Split most of pmap.h into pmap_private.h or vmparam.h.

This way pmap.h only contains the MD definition of the MI pmap(9)
API, which loads of things in the kernel rely on, so changing x86
pmap internals no longer requires recompiling the entire kernel every
time.

Callers needing these internals must now use machine/pmap_private.h.
Note: This is not x86/pmap_private.h because it contains three parts:

1. CPU-specific (different for i386/amd64) definitions used by...

2. common definitions, including Xenisms like xpmap_ptetomach,
further used by...

3. more CPU-specific inlines for pmap_pte_* operations

So {amd64,i386}/pmap_private.h defines 1, includes x86/pmap_private.h
for 2, and then defines 3. Maybe we should split that out into a new
pmap_pte.h to reduce this trouble.

No functional change intended, other than that some .c files must
include machine/pmap_private.h when previously uvm/uvm_pmap.h
polluted the namespace with pmap internals.

Note: This migrates part of i386/pmap.h into i386/vmparam.h --
specifically the parts that are needed for several constants defined
in vmparam.h:

VM_MAXUSER_ADDRESS
VM_MAX_ADDRESS
VM_MAX_KERNEL_ADDRESS
VM_MIN_KERNEL_ADDRESS

Since i386 needs PDP_SIZE in vmparam.h, I added it there on amd64
too, just to keep things parallel.
 1.54 26-Nov-2020  christos make the max text size the same as the max data size
 1.53 06-Oct-2020  christos branches: 1.53.2;
GC unused MAXTSIZ32
 1.52 22-Jan-2020  ad Move the UBC defaults into vmparam.h
 1.51 11-Feb-2019  cherry branches: 1.51.6;
We reorganise definitions for XEN source support as follows:

XEN - common sources required for baseline XEN support.
XENPV - sources required for support of XEN in PV mode.
XENPVHVM - sources required for support for XEN in HVM mode.
XENPVH - sources required for support for XEN in PVH mode.
 1.50 17-Jan-2019  maxv Increase VM_PHYSSEG_MAX from 32 to 64. Saw an example on tech-kern@ of a
heavily fragmented memory map.
 1.49 29-Oct-2018  maya Make VM_MIN_KERNEL_ADDRESS and others available in the _KMEMUSER case
as well. This affects ddb. Tested by htodd.
 1.48 28-Oct-2018  maxv Add #ifdef _KERNEL, vaddr_t does not exist in userland, and we don't want
externs anyway.
 1.47 12-Aug-2018  maxv Randomize the main memory on Xen, same as native. Tested on amd64-dom0.
 1.46 12-Aug-2018  maxv More ASLR: randomize the kernel main memory. VM_MIN_KERNEL_ADDRESS becomes
variable, and its location is chosen at boot time. There is room for
improvement, since for now we ask for an alignment of NBPD_L4.

This is enabled by default in GENERIC, but not in Xen. Tested extensively
on GENERIC and GENERIC_KASLR, XEN3_DOM0 still boots fine.
 1.45 13-Nov-2017  wiz branches: 1.45.2; 1.45.4;
Remove superfluous word in comment. Noted by Geoff Wing.
 1.44 11-Nov-2017  mrg bump PAGER_MAP_DEFAULT_SIZE to 512MB. this should allow more
concurrent IOs to be possible, and i'm unable to see pager_map
contention any more.

other larger platforms should probably do this too.

ok chs@.
 1.43 24-Jun-2017  joerg Update VM_DEFAULT_ADDRESS32_TOPDOWN to include guard area.
 1.42 23-Jun-2017  joerg Recommit exec_subr.c revision 1.79:
Always include a 1MB guard area beyond the end of stack. While ASLR will
normally create a guard area as well, this provides a deterministic area
for all binaries.

Mitigates the rest of CVE-2017-1000374 and CVE-2017-1000375 from
Qualys.

Additionally, change VM_DEFAULT_ADDRESS_TOPDOWN to include
user_stack_guard_size in the size reservation.
 1.41 17-Jun-2017  maxv Increase the kernel heap size from 512GB to 32TB, in such a way that it
is able to map the maximum amount of ram supported twice (16TB x 2).
 1.40 15-Jun-2017  maxv Correct these values. They must be consistent with NKL4_MAX_ENTRIES,
otherwise the kernel thinks it has ~126TB of va while pmap knows it
has only 512GB.
 1.39 11-Feb-2017  maxv branches: 1.39.6;
Remove VM_MAX_KERNEL_BUF (unused). Looks like several other ports could
do the same.
 1.38 19-Nov-2016  maxv branches: 1.38.2;
Put a one-page redzone between userland and the PTE space on amd64 and
i386.

The PTE space is a critical region that maps the page tree, and bugs have
been found in both amd64 and i386 where the kernel would wrongly overflow
userland data on this area. This kind of bug is terrible, since it allows
userland to overwrite some entries of the page tree, which makes it easy
to patch the kernel text and get ring0 privileges.
 1.37 07-Aug-2016  dholland Remove unused <sys/tree.h>.
 1.36 24-Jul-2014  riastradh branches: 1.36.4; 1.36.8;
Add a FIRST1G page freelist to x86, for old graphics devices.
 1.35 12-Jun-2014  riastradh Tweak x86 page freelists and add x86_select_freelist.

- Add 4G freelist to i386 -- there may be higher addresses if PAE.
- Add 64G and 1T freelists to amd64.
- Simplify freelist setup code and condense it into a table.
- Add x86_select_freelist to get a freelist guaranteed to yield
addresses no greater than a prescribed maximum address.

x86_select_freelist takes a uint64_t, not a paddr_t or bus_addr_t, so
that you can pass in, e.g., a 36-bit maximum address without needing
to write conditionals for i386/PAE.

No objections on port-x86:

https://mail-index.netbsd.org/port-i386/2014/05/21/msg003277.html
https://mail-index.netbsd.org/port-amd64/2014/05/21/msg002062.html
 1.34 25-Jan-2014  christos branches: 1.34.2;
delete VM_DEFAULT_ADDRESS; some of those should be GC'ed because they match
the default definition.
 1.33 25-Jan-2014  christos provide propert address defaults for topdown and bottomup allocation
 1.32 13-Nov-2012  chs branches: 1.32.2;
bump VM_PHYSSEG_MAX to 32, we've seen a system where 16 wasn't enough.
 1.31 15-Aug-2012  sborrill branches: 1.31.2;
Bump VM_PHYSSEG_MAX to 16 from 10. Modern IBM hardware requires
VM_PHYSSEG_MAX to be turned up to 11 to avoid an early panic.
 1.30 07-May-2012  joerg Raise per-image text size limit to 256MB. 64MB has seen already, so
provide some margin of grows.
 1.29 10-Jan-2012  chs branches: 1.29.2;
reduce VM_MAX_KERNEL_ADDRESS so that it does not include
the direct-map or APTE regions.
 1.28 24-Nov-2011  christos branches: 1.28.2;
Bump text size to 128MB to make sure that gcc46 fits. It exceeded 64MB by
a tiny bit.
 1.27 04-Mar-2011  christos branches: 1.27.4;
Revert max stack size change. This is not used anymore for 32 bit binaries.
 1.26 04-Mar-2011  joerg Reduce MAXSSIZ to 64MB, otherwise netbsd32 binaries crash in ld.elf_so,
including the trivial main(){}. Add a warning to not modify this without
testing compatibility mode.
 1.25 17-Feb-2011  drochner make stack size limit (both initial and maximum) for native code
the double of that in 32-but emul mode, so that code which works
in emulation (or on the i386 port) will likely not overflow the
stack if built as native 64-bit program
This is still very conservative.
(before, the max stack size was natively even less than for 32bit emul)
 1.24 14-Nov-2010  uebayasi branches: 1.24.2; 1.24.4;
Move struct vm_page_md definition from vmparam.h to pmap.h, because
it's used only by pmap. vmparam.h has definitions for wider
audience.

All GENERIC kernels build tested, except ia64.

powerpc/include/booke/vmparam.h has one too, but it has no pmap.h,
so it's left as is.
 1.23 06-Nov-2010  uebayasi Remove incomplete, never worked dynamic run-time memory registration
(uvm_page_physload(9)). This functionality will be re-added later.
 1.22 22-Nov-2009  bouyer branches: 1.22.2; 1.22.4;
For amd64, introduce a third free list distinct from the default free list
for memory between 16M and 4G. On large memory machine, this avoids
the 32bit-accessible memory being eaten by various kernel early allocation,
causing 32bit bus_dma(9) memory allocation to fail at boot time.
Tested on a system with 48GB RAM; based on netbsd-5 patch proposed on
port-amd64 3 days ago.
 1.21 06-Mar-2009  joerg Remove SHMMAXPGS from all kernel configs. Dynamically compute the
initial limit as 1/4 of the physical memory. Ensure the limit is at
least 1024 pages, the old default on most platforms.
 1.20 13-Dec-2008  pooka branches: 1.20.2;
_VMPARAM_H_ -> _$MACHINE_VMPARAM_H_
 1.19 13-Dec-2008  pooka wrap in #ifdef __x86_64__
 1.18 20-Jan-2008  yamt branches: 1.18.6; 1.18.10; 1.18.18; 1.18.20; 1.18.26;
- rewrite P->V tracking.
- use a hash rather than SPLAY trees.
SPLAY tree is a wrong algorithm to use here.
will be revisited if it slows down anything other than
micro-benchmarks.
- optimize the single mapping case (it's a common case) by
embedding an entry into mdpage.
- don't keep a pmap pointer as it can be obtained from ptp.
(discussed on port-i386 some years ago.)
ideally, a single paddr_t should be enough to describe a pte.
but it needs some more thoughts as it can increase computational
costs.
- pmap_enter: simplify and fix races with pmap_sync_pv.
- don't bother to lock pm_obj[i] where i > 0, unless DIAGNOSTIC.
- kill mp_link to save space.
- add many KASSERTs.
 1.17 06-Jan-2008  ad #include <sys/mutex.h>
 1.16 22-Nov-2007  bouyer branches: 1.16.6;
Pull up the bouyer-xenamd64 branch to HEAD. This brings in amd64 support
to NetBSD/Xen, both Dom0 and DomU.
 1.15 18-Oct-2007  yamt branches: 1.15.2;
merge yamt-x86pmap branch.

- reduce differences between amd64 and i386. notably, share pmap.c
between them. it makes several i386 pmap improvements available to
amd64, including tlb shootdown reduction and bug fixes from Stephan Uphoff.
- implement deferred pmap switching for amd64.
- remove LARGEPAGES option. always use large pages if available.
also, make it work on amd64.
 1.14 17-Oct-2007  garbled Merge the ppcoea-renovation branch to HEAD.

This branch was a major cleanup and rototill of many of the various OEA
cpu based PPC ports that focused on sharing as much code as possible
between the various ports to eliminate near-identical copies of files in
every tree. Additionally there is a new PIC system that unifies the
interface to interrupt code for all different OEA ppc arches. The work
for this branch was done by a variety of people, too long to list here.

TODO:
bebox still needs work to complete the transition to -renovation.
ofppc still needs a bunch of work, which I will be looking at.
ev64260 still needs to be renovated
amigappc was not attempted.

NOTES:
pmppc was removed as an arch, and moved to a evbppc target.
 1.13 29-Aug-2007  ad branches: 1.13.2; 1.13.4;
Merge most x86 changes from the vmlocking branch, except the threaded soft
interrupt stuff. This is mostly comprised of changes to the pmap modules to
work on multiprocessor systems without kernel_lock, and changes to speed up
tlb shootdowns.
 1.12 27-Sep-2006  cube branches: 1.12.8; 1.12.16; 1.12.22; 1.12.26; 1.12.28;
This is again that time of the millenium where we have to crank up a few
static limits to meet modern bloat requirements.

VM_PHYSSEG_MAX needs it to run on Intel's D946GZIS motherboard, as reported
by rix on #NetBSD-code on freenode. This has a consequence on the initial
number of possible extent allocations for iomem_ex, so increase that value
too.

While there, clarify the action to be taken when VM_PHYSSEG_MAX is maxed
out.

Do that on both amd64 and i386 because the causes, the effects and the code
are mostly the same.
 1.11 11-Jan-2006  cube branches: 1.11.18; 1.11.20;
Add support for VM_TOPDOWN, and use it unconditionally (just like i386).

For COMPAT_NETBSD32 binaries, use VM_TOPDOWN layout too, and sync some
parameters with their i386 counterpart.

OK'd by fvdl@.
 1.10 11-Dec-2005  christos branches: 1.10.2;
merge ktrace-lwp.
 1.9 30-Jul-2005  wiz Fix typo reported in PR 30872.
 1.8 26-Mar-2005  fvdl branches: 1.8.2;
Fix some things regarding COMPAT_NETBSD32 and limits/VM addresses.

* For sparc64 and amd64, define *SIZ32 VM constants.
* Add a new function pointer to struct emul, pointing at a function
that will return the default VM map address. The default function
is uvm_map_defaultaddr, which just uses the VM_DEFAULT_ADDRESS
macro. This gives emulations control over the default map address,
and allows things to be mapped at the right address (in 32bit range)
for COMPAT_NETBSD32.
* Add code to adjust the data and stack limits when a COMPAT_NETBSD32
or COMPAT_SVR4_32 binary is executed.
* Don't use USRSTACK in kern_resource.c, use p_vmspace->vm_minsaddr
instead (emulations might have set it differently)
* Since this changes struct emul, bump kernel version to 3.99.2

Tested on amd64, compile-tested on sparc64.
 1.7 11-Feb-2005  ws branches: 1.7.4;
Prevent integer overflow.
Fixes PR29332.
 1.6 10-Feb-2005  ws Increase max data size, now that the Xserver can grok it.
(It was the only program that couldn't.)
 1.5 04-Jun-2004  sekiya branches: 1.5.4; 1.5.6;
Use the SPLAY_* macros. Copied from the i386 pmap, okay'ed by fvdl@
 1.4 23-Mar-2004  drochner bump default data size to 256M, enough to build a "-g" kernel
 1.3 17-Oct-2003  fvdl Correct VM_MAXUSER_ADDRESS definitions, it was wasting a few pages.
 1.2 07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.1 26-Apr-2003  fvdl branches: 1.1.2;
Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1.2.6 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.1.2.5 01-Apr-2005  skrll Sync with HEAD.
 1.1.2.4 15-Feb-2005  skrll Sync with HEAD.
 1.1.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.1.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.1.2.1 03-Aug-2004  skrll Sync with HEAD
 1.5.6.2 26-Mar-2005  yamt sync with head.
 1.5.6.1 12-Feb-2005  yamt sync with head.
 1.5.4.1 29-Apr-2005  kent sync with -current
 1.7.4.1 18-Sep-2005  tron Pull up following revision(s) (requested by fvdl in ticket #798):
sys/compat/sunos/sunos_exec.c: revision 1.47
sys/compat/pecoff/pecoff_emul.c: revision 1.11
sys/arch/sparc64/sparc64/netbsd32_machdep.c: revision 1.45
sys/arch/amd64/amd64/netbsd32_machdep.c: revision 1.12
sys/sys/proc.h: revision 1.198
sys/compat/mach/mach_exec.c: revision 1.56
sys/compat/freebsd/freebsd_exec.c: revision 1.27
sys/arch/sparc64/include/vmparam.h: revision 1.27
sys/kern/kern_resource.c: revision 1.91
sys/compat/netbsd32/netbsd32_netbsd.c: revision 1.88
sys/compat/osf1/osf1_exec.c: revision 1.39
sys/compat/svr4_32/svr4_32_resource.c: revision 1.5
sys/compat/ultrix/ultrix_misc.c: revision 1.99
sys/compat/svr4_32/svr4_32_exec.h: revision 1.9
sys/kern/exec_elf32.c: revision 1.103
sys/compat/aoutm68k/aoutm68k_exec.c: revision 1.19
sys/compat/sunos32/sunos32_exec.c: revision 1.20
sys/compat/hpux/hpux_exec.c: revision 1.46
sys/compat/darwin/darwin_exec.c: revision 1.40
sys/kern/sysv_shm.c: revision 1.83
sys/uvm/uvm_extern.h: revision 1.99
sys/uvm/uvm_mmap.c: revision 1.89
sys/kern/kern_exec.c: revision 1.195
sys/compat/netbsd32/netbsd32.h: revision 1.31
sys/arch/sparc64/sparc64/svr4_32_machdep.c: revision 1.20
sys/compat/svr4/svr4_exec.c: revision 1.56
sys/compat/irix/irix_exec.c: revision 1.41
sys/compat/ibcs2/ibcs2_exec.c: revision 1.63
sys/compat/svr4_32/svr4_32_exec.c: revision 1.16
sys/arch/amd64/include/vmparam.h: revision 1.8
sys/compat/linux/common/linux_exec.c: revision 1.73
Fix some things regarding COMPAT_NETBSD32 and limits/VM addresses.
* For sparc64 and amd64, define *SIZ32 VM constants.
* Add a new function pointer to struct emul, pointing at a function
that will return the default VM map address. The default function
is uvm_map_defaultaddr, which just uses the VM_DEFAULT_ADDRESS
macro. This gives emulations control over the default map address,
and allows things to be mapped at the right address (in 32bit range)
for COMPAT_NETBSD32.
* Add code to adjust the data and stack limits when a COMPAT_NETBSD32
or COMPAT_SVR4_32 binary is executed.
* Don't use USRSTACK in kern_resource.c, use p_vmspace->vm_minsaddr
instead (emulations might have set it differently)
* Since this changes struct emul, bump kernel version to 3.99.2
Tested on amd64, compile-tested on sparc64.
 1.8.2.6 21-Jan-2008  yamt sync with head
 1.8.2.5 07-Dec-2007  yamt sync with head
 1.8.2.4 27-Oct-2007  yamt sync with head.
 1.8.2.3 03-Sep-2007  yamt sync with head.
 1.8.2.2 30-Dec-2006  yamt sync with head.
 1.8.2.1 21-Jun-2006  yamt sync with head.
 1.10.2.1 15-Jan-2006  yamt sync with head.
 1.11.20.1 22-Oct-2006  yamt sync with head
 1.11.18.1 18-Nov-2006  ad Sync with head.
 1.12.28.3 23-Mar-2008  matt sync with HEAD
 1.12.28.2 09-Jan-2008  matt sync with HEAD
 1.12.28.1 06-Nov-2007  matt sync with HEAD
 1.12.26.3 27-Nov-2007  joerg Sync with HEAD. amd64 Xen support needs testing.
 1.12.26.2 26-Oct-2007  joerg Sync with HEAD.

Follow the merge of pmap.c on i386 and amd64 and move
pmap_init_tmp_pgtbl into arch/x86/x86/pmap.c. Modify the ACPI wakeup
code to restore CR4 before jumping back into kernel space as the large
page option might cover that.
 1.12.26.1 03-Sep-2007  jmcneill Sync with HEAD.
 1.12.22.1 03-Sep-2007  skrll Sync with HEAD.
 1.12.16.1 03-Oct-2007  garbled Sync with HEAD
 1.12.8.3 03-Dec-2007  ad Sync with HEAD.
 1.12.8.2 23-Oct-2007  ad Sync with head.
 1.12.8.1 21-Aug-2007  ad amd64 changes, as yet untested:

- Adapt to vmlocking branch.
- Apply TLB shootdown and pv allocation changes to the pmap.
- Make it build.
 1.13.4.2 25-Oct-2007  bouyer Finish sync with HEAD. Especially use the new x86 pmap for xenamd64.
For this:
- rename pmap_pte_set() to pmap_pte_testset()
- make pmap_pte_set() a function or macro for non-atomic PTE write
- define and use pmap_pa2pte()/pmap_pte2pa() to read/write PTE entries
- define pmap_pte_flush() which is a nop in x86 case, and flush the
MMUops queue in the Xen case
 1.13.4.1 17-Oct-2007  bouyer amd64 (aka x86-64) support for Xen. Based on the OpenBSD port done by
Mathieu Ropert in 2006.
DomU-only for now. An INSTALL_XEN3_DOMU kernel with a ramdisk will boot to
sysinst if you're lucky. Often it panics because a runable LWP has
a NULL stack (really, it's all of l->l_addr which is has been zeroed out
while the process was on the queue !)
TODO:
- bug fixes :)
- Most of the xpq_* functions should be shared with xen/i386
- The xen/i386 assembly bootstrap code should be remplaced with the C
version in xenamd64/amd64/xpmap.c
- see if a config(5) trick could allow to merge xenamd64 back to xen or amd64.
 1.13.2.5 07-Oct-2007  yamt bump VM_MAX_KERNEL_ADDRESS from 0xffff800100000000 to 0xffffff8000000000.
 1.13.2.4 30-Sep-2007  yamt implement deferred pmap switching for amd64, and make amd64 use
x86 shared pmap code. it makes several i386 pmap improvements available
to amd64, including tlb shootdown reduction and bug fixes from Stephan Uphoff.
 1.13.2.3 29-Sep-2007  yamt fix more space/tab damages.
 1.13.2.2 29-Sep-2007  yamt fix some space/tab damages.
if you want to copy-and-paste code, please do so in a way which
preserves space/tab.
 1.13.2.1 29-Sep-2007  yamt sync a comment with i386
 1.15.2.2 18-Feb-2008  mjf Sync with HEAD.
 1.15.2.1 08-Dec-2007  mjf Sync with HEAD.
 1.16.6.2 20-Jan-2008  bouyer Sync with HEAD
 1.16.6.1 08-Jan-2008  bouyer Sync with HEAD
 1.18.26.1 21-Apr-2010  matt sync to netbsd-5
 1.18.20.1 01-Dec-2009  snj Apply patch (requested by bouyer in ticket 1158):
On amd64, add a third free list distinct from the default free list, holding
RAM between 16Mb and 4Gb. This helps preventing bus_dma(9) memory
allocation failures for 32bit DMA on large-memory machines.
 1.18.18.2 28-Apr-2009  skrll Sync with HEAD.
 1.18.18.1 19-Jan-2009  skrll Sync with HEAD.
 1.18.10.2 11-Mar-2010  yamt sync with head
 1.18.10.1 04-May-2009  yamt sync with head.
 1.18.6.1 17-Jan-2009  mjf Sync with HEAD.
 1.20.2.5 28-Mar-2011  jym Sync with HEAD. TODO before merge:
- shortcut for suspend code in sysmon, when powerd(8) is not running.
Borrow ``xs_watch'' thread context?
- bug hunting in xbd + xennet resume. Rings are currently thrashed upon
resume, so current implementation force flush them on suspend. It's not
really needed.
 1.20.2.4 10-Jan-2011  jym Sync with HEAD
 1.20.2.3 24-Oct-2010  jym Sync with HEAD
 1.20.2.2 01-Nov-2009  jym Sync with HEAD.
 1.20.2.1 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.22.4.1 05-Mar-2011  rmind sync with head
 1.22.2.3 15-Nov-2010  uebayasi Sync with HEAD.
 1.22.2.2 26-Apr-2010  uebayasi Remove the unfinished code to add a memory segment after uvm_page_init().
It doesn't even compile.

(In the future, we should allocate struct vm_page [] on the added memory
segment for NUMA's sake.)
 1.22.2.1 23-Feb-2010  uebayasi Convert all VM_MDPAGE_INIT()'s to take struct vm_page_md * and paddr_t.
 1.24.4.1 05-Mar-2011  bouyer Sync with HEAD
 1.24.2.1 06-Jun-2011  jruoho Sync with HEAD.
 1.27.4.5 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.27.4.4 16-Jan-2013  yamt sync with (a bit old) head
 1.27.4.3 30-Oct-2012  yamt sync with head
 1.27.4.2 23-May-2012  yamt sync with head.
 1.27.4.1 17-Apr-2012  yamt sync with head
 1.28.2.2 02-Jun-2012  mrg sync to latest -current.
 1.28.2.1 18-Feb-2012  mrg merge to -current.
 1.29.2.1 15-Aug-2012  riz Pull up following revision(s) (requested by sborrill in ticket #501):
sys/arch/amd64/include/vmparam.h: revision 1.31
sys/arch/i386/include/vmparam.h: revision 1.75
Bump VM_PHYSSEG_MAX to 16 from 10. Modern IBM hardware requires
VM_PHYSSEG_MAX to be turned up to 11 to avoid an early panic.
Bump VM_PHYSSEG_MAX to 16 from 10. Modern IBM hardware requires
VM_PHYSSEG_MAX to be turned up to 11 to avoid an early panic.
 1.31.2.3 03-Dec-2017  jdolecek update from HEAD
 1.31.2.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.31.2.1 20-Nov-2012  tls Resync to 2012-11-19 00:00:00 UTC
 1.32.2.1 18-May-2014  rmind sync with head
 1.34.2.1 10-Aug-2014  tls Rebase.
 1.36.8.2 20-Mar-2017  pgoyette Sync with HEAD
 1.36.8.1 07-Jan-2017  pgoyette Sync with HEAD. (Note that most of these changes are simply $NetBSD$
tag issues.)
 1.36.4.3 28-Aug-2017  skrll Sync with HEAD
 1.36.4.2 05-Dec-2016  skrll Sync with HEAD
 1.36.4.1 05-Oct-2016  skrll Sync with HEAD
 1.38.2.1 21-Apr-2017  bouyer Sync with HEAD
 1.39.6.3 27-Jan-2019  martin Pull up following revision(s) (requested by maxv in ticket #1174):

sys/arch/amd64/include/vmparam.h: revision 1.50

Increase VM_PHYSSEG_MAX from 32 to 64. Saw an example on tech-kern@ of a
heavily fragmented memory map.
 1.39.6.2 11-Apr-2018  martin Pull up following revision(s) (requested by mrg in ticket #733):

sys/arch/amd64/include/vmparam.h: revision 1.44
sys/arch/amd64/include/vmparam.h: revision 1.45
sys/arch/sparc64/include/vmparam.h: revision 1.38

bump PAGER_MAP_DEFAULT_SIZE to 512MB. this should allow more
concurrent IOs to be possible, and i'm unable to see pager_map
contention any more.

other larger platforms should probably do this too.
ok chs@.

Remove superfluous word in comment. Noted by Geoff Wing.

Bump PAGER_MAP_DEFAULT_SIZE to 512 MB (like amd64 recently did).
 1.39.6.1 31-Aug-2017  bouyer Pull up following revision(s) (requested by joerg in ticket #234):
sys/arch/amd64/include/vmparam.h: revision 1.43
sys/kern/exec_subr.c: revision 1.79
lib/libpthread/pthread_int.h: revision 1.94
sys/arch/mips/include/vmparam.h: revision 1.58
sys/arch/mips/include/vmparam.h: revision 1.59
lib/libpthread/TODO: revision 1.19
sys/arch/powerpc/include/vmparam.h: revision 1.20
sys/arch/riscv/include/vmparam.h: revision 1.2
sys/arch/riscv/include/vmparam.h: revision 1.3
sys/arch/i386/include/vmparam.h: revision 1.85
tests/lib/libpthread/t_join.c: revision 1.9
sys/uvm/uvm_meter.c: revision 1.66
sys/uvm/uvm_param.h: revision 1.36
sys/kern/exec_subr.c: revision 1.80
sys/uvm/uvm_param.h: revision 1.37
sys/kern/exec_subr.c: revision 1.81
sys/kern/exec_subr.c: revision 1.82
lib/libpthread/pthread_attr_getguardsize.3: revision 1.4
lib/libpthread/pthread.c: revision 1.148
lib/libpthread/pthread_attr.c: revision 1.17
sys/arch/amd64/include/vmparam.h: revision 1.42
Always include a 1MB guard area beyond the end of stack. While ASLR will
normally create a guard area as well, this provides a deterministic area
for all binaries.
Mitigates the rest of CVE-2017-1000374 and CVE-2017-1000375 from
Qualys.
Revert for the moment, creates problems on i386.
Recommit exec_subr.c revision 1.79:
Always include a 1MB guard area beyond the end of stack. While ASLR will
normally create a guard area as well, this provides a deterministic area
for all binaries.
Mitigates the rest of CVE-2017-1000374 and CVE-2017-1000375 from
Qualys.
Additionally, change VM_DEFAULT_ADDRESS_TOPDOWN to include
user_stack_guard_size in the size reservation.
Update VM_DEFAULT_ADDRESS32_TOPDOWN to include guard area.
Export the guard size of the main thread via vm.guard_size. Add a
complementary writable sysctl for the initial guard size of threads
created via pthread_create. Let the existing attribut accessors do the
right thing. Raise the default guard size for threads to 64KB.
 1.45.4.2 08-Apr-2020  martin Merge changes from current as of 20200406
 1.45.4.1 10-Jun-2019  christos Sync with HEAD
 1.45.2.3 18-Jan-2019  pgoyette Synch with HEAD
 1.45.2.2 26-Nov-2018  pgoyette Sync with HEAD, resolve a couple of conflicts
 1.45.2.1 06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.51.6.1 25-Jan-2020  ad Sync with head.
 1.53.2.1 14-Dec-2020  thorpej Sync w/ HEAD.
 1.3 28-Apr-2008  martin Remove clause 3 and 4 from TNF licenses
 1.2 11-Dec-2005  christos branches: 1.2.74; 1.2.76; 1.2.78;
merge ktrace-lwp.
 1.1 08-May-2004  kleink branches: 1.1.2;
Factor out W{CHAR,INT}_{MAX,MIN} into their own header file.
 1.1.2.4 21-Sep-2004  skrll Fix the sync with head I botched.
 1.1.2.3 18-Sep-2004  skrll Sync with HEAD.
 1.1.2.2 03-Aug-2004  skrll Sync with HEAD
 1.1.2.1 08-May-2004  skrll file wchar_limits.h was added on branch ktrace-lwp on 2004-08-03 10:31:36 +0000
 1.2.78.1 16-May-2008  yamt sync with head.
 1.2.76.1 18-May-2008  yamt sync with head.
 1.2.74.1 02-Jun-2008  mjf Sync with HEAD.
 1.2 25-Apr-2020  bouyer Merge the bouyer-xenpvh branch, bringing in Xen PV drivers support under HVM
guests in GENERIC.
Xen support can be disabled at runtime with
boot -c
disable hypervisor
 1.1 16-Apr-2020  bouyer branches: 1.1.2;
file hypercalls.h was initially added on branch bouyer-xenpvh.
 1.1.2.2 25-Apr-2020  bouyer Include changes in sys/arch/xen/include/ between bouyer-xenpvh-base1 and
bouyer-xenpvh-base2.
 1.1.2.1 16-Apr-2020  bouyer Reorganise sources to make it possible to include Xen PVHVM support in
native kernels. Among others:
- move xen/include/amd64/hypercall.h to amd64/include/xen and
xen/include/i386/hypercall.h to i386/include/xen
- exclude some native files from the build for xenpv
- add xen to "machine" config statement for amd64 and i386
- split arch/xen/conf/files.xen to arch/xen/conf/files.xen (for pv drivers)
and arch/xen/conf/files.xen.pv (for full pv support)
- add GENERIC_XENHVM kernel config which includes GENERIC and add Xen PV
drivers.

RSS XML Feed