Home | History | Annotate | only in /src/sys/arch/x86/include
History log of /src/sys/arch/x86/include
RevisionDateAuthorComments
 1.26 30-Nov-2024  christos Create a new header lwp_private.h to contain _lwp_getprivate_fast,
_lwp_gettcb_fast, _lwp_settcb and remove them from mcontext.h, so that:
1. we don't need special hacks to hide them
2. we can include <lwp.h> where needed to get the necessary prototypes
without redefining them locally.
 1.25 30-Apr-2021  christos branches: 1.25.20;
Merge the x86 gdt function and constant definitions
 1.24 11-May-2019  christos branches: 1.24.14;
Undo previous, fixed in userland.
 1.23 11-May-2019  christos expose the {rd,wr}msr functions to userland and install the header for
the benefit of cpuctl (fix the build).
 1.22 17-Feb-2018  kamil branches: 1.22.4;
Stop installing dbregs.h

This is now kernel-only header. The behavior is well specified by the CPU
documents and we don't introduce changes to it.

Noted by <wiz>
 1.21 15-Dec-2016  kamil branches: 1.21.8;
Add support for hardware assisted watchpoints/breakpoints API in ptrace(2)

Add new ptrace(2) calls:
- PT_COUNT_WATCHPOINTS - count the number of available hardware watchpoints
- PT_READ_WATCHPOINT - read struct ptrace_watchpoint from the kernel state
- PT_WRITE_WATCHPOINT - write new struct ptrace_watchpoint state, this
includes enabling and disabling watchpoints

The ptrace_watchpoint structure contains MI and MD parts:

typedef struct ptrace_watchpoint {
int pw_index; /* HW Watchpoint ID (count from 0) */
lwpid_t pw_lwpid; /* LWP described */
struct mdpw pw_md; /* MD fields */
} ptrace_watchpoint_t;

For example amd64 defines MD as follows:
struct mdpw {
void *md_address;
int md_condition;
int md_length;
};

These calls are protected with the __HAVE_PTRACE_WATCHPOINTS guard.

Tested on amd64, initial support added for i386 and XEN.

Sponsored by <The NetBSD Foundation>
 1.20 27-Feb-2016  tls branches: 1.20.2;
Add cpu_rng, a framework for simple on-CPU random number generators.
 1.19 11-Feb-2014  dsl branches: 1.19.6;
Move sys/arch/amd64/amd64/fpu.c and sys/arch/amd64/include/fpu.h
into sys/arch/x86 in preparation for using the same code for i386.
 1.18 07-Feb-2014  dsl Userspace (especially libkvm) build better is cpu_extended_state.h
is exported.
 1.17 29-Aug-2012  drochner branches: 1.17.2; 1.17.4;
Extend the CPU microcode update framework to support Intel x86 CPUs.
Contrary to the AMD implementation, it doesn't use xcalls to distribute
the update to all CPUs but relies on cpuctl(8) to bind itself to the
right CPU -- to keep it simple and avoid possible problems with
hyperthreading.
Also, it doesn't parse the vendor supplied file to pick the right
part for the present CPU model but relies on userland to prepare
files with specific filenames. I'll commit a pkg for this in a minute
(pkgsrc/sysutils/intel-microcode).
The ioctl interface changed; compatibility is provided (should be
limited to COMPAT_NETBSD6 as soon as this is available).
 1.16 17-Jul-2011  dyoung branches: 1.16.2;
Good-bye bus.h. Don't install <machine/bus.h>.
 1.15 20-Dec-2010  christos To use x86/cpu.h struct cpu_info from userland, we need via_padlock.h installed.
 1.14 07-Jul-2010  njoly Install x86/pte.h
 1.13 11-May-2008  ad branches: 1.13.12; 1.13.18; 1.13.20;
Share cpu.h between the x86 ports.
 1.12 20-Jan-2008  yamt branches: 1.12.6; 1.12.8; 1.12.10; 1.12.12;
- rewrite P->V tracking.
- use a hash rather than SPLAY trees.
SPLAY tree is a wrong algorithm to use here.
will be revisited if it slows down anything other than
micro-benchmarks.
- optimize the single mapping case (it's a common case) by
embedding an entry into mdpage.
- don't keep a pmap pointer as it can be obtained from ptp.
(discussed on port-i386 some years ago.)
ideally, a single paddr_t should be enough to describe a pte.
but it needs some more thoughts as it can increase computational
costs.
- pmap_enter: simplify and fix races with pmap_sync_pv.
- don't bother to lock pm_obj[i] where i > 0, unless DIAGNOSTIC.
- kill mp_link to save space.
- add many KASSERTs.
 1.11 18-Oct-2007  yamt branches: 1.11.2; 1.11.8;
merge yamt-x86pmap branch.

- reduce differences between amd64 and i386. notably, share pmap.c
between them. it makes several i386 pmap improvements available to
amd64, including tlb shootdown reduction and bug fixes from Stephan Uphoff.
- implement deferred pmap switching for amd64.
- remove LARGEPAGES option. always use large pages if available.
also, make it work on amd64.
 1.10 16-Apr-2007  ad branches: 1.10.10; 1.10.12; 1.10.14; 1.10.16;
+ x86/sysarch.h
 1.9 09-Feb-2007  ad branches: 1.9.2; 1.9.6; 1.9.8;
Merge newlock2 to head.
 1.8 01-Jan-2007  ad Report on and where possible, try to work around some of the known errata
for Athlon 64 and Opteron processors. Tested briefly by cube@ and elad@.
 1.7 04-Feb-2006  jmmv branches: 1.7.14;
Revert yesterday's change that attempted to fix the detection of the
boot device when using a Multiboot boot loader. It couldn't work because
these boot loaders do not pass a checksum of the disk so matchbiosdisk()
cannot really find any matches. I should have gone to sleep before
commiting...

Found by xtraeme@.
 1.6 03-Feb-2006  jmmv branches: 1.6.2;
When booting an i386 kernel with Multiboot, properly detect the boot device
by looking it up in the x86_alldisks table (instead of trying to match it
to 'wd*' manually).

In order to do this, move the cpu_rootconf function from x86 common code
to amd64 and i386 specific one. This way, i386 can do an extra step (call
the appropriate Multiboot code) in the appropriate place (after
x86_matchbiosdisks and before findroot()).
 1.5 22-Oct-2003  kleink branches: 1.5.16; 1.5.30;
Use a common <machine/math.h> for amd64 and i386.
 1.4 26-Apr-2003  fvdl branches: 1.4.2;
Install cacheinfo.h
 1.3 03-Mar-2003  fvdl Install cpuvar.h
 1.2 27-Feb-2003  fvdl Move a few more files to x86/include. Trim the list of files to install
in /usr/include a bit.
 1.1 26-Feb-2003  fvdl Install header files.
 1.4.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.4.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.4.2.1 03-Aug-2004  skrll Sync with HEAD
 1.5.30.1 09-Sep-2006  rpaulo sync with head
 1.5.16.4 21-Jan-2008  yamt sync with head
 1.5.16.3 27-Oct-2007  yamt sync with head.
 1.5.16.2 03-Sep-2007  yamt sync with head.
 1.5.16.1 26-Feb-2007  yamt sync with head.
 1.6.2.1 22-Apr-2006  simonb Sync with head.
 1.7.14.2 12-Jan-2007  ad Sync with head.
 1.7.14.1 24-Oct-2006  ad Compile fixes
 1.9.8.1 11-Jul-2007  mjf Sync with head.
 1.9.6.2 23-Oct-2007  ad Sync with head.
 1.9.6.1 27-May-2007  ad Sync with head.
 1.9.2.1 07-May-2007  yamt sync with head.
 1.10.16.1 25-Oct-2007  bouyer Sync with HEAD.
 1.10.14.1 08-Oct-2007  yamt merge some parts of x86 pmap.h.
 1.10.12.2 23-Mar-2008  matt sync with HEAD
 1.10.12.1 06-Nov-2007  matt sync with HEAD
 1.10.10.1 26-Oct-2007  joerg Sync with HEAD.

Follow the merge of pmap.c on i386 and amd64 and move
pmap_init_tmp_pgtbl into arch/x86/x86/pmap.c. Modify the ACPI wakeup
code to restore CR4 before jumping back into kernel space as the large
page option might cover that.
 1.11.8.1 20-Jan-2008  bouyer Sync with HEAD
 1.11.2.1 18-Feb-2008  mjf Sync with HEAD.
 1.12.12.1 23-Jun-2008  wrstuden Sync w/ -current. 34 merge conflicts to follow.
 1.12.10.2 11-Aug-2010  yamt sync with head.
 1.12.10.1 16-May-2008  yamt sync with head.
 1.12.8.1 18-May-2008  yamt sync with head.
 1.12.6.1 02-Jun-2008  mjf Sync with HEAD.
 1.13.20.1 05-Mar-2011  rmind sync with head
 1.13.18.1 17-Aug-2010  uebayasi Sync with HEAD.
 1.13.12.3 27-Aug-2011  jym Sync with HEAD. Most notably: uvm/pmap work done by rmind@, and MP Xen
work of cherry@.

No regression observed on suspend/restore.
 1.13.12.2 10-Jan-2011  jym Sync with HEAD
 1.13.12.1 24-Oct-2010  jym Sync with HEAD
 1.16.2.2 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.16.2.1 30-Oct-2012  yamt sync with head
 1.17.4.1 18-May-2014  rmind sync with head
 1.17.2.2 03-Dec-2017  jdolecek update from HEAD
 1.17.2.1 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.19.6.2 05-Feb-2017  skrll Sync with HEAD
 1.19.6.1 19-Mar-2016  skrll Sync with HEAD
 1.20.2.1 07-Jan-2017  pgoyette Sync with HEAD. (Note that most of these changes are simply $NetBSD$
tag issues.)
 1.21.8.1 01-Mar-2018  martin Pull up following revision(s) (requested by kamil in ticket #599):
sys/arch/x86/include/Makefile: revision 1.22
Stop installing dbregs.h
This is now kernel-only header. The behavior is well specified by the CPU=
documents and we don't introduce changes to it.
Noted by <wiz>
 1.22.4.1 10-Jun-2019  christos Sync with HEAD
 1.24.14.1 13-May-2021  thorpej Sync with HEAD.
 1.25.20.1 02-Aug-2025  perseant Sync with HEAD
 1.14 22-Dec-2019  thorpej Add acpi_intr_mask() and acpi_intr_unmask() which, following the pre-existing
ACPI software layering model, are wrappers around acpi_md_intr_mask() and
acpi_md_intr_unmask(), which in turn are wrappers around intr_mask() and
intr_unmask().

XXX ARM and IA64 implementations of acpi_md_intr_mask() and
acpi_md_intr_unmask() are just stubs for now.
 1.13 16-Nov-2018  jmcneill Add MD functions for establishing and disestablishing interrupt handlers.
 1.12 20-Mar-2018  bouyer branches: 1.12.2;
Allow registering ACPI interrupt handlers with a xname.
AcpiOsInstallInterruptHandler(), part of ACPICA API, doesn't allow passing
the xname. I extend the API with AcpiOsInstallInterruptHandler_xname()
for this purpose, and change acpi_md_OsInstallInterruptHandler() to
accept and use the xname (ia64 doens't use it).
The xname was hardcoded to "acpi SCI" in the
x86 acpi_md_OsInstallInterruptHandler(), so I make
AcpiOsInstallInterruptHandler() call
AcpiOsInstallInterruptHandler_xname with xname = "acpi SCI".

Now 'vmstat -i' shows the device's name instead of "acpi SCI" for for i2c HID
interrupts.

Proposed on tech-kern@ on Dec 29.
 1.11 23-Sep-2012  chs branches: 1.11.36;
locate PCI buses and determine their bus numbers using the info
previously extracted from ACPICA rather than trying to figure it out again.
allow PCI buses that don't have a _PRT method.
 1.10 12-Jun-2011  jruoho branches: 1.10.2; 1.10.8; 1.10.12;
Follow IA-64 with the x86-specific ACPI MD functions and move these where
they belong to. Remove an unused function. Minor KNF. No functional change.
 1.9 12-Jun-2011  jruoho Move the evaluation of the _PDC control method out from the acpicpu(4)
driver to the main acpi(4) stack. Follow Linux and evaluate it early.
Should fix PR port-amd64/42895, possibly also PR kern/42583, and many
other comparable bugs.

A common sense explanation is that Intel supplies additional CPU tables to
OEMs. BIOS writers do not bother to modify their DSDTs, but instead load
these extra tables dynamically as secondary SSDT tables. The actual Load()
happens when the _PDC method is invoked, and thus namespace errors occur
when the CPU-specific ACPI methods are not yet present but referenced in the
AML by various drivers, including, but not limited to, acpitz(4).
 1.8 13-Jan-2011  jruoho branches: 1.8.6;
Move the function that counts the CPUs from acpicpu(4) to the MD layer.
 1.7 24-Jul-2010  jruoho Revert the previous partially for the time being.
 1.6 24-Jul-2010  jruoho Move ACPI_FLUSH_CPU_CACHE() (a.k.a. WBINVD on x86) to MD headers where it
belongs to. Let IA-64 define its own function/instruction instead of
requiring a dummy wbinvd() to satisfy the definition in a MI header.
 1.5 14-Mar-2009  jmcneill branches: 1.5.2; 1.5.4;
Add acpi_md_OsEnableInterrupt, to go with acpi_md_OsDisableInterrupt
 1.4 15-Dec-2007  joerg branches: 1.4.10; 1.4.18; 1.4.24;
Move mapping of the real mode location for the ACPI wakeup code into a
separate function called from acpi_md_callback.
 1.3 09-Dec-2007  jmcneill branches: 1.3.2;
Merge jmcneill-pm branch.
 1.2 02-May-2005  kochi branches: 1.2.2; 1.2.56; 1.2.58; 1.2.68; 1.2.70;
Merge changes for ACPI-CA 20050408.
 1.1 11-May-2003  fvdl branches: 1.1.2;
Moved here from sys/arch/i386/include.
 1.1.2.1 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.2.70.1 11-Dec-2007  yamt sync with head.
 1.2.68.1 26-Dec-2007  ad Sync with head.
 1.2.58.1 09-Jan-2008  matt sync with HEAD
 1.2.56.4 02-Oct-2007  jmcneill Update to ACPI-CA 20070320
 1.2.56.3 08-Sep-2007  joerg Now that the real mode pages are statically allocated, make
acpi_wakeup_paddr static and local to acpi_wakeup.c.
 1.2.56.2 08-Sep-2007  joerg Start to revamp the ACPI wake code (i386 only, amd64 gets minimal fixes
to keep being compilable):

- In init386 and the amd64 equivalent, just reserve the low-level code.
Do not map and don't copy the wakecode yet. This avoids the conflicts
with the MP tramp code as well. The wakecode is expected to be less
than one page long, which is way too much space.
acpi_md_get_npages_of_wakecode and acpi_md_install_wakecode are
dropped, acpi_wakeup_paddr is set instead of the reserved address.
- Split the wakecode into the essential low-level part to setup
protected mode with paging and valid CS and DS (which stays as
wakecode) and the rest. Inline beepon and beepoff as they are used
exactly once.
- Split the acpi_restorecpu and acpi_savecpu assembly from apci_wakeup.c
and merge acpi_restorecpu with the second half dropped from wakecode.
Most registers are not exported, just those needed to be patched into
wakecode. Don't bother to save or restore %eax, it is overriden
anyway.
- Don't bother to save and restore eflags in acpi_md_sleep, they are
handled correctly by the assembly. Don't play games with cr3 either,
we modify the pmap of the running processes. Copy the wakecode
directly before patching it, after the identity mapping has been
setup.
- Drop clear_reg and acpi_printcpu.
- Add an commented out broadcast IPI to halt the other CPUs explicitly.
 1.2.56.1 23-Aug-2007  joerg From FreeBSD: explicitly load regions first to allow acpi_md_callback
to actually query the routing tables.

Drop the argment to acpi_md_callback, passing around singletons is not
that helpful.
 1.2.2.1 21-Jan-2008  yamt sync with head
 1.3.2.1 02-Jan-2008  bouyer Sync with HEAD
 1.4.24.5 27-Aug-2011  jym Sync with HEAD. Most notably: uvm/pmap work done by rmind@, and MP Xen
work of cherry@.

No regression observed on suspend/restore.
 1.4.24.4 28-Mar-2011  jym Sync with HEAD. TODO before merge:
- shortcut for suspend code in sysmon, when powerd(8) is not running.
Borrow ``xs_watch'' thread context?
- bug hunting in xbd + xennet resume. Rings are currently thrashed upon
resume, so current implementation force flush them on suspend. It's not
really needed.
 1.4.24.3 24-Oct-2010  jym Sync with HEAD
 1.4.24.2 01-Nov-2009  jym Sync with HEAD.
 1.4.24.1 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.4.18.1 28-Apr-2009  skrll Sync with HEAD.
 1.4.10.2 11-Aug-2010  yamt sync with head.
 1.4.10.1 04-May-2009  yamt sync with head.
 1.5.4.1 05-Mar-2011  rmind sync with head
 1.5.2.1 17-Aug-2010  uebayasi Sync with HEAD.
 1.8.6.1 23-Jun-2011  cherry Catchup with rmind-uvmplock merge.
 1.10.12.1 20-Nov-2012  tls Resync to 2012-11-19 00:00:00 UTC
 1.10.8.1 22-Nov-2012  riz Pull up following revision(s) (requested by chs in ticket #683):
sys/arch/ia64/include/acpi_machdep.h: revision 1.6
sys/arch/x86/include/acpi_machdep.h: revision 1.11
sys/dev/acpi/acpi.c: revision 1.255
sys/arch/x86/acpi/acpi_machdep.c: revision 1.4
sys/arch/x86/x86/mpacpi.c: revision 1.95
sys/arch/x86/x86/mpacpi.c: revision 1.96
sys/arch/ia64/acpi/acpi_machdep.c: revision 1.6
locate PCI buses and determine their bus numbers using the info
previously extracted from ACPICA rather than trying to figure it out again.
allow PCI buses that don't have a _PRT method.
as a workaround for PR 47016, call ioapic_reenable() at the end of
ACPI interrupt routing to fix the settings for the SCI interrupt.
the problem is that after my recent changes, the SCI handler is
installed before the MADT info is parsed, so we don't know what
polarity it should have. the real fix for this will be to rearrange
the ACPI initialization so that everything is done in a more sensible
order, but that will take some more time.
 1.10.2.1 30-Oct-2012  yamt sync with head
 1.11.36.2 26-Nov-2018  pgoyette Sync with HEAD, resolve a couple of conflicts
 1.11.36.1 22-Mar-2018  pgoyette Synch with HEAD, resolve conflicts
 1.12.2.2 08-Apr-2020  martin Merge changes from current as of 20200406
 1.12.2.1 10-Jun-2019  christos Sync with HEAD
 1.1 26-Feb-2003  fvdl Move some files out of i386 into x86, so that they can be shared with
other ports.
 1.11 02-May-2025  imil Add support for CPUID leaf 0x40000010 to detect TSC and LAPIC frequency on
hypervisors implementing the VMware-defined interface

This change enables virtual machines to obtain TSC and LAPIC frequency
information directly from the hypervisor via CPUID leaf 0x40000010, avoiding
the need for runtime calibration, thus reducing boot speed in supported
environments.

Tested on GENERIC and MICROVM kernels, QEMU/KVM and QEMU/NVMM (current and
10.1), Intel and AMD CPUs, NetBSD/amd64 and i386.
 1.10 06-Mar-2025  imil Revert VMware-compatible TSC and LAPIC frequency detection.
 1.9 06-Mar-2025  imil Add support for CPUID leaf 0x40000010, which enables VMware-compatible TSC
and LAPIC frequency detection for virtual machines.
 1.8 25-Apr-2020  bouyer branches: 1.8.26;
Merge the bouyer-xenpvh branch, bringing in Xen PV drivers support under HVM
guests in GENERIC.
Xen support can be disabled at runtime with
boot -c
disable hypervisor
 1.7 21-Apr-2020  msaitoh Get TSC frequency from CPUID 0x15 and/or x16 for newer Intel processors.

- If the max CPUID leaf is >= 0x15, take TSC value from CPUID. Some processors
can take TSC/core crystal clock ratio but core crystal clock frequency
can't be taken. Intel SDM give us the values for some processors.
- It also required to change lapic_per_second to make LAPIC timer correctly.
- Add new file x86/x86/identcpu_subr.c to share common subroutines between
kernel and userland. Some code in x86/x86/identcpu.c and cpuctl/arch/i386.c
will be moved to this file in future.
- Add comment to clarify.
 1.6 14-Jun-2019  msaitoh branches: 1.6.2; 1.6.8;
- Dump LAPIC and I/O APIC correctly.
- Don't print redirect target on LAPIC.
- Fix DEST_MASK:
- DEST_MASK is not 1 bit but 2 bit.
- Add missing "\0"s to print decoded name correctly.
- Support both LAPIC and I/O APIC correctly in apic_format_redir().
- Improve output of some bits using with snprintb()'s "F\B\1" and ":\V".
 1.5 28-Apr-2008  martin branches: 1.5.80; 1.5.88;
Remove clause 3 and 4 from TNF licenses
 1.4 05-Mar-2007  drochner branches: 1.4.40; 1.4.42; 1.4.44;
clean up how cpus and ioapics are attached at the mainbus:
Seperate "cpubus" and "ioapicbus" -- while they share a common "address
space" (the apic id), the kernel doesn't use this fact. There are different
data passed to cpus and apics, which caused some ugly polymorphism. This
also saves the special "submatch" functions needed to distingush cpus
and ioapics for autoconf. (And it makes that "apid" locators wired
in the kernel configuration are honored now; this allows one to dumb down
an mp box to singleprocessor by userconfig.)
Print "apid" locators in the buses "print" function "as everyone does",
so the per-port cpu drivers don't need to do it.
Being here, constify "struct cpu_functions" and g/c the unused MP_PICMODE
flag.
 1.3 29-May-2005  christos branches: 1.3.2; 1.3.34;
Sprinkle const.
 1.2 27-Oct-2003  junyoung Nuke __P().
 1.1 26-Feb-2003  fvdl branches: 1.1.2;
Move some files out of i386 into x86, so that they can be shared with
other ports.
 1.1.2.4 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.1.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.1.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.1.2.1 03-Aug-2004  skrll Sync with HEAD
 1.3.34.1 12-Mar-2007  rmind Sync with HEAD.
 1.3.2.1 03-Sep-2007  yamt sync with head.
 1.4.44.1 16-May-2008  yamt sync with head.
 1.4.42.1 18-May-2008  yamt sync with head.
 1.4.40.1 02-Jun-2008  mjf Sync with HEAD.
 1.5.88.2 21-Apr-2020  martin Sync with HEAD
 1.5.88.1 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.5.80.1 05-Aug-2020  martin Pull up the following revisions, requested by msaitoh in ticket #1593:

sys/arch/x86/conf/files.x86 1.108
sys/arch/x86/include/apicvar.h 1.7 via patch
sys/arch/x86/include/cpu.h 1.121
sys/arch/x86/x86/cpu.c 1.185 via patch
sys/arch/x86/x86/hyperv.c 1.7
sys/arch/x86/x86/tsc.c 1.41
sys/arch/xen/conf/files.xen 1.181

Get TSC frequency from CPUID 0x15 and/or x16 if it's available.
This change fixes a problem that newer Intel processors' timer
counts very slowly.
 1.6.8.1 25-Apr-2020  bouyer Sync with bouyer-xenpvh-base2 (HEAD)
 1.6.2.1 15-Jul-2020  martin Pull up the following, requested by msaitoh in ticket #1015

sys/arch/x86/conf/files.x86 1.108 (via patch)
sys/arch/x86/include/apicvar.h 1.7 (via patch)
sys/arch/x86/include/cpu.h 1.121 (via patch)
sys/arch/x86/x86/cpu.c 1.185 (via patch)
sys/arch/x86/x86/hyperv.c 1.7 (via patch)
sys/arch/x86/x86/tsc.c 1.41 (via patch)
sys/arch/xen/conf/files.xen 1.181 (via patch)

Get TSC frequency from CPUID 0x15 and/or x16 if it's available.
This change fixes a problem that newer Intel processors' timer
counts very slowly.
 1.8.26.1 02-Aug-2025  perseant Sync with HEAD
 1.6 24-May-2019  nonaka Added drivers for Hyper-V Synthetic Keyboard and Video device.
 1.5 22-Dec-2018  cherry This change modifies the mainbus(4) entry point for all x86 sub-archs
in the following way:

i) It provides a unified entry point in
x86/x86/mainbus.c:mainbus_attach()
ii) It carves out the preliminary bus attachment sequence that is
common to all sub-archs into
x86/x86/mainbus.c: x86_cpubus_attach()
iii) It consolidates the remaining pathways as internal callee
functions so that these may be called piecemeal if required. A
special usecase of this is XEN PVHVM which may need to call the
native configure path, the xen configure path, or both.
iv) It moves the driver private data structures from
i386/i386_mainbus.c to an x86/ level one. This allows for other
sub-arch's to do similar, if needed. (They do not at the moment).
v) For dom0 kernels, it enables 'acpi0 at mainbus?' and
'acpi0 at hypervisorbus'. This serves two purposes:
a) To demonstrate the possibility of dynamic configuration tree
traversal ordering changes.
b) To allow for the common acpi_check(self, "acpibus") call in
x86/mainbus.c to not barf when it is called from the dom0 attach
path. We allow for the acpi0 device to be a child of mainbus with
the changes to amd64/conf/XEN3_DOM0 and i386/conf/XEN3PAE_DOM0
without actually probing further in the code. This path will later
be pursued in a PVHVM boot codepath.

There should be no operative changes with this change. If there are,
please complain loudly.
 1.4 21-Sep-2016  jmcneill branches: 1.4.8; 1.4.14; 1.4.16;
Set hw.acpi.sleep.vbios when a non-HW accelerated VGA driver attaches.
If the VGA_POST option is present in the kernel the default value is 2,
otherwise 1. PR kern/50781

Reviewed by: agc, mrg
 1.3 18-Oct-2011  dyoung branches: 1.3.12; 1.3.30; 1.3.34;
Define some optional routines that will help device_register() to
register ISA & PCI devices. Add stub implementations of the routines.
 1.2 04-Feb-2006  jmmv Revert yesterday's change that attempted to fix the detection of the
boot device when using a Multiboot boot loader. It couldn't work because
these boot loaders do not pass a checksum of the disk so matchbiosdisk()
cannot really find any matches. I should have gone to sleep before
commiting...

Found by xtraeme@.
 1.1 03-Feb-2006  jmmv branches: 1.1.2;
When booting an i386 kernel with Multiboot, properly detect the boot device
by looking it up in the x86_alldisks table (instead of trying to match it
to 'wd*' manually).

In order to do this, move the cpu_rootconf function from x86 common code
to amd64 and i386 specific one. This way, i386 can do an extra step (call
the appropriate Multiboot code) in the appropriate place (after
x86_matchbiosdisks and before findroot()).
 1.1.2.1 22-Apr-2006  simonb Sync with head.
 1.3.34.1 04-Nov-2016  pgoyette Sync with HEAD
 1.3.30.1 05-Oct-2016  skrll Sync with HEAD
 1.3.12.1 03-Dec-2017  jdolecek update from HEAD
 1.4.16.1 10-Jun-2019  christos Sync with HEAD
 1.4.14.1 26-Dec-2018  pgoyette Sync with HEAD, resolve a few conflicts
 1.4.8.1 12-Jun-2019  martin Pull up following revision(s) (requested by nonaka in ticket #1280):

sys/arch/x86/x86/consinit.c: revision 1.29
sys/dev/hyperv/vmbusvar.h: revision 1.2
sys/dev/hyperv/genfb_vmbusvar.h: revision 1.1
sys/arch/x86/x86/x86_autoconf.c: revision 1.78
sys/arch/x86/x86/identcpu.c: revision 1.91
sys/arch/x86/x86/hyperv.c: revision 1.2
sys/arch/x86/x86/hyperv.c: revision 1.3
sys/arch/x86/x86/hyperv.c: revision 1.4
sys/arch/i386/conf/GENERIC: revision 1.1207
sys/dev/wscons/wsconsio.h: revision 1.123
sys/arch/x86/x86/hypervvar.h: revision 1.1
sys/arch/amd64/conf/GENERIC: revision 1.528
sys/dev/hyperv/files.hyperv: revision 1.2
sys/arch/x86/include/autoconf.h: revision 1.6
sys/dev/hyperv/hyperv_common.c: revision 1.2
sys/arch/xen/x86/autoconf.c: revision 1.23
sys/arch/x86/pci/pci_machdep.c: revision 1.86
sys/dev/hyperv/hvkbd.c: revision 1.1
sys/dev/hyperv/hypervvar.h: revision 1.2
sys/dev/acpi/vmbus_acpi.c: revision 1.2
sys/dev/hyperv/vmbus.c: revision 1.3
sys/dev/hyperv/hvkbdvar.h: revision 1.1
sys/dev/hyperv/genfb_vmbus.c: revision 1.1

Added drivers for Hyper-V Synthetic Keyboard and Video device.

Avoid undefined reference to `hyperv_guid_video' without vmbus(4).

Avoid undefined reference to `hyperv_is_gen1' without hyperv(4).

Use efi_probe().
 1.5 28-Apr-2008  martin Remove clause 3 and 4 from TNF licenses
 1.4 25-Dec-2007  perry branches: 1.4.6; 1.4.8; 1.4.10;
Convert many of the uses of __attribute__ to equivalent
__packed, __unused and __dead macros from cdefs.h
 1.3 04-Mar-2007  christos branches: 1.3.20; 1.3.26; 1.3.28; 1.3.32;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.
 1.2 27-Oct-2003  junyoung branches: 1.2.16; 1.2.54;
Nuke __P().
 1.1 26-Feb-2003  fvdl branches: 1.1.2;
Move some files out of i386 into x86, so that they can be shared with
other ports.
 1.1.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.1.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.1.2.1 03-Aug-2004  skrll Sync with HEAD
 1.2.54.1 12-Mar-2007  rmind Sync with HEAD.
 1.2.16.2 21-Jan-2008  yamt sync with head
 1.2.16.1 03-Sep-2007  yamt sync with head.
 1.3.32.1 02-Jan-2008  bouyer Sync with HEAD
 1.3.28.1 26-Dec-2007  ad Sync with head.
 1.3.26.1 18-Feb-2008  mjf Sync with HEAD.
 1.3.20.1 09-Jan-2008  matt sync with HEAD
 1.4.10.1 16-May-2008  yamt sync with head.
 1.4.8.1 18-May-2008  yamt sync with head.
 1.4.6.1 02-Jun-2008  mjf Sync with HEAD.
 1.32 30-Apr-2025  imil Introduce pvh_boot boolean to identify the real hypervisor when booting in PVH
mode.

As of now, sys/arch/x86/x86/identcpu.c / identify_hypervisor() returns in the
case of vm_guest being VM_GUEST_GENPVH, yet this VM type is not an actual
hypervisor but an information recorded in locore.S to drive boot method.
We need to investigate what type of hypervisor is really running the VM in
order to apply specifics, so instead of relying on vm_guest_is_pvh() which only
checks for VM_GUEST_XENPVH || VM_GUEST_GENPVH, pvh_boot informs on the boot
method while allowing to identify the real hypervisor.

Idea ok'd by bouyer@, tested on Xen domU, Xen dom0 with GENERIC PVH and
qemu GENERIC PVH boot.
 1.31 20-Aug-2022  riastradh branches: 1.31.10;
x86/bootinfo.h: Add include guard.
 1.30 21-Jun-2019  nonaka PR/54147: Increase BOOTINFO_MAXSIZE to 16Kib.

Some systems require a larger bootinfo size for memory descriptors.
 1.29 13-Apr-2018  nonaka branches: 1.29.2;
x86: Increase BOOTINFO_MAXSIZE to 8Kib.

Proposed on port-i386 and port-amd64 with no objections:
http://mail-index.netbsd.org/port-i386/2018/04/11/msg003692.html
http://mail-index.netbsd.org/port-amd64/2018/04/11/msg002697.html
 1.28 09-Nov-2017  christos branches: 1.28.2;
add "prekern" to the string list.
 1.27 07-Oct-2017  maxv Add a new option in libsa, to load dynamic binaries. A separate function
is used, and it does not break in any way the generic static loader. Then,
add a new "pkboot" command in the x86 bootloader, which boots a
GENERIC_KASLR kernel via the prekern. (See thread on tech-kern@.)
 1.26 14-Feb-2017  nonaka branches: 1.26.6;
x86: add e820 memory type.
 1.25 24-Jan-2017  nonaka Initial commit of native amd64 EFI boot loader.
 1.24 28-Jan-2016  christos branches: 1.24.2; 1.24.4;
Add support for grub to find the ACPI root table pointer via a bootinfo entry
from grub.
From: https://mail-index.netbsd.org/tech-kern/2014/05/22/msg017119.html
 1.23 30-Aug-2013  jmcneill branches: 1.23.6;
Add support for using a raw file-system image as memory disk root with
the x86 bootloader.
 1.22 16-May-2013  christos branches: 1.22.2;
Complete the dosparts -> mbrparts conversion. Only x86k new uses dosparts
because it also uses struct dos_partition.
 1.21 16-May-2013  christos Complete the dosparts -> mbrparts conversion. Only x86k new uses dosparts
because it also uses struct dos_partition.
 1.20 16-May-2013  christos Complete the dosparts -> mbrparts conversion. Only x86k new uses dosparts
because it also uses struct dos_partition.
 1.19 28-Nov-2011  tls branches: 1.19.8;

Add support for passing saved entropy (random seed file) to the kernel
from the bootloader. This can fix the problem of poor quality keys
for other kernel modules which call arc4random() early in kernel startup
(NFS startup, in particular, causes this).

We continue to rely on the etc/rc.d/random_seed script to save entropy
to the seed file at shutdown and erase the seed file at startup.

Boot loader support implemented only for i386 and amd64 ports for now but
it should be easy for other ports to do the same or similar.
 1.18 26-May-2011  uebayasi branches: 1.18.4;
Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html
 1.17 06-Feb-2011  jmcneill add BI_MODULE_IMAGE boot module type
 1.16 24-Aug-2009  jmcneill branches: 1.16.4; 1.16.6; 1.16.8;
Pass the VBE mode number from the bootloader to the kernel, and then
make the ACPI wakecode aware of it. Restore the desired VBE mode on resume
when acpi_vbios_reset=1, so suspend/resume with genfb console will work.
 1.15 16-Feb-2009  jmcneill Kernel-side modifications for framebuffer console support on i386 and amd64.

* New BTINFO_FRAMEBUFFER kernel parameter to pass screen configuration
* Early attach support for framebuffer console
* Pass BTINFO_FRAMEBUFFER parameters to genfb in device_register
* Provide hooks to genfb to set VGA DAC palette in 8bpp mode
 1.14 09-Sep-2008  tron branches: 1.14.2; 1.14.8;
Remove duplicate definition of "bootinfo" structure.
Patch provided by Juan RP in PR kern/39495.
 1.13 02-May-2008  ad branches: 1.13.2; 1.13.6;
- Give x86 BIOS boot the ability to load new style modules and pass them
into the kernel. Based on a patch by jmcneill@, with many fixes and
improvements by me.

- Put MEMORY_DISK_DYNAMIC and MODULAR into the GENERIC kernels, so that
you can load miniroot.kmod from the boot blocks and boot into the
installer!
 1.12 25-Dec-2007  perry branches: 1.12.6; 1.12.8; 1.12.10;
Convert many of the uses of __attribute__ to equivalent
__packed, __unused and __dead macros from cdefs.h
 1.11 03-Feb-2006  jmmv branches: 1.11.46; 1.11.52; 1.11.56; 1.11.60;
Implement support for 'The Multiboot Specification' so that i386 kernels
can be booted directly from Multiboot-compliant boot loaders (e.g. GRUB).
See the added multiboot(8) manual page for more information.

No objections in tech-kern@; only positive comments.
 1.10 30-Dec-2005  jmmv branches: 1.10.2; 1.10.4;
Add a 'struct bootinfo' to represent the bootinfo structure used in the
kernel by x86 platforms (instead of a simple char *). This way, the code
in, e.g., lookup_bootinfo, is a bit easier to understand.

While here, move the lookup_bootinfo function used in x86 platforms (amd64,
i386 and xen) to a common file (x86/x86_machdep.c), as it was exactly the
same in all of them.
 1.9 06-Jul-2005  junyoung branches: 1.9.2;
BIOSDISK_EXT13INFO_V3 -> BIOSDISK_EXTINFO_V3
u_intNN_t -> uintNN_t
u_int -> unsigned int
Remove trailing spaces
 1.8 12-Jun-2005  dyoung Make disklabel(8) and fdisk(8) into "host tools " last step: build
and install ${TOOLDIR}/bin/${MACHINE_GNU_PLATFORM}-disklabel,
${TOOLDIR}/bin/${MACHINE_GNU_PLATFORM}-fdisk by "reaching over" to
the sources in ${NETBSDSRCDIR}/sbin/{disklabel fdisk}/.

To avoid clashes with a build-host's header files, especially on
*BSD, the host-tools versions of fdisk and disklabel search for
#includes such as disklabel.h, disklabel_acorn.h, disklabel_gpt.h,
and bootinfo.h in a new #includes namespace, nbinclude/. That is,
they #include <nbinclude/sys/disklabel.h>, <nbinclude/machine/disklabel.h>,
<nbinclude/sparc64/disklabel.h>, instead of <sys/disklabel.h> and
such. I have also updated the system headers to #include from
nbinclude/-space when HAVE_NBTOOL_CONFIG_H is #defined.
 1.7 04-Feb-2005  fvdl The bootinfo_wedge structure must be packed, or the 32bit alignments
used by the bootloader don't match the amd64 kernel.
 1.6 23-Oct-2004  thorpej branches: 1.6.4; 1.6.6;
Add support for passing booted wedge information to the kernel.
 1.5 24-Mar-2004  drochner remove license clauses 3 and 4 from my cpoyright notices
 1.4 27-Oct-2003  junyoung Nuke __P().
 1.3 08-Oct-2003  lukem Overhaul MBR handling (part 1):

<sys/bootblock.h>:
* Added definitions for the Master Boot Record (MBR) used by
a variety of systems (primarily i386), including the format
of the BIOS Parameter Block (BPB).
This information was cribbed from a variety of sources
including <sys/disklabel_mbr.h> which this is a superset of.

As part of this, some data structure elements and #defines
were renamed to be more "namespace friendly" and consistent
with other bootblocks and MBR documentation.
Update all uses of the old names to the new names.

<sys/disklabel_mbr.h>:
* Deprecated in favor of <sys/bootblock.h> (the latter is more
"host tool" friendly).

amd64 & i386:
* Renamed /usr/mdec/bootxx_dosfs to /usr/mdec/bootxx_msdos, to
be consistent with the naming convention of the msdosfs tools.

* Removed /usr/mdec/bootxx_ufs, as it's equivalent to bootxx_ffsv1
and it's confusing to have two functionally equivalent bootblocks,
especially given that "ufs" has multiple meanings (it could be
a synonym for "ffs", or the group of ffs/lfs/ext2fs file systems).

* Rework pbr.S (the first sector of bootxx_*):
+ Ensure that BPB (bytes 11..89) and the partition table
(bytes 446..509) do not contain code.
+ Add support for booting from FAT partitions if BOOT_FROM_FAT
is defined. (Only set for bootxx_msdos).
+ Remove "dummy" partition 3; if people want to installboot(8)
these to the start of the disk they can use fdisk(8) to
create a real MBR partition table...
+ Compile with TERSE_ERROR so it fits because of the above.
Whilst this is less user friendly, I feel it's important
to have a valid partition table and BPB in the MBR/PBR.

* Renamed /usr/mdec/biosboot to /usr/mdec/boot, to be consistent
with other platforms.

* Enable SUPPORT_DOSFS in /usr/mdec/boot (stage2), so that
we can boot off FAT partitions.

* Crank version of /usr/mdec/boot to 3.1, and fix some of the other
entries in the version file.

installboot(8) (i386):
* Read the existing MBR of the filesystem and retain the BIOS
Parameter Block (BPB) in bytes 11..89 and the MBR partition
table in bytes 446..509. (Previously installboot(8) would
trash those two sections of the MBR.)

mbrlabel(8):
* Use sys/lib/libkern/xlat_mbr_fstype.c instead of homegrown code
to map the MBR partition type to the NetBSD disklabel type.


Test built "make release" for i386, and new bootblocks verified to work
(even off FAT!).
 1.2 16-Apr-2003  dsl branches: 1.2.2;
Add definitions (#defined out) to pass the result of the v3.x bios
extended disk information request to the kernel.
Binary compatible with the existing code, disabled because I don't
have a system with a bios that supports the request.
 1.1 26-Feb-2003  fvdl Move some files out of i386 into x86, so that they can be shared with
other ports.
 1.2.2.6 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.2.2.5 06-Feb-2005  skrll Sync with HEAD.
 1.2.2.4 02-Nov-2004  skrll Sync with HEAD.
 1.2.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.2.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.2.2.1 03-Aug-2004  skrll Sync with HEAD
 1.6.6.1 12-Feb-2005  yamt sync with head.
 1.6.4.1 29-Apr-2005  kent sync with -current
 1.9.2.2 21-Jan-2008  yamt sync with head
 1.9.2.1 21-Jun-2006  yamt sync with head.
 1.10.4.1 09-Sep-2006  rpaulo sync with head
 1.10.2.1 18-Feb-2006  yamt sync with head.
 1.11.60.1 02-Jan-2008  bouyer Sync with HEAD
 1.11.56.1 26-Dec-2007  ad Sync with head.
 1.11.52.1 18-Feb-2008  mjf Sync with HEAD.
 1.11.46.1 09-Jan-2008  matt sync with HEAD
 1.12.10.3 16-Sep-2009  yamt sync with head
 1.12.10.2 04-May-2009  yamt sync with head.
 1.12.10.1 16-May-2008  yamt sync with head.
 1.12.8.1 18-May-2008  yamt sync with head.
 1.12.6.2 28-Sep-2008  mjf Sync with HEAD.
 1.12.6.1 02-Jun-2008  mjf Sync with HEAD.
 1.13.6.1 19-Oct-2008  haad Sync with HEAD.
 1.13.2.1 24-Sep-2008  wrstuden Merge in changes between wrstuden-revivesa-base-2 and
wrstuden-revivesa-base-3.
 1.14.8.4 27-Aug-2011  jym Sync with HEAD. Most notably: uvm/pmap work done by rmind@, and MP Xen
work of cherry@.

No regression observed on suspend/restore.
 1.14.8.3 28-Mar-2011  jym Sync with HEAD. TODO before merge:
- shortcut for suspend code in sysmon, when powerd(8) is not running.
Borrow ``xs_watch'' thread context?
- bug hunting in xbd + xennet resume. Rings are currently thrashed upon
resume, so current implementation force flush them on suspend. It's not
really needed.
 1.14.8.2 01-Nov-2009  jym Sync with HEAD.
 1.14.8.1 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.14.2.1 03-Mar-2009  skrll Sync with HEAD.
 1.16.8.1 08-Feb-2011  bouyer Sync with HEAD
 1.16.6.1 06-Jun-2011  jruoho Sync with HEAD.
 1.16.4.2 31-May-2011  rmind sync with head
 1.16.4.1 05-Mar-2011  rmind sync with head
 1.18.4.2 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.18.4.1 17-Apr-2012  yamt sync with head
 1.19.8.3 03-Dec-2017  jdolecek update from HEAD
 1.19.8.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.19.8.1 23-Jun-2013  tls resync from head
 1.22.2.1 18-May-2014  rmind sync with head
 1.23.6.3 28-Aug-2017  skrll Sync with HEAD
 1.23.6.2 05-Feb-2017  skrll Sync with HEAD
 1.23.6.1 19-Mar-2016  skrll Sync with HEAD
 1.24.4.1 21-Apr-2017  bouyer Sync with HEAD
 1.24.2.1 20-Mar-2017  pgoyette Sync with HEAD
 1.26.6.2 27-Jun-2019  martin Pull up following revision(s) (requested by nonaka in ticket #1282):

sys/arch/x86/include/bootinfo.h: revision 1.30

PR/54147: Increase BOOTINFO_MAXSIZE to 16Kib.

Some systems require a larger bootinfo size for memory descriptors.
 1.26.6.1 14-Apr-2018  martin Pull up following revision(s) (requested by nonaka in ticket #753):

sys/arch/x86/include/bootinfo.h: revision 1.29

x86: Increase BOOTINFO_MAXSIZE to 8Kib.

Proposed on port-i386 and port-amd64 with no objections:
http://mail-index.netbsd.org/port-i386/2018/04/11/msg003692.html
http://mail-index.netbsd.org/port-amd64/2018/04/11/msg002697.html
 1.28.2.1 16-Apr-2018  pgoyette Sync with HEAD, resolve some conflicts
 1.29.2.1 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.31.10.1 02-Aug-2025  perseant Sync with HEAD
 1.1 20-Aug-2022  riastradh x86: Split bootspace out of x86/pmap.h into new x86/bootspace.h.
 1.21 17-Jul-2011  dyoung Good-bye bus.h. Don't install <machine/bus.h>.
 1.20 28-Apr-2010  dyoung On x86, change the bus_space_tag_t to a pointer to a struct
bus_space_tag. For now, bus_space_tag's only member is
bst_type, the type of space, which is either X86_BUS_SPACE_IO
or X86_BUS_SPACE_MEM. In the future, new bus_space_tag members
will refer to override-functions installed by a new function,
bus_space_tag_create(9).

Add pointers to constant struct bus_space_tag, x86_bus_space_io and
x86_bus_space_mem. Use them to replace most uses of X86_BUS_SPACE_IO
and X86_BUS_SPACE_MEM.

Add an x86-specific bus_space_is_equal(9) implementation that compares
the two tags' bst_type.
 1.19 13-Feb-2009  bouyer branches: 1.19.2; 1.19.4;
Change bus_size_t from paddr_t to size_t. It doens't make sense to have
a 64bit bus_size_t on i386 as the address space is 32bits anyway.
With a 64bit bus_size_t we need a different bus_space.S for PAE and non-PAE.
 1.18 08-Feb-2009  bouyer branches: 1.18.2;
Apply patch proposed on port-amd64/port-i386, allowing to use a 64bit
bus_addr_t on i386PAE kernels:
change bus_addr_t to be a paddr_t (so its size follows paddr_t depending
on options PAE)
remplace bus_addr_t with vaddr_t where the value is used as a virtual address.

Difference with the proposed patch: cast to uintmax_t and use %jx in
printf() as suggested by Joerg.
 1.17 06-Nov-2008  dyoung Use NULL instead of (bus_dma_tag_t)0.
 1.16 28-Apr-2008  martin branches: 1.16.6; 1.16.8; 1.16.10; 1.16.14;
Remove clause 3 and 4 from TNF licenses
 1.15 17-Oct-2007  garbled branches: 1.15.16; 1.15.18; 1.15.20;
Merge the ppcoea-renovation branch to HEAD.

This branch was a major cleanup and rototill of many of the various OEA
cpu based PPC ports that focused on sharing as much code as possible
between the various ports to eliminate near-identical copies of files in
every tree. Additionally there is a new PIC system that unifies the
interface to interrupt code for all different OEA ppc arches. The work
for this branch was done by a variety of people, too long to list here.

TODO:
bebox still needs work to complete the transition to -renovation.
ofppc still needs a bunch of work, which I will be looking at.
ev64260 still needs to be renovated
amigappc was not attempted.

NOTES:
pmppc was removed as an arch, and moved to a evbppc target.
 1.14 26-Sep-2007  ad x86 changes for pcc and LKMs.

- Replace most inline assembly with proper functions. As a side effect
this reduces the size of amd64 GENERIC by about 120kB, and i386 by a
smaller amount. Nearly all of the inlines did something slow, or something
that does not need to be fast.
- Make curcpu() and curlwp functions proper, unless __GNUC__ && _KERNEL.
In that case make them inlines. Makes curlwp LKM and preemption safe.
- Make bus_space and bus_dma more LKM friendly.
- Share a few more files between the ports.
- Other minor changes.
 1.13 04-Mar-2007  christos branches: 1.13.2; 1.13.10; 1.13.18; 1.13.20; 1.13.22;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.
 1.12 21-Feb-2007  mrg add a pair of new bus_dma(9) functions:
int _bus_dmatag_subregion(bus_dma_tag_t tag,
bus_addr_t min_addr,
bus_addr_t max_addr,
bus_dma_tag_t *newtag,
int flags)
void _bus_dmatag_destroy(bus_dma_tag_t tag)

that allow a (normally broken/limited) device to restrict the bus address
range it can talk to. this is used by bce(4) to limit DMA addresses to
1GB range, the maximum the chip can address.

all this is from Yorick Hardy <yhardy@uj.ac.za> with input from several
people on tech-kern.

XXX: bus_dma(9) needs an update still.
 1.11 16-Feb-2006  perry branches: 1.11.20;
Change "inline" back to "__inline" in .h files -- C99 is still too
new, and some apps compile things in C89 mode. C89 keywords stay.

As per core@.
 1.10 24-Dec-2005  perry branches: 1.10.2; 1.10.4; 1.10.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.
 1.9 16-Apr-2005  yamt branches: 1.9.2;
tweak x86 bus_dma code so that it can be used by xen port.

- distinguish paddr_t and bus_addr_t.
for xen, use bus_addr_t in the sense of machine address.
- move _X86_BUS_DMA_PRIVATE part of bus.h into bus_private.h.
- remove special handling of xen_shm. we can always grab
machine address from pte.
 1.8 09-Mar-2005  matt branches: 1.8.2;
Add a dm_maxsegsz public member to bus_dmamap_t. This allows a user of the API
to select the maximum segment size for each bus_dmamap_load (up to the maxsegsz
supplied to bus_dmamap_create). dm_maxsegsz is reset to the value supplied to
bus_dmamap_create when the dmamap is unloaded.
 1.7 20-Jun-2004  thorpej branches: 1.7.4; 1.7.6;
Remove the "ID" component of the x86 bus_dma flags, since these are no
longer "ISA DMA" specific flags.
 1.6 05-Jun-2004  yamt unexport following x86 bus_dma internal functions.
_bus_dma_alloc_bouncebuf
_bus_dma_free_bouncebuf
_bus_dmamap_load_buffer
 1.5 14-Jan-2004  yamt issue memory read barrier for BUS_DMASYNC_POSTREAD operation.
PR/21665 from Stephan Uphoff.
 1.4 27-Oct-2003  junyoung Nuke __P().
 1.3 15-Jun-2003  fvdl branches: 1.3.2;
Handle 64bit DMA addresses on PCI for platforms that can (currently only
enabled on amd64). Add a dmat64 field to various PCI attach structures,
and pass it down where needed. Implement a simple new function called
pci_dma64_available(pa) to test if 64bit DMA addresses may be used.
This returns 1 iff _PCI_HAVE_DMA64 is defined in <machine/pci_machdep.h>,
and there is more than 4G of memory.
 1.2 07-May-2003  fvdl Generalize bounce buffers, and use them for 32 bit PCI if needed.
Make ALLOCNOW the default iff bouncing might be needed (this has
no effect on i386 because ISA DMA devices already had to use
ALLOCNOW, and PCI isn't bounced (yet), since we don't do > 4G
at this point for i386.
 1.1 26-Feb-2003  fvdl Move some files out of i386 into x86, so that they can be shared with
other ports.
 1.3.2.5 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.3.2.4 01-Apr-2005  skrll Sync with HEAD.
 1.3.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.3.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.3.2.1 03-Aug-2004  skrll Sync with HEAD
 1.7.6.1 19-Mar-2005  yamt sync with head. xen and whitespace. xen part is not finished.
 1.7.4.1 29-Apr-2005  kent sync with -current
 1.8.2.1 21-Apr-2005  tron Pull up revision 1.9 (requested by yamt in ticket #175):
tweak x86 bus_dma code so that it can be used by xen port.
- distinguish paddr_t and bus_addr_t.
for xen, use bus_addr_t in the sense of machine address.
- move _X86_BUS_DMA_PRIVATE part of bus.h into bus_private.h.
- remove special handling of xen_shm. we can always grab
machine address from pte.
 1.9.2.4 27-Oct-2007  yamt sync with head.
 1.9.2.3 03-Sep-2007  yamt sync with head.
 1.9.2.2 26-Feb-2007  yamt sync with head.
 1.9.2.1 21-Jun-2006  yamt sync with head.
 1.10.6.1 22-Apr-2006  simonb Sync with head.
 1.10.4.1 09-Sep-2006  rpaulo sync with head
 1.10.2.1 18-Feb-2006  yamt sync with head.
 1.11.20.2 12-Mar-2007  rmind Sync with HEAD.
 1.11.20.1 27-Feb-2007  yamt - sync with head.
- move sched_changepri back to kern_synch.c as it doesn't know PPQ anymore.
 1.13.22.1 06-Oct-2007  yamt sync with head.
 1.13.20.1 06-Nov-2007  matt sync with HEAD
 1.13.18.1 02-Oct-2007  joerg Sync with HEAD.
 1.13.10.1 03-Oct-2007  garbled Sync with HEAD
 1.13.2.1 09-Oct-2007  ad Sync with head.
 1.15.20.3 11-Aug-2010  yamt sync with head.
 1.15.20.2 04-May-2009  yamt sync with head.
 1.15.20.1 16-May-2008  yamt sync with head.
 1.15.18.1 18-May-2008  yamt sync with head.
 1.15.16.2 17-Jan-2009  mjf Sync with HEAD.
 1.15.16.1 02-Jun-2008  mjf Sync with HEAD.
 1.16.14.1 21-Apr-2010  matt sync to netbsd-5
 1.16.10.2 29-Sep-2009  snj Pull up following revision(s) (requested by bouyer in ticket #1040):
sys/arch/x86/include/bus.h: revision 1.19
Change bus_size_t from paddr_t to size_t. It doens't make sense to have
a 64bit bus_size_t on i386 as the address space is 32bits anyway.
With a 64bit bus_size_t we need a different bus_space.S for PAE and non-PAE.
 1.16.10.1 29-Sep-2009  snj Pull up following revision(s) (requested by bouyer in ticket #1040):
sys/arch/x86/include/bus.h: revision 1.18
sys/arch/x86/include/isa_machdep.h: revision 1.7
sys/arch/x86/x86/bus_space.c: revision 1.21
Apply patch proposed on port-amd64/port-i386, allowing to use a 64bit
bus_addr_t on i386PAE kernels:
change bus_addr_t to be a paddr_t (so its size follows paddr_t depending
on options PAE)
remplace bus_addr_t with vaddr_t where the value is used as a virtual address.
Difference with the proposed patch: cast to uintmax_t and use %jx in
printf() as suggested by Joerg.
 1.16.8.2 03-Mar-2009  skrll Sync with HEAD.
 1.16.8.1 19-Jan-2009  skrll Sync with HEAD.
 1.16.6.1 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.18.2.4 27-Aug-2011  jym Add/remove files, like in HEAD.
 1.18.2.3 24-Oct-2010  jym Sync with HEAD
 1.18.2.2 01-Nov-2009  jym Sync with HEAD.
 1.18.2.1 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.19.4.1 30-May-2010  rmind sync with head
 1.19.2.1 30-Apr-2010  uebayasi Sync with HEAD.
 1.6 21-Jan-2021  kre PRIxXXXX (etc) definitions should not include the %

Will fix anything this ends up breaking later.
 1.5 14-Nov-2019  maxv branches: 1.5.8;
Add support for Kernel Memory Sanitizer (kMSan). It detects uninitialized
memory used by the kernel at run time, and just like kASan and kCSan, it
is an excellent feature. It has already detected 38 uninitialized variables
in the kernel during my testing, which I have since discreetly fixed.

We use two shadows:
- "shad", to track uninitialized memory with a bit granularity (1:1).
Each bit set to 1 in the shad corresponds to one uninitialized bit of
real kernel memory.
- "orig", to track the origin of the memory with a 4-byte granularity
(1:1). Each uint32_t cell in the orig indicates the origin of the
associated uint32_t of real kernel memory.

The memory consumption of these shadows is consequent, so at least 4GB of
RAM is recommended to run kMSan.

The compiler inserts calls to specific __msan_* functions on each memory
access, to manage both the shad and the orig and detect uninitialized
memory accesses that change the execution flow (like an "if" on an
uninitialized variable).

We mark as uninit several types of memory buffers (stack, pools, kmem,
malloc, uvm_km), and check each buffer passed to copyout, copyoutstr,
bwrite, if_transmit_lock and DMA operations, to detect uninitialized memory
that leaves the system. This allows us to detect kernel info leaks in a way
that is more efficient and also more user-friendly than KLEAK.

Contrary to kASan, kMSan requires comprehensive coverage, ie we cannot
tolerate having one non-instrumented function, because this could cause
false positives. kMSan cannot instrument ASM functions, so I converted
most of them to __asm__ inlines, which kMSan is able to instrument. Those
that remain receive special treatment.

Contrary to kASan again, kMSan uses a TLS, so we must context-switch this
TLS during interrupts. We use different contexts depending on the interrupt
level.

The orig tracks precisely the origin of a buffer. We use a special encoding
for the orig values, and pack together in each uint32_t cell of the orig:
- a code designating the type of memory (Stack, Pool, etc), and
- a compressed pointer, which points either (1) to a string containing
the name of the variable associated with the cell, or (2) to an area
in the kernel .text section which we resolve to a symbol name + offset.

This encoding allows us not to consume extra memory for associating
information with each cell, and produces a precise output, that can tell
for example the name of an uninitialized variable on the stack, the
function in which it was pushed on the stack, and the function where we
accessed this uninitialized variable.

kMSan is available with LLVM, but not with GCC.

The code is organized in a way that is similar to kASan and kCSan, so it
means that other architectures than amd64 can be supported.
 1.4 04-Oct-2019  maxv Add DMA instrumentation in KASAN. We note the original buffer and length in
the map, and check the buffer on each bus_dmamap_sync. This allows us to
find DMA buffer overflows and UAFs, which couldn't be found before because
the device accesses to memory are outside of KASAN's control.
 1.3 23-Sep-2019  skrll Provide PRIxBUSADDR, PRIxBUSSIZE, PRIuBUSSIZE, and PRIxBSH for all arches
to follow arm and (generic) mips.

Reviewed by christos.
 1.2 25-Aug-2011  dyoung branches: 1.2.2; 1.2.56;
Add to x86 bus_space_tag_t a member, bst_exists, that tells whether a
routine is overridden by this tag or by any ancestral tag.
 1.1 01-Jul-2011  dyoung Per discussion at
<http://mail-index.netbsd.org/tech-kern/2010/04/02/msg007941.html>,
divide each machine's bus.h into bus_defs.h (constants & data types)
and bus_funcs.h (macro implementations of bus_space(9) routines and MD
prototypes).

Note that some bus_space(9) routines' implementation will move to .c
files from inline subroutines or macros in .h files.

I've only made the split for machine architectures where there is PCI.
All of the non-PCI-having architectures will require a similar split.

These #include files are not referenced by any (committed) Makefiles or
header files, yet. Changes to Makefiles, to <sys/bus.h>, and to some
more machine-dependent files will dribble in before I throw the switch.
 1.2.56.1 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.2.2.2 27-Aug-2011  jym Add/remove files, like in HEAD.
 1.2.2.1 25-Aug-2011  jym file bus_defs.h was added on branch jym-xensuspend on 2011-08-27 15:59:49 +0000
 1.5.8.1 03-Apr-2021  thorpej Sync with HEAD.
 1.1 01-Jul-2011  dyoung branches: 1.1.2;
Per discussion at
<http://mail-index.netbsd.org/tech-kern/2010/04/02/msg007941.html>,
divide each machine's bus.h into bus_defs.h (constants & data types)
and bus_funcs.h (macro implementations of bus_space(9) routines and MD
prototypes).

Note that some bus_space(9) routines' implementation will move to .c
files from inline subroutines or macros in .h files.

I've only made the split for machine architectures where there is PCI.
All of the non-PCI-having architectures will require a similar split.

These #include files are not referenced by any (committed) Makefiles or
header files, yet. Changes to Makefiles, to <sys/bus.h>, and to some
more machine-dependent files will dribble in before I throw the switch.
 1.1.2.2 27-Aug-2011  jym Add/remove files, like in HEAD.
 1.1.2.1 01-Jul-2011  jym file bus_funcs.h was added on branch jym-xensuspend on 2011-08-27 15:59:49 +0000
 1.16 22-Jan-2022  skrll Ensure bus_dmatag_subregion is called with an inclusive max_addr
everywhere.
 1.15 22-Feb-2020  chs remove some unnecessary includes of internal UVM headers.
 1.14 01-Sep-2011  christos branches: 1.14.54; 1.14.60;
Add bus_dma overrides. From dyoung
 1.13 31-Aug-2011  dyoung Add override members to x86_bus_dma_tag.
 1.12 12-Nov-2010  uebayasi Pull in uvm/uvm.h for VM_PAGE_TO_PHYS().
 1.11 28-Apr-2008  martin branches: 1.11.14; 1.11.22;
Remove clause 3 and 4 from TNF licenses
 1.10 17-Oct-2007  garbled branches: 1.10.16; 1.10.18; 1.10.20;
Merge the ppcoea-renovation branch to HEAD.

This branch was a major cleanup and rototill of many of the various OEA
cpu based PPC ports that focused on sharing as much code as possible
between the various ports to eliminate near-identical copies of files in
every tree. Additionally there is a new PIC system that unifies the
interface to interrupt code for all different OEA ppc arches. The work
for this branch was done by a variety of people, too long to list here.

TODO:
bebox still needs work to complete the transition to -renovation.
ofppc still needs a bunch of work, which I will be looking at.
ev64260 still needs to be renovated
amigappc was not attempted.

NOTES:
pmppc was removed as an arch, and moved to a evbppc target.
 1.9 26-Sep-2007  ad x86 changes for pcc and LKMs.

- Replace most inline assembly with proper functions. As a side effect
this reduces the size of amd64 GENERIC by about 120kB, and i386 by a
smaller amount. Nearly all of the inlines did something slow, or something
that does not need to be fast.
- Make curcpu() and curlwp functions proper, unless __GNUC__ && _KERNEL.
In that case make them inlines. Makes curlwp LKM and preemption safe.
- Make bus_space and bus_dma more LKM friendly.
- Share a few more files between the ports.
- Other minor changes.
 1.8 04-Mar-2007  christos branches: 1.8.2; 1.8.10; 1.8.18; 1.8.20; 1.8.22;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.
 1.7 21-Feb-2007  mrg add a pair of new bus_dma(9) functions:
int _bus_dmatag_subregion(bus_dma_tag_t tag,
bus_addr_t min_addr,
bus_addr_t max_addr,
bus_dma_tag_t *newtag,
int flags)
void _bus_dmatag_destroy(bus_dma_tag_t tag)

that allow a (normally broken/limited) device to restrict the bus address
range it can talk to. this is used by bce(4) to limit DMA addresses to
1GB range, the maximum the chip can address.

all this is from Yorick Hardy <yhardy@uj.ac.za> with input from several
people on tech-kern.

XXX: bus_dma(9) needs an update still.
 1.6 28-Aug-2006  bouyer branches: 1.6.8;
Some bus_dma(9) fixes for Xen:
- Attempt to gracefully recover from a failed decrease_reservation or
increase_reservation, by avoiding physical memory loss.
- always store a machine address in ds_addr; this avoids some mistakes
where machine address would in some case be freed at physical address, or
mapped as physical address.
 1.5 16-Feb-2006  perry branches: 1.5.2; 1.5.12;
Change "inline" back to "__inline" in .h files -- C99 is still too
new, and some apps compile things in C89 mode. C89 keywords stay.

As per core@.
 1.4 24-Dec-2005  perry branches: 1.4.2; 1.4.4; 1.4.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.
 1.3 22-Aug-2005  bouyer branches: 1.3.6;
Rename _PRIVATE_BUS_DMAMEM_ALLOC_RANGE to _BUS_DMAMEM_ALLOC_RANGE for
consistency with other macros defined in bus_private.h. Pointed out by
YAMAMOTO Takashi.
 1.2 20-Aug-2005  bouyer More adjustements to deal with Xen's physical <=> machine addresses mappings:
- Allow _bus_dmamem_alloc_range to be provided from external source:
Use a _PRIVATE_BUS_DMAMEM_ALLOC_RANGE macro, defined to
_bus_dmamem_alloc_range by default.
- avail_end is the end of the physical address range. Define a macro
_BUS_AVAIL_END (defined by default to avail_end) and use it instead.
 1.1 16-Apr-2005  yamt branches: 1.1.2; 1.1.4; 1.1.6;
add files which i forgot to add with arch/x86/x86/bus_dma.c rev.1.21.
 1.1.6.5 27-Oct-2007  yamt sync with head.
 1.1.6.4 03-Sep-2007  yamt sync with head.
 1.1.6.3 26-Feb-2007  yamt sync with head.
 1.1.6.2 30-Dec-2006  yamt sync with head.
 1.1.6.1 21-Jun-2006  yamt sync with head.
 1.1.4.2 29-Apr-2005  kent sync with -current
 1.1.4.1 16-Apr-2005  kent file bus_private.h was added on branch kent-audio2 on 2005-04-29 11:28:29 +0000
 1.1.2.5 16-Sep-2006  ghen Pull up following revision(s) (requested by bouyer in ticket #1510):
sys/arch/xen/x86/xen_bus_dma.c: revision 1.7
sys/arch/xen/x86/xen_bus_dma.c: revision 1.8
sys/arch/x86/include/bus_private.h: revision 1.6
sys/arch/x86/x86/bus_dma.c: revision 1.30
sys/arch/xen/include/bus_private.h: revision 1.7
Some bus_dma(9) fixes for Xen:
- Attempt to gracefully recover from a failed decrease_reservation or
increase_reservation, by avoiding physical memory loss.
- always store a machine address in ds_addr; this avoids some mistakes
where machine address would in some case be freed at physical address, or
mapped as physical address.
Wrap some printfs in #ifdef DEBUG, as we should not leak memory any more when
bus_dma memory allocation fails.
 1.1.2.4 25-Aug-2005  tron Pull up following revision(s) (requested by bouyer in ticket #697):
sys/arch/x86/x86/bus_dma.c: revision 1.23
sys/arch/x86/include/bus_private.h: revision 1.3
sys/arch/xen/include/bus_private.h: revision 1.3
Rename _PRIVATE_BUS_DMAMEM_ALLOC_RANGE to _BUS_DMAMEM_ALLOC_RANGE for
consistency with other macros defined in bus_private.h. Pointed out by
YAMAMOTO Takashi.
 1.1.2.3 25-Aug-2005  tron Pull up following revision(s) (requested by bouyer in ticket #695):
sys/arch/x86/x86/bus_dma.c: revision 1.22
sys/arch/x86/include/bus_private.h: revision 1.2
More adjustements to deal with Xen's physical <=> machine addresses mappings:
- Allow _bus_dmamem_alloc_range to be provided from external source:
Use a _PRIVATE_BUS_DMAMEM_ALLOC_RANGE macro, defined to
_bus_dmamem_alloc_range by default.
- avail_end is the end of the physical address range. Define a macro
_BUS_AVAIL_END (defined by default to avail_end) and use it instead.
 1.1.2.2 21-Apr-2005  tron Pull up revision 1.1 (requested by yamt in ticket #175):
add files which i forgot to add with arch/x86/x86/bus_dma.c rev.1.21.
 1.1.2.1 16-Apr-2005  tron file bus_private.h was added on branch netbsd-3 on 2005-04-21 18:43:01 +0000
 1.3.6.2 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.3.6.1 22-Aug-2005  skrll file bus_private.h was added on branch ktrace-lwp on 2005-11-10 14:00:20 +0000
 1.4.6.1 22-Apr-2006  simonb Sync with head.
 1.4.4.1 09-Sep-2006  rpaulo sync with head
 1.4.2.1 18-Feb-2006  yamt sync with head.
 1.5.12.1 14-Sep-2006  riz Pull up following revision(s) (requested by bouyer in ticket #150):
sys/arch/xen/x86/xen_bus_dma.c: revision 1.7
sys/arch/xen/x86/xen_bus_dma.c: revision 1.8
sys/arch/x86/include/bus_private.h: revision 1.6
sys/arch/x86/x86/bus_dma.c: revision 1.30
sys/arch/xen/include/bus_private.h: revision 1.7
Some bus_dma(9) fixes for Xen:
- Attempt to gracefully recover from a failed decrease_reservation or
increase_reservation, by avoiding physical memory loss.
- always store a machine address in ds_addr; this avoids some mistakes
where machine address would in some case be freed at physical address, or
mapped as physical address.
Wrap some printfs in #ifdef DEBUG, as we should not leak memory any more when
bus_dma memory allocation fails.
 1.5.2.1 03-Sep-2006  yamt sync with head.
 1.6.8.2 12-Mar-2007  rmind Sync with HEAD.
 1.6.8.1 27-Feb-2007  yamt - sync with head.
- move sched_changepri back to kern_synch.c as it doesn't know PPQ anymore.
 1.8.22.1 06-Oct-2007  yamt sync with head.
 1.8.20.1 06-Nov-2007  matt sync with HEAD
 1.8.18.1 02-Oct-2007  joerg Sync with HEAD.
 1.8.10.1 03-Oct-2007  garbled Sync with HEAD
 1.8.2.1 09-Oct-2007  ad Sync with head.
 1.10.20.1 16-May-2008  yamt sync with head.
 1.10.18.1 18-May-2008  yamt sync with head.
 1.10.16.1 02-Jun-2008  mjf Sync with HEAD.
 1.11.22.1 05-Mar-2011  rmind sync with head
 1.11.14.1 10-Jan-2011  jym Sync with HEAD
 1.14.60.1 29-Feb-2020  ad Sync with head.
 1.14.54.1 08-Apr-2020  martin Merge changes from current as of 20200406
 1.2 28-Apr-2008  martin Remove clause 3 and 4 from TNF licenses
 1.1 26-Sep-2007  ad branches: 1.1.2; 1.1.4; 1.1.6; 1.1.10; 1.1.14; 1.1.28; 1.1.30; 1.1.32;
x86 changes for pcc and LKMs.

- Replace most inline assembly with proper functions. As a side effect
this reduces the size of amd64 GENERIC by about 120kB, and i386 by a
smaller amount. Nearly all of the inlines did something slow, or something
that does not need to be fast.
- Make curcpu() and curlwp functions proper, unless __GNUC__ && _KERNEL.
In that case make them inlines. Makes curlwp LKM and preemption safe.
- Make bus_space and bus_dma more LKM friendly.
- Share a few more files between the ports.
- Other minor changes.
 1.1.32.1 16-May-2008  yamt sync with head.
 1.1.30.1 18-May-2008  yamt sync with head.
 1.1.28.1 02-Jun-2008  mjf Sync with HEAD.
 1.1.14.2 06-Nov-2007  matt sync with HEAD
 1.1.14.1 26-Sep-2007  matt file busdefs.h was added on branch matt-armv6 on 2007-11-06 23:23:33 +0000
 1.1.10.2 27-Oct-2007  yamt sync with head.
 1.1.10.1 26-Sep-2007  yamt file busdefs.h was added on branch yamt-lazymbuf on 2007-10-27 11:28:54 +0000
 1.1.6.2 09-Oct-2007  ad Sync with head.
 1.1.6.1 26-Sep-2007  ad file busdefs.h was added on branch vmlocking on 2007-10-09 13:38:41 +0000
 1.1.4.2 06-Oct-2007  yamt sync with head.
 1.1.4.1 26-Sep-2007  yamt file busdefs.h was added on branch yamt-x86pmap on 2007-10-06 15:33:31 +0000
 1.1.2.2 02-Oct-2007  joerg Sync with HEAD.
 1.1.2.1 26-Sep-2007  joerg file busdefs.h was added on branch jmcneill-pm on 2007-10-02 18:27:49 +0000
 1.31 09-Dec-2021  msaitoh Print TLB message consistently to improve readability.

Example:
cpu0: L2 cache: 256KB 64B/line 4-way
cpu0: L3 cache: 4MB 64B/line 16-way
cpu0: 64B prefetching
-cpu0: ITLB: 64 4KB entries 8-way, 2M/4M: 8 entries
+cpu0: ITLB: 64 4KB entries 8-way, 8 2M/4M entries
cpu0: DTLB: 64 4KB entries 4-way, 4 1GB entries 4-way
cpu0: L2 STLB: 1536 4KB entries 6-way
cpu0: Initial APIC ID 0
 1.30 07-Oct-2021  msaitoh Move some common functions into x86/identcpu_subr.c. No functional change.
 1.29 27-Sep-2021  msaitoh Add Load Only TLB and Store Only TLB.
 1.28 26-Jul-2019  msaitoh branches: 1.28.2;
- AMD CPUID Fn8000_0001d Cache Topology Information leaf is almost the same as
Intel Deterministic Cache Parameter Leaf(0x04), so make new
cpu_dcp_cacheinfo() and share it.
- AMD's L2 and L3's cache descriptor's definition is the same, so use one
common definition.
- KNF.

XXX Split some common functions to new identcpu_subr.c or use #ifdef _KERNEK
... #endif in identcpu.c to share from both kernel and cpuctl?
 1.27 24-Jul-2019  msaitoh It seems that AMD zen2's CPUID 0x80000006 leaf's spec has changed.
The EDX register's acsociativity field has 9. In the latest available document,
it's a reserved value. I have no access to zen2's document, but many websites
say that the acsociativity is 16. Add it.
 1.26 12-Mar-2018  msaitoh branches: 1.26.2;
AMD L3 cache association bitfield is not 8bit but 4bit like others association
bitfields.
 1.25 12-Mar-2018  msaitoh Add 3way and 6way of L2 cache or TLB on AMD CPU.
 1.24 09-Mar-2018  msaitoh Add yet another Shared L2 TLB (2M/4M pages).

XXX need redesign.
 1.23 05-Mar-2018  msaitoh branches: 1.23.2;
Add Intel Deterministic Address Translation Parameter Leaf(0x18) definitions.
 1.22 27-Apr-2016  msaitoh branches: 1.22.10;
Add new desc 0x64 and 0xc4.
 1.21 08-Jan-2016  msaitoh Index 0x6c is not 126 entries but 128 entries. The old value was from
previous SDM.
 1.20 19-Oct-2015  msaitoh Add some TLB entries from the latest Intel SDM. This change might be incorrect
because the document itself is very strange.
 1.19 09-Sep-2014  msaitoh branches: 1.19.2;
Add new cache descriptor (0xc3) from the latest Intel SDM.
 1.18 03-Jul-2014  msaitoh branches: 1.18.2;
Fix some entries:
- Desc 0x55 and 0xb1 are Instruction TLB but not fixed to 4K.
- Desc 0x5a and 0xc0 are Data TLB but not fixed to 4K.
- Desc 0x57 and 0x59 are 4K fixed DTLB.
- Fix string of desc 0xc2 and it's not fixed to 4K.
- Desc 0xca is 4K fixed L2 shared TLB.
- Add desc 0xa0.

BUG: A lot of CPUs have multiple CAI_DTLB and/or CAI_DTLB2 entries. Currently
TLB info is indexed in ci_cinfo[CAI_COUNT], so some info is overwritten.

Nowadays CPUs have very complexed TLBs. It's hard to manage with CAI_* index.
We should think to separate TLB info structure from ci_cinfo[CAI_COUNT]
in struct cpu_info.
 1.17 28-Oct-2013  msaitoh branches: 1.17.2;
Support prefetch size.
 1.16 14-Sep-2013  msaitoh Add Shared L2 TLB and some cache and tlb entries from the latest document.
 1.15 17-Jul-2013  msaitoh Add some new TLB and cache entries from document (Table 3-22 Encoding of CPUID
Leaf 2 Descriptors, Intel 64 and IA-32 Architectures Software Developer's
Manual Vol. 2A.)
 1.14 17-Jul-2013  msaitoh Fix 0x0d's DCACHE entry and 0xeb's L3CACHE entry from the document
(Table 3-22 Encoding of CPUID Leaf 2 Descriptors, Intel 64 and IA-32
Architectures Software Developer's Manual Vol. 2A.)
 1.13 04-Dec-2011  chs branches: 1.13.2; 1.13.6; 1.13.10;
add info on L2 TLBs and 1GB pages.
 1.12 13-May-2009  pgoyette branches: 1.12.12; 1.12.16;
Fix toyp in previous. Pointed out by snj@
 1.11 13-May-2009  pgoyette 1. Extend CPU probe of Intel processors to handle extended-models. This
allows us to properly identify new Intel 45nm processors, Core i7,
Atom, and the 45nm Xeon MP.

2. Properly decode several new Intel cache descriptors, as listed in the
most recent (March 2009) edition of Intel's Application Note 485.

3. Convert decode of the various features masks to use the newly added
snprintb_m(3) routine.

Addresses my PR bin/41289
Addresses my PR bin/41290
 1.10 15-Apr-2009  lukem Constify a userland-only member.
 1.9 30-May-2008  christos branches: 1.9.6; 1.9.8; 1.9.12; 1.9.16;
don't undef __CI_TBL before we use it :-)
 1.8 30-May-2008  christos - fix an amd cache entry.
- merge tables
- support phenom
from Paul Goyette
 1.7 30-May-2008  christos PR/38722: Paul Goyette: Share cacheinfo information
 1.6 11-May-2008  cegger print L3 and TLB cache information for AMD Barcelona/Phenom
 1.5 11-May-2008  ad Simplify x86 identcpu code, and share between i386/amd64.
 1.4 16-Apr-2005  yamt branches: 1.4.82; 1.4.84; 1.4.86; 1.4.88;
make multi inclusion protection macros consistent.
 1.3 17-Aug-2004  briggs branches: 1.3.4; 1.3.10;
Get correct cache information for earlier VIA C3 models.
Mostly from PR kern/26689 submitted by Michael van Elst.
 1.2 08-Aug-2004  briggs VIA C3 cache info.
 1.1 25-Apr-2003  fvdl branches: 1.1.2; 1.1.4;
Share some common cache info cpuid code between i386 and x86_64.
 1.1.4.2 22-Aug-2004  tron Pull up revision 1.3 (requested by briggs in ticket #770):
Get correct cache information for earlier VIA C3 models.
Mostly from PR kern/26689 submitted by Michael van Elst.
 1.1.4.1 12-Aug-2004  jmc Pullup rev 1.2 (requested by briggs in ticket #742)

Enable VIA C3 CPU support
 1.1.2.5 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.1.2.4 21-Sep-2004  skrll Fix the sync with head I botched.
 1.1.2.3 18-Sep-2004  skrll Sync with HEAD.
 1.1.2.2 25-Aug-2004  skrll Sync with HEAD.
 1.1.2.1 12-Aug-2004  skrll Sync with HEAD.
 1.3.10.1 21-Apr-2005  tron Pull up revision 1.4 (requested by yamt in ticket #174):
make multi inclusion protection macros consistent.
 1.3.4.1 29-Apr-2005  kent sync with -current
 1.4.88.1 23-Jun-2008  wrstuden Sync w/ -current. 34 merge conflicts to follow.
 1.4.86.3 16-May-2009  yamt sync with head
 1.4.86.2 04-May-2009  yamt sync with head.
 1.4.86.1 16-May-2008  yamt sync with head.
 1.4.84.2 04-Jun-2008  yamt sync with head
 1.4.84.1 18-May-2008  yamt sync with head.
 1.4.82.1 02-Jun-2008  mjf Sync with HEAD.
 1.9.16.1 21-Apr-2010  matt sync to netbsd-5
 1.9.12.3 01-Nov-2009  jym Sync with HEAD.
 1.9.12.2 31-May-2009  jym Sync with HEAD.
 1.9.12.1 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.9.8.1 18-May-2009  bouyer Pull up following revision(s) (requested by pgoyette in ticket #761):
sys/arch/x86/include/cacheinfo.h: revisions 1.11, 1.12
usr.sbin/cpuctl/arch/i386.c: revisions 1.18, 1.19 via patch
1. Extend CPU probe of Intel processors to handle extended-models. This
allows us to properly identify new Intel 45nm processors, Core i7,
Atom, and the 45nm Xeon MP.
2. Properly decode several new Intel cache descriptors, as listed in the
most recent (March 2009) edition of Intel's Application Note 485.
Addresses my PR bin/41289
Addresses my PR bin/41290
 1.9.6.1 28-Apr-2009  skrll Sync with HEAD.
 1.12.16.1 18-Feb-2012  mrg merge to -current.
 1.12.12.2 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.12.12.1 17-Apr-2012  yamt sync with head
 1.13.10.2 18-May-2014  rmind sync with head
 1.13.10.1 28-Aug-2013  rmind sync with head
 1.13.6.2 03-Dec-2017  jdolecek update from HEAD
 1.13.6.1 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.13.2.1 29-Dec-2014  martin Pullup the following revisions, requested by msaitoh in #1219:

sys/arch/x86/include/cacheinfo.h 1.14-1.19

Update Intel's cache and TLB descripotr table. This changes the number
of page coloring on some CPUs.
- Add Shared L2 TLB.
- Support prefetch size.
- Add some new TLB and cache entries from the document.
- Fix some entries:
- Fix 0x0d's DCACHE entry and 0xeb's L3CACHE entry.
- Desc 0x55 and 0xb1 are Instruction TLB but not fixed to 4K.
- Desc 0x5a and 0xc0 are Data TLB but not fixed to 4K.
- Desc 0x57 and 0x59 are 4K fixed DTLB.
- Fix string of desc 0xc2 and it's not fixed to 4K.
- Desc 0xca is 4K fixed L2 shared TLB.
 1.17.2.1 10-Aug-2014  tls Rebase.
 1.18.2.4 09-Oct-2018  snj Pull up following revision(s) (requested by msaitoh in ticket #1636):
sys/arch/x86/include/cacheinfo.h: 1.23-1.26
sys/arch/x86/include/cpu.h: 1.70
sys/arch/x86/include/specialreg.h: 1.91-1.93,1.98,1.100,1.102-1.124,1.126,1.130 via patch
sys/arch/x86/x86/cpu_topology.c: 1.10
sys/arch/x86/x86/identcpu.c: 1.56-1.57,1.70 via patch
usr.sbin/cpuctl/arch/i386.c: 1.71,1.75-1.79,1.81-1.85 via patch
Add some register definitions for x86:
- Add CLWB bit.
- Fix a few (unused) MSR values, and add some bit definitions of
MSR_EFER from Murray Armfield in PR#42861.
- CPUID_CFLUSH bit is not for CFLUSH insn but CLFLUSH insn, so modify
comments and snprintb() string.
- Define CPUID Fn00000001 %ebx bits and use them.
No functional change.
- Add Structured Extended Flags Enumeration Leaf's bit definitions:
AVX512_{IFMA,VBMI2,VNNI,BITALG,VPOPCNTDQ,4VNNIW,4FMAPS},GFNI&VAES.
- Add Turbo Boost Max Technology 3.0 bit.
- Add AMD SVM features definitions.
- Add Intel cpuid 7 %edx IBRS and STIBP bit definitions.
- Fix swapped comments for EFER LME and LMA
- Add Intel cpuid 7 %edx bit 29 IA32_ARCH_CAPABILITIES supported bit.
- Add MSR_IA32_ARCH_CAPABILITIES definition.
- Add IA32_SPEC_CTRL MSR and IA32_PRED_CMD MSR.
- Add Intel Deterministic Address Translation Parameter Leaf(0x18)
definitions.
- s/CLFUSH/CLFLUSH/
- Add AMD's Disable Indirect Branch Predictor bit definition.
- Add the MSR bits definitions for IBRS, STIBP and IBPB.
- Add Intel Fn0000_0006 %eax new bit 14-20 (HWP stuff).
- Intel Fn0000_0007 %ecx bit 22 is for both RDPID and IA32_TSC_AUX.
- Add AMD's CPUID Fn80000001 %edx MMX and FXSR bit definitions.
- Add RDCL_NO and IBRS_ALL.
- Add SSBD and RSBA bit definitions.
- Add AMD's SSB bit definitions for F15H, F16H and F17H.
- Add cpuid 7 edx L1D_FLUSH bit.
- Add IA32_ARCH_SKIP_L1DFL_VMENTRY bit.
- Add IA32_FLUSH_CMD MSR.
- Add yet another Shared L2 TLB (2M/4M pages).
- Add 3way and 6way of L2 cache or TLB on AMD CPU.
- AMD L3 cache association bitfield is not 8bit but 4bit like others
association bitfields.
- Sort entries. No functional change.
- Modify comment, fix typo in comment and add comment.
cpuctl(8):
- Add detection for Quark X1000, Xeon E5 v4, E7 v4,
Core i7-69xx Extreme Edition, Xeon Scalable (Skylake),
Xeon Phi [357]200 (Knights Landing), Atom (Goldmont),
Atom (Denverton), Future Core (Cannon Lake), Atom (Goldmont Plus),
Xeon Phi 7215, 7285 and 7295 (Knights Mill) and
7th or 8th gen Core (Kaby Lake, Coffee Lake).
- Print Structured Extended Feature leaf Fn0000_0007 %ebx on AMD,too.
- Print Fn0000_0007 %ecx on Intel.
- Print Intel cpuid 7 %edx.
- Parse the TLB info from `cpuid leaf 18H' on Intel processor.
- Use aprint_error_dev() for error output.
 1.18.2.3 08-Dec-2016  snj Pull up following revision(s) (requested by msaitoh in ticket #1285):
sys/arch/x86/include/cacheinfo.h: revision 1.22
sys/arch/x86/include/specialreg.h: revisions 1.87 and 1.90
usr.sbin/cpuctl/arch/i386.c: revisions 1.72-1.74
Changes for x86's cpuctl(8):
- Add Quark X1000, Xeon E[57] v4, Core i7-69xx Extreme, 7th gen Core,
Denverton, Xeon Phi [357]200, Future Xeon and Future Xeon Phi.
- Add SGX, UMIP, RDPID, SGXLC, AVX512DQ, AVX512BW and AVX512VL bit.
- Fix the bit location of CLFLUSHOPT.
- Add new TLB descriptor 0x64 and 0xc4.
 1.18.2.2 06-Mar-2016  martin branches: 1.18.2.2.2;
Pull up the following changes, requested by msaitoh in #1117:

sys/arch/x86/include/cacheinfo.h 1.20-1.21
sys/arch/x86/include/specialreg.h 1.83-1.86
usr.sbin/cpuctl/arch/i386.c 1.67-1.70

Changes for x86's cpuctl(8):
- Add some TLB information (index 0x6a-0x6d).
- Add Hardware-Controlled Performance States (HWP) bits, FPU Data
Pointer Updated Only bit and CLFLUSHOPT bit.
- Add some AMD's bit definitions from "BIOS and Kernel Developer(BKDG)
for AMD Family 15h Models 60h-6Fh Processors".
- Add Xeon E5-4600 v3,
- Add Xeon E3-1200 v4 and v5.
- Add 6th gen Core, Xeon E3-1500 v5 and Xeon D-1500.
- Change CPU family 0x1c from "Atom Family" to "45nm Atom Family"
 1.18.2.1 12-Dec-2014  martin Pull up following revision(s) (requested by msaitoh in ticket #310):
sys/arch/x86/include/specialreg.h: revision 1.79-1.80
usr.sbin/cpuctl/arch/i386.c: revision 1.59
sys/arch/x86/include/cacheinfo.h: revision 1.19

Update some cpuid related values:
- Add XSAVECC, XGETBV, XSAVES, SMAP and PQE
- Change XINUSE to XGETBV
- Add new cache descripter value (0xc3)
- Update signatures for the follwing CPUs:
- Core M-5xxx
- Core i7 Extreme
- Future Core (0x4e)
- Future Xeon (0x56)
 1.18.2.2.2.1 18-Jan-2017  skrll Sync with netbsd-5
 1.19.2.3 29-May-2016  skrll Sync with HEAD
 1.19.2.2 19-Mar-2016  skrll Sync with HEAD
 1.19.2.1 27-Dec-2015  skrll Sync with HEAD (as of 26th Dec)
 1.22.10.5 24-Dec-2021  martin Pull up the following (all via patch), requested by msaitoh in ticket #1721:

usr.sbin/cpuctl/arch/i386.c 1.118-1.119, 1.121-1.122
usr.sbin/cpuctl/arch/cpuctl_i386.h 1.6
sys/arch/x86/x86/identcpu_subr.c 1.8-1.9
sys/arch/x86/x86/identcpu.c 1.123
sys/arch/x86/include/cacheinfo.h 1.30
sys/arch/x86/include/cpu.h 1.132

- Fix a bug that some TLB related lines were not printed.
- Fix a bug that STLB is printed as DTLB.
- If a TLB is variable sized, print the max size instead of error message.
- Cosmetic changes to improve readability.
 1.22.10.4 16-Aug-2019  martin Pull up following revision(s) (requested by msaitoh in ticket #1338):

usr.sbin/cpuctl/arch/i386.c: revision 1.104
sys/arch/x86/x86/identcpu.c: revision 1.93
sys/arch/x86/include/cacheinfo.h: revision 1.28
sys/arch/x86/include/specialreg.h: revision 1.150

- AMD CPUID Fn8000_0001d Cache Topology Information leaf is almost the same as
Intel Deterministic Cache Parameter Leaf(0x04), so make new
cpu_dcp_cacheinfo() and share it.
- AMD's L2 and L3's cache descriptor's definition is the same, so use one
common definition.
- KNF.

XXX Split some common functions to new identcpu_subr.c or use #ifdef _KERNEK
... #endif in identcpu.c to share from both kernel and cpuctl?
 1.22.10.3 16-Aug-2019  martin Pull up following revision(s) (requested by msaitoh in ticket #1338):

sys/arch/x86/include/cacheinfo.h: revision 1.27
sys/arch/x86/x86/identcpu.c: revision 1.74

Handle more Vortex CPU's from Andrius V.
While here refactor the code to make it smaller.

-

It seems that AMD zen2's CPUID 0x80000006 leaf's spec has changed.
The EDX register's acsociativity field has 9. In the latest available document,
it's a reserved value. I have no access to zen2's document, but many websites
say that the acsociativity is 16. Add it.

-

- AMD CPUID Fn8000_0001d Cache Topology Information leaf is almost the same as
Intel Deterministic Cache Parameter Leaf(0x04), so make new
cpu_dcp_cacheinfo() and share it.
- AMD's L2 and L3's cache descriptor's definition is the same, so use one
common definition.
- KNF.

XXX Split some common functions to new identcpu_subr.c or use #ifdef _KERNEK
... #endif in identcpu.c to share from both kernel and cpuctl?
 1.22.10.2 09-Apr-2018  martin Pull up following revision(s) (requested by msaitoh in ticket #715):

sys/arch/x86/include/cacheinfo.h: revision 1.24-1.26
usr.sbin/cpuctl/arch/i386.c: revision 1.81-1.84

- Parse the TLB info from `cpuid leaf 18H' on Intel processor. Currently,
this change doesn't decode perfectly. Tested with Gemini Lake. It has
two L2 Shared TLB. One is 4MB and another is 2MB/4MB but former isn't
printed yet:
cpu0: ITLB 1 4KB entries 48-way
cpu0: DTLB 1 4KB entries 32-way
cpu0: L2 STLB 8 4MB entries 4-way
Need some rework for struct x86_cache_info.
- Use aprint_error_dev() for error output.
Calculate way and number of entries correctly from CPUID leaf 18H.
Add yet another Shared L2 TLB (2M/4M pages).
XXX need redesign.

Add 3way and 6way of L2 cache or TLB on AMD CPU.
AMD L3 cache association bitfield is not 8bit but 4bit like others association
bitfields.

From the latest Intel SDM:
- Add Xeon Phi 7215, 7285 and 7295
- Add Coffee Lake
 1.22.10.1 16-Mar-2018  martin Pull up following revision(s) (requested by msaitoh in ticket #633):
sys/arch/x86/include/specialreg.h: revision 1.107
sys/arch/x86/include/specialreg.h: revision 1.108
sys/arch/x86/include/specialreg.h: revision 1.109
sys/arch/x86/include/cacheinfo.h: revision 1.23
sys/arch/x86/include/specialreg.h: revision 1.110
sys/arch/x86/include/specialreg.h: revision 1.111
sys/arch/x86/include/specialreg.h: revision 1.112
sys/arch/x86/include/specialreg.h: revision 1.113
sys/arch/x86/include/specialreg.h: revision 1.114
usr.sbin/cpuctl/arch/i386.c: revision 1.79
sys/arch/x86/x86/identcpu.c: revision 1.70
sys/arch/x86/include/specialreg.h: revision 1.106

Add comment.

Add Intel cpuid 7 %edx IBRS(IBPB Speculation Control) and
STIBP(STIBP Speculation Control) from OpenBSD.

Print Intel cpuid 7 %edx.

Example output of cpuctl -v identify 0:
+cpu0: 00000007: 00000000 000027ab 00000000 0c000000
(snip)
+cpu0: SEF edx 0xc000000<IBRS,STIBP>

fix swapped comments for EFER LME and LMA

- Add Intel cpuid 7 %edx bit 29 IA32_ARCH_CAPABILITIES supported bit.
- Add comment.
Add MSR_IA32_ARCH_CAPABILITIES definition.

Add IA32_SPEC_CTRL MSR and IA32_PRED_CMD MSR.

Add Intel Deterministic Address Translation Parameter Leaf(0x18) definitions.

Sort entries. No functional change.

s/CLFUSH/CLFLUSH/
No functional change.
 1.23.2.1 15-Mar-2018  pgoyette Synch with HEAD
 1.26.2.1 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.28.2.1 24-Dec-2021  martin Pull up the following (all via patch), requested by msaitoh in ticket #1396:

usr.sbin/cpuctl/arch/i386.c 1.118-1.119, 1.121-1.122
usr.sbin/cpuctl/arch/cpuctl_i386.h 1.6
sys/arch/x86/x86/identcpu_subr.c 1.8-1.9
sys/arch/x86/x86/identcpu.c 1.123
sys/arch/x86/include/cacheinfo.h 1.30
sys/arch/x86/include/cpu.h 1.132

- Fix a bug that some TLB related lines were not printed.
- Fix a bug that STLB is printed as DTLB.
- If a TLB is variable sized, print the max size instead of error message.
- Cosmetic changes to improve readability.
 1.140 24-Apr-2025  riastradh amd64: Allocate FPU save state outside pcb if it's too large.

We have seen x86_fpu_save_size values (CPUID[EAX=0x0d, ECX=0].ECX) as
large as 11008 bytes, notably with Intel AMX TILEDATA's 8192-byte
state.

We only do this for user threads, and only on machines where it's
necessary, to avoid incurring much overhead. There is still a tiny
bit of overhead when saving and restoring the FPU state by using a
pointer indirection instead of arithmetic indirection for access to
struct pcb::pcb_savefpu, but this is probably a drop in the bucket
compared to the memory traffic incurred by the FPU state save/restore
anyway.

For now, these paths are mostly disabled on i386. We could enable
them but it will require either rewriting cpu_uarea_alloc/free for
i386, or adopting a guard page like amd64 does, which might be costly
and so should be undertaken only with some thought and care. And
since Intel AMX instructions only work in 64-bit mode, it's not
likely to be useful on i386.

PR port-amd64/57661: Crash when booting on Xeon Silver 4416+ in
KVM/Qemu

These changes, as a side effect, may fix:

PR kern/57258: kthread_fpu_enter/exit problem

by making sure to allocate an FPU save space that is large enough to
guarantee fpu_kern_enter/leave work safely, instead of just using a
union savefpu object on the stack (which, at 576 bytes, may be too
small on some machines, particularly with AVX512 requiring ~2.5K).
(But we'll have to do some extra work with kthread_fpu_enter/exit_md
-- if we try doing them again on x86 -- to actually allocate the
separate pcb on these machines!)
 1.139 22-Apr-2025  imil NVMM hypervisor identification, KVM and GenPVH identification fixes

arch/x86/include/cpu.h, arch/x86/x86/identcpu.c: Enable NVMM hypervisor
discovery
arch/x86/x86/identcpu.c: Fix vm_guest_t for KVM in vm_system_products
iarch/x86/x86/x86_machdep.c: Add NVMM and GenPVH in vm_guest_name
 1.138 06-Dec-2024  bouyer Introduce vm_guest_is_pvh() and use it in place of
(vm_guest == VM_GUEST_XENPVH || vm_guest == VM_GUEST_GENPVH)
 1.137 02-Dec-2024  bouyer Add support for non-Xen PVH guests to amd64. Patch from
Emile 'iMil' Heitor in PR kern/57813, with some cosmetic tweaks by me.
Tested on bare metal, Xen PV and Xen PVH by me.
 1.136 01-Aug-2023  riastradh branches: 1.136.6;
xen: Report when hardclock jump exceeds timecounter(9) limit.
 1.135 13-Jul-2023  riastradh xen: Record event when local view of timecounter is behind global.
 1.134 13-Jul-2023  riastradh Break cycle by using `struct kmutex *' instead of `kmutex_t *'.

sys/sched.h included sys/mutex.h
which includes sys/intr.h
which includes machine/intr.h
which on cats includes arm/footbridge/footbridge_intr.h
which includes arm/cpu.h
which includes sys/cpu_data.h
which includes sys/sched.h

But there was never any real need for sys/mutex.h in sys/sched.h,
because it only uses pointers to the opaque struct kmutex. Cycle
broken by using `struct kmutex *' instead of pulling in sys/mutex.h
for the definition of kmutex_t.

Side effect: This revealed that sys/cpu_data.h needed sys/intr.h
(which was pulled in accidentally by sys/mutex.h via sys/sched.h) for
SOFTINT_COUNT. Also revealed some other machine/cpu.h header files
were missing includes of sys/mutex.h for kmutex_t.
 1.133 07-Sep-2022  knakahara branches: 1.133.4;
NetBSD/x86: Raise the number of interrupt sources per CPU from 32 to 56.

There has been no objection for three years.
https://mail-index.netbsd.org/port-amd64/2019/09/22/msg003012.html
Implemented by nonaka@n.o, updated by me.
 1.132 07-Oct-2021  msaitoh Move some common functions into x86/identcpu_subr.c. No functional change.
 1.131 14-Aug-2021  ryo Improved the performance of kernel profiling on MULTIPROCESSOR, and possible to get profiling data for each CPU.

In the current implementation, locks are acquired at the entrance of the mcount
internal function, so the higher the number of cores, the more lock conflict
occurs, making profiling performance in a MULTIPROCESSOR environment unusable
and slow. Profiling buffers has been changed to be reserved for each CPU,
improving profiling performance in MP by several to several dozen times.

- Eliminated cpu_simple_lock in mcount internal function, using per-CPU buffers.
- Add ci_gmon member to struct cpu_info of each MP arch.
- Add kern.profiling.percpu node in sysctl tree.
- Add new -c <cpuid> option to kgmon(8) to specify the cpuid, like openbsd.
For compatibility, if the -c option is not specified, the entire system can be
operated as before, and the -p option will get the total profiling data for
all CPUs.
 1.130 19-Feb-2021  christos Identify VirtualBox as a separate guest type.
 1.129 08-Aug-2020  christos branches: 1.129.2;
PR/55547: Dan Plassche: Fix BSD/OS binary emulation.
Centralize lcall sniffer and recognize the BSD/OS flavor.
 1.128 19-Jul-2020  maxv don't include opt_user_ldt.h when it is not needed
 1.127 14-Jul-2020  yamaguchi Introduce per-cpu IDTs

This is realized by following modifications:
- Add IDT pages and its allocation maps for each cpu in "struct cpu_info"
- Load per-cpu IDTs at cpu_init_idt(struct cpu_info*)
- Copy the IDT entries for cpu0 to other CPUs at attach
- These are, for example, exceptions, db, system calls, etc.

And, added a kernel option named PCPU_IDT to enable the feature.
 1.126 19-Jun-2020  maxv localify
 1.125 02-May-2020  bouyer Introduce Xen PVH support in GENERIC.
This is compiled in with
options XENPVHVM
x86 changes:
- add Xen section and xen pvh entry points to locore.S. Set vm_guest
to VM_GUEST_XENPVH in this entry point.
Most of the boot procedure (especially page table setup and switch to
paged mode) is shared with native.
- change some x86_delay() to delay_func(), which points to x86_delay() for
native/HVM, and xen_delay() for PVH

Xen changes:
- remove Xen bits from init_x86_64_ksyms() and init386_ksyms()
and move to xen_init_ksyms(), used for both PV and PVH
- set ISA no-legacy-devices property for PVH
- factor out code from Xen's cpu_bootconf() to xen_bootconf()
in xen_machdep.c
- set up a specific pvh_consinit() which starts with printk()
(which uses a simple hypercall that is available early) and switch to
xencons when we can use pmap_kenter_pa().
 1.124 30-Apr-2020  bouyer Don't #include xen/intrdefs.h is !XEN.
Should fix third-party module builds (e.g. virtualbox)
 1.123 27-Apr-2020  bouyer Move ci_vcpu under the #ifdef XEN section at the end of the struct cpu_info.
Hopefully will fix the nvmm module.
 1.122 25-Apr-2020  bouyer Merge the bouyer-xenpvh branch, bringing in Xen PV drivers support under HVM
guests in GENERIC.
Xen support can be disabled at runtime with
boot -c
disable hypervisor
 1.121 21-Apr-2020  msaitoh Get TSC frequency from CPUID 0x15 and/or x16 for newer Intel processors.

- If the max CPUID leaf is >= 0x15, take TSC value from CPUID. Some processors
can take TSC/core crystal clock ratio but core crystal clock frequency
can't be taken. Intel SDM give us the values for some processors.
- It also required to change lapic_per_second to make LAPIC timer correctly.
- Add new file x86/x86/identcpu_subr.c to share common subroutines between
kernel and userland. Some code in x86/x86/identcpu.c and cpuctl/arch/i386.c
will be moved to this file in future.
- Add comment to clarify.
 1.120 13-Apr-2020  bouyer By default, events are bound to CPU 0 (exept for IPIs and VTIMERs which
are bound to a different CPU at creation time).
Recent MI changes caused the scheduler to choose a different CPU when
probing and attaching xennet devices (I guess it's the xenbus thread which
runs on a different CPU). This cause the callback to be called on a different
CPU than the one expected by the kernel, and the event is ignored.
It is handled when the clock causes the callback to be called on the right
CPU, which is why xennet still run, but slowly.

Change event_set_handler() to do a EVTCHNOP_bind_vcpu if requested to,
and make sure we don't do it for IPIs and VIRQs (for theses, the op fails).
 1.119 10-Apr-2020  bouyer Revert, wrong branch
 1.118 10-Apr-2020  bouyer Skip cx8_spllower patch if we're running on any form of Xen PV,
we can't handle PV interrupts with a single atomic op here.
Enable x86_patch() for Xen too.
 1.117 15-Jan-2020  ad branches: 1.117.4;
Push the INVLPG limit for shootdowns up to 16 (for UBC).
 1.116 30-Dec-2019  thorpej branches: 1.116.2;
Fix a problem with intr_unmask() that can cause a forever-loop:
- When handling the source-is-masked case in the interrupt vector, set the
interrupt bit in a new ci_imasked field and ensure the bit is cleared
from ci_ipending.
- In intr_unmask(), transfer the bit from ci_imasked to ci_ipending for
non-level-sensitive interrupts (the PIC does the work for us in the
level-sensitive case), and only force pending interrupts to be processed
in this case. (In all cases, make sure the now-unmasked bit is cleared
from ci_imasked.)

Before, the bit was left in ci_ipending so as not to use edge-triggered
interrupts while the source is masked, but Xspllower() relies on the
pending bits getting cleared.

Tested by forcing all wm(4) interrupts on my test system though an
intr_mask() / softint / intr_unmask() cycle and exercising the network
heavily.
 1.115 01-Dec-2019  ad Fix false sharing problems with cpu_info. Identified with tprof(8).
This was a very nice win in my tests on a 48 CPU box.

- Reorganise cpu_data slightly according to usage.
- Put cpu_onproc into struct cpu_info alongside ci_curlwp (now is ci_onproc).
- On x86, put some items in their own cache lines according to usage, like
the IPI bitmask and ci_want_resched.
 1.114 27-Nov-2019  maxv Add a small API for in-kernel FPU operations.

fpu_kern_enter();
/* do FPU stuff */
fpu_kern_leave();
 1.113 23-Nov-2019  ad cpu_need_resched():

- Remove all code that should be MI, leaving the bare minimum under arch/.
- Make the required actions very explicit.
- Pass in LWP pointer for convenience.
- When a trap is required on another CPU, have the IPI set it locally.
- Expunge cpu_did_resched().
 1.112 21-Nov-2019  ad x86 TLB shootdown IPI changes:

- Shave some time off processing.
- Reduce cacheline/bus traffic on systems with many CPUs.
- Reduce time spent at IPL_VM.
 1.111 21-Nov-2019  ad mi_userret(): take care of calling preempt(), set spc_curpriority directly,
and remove MD code that does the same.
 1.110 12-Oct-2019  maxv Rewrite the FPU code on x86. This greatly simplifies the logic and removes
the dependency on IPL_HIGH. NVMM is updated accordingly. Posted on
port-amd64 a week ago.

Bump the kernel version to 9.99.16.
 1.109 03-Oct-2019  maxv Remove the LazyFPU code, as posted 5 months ago on port-amd64@.
 1.108 07-Aug-2019  maxv Add support for USER_LDT in SVS. This allows us to have both enabled at
the same time.

We allocate an LDT for each CPU in the GDT and map an area for it, in
addition to the default LDT already present. In context switches between
different processes, we choose between the default or the per-cpu LDT
selector: if the user set specific LDT entries, we memcpy them to the
per-cpu LDT and load the per-cpu selector.

Tested by Naveen Narayanan (with Wine on amd64).
 1.107 26-Jun-2019  mgorny branches: 1.107.2;
Fetch XSAVE area component offsets and sizes when initializing x86 CPU

Introduce two new arrays, x86_xsave_offsets and x86_xsave_sizes,
and initialize them with XSAVE area component offsets and sizes queried
via CPUID. This will be needed to implement getters and setters for
additional register types.

While at it, add XSAVE_* constants corresponding to specific XSAVE
components.
 1.106 27-May-2019  maxv Remove 'ci_svs_kpdirpa', unused. While here fix a few comments here and
there, reduces a future diff.
 1.105 15-Feb-2019  nonaka Added Microsoft Hyper-V support. It ported from OpenBSD and FreeBSD.

graphical console is not work on Gen.2 VM yet. To use the serial console,
enter "consdev com,0x3f8,115200" on efiboot.
 1.104 14-Feb-2019  cherry Welcome XENPVHVM mode.

It is UP only, has xbd(4) and xennet(4) as PV drivers.

The console is com0 at isa and the native portion is very
rudimentary AT architecture, so is probably suboptimal to
run without PV support.
 1.103 11-Feb-2019  cherry We reorganise definitions for XEN source support as follows:

XEN - common sources required for baseline XEN support.
XENPV - sources required for support of XEN in PV mode.
XENPVHVM - sources required for support for XEN in HVM mode.
XENPVH - sources required for support for XEN in PVH mode.
 1.102 02-Feb-2019  cherry Switch NetBSD/xen to use XEN api tag RELEASE-4.11.1

The headers for this api are in sys/external/mit/xen-include-public/dist/
 1.101 25-Dec-2018  cherry Excise XEN specific code out of x86/x86/intr.c into xen/x86/xen_intr.c

While at it, separate the source function tracking so that the interrupt
paths are truly independant.

Use weak symbol exporting to provision for future PVHVM co-existence
of both files, but with independant paths. Introduce assembler code
such that in a unified scenario, native interrupts get first priority
in spllower(), followed by XEN event callbacks. IPL management and
semantics are unchanged - native handlers and xen callbacks are
expected to maintain their ipl related semantics.

In summary, after this commit, native and XEN now have completely
unrelated interrupt handling mechanisms, including
intr_establish_xname() and assembler stubs and intr handler
management.

Happy Christmas!
 1.100 18-Nov-2018  cherry On Xen, copy just the bits we need from the trapframe for hardclock(9)
and statclock(9).

Current, the macros that use the trapframe are:
CLKF_USERMODE()
CLKF_PC()
CLKF_INTR()

Of these, CLKF_INTR() already ignores the frame and uses the ci_idepth
variable to do its job.

Convert the two remaining ones to do this, but only for XEN.
 1.99 18-Nov-2018  cherry Save the interrupt trap/clockframe to a per-cpu copy.

We can use this copy to pass on the trapframe to hardclock(9) from
within the xen timer handler. This delinks the current dependency
between MD code and the handler, which is specially prototyped to take
the clockframe unlike any other handler.

This change has performance implications, as each interrupt entry will
copy the entire trapframe over to the per-cpu cached copy. This can be
mitigated by selectively copying just the parts of the clockframe that
are used by hardclock() et. al.

Tested on amd64 XEN domU
 1.98 05-Oct-2018  maxv export x86_fpu_mxcsr_mask, fpu_area_save and fpu_area_restore
 1.97 22-Aug-2018  msaitoh - Cleanup for dynamic sysctl:
- Remove unused *_NAMES macros for sysctl.
- Remove unused *_MAXID for sysctls.
- Move CTL_MACHDEP sysctl definitions for m68k into m68k/include/cpu.h and
use them on all m68k machines.
 1.96 16-Jul-2018  pgoyette More rearrangement of struct cpu_info to keep all the un-conditional
members at fixed locations.

Should address my PR kern/52919

OK maxv@

XXX kernel version bump coming momentarily.
 1.95 15-Jul-2018  maxv Hum. Move the __HAVE_DIRECT_MAP block a little below, otherwise dynamically
loaded kernel modules use a wrong offset for some ci_* fields. Found when
modloading tprof_amd on an AMD 10h, the read of ci_signature was at a
wrong address, and the cpu family was not detected correctly.
 1.94 30-Jun-2018  riastradh Just use struct cpu_info members for the Xen clock state.

Silly to use percpu(9) for some things and struct cpu_info for
others.
 1.93 29-Jun-2018  riastradh Rewrite Xen timecounter and hardclock timer.

With this change, the Xen timecounter should now be globally
monotonic, as every timecounter is supposed to be. Should also fix a
litany of races in the timecounter logic.

Proposed last year; see mailing list for further details:
https://mail-index.netbsd.org/port-xen/2017/10/31/msg009112.html

ok cherry
 1.92 14-Jun-2018  maxv branches: 1.92.2;
Add some code to support eager fpu switch, INTEL-SA-00145. We restore the
FPU state of the lwp right away during context switches. This guarantees
that when the CPU executes in userland, the FPU doesn't contain secrets.

Maybe we also need to clear the FPU in setregs(), not sure about this one.

Can be enabled/disabled via:

machdep.fpu_eager = {0/1}

Not yet turned on automatically on affected CPUs (Intel Family 6).

More generally it would be good to turn it on automatically when XSAVEOPT
is supported, because in this case there is probably a non-negligible
performance gain; but we need to fix PR/52966.
 1.91 04-Apr-2018  maxv Enable the SpectreV2 mitigation by default at boot time.
 1.90 30-Mar-2018  maxv Retrieve cpuid.7:%edx.
 1.89 18-Jan-2018  maxv branches: 1.89.2;
Unmap the kernel heap from the user page tables (SVS).

This implementation is optimized and organized in such a way that we
don't need to copy the kernel stack to a safe place during user<->kernel
transitions. We create two VAs that point to the same physical page; one
will be mapped in userland and is offset in order to contain only the
trapframe, the other is mapped in the kernel and maps the entire stack.

Sent on tech-kern@ a week ago.
 1.88 07-Jan-2018  maxv Add a new option, SVS (for Separate Virtual Space), that unmaps kernel
pages when running in userland. For now, only the PTE area is unmapped.

Sent on tech-kern@.
 1.87 05-Jan-2018  maxv Add a __HAVE_PCPU_AREA option, enabled by default on native amd64 but not
Xen.

With this option, the CPU structures that must always be present in the
CPU's page tables are moved on L4 slot 384, which means address
0xffffc00000000000.

A new pcpu_area structure is defined. It contains shared structures (IDT,
LDT), and then an array of pcpu_entry structures, indexed by cpu_index(ci).
Theoretically the LDT should be in the array, but this will be done later.

During the boot procedure, cpu0 calls pmap_init_pcpu, which creates a
page tree that is able to map the pcpu_area structure entirely. cpu0 then
immediately maps the shared structures. Later, every CPU goes through
cpu_pcpuarea_init, which allocates physical pages and kenters the relevant
pcpu_entry to them. Finally, each pointer is replaced to point to pcpuarea.

The point of this change is to make sure that the structures that must
always be present in the page tables have their own L4 slot. Until now
their L4 slot was that of pmap_kernel, and making a distinction between
what must be mapped and what does not need to be was complicated.

Even in the non-speculative-bug case this change makes some sense: there
are several x86 instructions that leak the addresses of the CPU structures,
and putting these structures inside pmap_kernel actually offered a way to
compute the address of the kernel heap - which would have made ASLR on it
plainly useless, had we implemented that.

Note that, for now, pcpuarea does not contain rsp0.

Unfortunately this change adds many #ifdefs, and makes the code harder to
understand. There is also some duplication, but that will be solved later.
 1.86 04-Jan-2018  maxv Allocate the TSS area dynamically. This way cpu_info and cpu_tss can be
put in separate pages.
 1.85 04-Jan-2018  maxv Group the different TSSes into a cpu_tss structure. And pack this
structure to make sure there is no padding between 'tss' and 'iomap'.
 1.84 28-Dec-2017  maxv typos
 1.83 02-Dec-2017  christos Add padding to make the 32/64 bit structs the same.
 1.82 27-Nov-2017  maxv Remove unused fields, there is no alignment we need to enforce.
 1.81 23-Nov-2017  kamil Restore removed sysctl(2) x86 entry: fpu_present

Hardcode it to 1 for now on i386 and amd64.

This unbreaks software that used it (e.g. LLDB).

Removal noted by <christos>

PR lib/52756 by myself
 1.80 09-Oct-2017  maya GC i386_fpu_present. no FPU x86 is not supported.

Also delete newly unused send_sigill
 1.79 16-Sep-2017  maxv Move xpq_idx into cpu_info, to prevent false sharing between CPUs. Saves
10s when doing a './build.sh -j 3 kernel=GENERIC' on xen-amd64-domU.
 1.78 27-Aug-2017  maxv style, and move some i386-specific code into i386/
 1.77 27-Aug-2017  maxv Localify. By the way, we should use a different stack for NMIs.
 1.76 12-Aug-2017  maxv Remove vm86.

Pass 3.
 1.75 22-Jul-2017  maxv Call _proc0_tss_ldt_init only once, and rename them.
 1.74 16-Jul-2017  cherry branches: 1.74.2;
Unify the xen and native x86/ interrupt setup functions and
spl traversal data structures.

This is towards PVHVM.
 1.73 16-Jun-2017  jdolecek dumpconf(void) long doesn't exist, remove the prototype

PR kern/39714 by Henning Petersen
 1.72 09-Jun-2017  chs if __HIDE_DELAY is defined, do not define delay() or DELAY().
needed by dtrace and ZFS.
 1.71 23-May-2017  nonaka branches: 1.71.2;
x86: hypervisor detection from FreeBSD for x2APIC support.
 1.70 15-May-2017  msaitoh CPUID_CFLUSH bit is not for CFLUSH insn but CLFLUSH insn, so modify comments
and snprintb() sring.
 1.69 14-Apr-2017  kamil branches: 1.69.2;
x86: Export fpu_save, fpu_save_size, xsave_features to dedicated sysctl nodes

Add new defines:
- CPU_FPU_SAVE (15)
int: FPU Instructions layout
* to use this, CPU_OSFXSR must be true
* 0: FSAVE
* 1: FXSAVE
* 2: XSAVE
* 3: XSAVEOPT
- CPU_FPU_SAVE_SIZE (16)
int: FPU Instruction layout size
- CPU_XSAVE_FEATURES (17)
quad: FPU XSAVE features

Bump CPU_MAXID from 15 to 18.

These values were prepared originally to be exported without ASCIIZ name to
be used as handler. These values are useful to get FPU accessors in a
debugger easier to implement on x86 (PT_SETFPREG, PT_GETFPREG).

This interface handles all supported x86 targets. In the older (i386) and
less featured CPUs check first osfxsr (OS uses FXSAVE/FXRSTOR).

According to sys/arch/x86/include/cpu.h r.1.65 this was prepared to be
exported beyond simple CTL_CREATE node.

Sponsored by <The NetBSD Foundation>
 1.68 11-Feb-2017  maxv Instead of using a global array with per-cpu indexes, embed the tmp VAs
into cpu_info directly. This concerns only {i386, Xen-i386, Xen-amd64},
because amd64 already has a direct map that is way faster than that.

There are two major issues with the global array: maxcpus entries are
allocated while it is unlikely that common i386 machines have so many
cpus, and the base VA of these entries is not cache-line-aligned, which
mostly guarantees cache-line-thrashing each time the VAs are entered.

Now the number of tmp VAs allocated is proportionate to the number of CPUs
attached (which therefore reduces memory consumption), and the base is
properly aligned.

On my 3-core AMD, the number of DC_refills_L2 events triggered when
performing 5x10^6 calls to pmap_zero_page on two dedicated cores is on
average divided by two with this patch.

Discussed on tech-kern a little.
 1.67 13-Dec-2015  maxv branches: 1.67.2; 1.67.4;
Retrieve cpuid7 (Structured Extended Features) into ci_feat_val.
 1.66 23-Feb-2014  dsl branches: 1.66.4; 1.66.6; 1.66.8;
Rename (the recently added) 'x86_xsave_size' to 'x86_fpu_save_size'
and default to 512 (the size of the fxsave structure).
 1.65 23-Feb-2014  dsl Determine whether the cpu supports xsave (and hence AVX).
The result is only written to sysctl nodes at the moment.
I see:
machdep.fpu_save = 3 (implies xsaveopt)
machdep.xsave_size = 832
machdep.xsave_features = 7
Completely common up the i386 and amd64 machdep sysctl creation.
 1.64 22-Feb-2014  dsl Re-use the unused ci_cpu_serial[3] to save the highest cpuid values
for the normal and extended leafs.
(The 'normal' one might be luring in the global cpulevel.)
Read the 'extended feature' from cpuid.80000001.%ecx/edx into
ci_feat_val[3/2] just after saving cpuid.1.%ecx/dx in ci_feat_val[1/0]
instead of doing it separately for amd k678 and via c3 processors
in their probe functions and repeating it for all cpus a few instructions
later when x86_cpu_topology() is called.
x86_cpu_topology() is only called from cpu_probe() and really doesn't
deserve its own source file. Chasing the setup code is bad enough anyway.
 1.63 20-Feb-2014  dsl This needs stdint.h in userspace (for uint64_t)
 1.62 15-Feb-2014  dsl Remove all references to MDL_USEDFPU and deferred fpu initialisation.
The cost of zeroing the save area on exec is minimal.
This stops the FP registers of a random process being used the first
time an lwp uses the fpu.
sendsig_siginfo() and get_mcontext() now unconditionally copy the FP
registers.
I'll remove the double-copy for signal handlers soon.
get_mcontext() might have been leaking kernel memory to userspace - and
may still do so if i386_use_fxsave is false (short copies).
 1.61 12-Feb-2014  dsl Change i386 to use x86/fpu.c instead of i386/isa/npx.c
This changes the trap10 and trap13 code to call directly into fpu.c,
removing all the code for T_ARITHTRAP, T_XMM and T_FPUNDA from i386/trap.c
Not all of the code thate appeared to handle fpu traps was ever called!
Most of the changes just replace the include of machine/npx.h with x86/fpu.h
(or remove it entirely).
 1.60 04-Feb-2014  dsl There is no need to check for recursive calls into fpudna().
Rename the associated ci_fpsaving field to 'unused'.
I'm not sure they could ever happen, you could get unwanted calls into
the fpu trap code while saving state when using INT13 - but these are
different.
The return value from the i386 fpudna() was always 1 - possibly a historic
relic of the kernel fp emulation. Remove and don't check in trap.S.
The amd64 and i386 fpudna() code is now almost identical.
 1.59 26-Jan-2014  dsl Remove support for 'external' floating point units and the MS-DOS
compatible method of handling floating point exceptions.
Make kernel support for teh fpu non-optional (486SX should still work).
Only 386 cpus support external fpu, and i386 support was removed years ago.
This means that the npx code no longer uses port 0xf0 or interupt 13.
All the "npx at isa" lines go from the configs, arch/i386/isa/npx.c
is now mandatory for all i386 kernels.
I've renamed npxinit() to fpuinit() and npxinit_cpu() to fpuinit_cpu()
to match the very similar amd64 functions.
The fpu of the boot cpu is now initialised by a direct call from
cpu_configure(), this enables FP emulation for a 486SX.
(for amd64 the cr0 values are set in locore.S and similar).
This fixes a long-standing bug in linux_setregs() - which did not
save the fpu regsiters if they were active.
I've test booted a single cpu i386 kernel (using anita).
amd64 builds - none of teh changes should affect it.
The i386 XEN kernels build, but I'm not sure where they set cr0, and
it might have got lost!
 1.58 01-Dec-2013  christos revert fpu/pcu changes until we figure out what's wrong; they cause random
freezes
 1.57 10-Nov-2013  christos use __unused instead of __USE and void cast to mark iterator variable unused
where needed (from phone)
 1.56 05-Nov-2013  christos initialize cii before using it.
 1.55 23-Oct-2013  drochner Use the MI "pcu" framework for bookkeeping of npx/fpu states on x86.
This reduces the amount of MD code enormously, and makes it easier
to implement support for newer CPU features which require more fpu
state, or for fpu usage by the kernel.
For access to FPU state across CPUs, an xcall kthread is used now
rather than a dedicated IPI.
No user visible changes intended.
 1.54 17-Oct-2013  christos __USE() unused variables
 1.53 27-Oct-2012  chs branches: 1.53.2;
split device_t/softc for all remaining drivers.
replace "struct device *" with "device_t".
use device_xname(), device_unit(), etc.
 1.52 15-Jul-2012  dsl branches: 1.52.2;
Rename MDP_IRET to MDL_IRET since it is an lwp flag, not a proc one.
Add an MDL_COMPAT32 flag to the lwp's md_flags, set it for 32bit lwps
and use it to force 'return to user' with iret (as is done when
MDL_IRET is set).
Split the iret/sysret code paths much later.
Remove all the replicated code for 32bit system calls - which was only
needed so that iret was always used.
frameasm.h for XEN contains '#define swapgs', while XEN probable never
needs swapgs, this is likely to be confusing.
Add a SWAPGS which is a nop on XEN and swapgs otherwise.
(I've not yet checked all the swapgs in files that include frameasm.h)
Simple x86 programs still work.
Hijack 6.99.9 kernel bump (needed for compat32 modules)
 1.51 16-Jun-2012  chs rename the global variable "cpu" to "cputype" to avoid conflicting with
dtrace, which wants to use "cpu" as a local variable.
 1.50 20-Apr-2012  rmind - Convert x86 MD code, mainly pmap(9) e.g. TLB shootdown code, to use
kcpuset(9) and thus replace hardcoded CPU bitmasks. This removes the
limitation of maximum CPUs.

- Support up to 256 CPUs on amd64 architecture by default.

Bug fixes, improvements, completion of Xen part and testing on 64-core
AMD Opteron(tm) Processor 6282 SE (also, as Xen HVM domU with 128 CPUs)
by Manuel Bouyer.
 1.49 02-Mar-2012  bouyer Follow locore.S and move FPU handling from x86_64_switch_context() to
x86_64_tls_switch(); raise IPL to IPL_HIGH in x86_64_switch_context()
and test ci_fpcurlwp to decide to disable FPU or not.
Change the Xen i386 context switch code to be like the amd64 one.
 1.48 17-Feb-2012  bouyer Apply patch proposed in PR port-xen/45975 (this does not solve the exact
problem reported here but is part of the solution):
xen_kpm_sync() is not working as expected,
leading to races between CPUs.
1 the check (xpq_cpu != &x86_curcpu) is always false because we
have different x86_curcpu symbols with different addresses in the kernel.
Fortunably, all addresses dissaemble to the same code.
Because of this we always use the code intended for bootstrap, which doesn't
use cross-calls or lock.

2 once 1 above is fixed, xen_kpm_sync() will use xcalls to sync other CPUs,
which cause it to sleep and pmap.c doesn't like that. It triggers this
KASSERT() in pmap_unmap_ptes():
KASSERT(pmap->pm_ncsw == curlwp->l_ncsw);
3 pmap->pm_cpus is not safe for the purpose of xen_kpm_sync(), which
needs to know on which CPU a pmap is loaded *now*:
pmap->pm_cpus is cleared before cpu_load_pmap() is called to switch
to a new pmap, leaving a window where a pmap is still in a CPU's
ci_kpm_pdir but not in pm_cpus. As a virtual CPU may be preempted
by the hypervisor at any time, it can be large enough to let another
CPU free the PTP and reuse it as a normal page.

To fix 2), avoid cross-calls and IPIs completely, and instead
use a mutex to update all CPU's ci_kpm_pdir from the local CPU.
It's safe because we just need to update the table page, a tlbflush IPI will
happen later. As a side effect, we don't need a different code for bootstrap,
fixing 1). The mutex added to struct cpu needs a small headers reorganisation.

to fix 3), introduce a pm_xen_ptp_cpus which is updated from
cpu_pmap_load(), whith the ci_kpm_mtx mutex held. Checking it with
ci_kpm_mtx held will avoid overwriting the wrong pmap's ci_kpm_pdir.

While there I removed the unused pmap_is_active() function;
and added some more details to DIAGNOSTIC panics.
 1.47 12-Feb-2012  jym branches: 1.47.2;
Xen clock management routines keep track of CPU (following MP merge).
Reflect this change in the suspend/resume routines so they can cope with
domU CPU suspend, instead of setting their cpu_info pointer to NULL.

Avoid copy/pasting by using the resume routines during attachement.

ok releng@.

No regression observed, and allows domU to suspend successfully again.
Restore is a different beast as PD/PT flags are marked "invalid" by Xen-4
hypervisor, and blocks resuming. Looking into it.
 1.46 28-Jan-2012  cherry stop using alternate pde mapping in xen pmap
 1.45 30-Dec-2011  cherry Move the per-cpu l3 page allocation code to a separate MD function. Avoids code duplication for xen PAE
 1.44 07-Dec-2011  cegger switch from xen3-public to xen-public.
 1.43 19-Nov-2011  cherry branches: 1.43.4;
[merging from cherry-xenmp] bring in bouyer@'s changes via:
http://mail-index.netbsd.org/source-changes/2011/10/22/msg028271.html
From the Log:
Log Message:
Various interrupt fixes, mainly:
keep a per-cpu mask of enabled events, and use it to get pending events.
A cpu-specific event (all of them at this time) should not be ever masked
by another CPU, because it may prevent the target CPU from seeing it
(the clock events all fires at once for example).
 1.42 10-Nov-2011  jym Turn the 'i386_use_pae' variable into simply 'use_pae'. Technically
speaking we are also running with PAE enabled in long mode under amd64,
so this variable will be used in various places across x86 machdep to
branch at runtime to functions that require extra handling for PAE mode.
 1.41 06-Nov-2011  cherry [merging from cherry-xenmp] make pmap_kernel() shadow PMD per-cpu and MP aware.
 1.40 01-Nov-2011  joerg branches: 1.40.2;
Reduce exposure of kernel internals for __KMEMUSER
 1.39 17-Oct-2011  jmcneill add a "vm" device class for cpufeaturebus
 1.38 20-Sep-2011  jym Merge jym-xensuspend branch in -current. ok bouyer@.

Goal: save/restore support in NetBSD domUs, for i386, i386 PAE and amd64.

Executive summary:
- split all Xen drivers (xenbus(4), grant tables, xbd(4), xennet(4))
in two parts: suspend and resume, and hook them to pmf(9).
- modify pmap so that Xen hypervisor does not cry out loud in case
it finds "unexpected" recursive memory mappings
- provide a sysctl(7), machdep.xen.suspend, to command suspend from
userland via powerd(8). Note: a suspend can only be handled correctly
when dom0 requested it, so provide a mechanism that will prevent
kernel to blindly validate user's commands

The code is still in experimental state, use at your own risk: restore
can corrupt backend communications rings; this can completely thrash
dom0 as it will loop at a high interrupt level trying to honor
all domU requests.

XXX PAE suspend does not work in amd64 currently, due to (yet again!)
page validation issues with hypervisor. Will fix.

XXX secondary CPUs are not suspended, I will write the handlers
in sync with cherry's Xen MP work.

Tested under i386 and amd64, bear in mind ring corruption though.

No build break expected, GENERICs and XEN* kernels should be fine.
./build.sh distribution still running. In any case: sorry if it does
break for you, contact me directly for reports.
 1.37 11-Aug-2011  cherry Hide the MD details of specific IPIs behind semantically pleasing functions. This cleans up a couple of #ifdef XEN/#endif pairs
 1.36 10-Aug-2011  cherry Add Xen specific members to struct cpu_info, Add proper per-cpu curcpu() functionality
 1.35 12-Jun-2011  rmind Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.
 1.34 31-May-2011  dyoung branches: 1.34.2;
Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.
 1.33 26-May-2011  uebayasi Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html
 1.32 13-Apr-2011  mrg move the include sys/types.h xor stdbool.h to the top of the file,
so that "bool" will be present when used later in the file.
 1.31 24-Feb-2011  jruoho Fix autoconf(9) of cpufeaturebus.
 1.30 23-Feb-2011  jruoho Move ENHANCED_SPEEDSTEP, or henceforth est(4), to the cpufeaturebus.
 1.29 20-Feb-2011  jruoho Modularize coretemp(4). Ok jmcneill@.
 1.28 20-Feb-2011  jmcneill cpu.h no longer needs via_padlock.h
 1.27 19-Feb-2011  jmcneill modularize VIA PadLock support
- retire options VIA_PADLOCK, replace with 'padlock0 at cpu0'
- driver supports attach & detach
- support building as a module
 1.26 22-Dec-2010  christos branches: 1.26.2; 1.26.4;
Make __HAVE_CPU_DATA_FIRST true
 1.25 16-Aug-2010  jym Add machdep.pae sysctl(7) for i386. Thanks to Paul and Joerg for their
reviews.

In kernel, it matches the 'i386_use_pae' variable (0: kernel does not use
PAE, 1: kernel uses PAE). Will be used by i386 kvm(3) to know the functions
that should get called for VA => PA translations.
 1.24 04-Aug-2010  jruoho Store the MADT-derived CPU ID to <x86/cpu.h>. This is required to properly
match the ACPI processor object ID with the ID available in the APIC table.
 1.23 24-Jul-2010  jym Welcome PAE inside i386 current.

This patch is inspired by work previously done by Jeremy Morse, ported by me
to -current, merged with the work previously done for port-xen, together with
additionals fixes and improvements.

PAE option is disabled by default in GENERIC (but will be enabled in ALL in
the next few days).

In quick, PAE switches the CPU to a mode where physical addresses become
36 bits (64 GiB). Virtual address space remains at 32 bits (4 GiB). To cope
with the increased size of the physical address, they are manipulated as
64 bits variables by kernel and MMU.

When supported by the CPU, it also allows the use of the NX/XD bit that
provides no-execution right enforcement on a per physical page basis.

Notes:

- reworked locore.S

- introduce cpu_load_pmap(), used to switch pmap for the curcpu. Due to the
different handling of pmap mappings with PAE vs !PAE, Xen vs native, details
are hidden within this function. This helps calling it from assembly,
as some features, like BIOS calls, switch to pmap_kernel before mapping
trampoline code in low memory.

- some changes in bioscall and kvm86_call, to reflect the above.

- the L3 is "pinned" per-CPU, and is only manipulated by a
reduced set of functions within pmap. To track the L3, I added two
elements to struct cpu_info, namely ci_l3_pdirpa (PA of the L3), and
ci_l3_pdir (the L3 VA). Rest of the code considers that it runs "just
like" a normal i386, except that the L2 is 4 pages long (PTP_LEVELS is
still 2).

- similar to the ci_pae_l3_pdir{,pa} variables, amd64's xen_current_user_pgd
becomes an element of cpu_info (slowly paving the way for MP world).

- bootinfo_source struct declaration is modified, to cope with paddr_t size
change with PAE (it is not correct to assume that bs_addr is a paddr_t when
compiled with PAE - it should remain 32 bits). bs_addrs is now a
void * array (in bootloader's code under i386/stand/, the bs_addrs
is a physaddr_t, which is an unsigned long).

- fixes in multiboot code (same reason as bootinfo): paddr_t size
change. I used Elf32_* types, use RELOC() where necessary, and move the
memcpy() functions out of the if/else if (I do not expect sym and str tables
to overlap with ELF).

- 64 bits atomic functions for pmap

- all pmap_pdirpa access are now done through the pmap_pdirpa macro. It
hides the L3/L2 stuff from PAE, as well as the pm_pdirpa change in
struct pmap (it now becomes a PDP_SIZE array, with or without PAE).

- manipulation of recursive mappings ( PDIR_SLOT_{,A}PTEs ) is done via
loops on PDP_SIZE.

See also http://mail-index.netbsd.org/port-i386/2010/07/17/msg002062.html

No objection raised on port-i386@ and port-xen@R for about a week.

XXX kvm(3) will be fixed in another patch to properly handle both PAE and !PAE
kernel dumps (VA => PA macros are slightly different, and need proper 64 bits
PA support in kvm_i386).

XXX Mixing PAE and !PAE modules may lead to unwanted/unexpected results. This
cannot be solved easily, and needs lots of thinking before being declared
safe (paddr_t/bus_addr_t size handling, PD/PT macros abstractions).
 1.22 09-May-2010  rmind Drop x86 MD package/core/smt IDs and use MI.
 1.21 18-Apr-2010  jym This patch fixes the NX regression issue observed on amd64 kernels, where
per-page execution right was disabled (therefore leading to the inability
of the kernel to detect fraudulent use of memory mappings marked as not
being executable).

- replace cpu_feature and ci_feature_flags variables by cpu_feature and
ci_feat_val arrays. This makes it cleaner and brings kernel code closer
to the design of cpuctl(8). A warning will be raised for each CPU that
does not expose the same features as the Boot Processor (BP).

- the blacklist of CPU features is now a macro defined in the
specialreg.h header, instead of hardcoding it inside MD initialization
code; fix comments.

- replace checks against CPUID_TSC with the cpu_hascounter() function.

- clean up the code in init_x86_64(), as cpu_feature variables are set
inside cpu_probe().

- use cpu_init_msrs() for i386. It will be eventually used later for NX
feature under i386 PAE kernels.

- remove code that checks for CPUID_NOX in amd64 mptramp.S, this is already
performed by cpu_hatch() through cpu_init_msrs().

- remove cpu_signature and feature_flags members from struct mpbios_proc
(they were never used).

This patch was tested with i386 MONOLITHIC, XEN3PAE_DOM0 and XEN3_DOM0 under
a native i386 host, and amd64 GENERIC, XEN3_DOM0 via QEMU virtual machines.

XXX Should kernel rev be bumped?

XXX A similar patch should be pulled-up for NetBSD-5, hopefully tomorrow.
 1.20 18-Jan-2010  rmind branches: 1.20.2; 1.20.4;
x86_cpu_topology, not toplogy.
 1.19 09-Jan-2010  cegger add x2apic support.
patch presented on current-users@, port-i386@ and port-amd64@ on 2009-12-22

No comments.
 1.18 21-Nov-2009  rmind Use lwp_getpcb() on x86 MD code, clean from struct user usage.
 1.17 30-Apr-2009  rmind Move x86 CPU topology detection code into the separate file (as it was originally).
OK by <yamt>.
 1.16 19-Apr-2009  ad cpuctl:

- Add interrupt shielding (direct hardware interrupts away from the
specified CPUs). Not documented just yet but will be soon.

- Redo /dev/cpu time_t compat so no kernel changes are needed.

x86:

- Make intr_establish, intr_disestablish safe to use when !cold.

- Distribute hardware interrupts among the CPUs, instead of directing
everything to the boot CPU.

- Add MD code for interrupt sheilding. This works in most cases but there is
a bug where delivery is not accepted by an LAPIC after redistribution. It
also needs re-balancing to make things fair after interrupts are turned
back on for a CPU.
 1.15 16-Apr-2009  rmind - Add macros to handle (some) trapframe registers for common x86 code.
- Merge i386 and amd64 syscall.c into x86. No functional changes intended.

Proposed on (port-i386 & port-amd64). Unfortunately, I cannot merge these
lists into the single port-x86. :(
 1.14 30-Mar-2009  tsutsui #include <sys/types.h>, not <stdbool.h> for userland
in defined(_STANDALONE) case too.
 1.13 28-Mar-2009  rmind kvtop: change return type to paddr_t.
 1.12 27-Mar-2009  dyoung If defined(_KERNEL), #include <sys/types.h>, otherwise #include
<stdbool.h>, for the bool definition that we need. cpu.h only got the
definition by chance, before.
 1.11 07-Mar-2009  ad Expose more stuff if _KMEMUSER is defined.
 1.10 29-Dec-2008  pooka branches: 1.10.2;
_LKM -> _MODULE
 1.9 25-Oct-2008  mrg branches: 1.9.2; 1.9.4; 1.9.8; 1.9.10;
this uses an evcnt so, include <sys/evcnt.h>
 1.8 13-Oct-2008  cegger print features4: cpuid fn80000001 %ecx on AMD CPUs.
 1.7 30-May-2008  ad branches: 1.7.2; 1.7.6; 1.7.8;
fillw is dead.
 1.6 28-May-2008  ad Remove X86_MAXPROCS. There is still a 32-cpu limit, but it's now using
the MI constants.
 1.5 22-May-2008  ad Mark x86_curlwp() with __attribute__ ((const)), so gcc can CSE it and know
that it does not clobber global data.
 1.4 12-May-2008  ad branches: 1.4.2; 1.4.4;
- Make cpu_number() return MI index, otherwise the pmap cannot work on
systems with lapic IDs > X86_MAXPROCS.
- Kill cpu_info[] array and use MI cpu_lookup_byindex().
 1.3 11-May-2008  ad Don't reload LDTR unless a new value, which only happens for USER_LDT.
 1.2 11-May-2008  ad Stop using APIC IDs to identify CPUs for software purposes. Allows for
APIC IDs beyond 31, which has been possible for some time now.
 1.1 11-May-2008  ad Share cpu.h between the x86 ports.
 1.4.4.3 04-Jun-2008  yamt sync with head
 1.4.4.2 18-May-2008  yamt sync with head.
 1.4.4.1 12-May-2008  yamt file cpu.h was added on branch yamt-pf42 on 2008-05-18 12:33:01 +0000
 1.4.2.6 09-Oct-2010  yamt sync with head
 1.4.2.5 11-Aug-2010  yamt sync with head.
 1.4.2.4 11-Mar-2010  yamt sync with head
 1.4.2.3 04-May-2009  yamt sync with head.
 1.4.2.2 16-May-2008  yamt sync with head.
 1.4.2.1 12-May-2008  yamt file cpu.h was added on branch yamt-nfs-mp on 2008-05-16 02:23:27 +0000
 1.7.8.2 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.7.8.1 19-Oct-2008  haad Sync with HEAD.
 1.7.6.2 23-Jun-2008  wrstuden Add files to branch that were added on -current.

After this, all that's left of update is to merge some changes
that had conflicts.
 1.7.6.1 30-May-2008  wrstuden file cpu.h was added on branch wrstuden-revivesa on 2008-06-23 05:02:12 +0000
 1.7.2.3 17-Jan-2009  mjf Sync with HEAD.
 1.7.2.2 02-Jun-2008  mjf Sync with HEAD.
 1.7.2.1 30-May-2008  mjf file cpu.h was added on branch mjf-devfs2 on 2008-06-02 13:22:50 +0000
 1.9.10.2 20-May-2011  matt bring matt-nb5-mips64 up to date with netbsd-5-1-RELEASE (except compat).
 1.9.10.1 21-Apr-2010  matt sync to netbsd-5
 1.9.8.1 23-Apr-2010  snj Apply patch (requested by jym in ticket #1380):
Fix the NX regression issue observed on amd64 kernels, where per-page
execution right was disabled (therefore leading to the inability
of the kernel to detect fraudulent use of memory mappings marked as not
being executable).
 1.9.4.2 22-Apr-2010  snj Apply patch (requested by jym in ticket #1380):
Fix the NX regression issue observed on amd64 kernels, where per-page
execution right was disabled (therefore leading to the inability
of the kernel to detect fraudulent use of memory mappings marked as not
being executable).
 1.9.4.1 16-Jun-2009  snj Pull up following revision(s) (requested by rmind in ticket #782):
sys/arch/x86/conf/files.x86: revision 1.52 via patch
sys/arch/x86/include/cpu.h: revision 1.17
sys/arch/x86/x86/cpu_topology.c: revision 1.1
sys/arch/x86/x86/identcpu.c: revision 1.16 via patch
Move x86 CPU topology detection code into the separate file (as it was
originally).
OK by <yamt>.
 1.9.2.2 28-Apr-2009  skrll Sync with HEAD.
 1.9.2.1 19-Jan-2009  skrll Sync with HEAD.
 1.10.2.9 27-Aug-2011  jym Sync with HEAD. Most notably: uvm/pmap work done by rmind@, and MP Xen
work of cherry@.

No regression observed on suspend/restore.
 1.10.2.8 02-May-2011  jym Sync with head.
 1.10.2.7 28-Mar-2011  jym Sync with HEAD. TODO before merge:
- shortcut for suspend code in sysmon, when powerd(8) is not running.
Borrow ``xs_watch'' thread context?
- bug hunting in xbd + xennet resume. Rings are currently thrashed upon
resume, so current implementation force flush them on suspend. It's not
really needed.
 1.10.2.6 10-Jan-2011  jym Sync with HEAD
 1.10.2.5 24-Oct-2010  jym Sync with HEAD
 1.10.2.4 01-Nov-2009  jym - Upgrade suspend/resume code to comply with Xen2 removal.
- Add support for PAE domUs suspend/resume.
- Fix an issue regarding initialization of the xbd ring I/O that could end
badly during resume, with invalid block operations submitted to dom0 backend.

NetBSD supports PAE under x86_32 by considering the L2 page as being
4 pages long instead of 1.

Xen validates the page types during resume. Sadly, the hypervisor handles
alternative recursive mappings (== PG/PD entries pointing to pages other
than self) inadequately, which could lead to incorrect page pinning.

As a result, the important change with this patch is to clear these alternative
mappings during suspend, and reset them back to their former self upon
resume. For PAE, approx. all 4 PDIR_SLOT_PTEs could be considered as
alternative recursive mappings.

See comments in pmap.c for further details.

Now, let the testing and bug hunting begin.
 1.10.2.3 01-Nov-2009  jym Sync with HEAD.
 1.10.2.2 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.10.2.1 09-Feb-2009  jym Initial code for xen save/restore/migrate facilities.

- split the attach code of frontends in two half: one that is only needed
during autoconf(9) attach/detach phases, and one used at each save/restore
of device state (between suspend and resume).

Applies to hypervisor, xencons, xenbus, xbd, and xennet.

- add a rwlock(9) ("ptom_lock") to protect the different parts in the kernel
that manipulate MFNs (which could change between a suspend and a resume,
without the kernel noticing it). Parts that require MFNs acquire a reader lock,
while suspend code will acquire a writer lock to ensure that no-other parts
in kernel still use MFNs.

- integrate the suspend code with sysmon.

- various things in pmap(9), and clock.

TODO:
- factorize code a bit more inside frontends drivers.
- remove all alternative recursive (APDP_PDE) mappings found in PD/PT during
suspend, as Xen does not support them.
- abstract the ptom_lock locking, it is only required when kernel preemption
is enabled, or on MP systems.

Current code works mostly. You may experience difficulties in some corner
cases (dom0 warnings about xennet interface errors, and Xen tools failing to
validate NetBSD's alternative pmaps).
 1.20.4.6 12-Jun-2011  rmind sync with head
 1.20.4.5 31-May-2011  rmind sync with head
 1.20.4.4 21-Apr-2011  rmind sync with head
 1.20.4.3 05-Mar-2011  rmind sync with head
 1.20.4.2 30-May-2010  rmind sync with head
 1.20.4.1 26-Apr-2010  rmind Apply renovated patch to significantly reduce TLB shootdowns in x86 pmap,
also provide TLBSTATS option to measure and track TLB shootdowns. Details:

http://mail-index.netbsd.org/port-i386/2009/01/11/msg001018.html

Patch from Andrew Doran, proposed on tech-x86 [sic], in January 2009.

XXX: amd64 and xen are not yet; work in progress.
 1.20.2.3 22-Oct-2010  uebayasi Sync with HEAD (-D20101022).
 1.20.2.2 17-Aug-2010  uebayasi Sync with HEAD.
 1.20.2.1 30-Apr-2010  uebayasi Sync with HEAD.
 1.26.4.1 05-Mar-2011  bouyer Sync with HEAD
 1.26.2.1 06-Jun-2011  jruoho Sync with HEAD.
 1.34.2.7 22-Oct-2011  bouyer Various interrupt fixes, mainly:
keep a per-cpu mask of enabled events, and use it to get pending events.
A cpu-specific event (all of them at this time) should not be ever masked
by another CPU, because it may prevent the target CPU from seeing it
(the clock events all fires at once for example).
 1.34.2.6 01-Sep-2011  cherry fix %cr3 init. from mhitch@, tested by riz@ & mhitch@
 1.34.2.5 20-Aug-2011  cherry PAE MP support (preliminary), amd64 per-cpu L4 model redesigned, i386 pmap_pa_start/end fixup
 1.34.2.4 17-Aug-2011  cherry Pullup relevant changes from -current
 1.34.2.3 16-Jul-2011  cherry Introduce a per-cpu "shadow" for pmap_kernel()'s L4 page
 1.34.2.2 23-Jun-2011  cherry Catchup with rmind-uvmplock merge.
 1.34.2.1 03-Jun-2011  cherry Initial import of xen MP sources, with kernel and userspace tests.
- this is a source priview.
- boots to single user.
- spurious interrupt and pmap related panics are normal
 1.40.2.5 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.40.2.4 30-Oct-2012  yamt sync with head
 1.40.2.3 23-May-2012  yamt sync with head.
 1.40.2.2 17-Apr-2012  yamt sync with head
 1.40.2.1 10-Nov-2011  yamt sync with head
 1.43.4.5 29-Apr-2012  mrg sync to latest -current.
 1.43.4.4 06-Mar-2012  mrg sync to -current
 1.43.4.3 06-Mar-2012  mrg sync to -current
 1.43.4.2 04-Mar-2012  mrg sync to latest -current.
 1.43.4.1 18-Feb-2012  mrg merge to -current.
 1.47.2.3 09-May-2012  riz Pull up following revision(s) (requested by rmind in ticket #202):
sys/arch/x86/include/cpuvar.h: revision 1.46
sys/arch/xen/include/xenpmap.h: revision 1.34
sys/arch/i386/include/param.h: revision 1.77
sys/arch/x86/x86/pmap_tlb.c: revision 1.5
sys/arch/x86/x86/pmap_tlb.c: revision 1.6
sys/arch/i386/i386/genassym.cf: revision 1.92
sys/arch/xen/x86/cpu.c: revision 1.91
sys/arch/x86/x86/pmap.c: revision 1.177
sys/arch/xen/x86/xen_pmap.c: revision 1.21
sys/arch/x86/acpi/acpi_wakeup.c: revision 1.31
sys/kern/subr_kcpuset.c: revision 1.5
sys/arch/amd64/include/param.h: revision 1.18
sys/sys/kcpuset.h: revision 1.5
sys/arch/x86/x86/mtrr_i686.c: revision 1.26
sys/arch/x86/x86/mtrr_i686.c: revision 1.27
sys/arch/xen/x86/x86_xpmap.c: revision 1.43
sys/arch/x86/x86/cpu.c: revision 1.98
sys/arch/amd64/amd64/mptramp.S: revision 1.14
sys/kern/sys_sched.c: revision 1.42
sys/arch/amd64/amd64/genassym.cf: revision 1.50
sys/arch/i386/i386/mptramp.S: revision 1.24
sys/arch/x86/include/pmap.h: revision 1.52
sys/arch/x86/include/cpu.h: revision 1.50
- Convert x86 MD code, mainly pmap(9) e.g. TLB shootdown code, to use
kcpuset(9) and thus replace hardcoded CPU bitmasks. This removes the
limitation of maximum CPUs.
- Support up to 256 CPUs on amd64 architecture by default.
Bug fixes, improvements, completion of Xen part and testing on 64-core
AMD Opteron(tm) Processor 6282 SE (also, as Xen HVM domU with 128 CPUs)
by Manuel Bouyer.
- pmap_tlb_shootdown: do not overwrite tp_cpumask with pm_cpus, but merge
like pm_kernel_cpus. Remove unecessary intersection with kcpuset_running.
Do not reset tp_userpmap if pmap_kernel().
- Remove pmap_tlb_mailbox_t wrapping, which is pointless after recent changes.
- pmap_tlb_invalidate, pmap_tlb_intr: constify for packet structure.
i686_mtrr_init_first: handle the case when there are no variable-size MTRR
registers available (i686_mtrr_vcnt == 0).
 1.47.2.2 05-Mar-2012  sborrill Pull up the following revisions(s) (requested by bouyer in ticket #80):
sys/arch/xen/x86/x86_xpmap.c: revision 1.42
sys/arch/x86/include/specialreg.h: revision 1.56
sys/arch/amd64/amd64/machdep.c: revision 1.179
sys/arch/i386/i386/locore.S: revision 1.97
sys/arch/i386/i386/machdep.c: revision 1.723 via patch
sys/arch/x86/include/cpu.h: revision 1.49

Fix possible FPU registers corruption on context switches.
Fix type of pointers passed to some hypercalls.
 1.47.2.1 22-Feb-2012  riz Pull up following revision(s) (requested by bouyer in ticket #29):
sys/arch/xen/x86/x86_xpmap.c: revision 1.39
sys/arch/xen/include/hypervisor.h: revision 1.37
sys/arch/xen/include/intr.h: revision 1.34
sys/arch/xen/x86/xen_ipi.c: revision 1.10
sys/arch/x86/x86/cpu.c: revision 1.97
sys/arch/x86/include/cpu.h: revision 1.48
sys/uvm/uvm_map.c: revision 1.315
sys/arch/x86/x86/pmap.c: revision 1.165
sys/arch/xen/x86/cpu.c: revision 1.81
sys/arch/x86/x86/pmap.c: revision 1.167
sys/arch/xen/x86/cpu.c: revision 1.82
sys/arch/x86/x86/pmap.c: revision 1.168
sys/arch/xen/x86/xen_pmap.c: revision 1.17
sys/uvm/uvm_km.c: revision 1.122
sys/uvm/uvm_kmguard.c: revision 1.10
sys/arch/x86/include/pmap.h: revision 1.50
Apply patch proposed in PR port-xen/45975 (this does not solve the exact
problem reported here but is part of the solution):
xen_kpm_sync() is not working as expected,
leading to races between CPUs.
1 the check (xpq_cpu != &x86_curcpu) is always false because we
have different x86_curcpu symbols with different addresses in the kernel.
Fortunably, all addresses dissaemble to the same code.
Because of this we always use the code intended for bootstrap, which doesn't
use cross-calls or lock.
2 once 1 above is fixed, xen_kpm_sync() will use xcalls to sync other CPUs,
which cause it to sleep and pmap.c doesn't like that. It triggers this
KASSERT() in pmap_unmap_ptes():
KASSERT(pmap->pm_ncsw == curlwp->l_ncsw);
3 pmap->pm_cpus is not safe for the purpose of xen_kpm_sync(), which
needs to know on which CPU a pmap is loaded *now*:
pmap->pm_cpus is cleared before cpu_load_pmap() is called to switch
to a new pmap, leaving a window where a pmap is still in a CPU's
ci_kpm_pdir but not in pm_cpus. As a virtual CPU may be preempted
by the hypervisor at any time, it can be large enough to let another
CPU free the PTP and reuse it as a normal page.
To fix 2), avoid cross-calls and IPIs completely, and instead
use a mutex to update all CPU's ci_kpm_pdir from the local CPU.
It's safe because we just need to update the table page, a tlbflush IPI will
happen later. As a side effect, we don't need a different code for bootstrap,
fixing 1). The mutex added to struct cpu needs a small headers reorganisation.
to fix 3), introduce a pm_xen_ptp_cpus which is updated from
cpu_pmap_load(), whith the ci_kpm_mtx mutex held. Checking it with
ci_kpm_mtx held will avoid overwriting the wrong pmap's ci_kpm_pdir.
While there I removed the unused pmap_is_active() function;
and added some more details to DIAGNOSTIC panics.
When using uvm_km_pgremove_intrsafe() make sure mappings are removed
before returning the pages to the free pool. Otherwise, under Xen,
a page which still has a writable mapping could be allocated for
a PDP by another CPU and the hypervisor would refuse it (this is
PR port-xen/45975).
For this, move the pmap_kremove() calls inside uvm_km_pgremove_intrsafe(),
and do pmap_kremove()/uvm_pagefree() in batch of (at most) 16 entries
(as suggested by Chuck Silvers on tech-kern@, see also
http://mail-index.netbsd.org/tech-kern/2012/02/17/msg012727.html and
followups).
Avoid early use of xen_kpm_sync(); locks are not available at this time.
Don't call cpu_init() twice.
Makes LOCKDEBUG kernels boot again
Revert pmap_pte_flush() -> xpq_flush_queue() in previous.
 1.52.2.3 03-Dec-2017  jdolecek update from HEAD
 1.52.2.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.52.2.1 20-Nov-2012  tls Resync to 2012-11-19 00:00:00 UTC
 1.53.2.1 18-May-2014  rmind sync with head
 1.66.8.1 19-Mar-2018  martin Pull up following revision(s) (requested by msaitoh in ticket #1118):
sys/arch/x86/include/cpuvar.h: revision 1.47
sys/arch/x86/x86/cpu.c: revision 1.117
sys/arch/x86/x86/identcpu.c: revision 1.49
sys/arch/x86/include/cpu.h: revision 1.67

Retrieve cpuid7 (Structured Extended Features) into ci_feat_val.
 1.66.6.2 28-Aug-2017  skrll Sync with HEAD
 1.66.6.1 27-Dec-2015  skrll Sync with HEAD (as of 26th Dec)
 1.66.4.2 09-Oct-2018  snj Pull up following revision(s) (requested by msaitoh in ticket #1636):
sys/arch/x86/include/cacheinfo.h: 1.23-1.26
sys/arch/x86/include/cpu.h: 1.70
sys/arch/x86/include/specialreg.h: 1.91-1.93,1.98,1.100,1.102-1.124,1.126,1.130 via patch
sys/arch/x86/x86/cpu_topology.c: 1.10
sys/arch/x86/x86/identcpu.c: 1.56-1.57,1.70 via patch
usr.sbin/cpuctl/arch/i386.c: 1.71,1.75-1.79,1.81-1.85 via patch
Add some register definitions for x86:
- Add CLWB bit.
- Fix a few (unused) MSR values, and add some bit definitions of
MSR_EFER from Murray Armfield in PR#42861.
- CPUID_CFLUSH bit is not for CFLUSH insn but CLFLUSH insn, so modify
comments and snprintb() string.
- Define CPUID Fn00000001 %ebx bits and use them.
No functional change.
- Add Structured Extended Flags Enumeration Leaf's bit definitions:
AVX512_{IFMA,VBMI2,VNNI,BITALG,VPOPCNTDQ,4VNNIW,4FMAPS},GFNI&VAES.
- Add Turbo Boost Max Technology 3.0 bit.
- Add AMD SVM features definitions.
- Add Intel cpuid 7 %edx IBRS and STIBP bit definitions.
- Fix swapped comments for EFER LME and LMA
- Add Intel cpuid 7 %edx bit 29 IA32_ARCH_CAPABILITIES supported bit.
- Add MSR_IA32_ARCH_CAPABILITIES definition.
- Add IA32_SPEC_CTRL MSR and IA32_PRED_CMD MSR.
- Add Intel Deterministic Address Translation Parameter Leaf(0x18)
definitions.
- s/CLFUSH/CLFLUSH/
- Add AMD's Disable Indirect Branch Predictor bit definition.
- Add the MSR bits definitions for IBRS, STIBP and IBPB.
- Add Intel Fn0000_0006 %eax new bit 14-20 (HWP stuff).
- Intel Fn0000_0007 %ecx bit 22 is for both RDPID and IA32_TSC_AUX.
- Add AMD's CPUID Fn80000001 %edx MMX and FXSR bit definitions.
- Add RDCL_NO and IBRS_ALL.
- Add SSBD and RSBA bit definitions.
- Add AMD's SSB bit definitions for F15H, F16H and F17H.
- Add cpuid 7 edx L1D_FLUSH bit.
- Add IA32_ARCH_SKIP_L1DFL_VMENTRY bit.
- Add IA32_FLUSH_CMD MSR.
- Add yet another Shared L2 TLB (2M/4M pages).
- Add 3way and 6way of L2 cache or TLB on AMD CPU.
- AMD L3 cache association bitfield is not 8bit but 4bit like others
association bitfields.
- Sort entries. No functional change.
- Modify comment, fix typo in comment and add comment.
cpuctl(8):
- Add detection for Quark X1000, Xeon E5 v4, E7 v4,
Core i7-69xx Extreme Edition, Xeon Scalable (Skylake),
Xeon Phi [357]200 (Knights Landing), Atom (Goldmont),
Atom (Denverton), Future Core (Cannon Lake), Atom (Goldmont Plus),
Xeon Phi 7215, 7285 and 7295 (Knights Mill) and
7th or 8th gen Core (Kaby Lake, Coffee Lake).
- Print Structured Extended Feature leaf Fn0000_0007 %ebx on AMD,too.
- Print Fn0000_0007 %ecx on Intel.
- Print Intel cpuid 7 %edx.
- Parse the TLB info from `cpuid leaf 18H' on Intel processor.
- Use aprint_error_dev() for error output.
 1.66.4.1 06-Mar-2016  martin Pull up following revision(s) (requested by msaitoh in ticket #1118):
sys/arch/x86/include/cpuvar.h: revision 1.47
sys/arch/x86/x86/cpu.c: revision 1.117
sys/arch/x86/x86/identcpu.c: revision 1.49
sys/arch/x86/include/cpu.h: revision 1.67
Retrieve cpuid7 (Structured Extended Features) into ci_feat_val.
 1.67.4.1 21-Apr-2017  bouyer Sync with HEAD
 1.67.2.2 26-Apr-2017  pgoyette Sync with HEAD
 1.67.2.1 20-Mar-2017  pgoyette Sync with HEAD
 1.69.2.1 19-May-2017  pgoyette Resolve conflicts from previous merge (all resulting from $NetBSD
keywork expansion)
 1.71.2.10 24-Dec-2021  martin Pull up the following (all via patch), requested by msaitoh in ticket #1721:

usr.sbin/cpuctl/arch/i386.c 1.118-1.119, 1.121-1.122
usr.sbin/cpuctl/arch/cpuctl_i386.h 1.6
sys/arch/x86/x86/identcpu_subr.c 1.8-1.9
sys/arch/x86/x86/identcpu.c 1.123
sys/arch/x86/include/cacheinfo.h 1.30
sys/arch/x86/include/cpu.h 1.132

- Fix a bug that some TLB related lines were not printed.
- Fix a bug that STLB is printed as DTLB.
- If a TLB is variable sized, print the max size instead of error message.
- Cosmetic changes to improve readability.
 1.71.2.9 05-Aug-2020  martin Pull up the following revisions, requested by msaitoh in ticket #1593:

sys/arch/x86/conf/files.x86 1.108
sys/arch/x86/include/apicvar.h 1.7 via patch
sys/arch/x86/include/cpu.h 1.121
sys/arch/x86/x86/cpu.c 1.185 via patch
sys/arch/x86/x86/hyperv.c 1.7
sys/arch/x86/x86/tsc.c 1.41
sys/arch/xen/conf/files.xen 1.181

Get TSC frequency from CPUID 0x15 and/or x16 if it's available.
This change fixes a problem that newer Intel processors' timer
counts very slowly.
 1.71.2.8 09-Mar-2019  martin Pull up following revision(s) via patch (requested by nonaka in ticket #1210):

sys/dev/hyperv/vmbusvar.h: revision 1.1
sys/dev/hyperv/hvs.c: revision 1.1
sys/dev/hyperv/if_hvn.c: revision 1.1
sys/dev/hyperv/vmbusic.c: revision 1.1
sys/arch/x86/x86/lapic.c: revision 1.69
sys/arch/x86/isa/clock.c: revision 1.34
sys/arch/x86/include/intrdefs.h: revision 1.22
sys/arch/i386/conf/GENERIC: revision 1.1201
sys/arch/x86/x86/hyperv.c: revision 1.1
sys/arch/x86/include/cpu.h: revision 1.105
sys/arch/x86/x86/x86_machdep.c: revision 1.124
sys/arch/i386/conf/GENERIC: revision 1.1203
sys/arch/amd64/amd64/genassym.cf: revision 1.74
sys/arch/i386/conf/GENERIC: revision 1.1204
sys/arch/amd64/conf/GENERIC: revision 1.520
sys/arch/x86/x86/hypervreg.h: revision 1.1
sys/arch/amd64/amd64/vector.S: revision 1.69
sys/dev/hyperv/hvshutdown.c: revision 1.1
sys/dev/hyperv/hvshutdown.c: revision 1.2
sys/dev/usb/if_urndisreg.h: file removal
sys/arch/x86/x86/cpu.c: revision 1.167
sys/arch/x86/conf/files.x86: revision 1.107
sys/dev/usb/if_urndis.c: revision 1.20
sys/dev/hyperv/vmbusicreg.h: revision 1.1
sys/dev/hyperv/hvheartbeat.c: revision 1.1
sys/dev/hyperv/vmbusicreg.h: revision 1.2
sys/dev/hyperv/hvheartbeat.c: revision 1.2
sys/dev/hyperv/files.hyperv: revision 1.1
sys/dev/ic/rndisreg.h: revision 1.1
sys/arch/i386/i386/genassym.cf: revision 1.111
sys/dev/ic/rndisreg.h: revision 1.2
sys/dev/hyperv/hyperv_common.c: revision 1.1
sys/dev/hyperv/hvtimesync.c: revision 1.1
sys/dev/hyperv/hypervreg.h: revision 1.1
sys/dev/hyperv/hvtimesync.c: revision 1.2
sys/dev/hyperv/vmbusicvar.h: revision 1.1
sys/dev/hyperv/if_hvnreg.h: revision 1.1
sys/arch/x86/x86/lapic.c: revision 1.70
sys/arch/amd64/amd64/vector.S: revision 1.70
sys/dev/ic/ndisreg.h: revision 1.1
sys/arch/amd64/conf/GENERIC: revision 1.516
sys/dev/hyperv/hypervvar.h: revision 1.1
sys/arch/amd64/conf/GENERIC: revision 1.518
sys/arch/amd64/conf/GENERIC: revision 1.519
sys/arch/i386/conf/files.i386: revision 1.400
sys/dev/acpi/vmbus_acpi.c: revision 1.1
sys/dev/hyperv/vmbus.c: revision 1.1
sys/dev/hyperv/vmbus.c: revision 1.2
sys/arch/x86/x86/intr.c: revision 1.144
sys/arch/i386/i386/vector.S: revision 1.83
sys/arch/amd64/conf/files.amd64: revision 1.112

separate RNDIS definitions from urndis(4) for use with Hyper-V NetVSC.

-

Added Microsoft Hyper-V support. It ported from OpenBSD and FreeBSD.
graphical console is not work on Gen.2 VM yet. To use the serial console,
enter "consdev com,0x3f8,115200" on efiboot.

-

Add __diagused.

-

PR/53984: Partial revert of modify lapic_calibrate_timer() in lapic.c r1.69.

-

Update Hyper-V related drivers description.

-

Remove unused definition.

-

Rename the MODULE_*_HOOK() macros to MODULE_HOOK_*() as briefly
discussed on irc.
NFCI intended.

-

commented out hvkvp entry.

-

fix typo. pointed out by pgoyette@n.o.

-

Use IDTVEC instead of NENTRY for handle_hyperv_hypercall.

-

Rename the MODULE_*_HOOK() macros to MODULE_HOOK_*() as briefly
discussed on irc.
 1.71.2.7 23-Jun-2018  martin Pull up the following, via patch, requested by maxv in ticket #897:

sys/arch/amd64/amd64/locore.S 1.166 (patch)
sys/arch/i386/i386/locore.S 1.157 (patch)
sys/arch/x86/include/cpu.h 1.92 (patch)
sys/arch/x86/include/fpu.h 1.9 (patch)
sys/arch/x86/x86/fpu.c 1.33-1.39 (patch)
sys/arch/x86/x86/identcpu.c 1.72 (patch)
sys/arch/x86/x86/vm_machdep.c 1.34 (patch)
sys/arch/x86/x86/x86_machdep.c 1.116,1.117 (patch)

Support eager fpu switch, to work around INTEL-SA-00145.
Provide a sysctl machdep.fpu_eager, which gets automatically
initialized to 1 on affected CPUs.
 1.71.2.6 09-Jun-2018  martin Pullup the following revisions, requested by maxv in ticket #865:

sys/arch/amd64/amd64/machdep.c 1.303 (patch)
sys/arch/amd64/conf/GENERIC 1.492 (patch)
sys/arch/amd64/conf/files.amd64 1.103 (patch)
sys/arch/i386/i386/machdep.c 1.806 (patch)
sys/arch/i386/conf/GENERIC 1.1179 (patch)
sys/arch/i386/conf/files.i386 1.393 (patch)
sys/arch/x86/include/cpu.h 1.91 (patch)
sys/arch/x86/include/specialreg.h upto 1.126 (patch)
sys/arch/x86/x86/x86_machdep.c upto 1.115 (patch, adapted)
sys/arch/x86/x86/spectre.c upto 1.19 (patch, adapted,
no IBRS,
SpectreV2 mitigations not
enabled by default)

Backport the hardware SpectreV2 and SpectreV4 mitigations.
 1.71.2.5 01-Apr-2018  martin Pull up following revision(s) (requested by maxv in ticket #681):
sys/arch/x86/include/cpu.h: revision 1.90
sys/arch/x86/x86/identcpu.c: revision 1.71
Retrieve cpuid.7:%edx.
 1.71.2.4 22-Mar-2018  martin Pull up the following revisions, requested by maxv in ticket #652:

sys/arch/amd64/amd64/amd64_trap.S upto 1.39 (partial, patch)
sys/arch/amd64/amd64/db_machdep.c 1.6 (patch)
sys/arch/amd64/amd64/genassym.cf 1.65,1.66,1.67 (patch)
sys/arch/amd64/amd64/locore.S upto 1.159 (partial, patch)
sys/arch/amd64/amd64/machdep.c 1.299-1.302 (patch)
sys/arch/amd64/amd64/trap.c upto 1.113 (partial, patch)
sys/arch/amd64/amd64/amd64/vector.S upto 1.61 (partial, patch)
sys/arch/amd64/conf/GENERIC 1.477,1.478 (patch)
sys/arch/amd64/conf/kern.ldscript 1.26 (patch)
sys/arch/amd64/include/frameasm.h upto 1.37 (partial, patch)
sys/arch/amd64/include/param.h 1.25 (patch)
sys/arch/amd64/include/pmap.h 1.41,1.43,1.44 (patch)
sys/arch/x86/conf/files.x86 1.91,1.93 (patch)
sys/arch/x86/include/cpu.h 1.88,1.89 (patch)
sys/arch/x86/include/pmap.h 1.75 (patch)
sys/arch/x86/x86/cpu.c 1.144,1.146,1.148,1.149 (patch)
sys/arch/x86/x86/pmap.c upto 1.289 (partial, patch)
sys/arch/x86/x86/vm_machdep.c 1.31,1.32 (patch)
sys/arch/x86/x86/x86_machdep.c 1.104,1.106,1.108 (patch)
sys/arch/x86/x86/svs.c 1.1-1.14
sys/arch/xen/conf/files.compat 1.30 (patch)

Backport SVS. Not enabled yet.
 1.71.2.3 16-Mar-2018  martin Pull up the following revisions (via patch), requested by maxv in #635:

sys/arch/amd64/amd64/gdt.c 1.39-1.45 (patch)
sys/arch/amd64/amd64/amd64/machdep.c 1.284,1.287,1.288 (patch)
sys/arch/amd64/amd64/include/param.h 1.23 (patch)
sys/arch/amd64/include/types.h 1.53 (patch)
sys/arch/x86/include/cpu.h 1.87 (patch)
sys/arch/x86/include/pmap.h 1.73,1.74 (patch)
sys/arch/x86/x86/cpu.c 1.142 (patch)
sys/arch/x86/x86/intr.c 1.117 (partial),1.120 (patch)
sys/arch/x86/x86/pmap.c 1.276 (patch)

Initialize ist0 in cpu_init_tss.
Backport __HAVE_PCPU_AREA.
 1.71.2.2 13-Mar-2018  martin Pullup the following revisions via patch, requested by maxv in ticket #629:

sys/arch/amd64/amd64/genassym.cf 1.63,1.64
sys/arch/amd64/amd64/locore.S 1.144
sys/arch/amd64/amd64/machdep.c 1.281-1.283
sys/arch/i386/i386/genassym.cf 1.105-1.106
sys/arch/i386/i386/locore.S 1.155
sys/arch/i386/i386/machdep.c 1.802 (adapted),1.803
sys/arch/x86/include/cpu.h 1.85
sys/arch/x86/x86/intr.c 1.115-1.116
sys/arch/x86/x86/pmap.c 1.275
sys/arch/x86/x86/sys_machdep.c 1.45
sys/arch/xen/x86/cpu.c 1.117

Stop sharing the double-fault stack.
Merge the TSS structures into one single cpu_tss structure, and
allocate it dynamically.
 1.71.2.1 08-Mar-2018  martin Pull up following revision(s) (requested by maxv in ticket #611):
sys/arch/x86/x86/cpu.c: revision 1.134 (patch)
sys/arch/x86/include/cpu.h: revision 1.78 (patch)
sys/arch/i386/i386/machdep.c: revision 1.792 (patch)

style, and move some i386-specific code into i386/
 1.74.2.2 16-Jul-2017  cherry 2302677
 1.74.2.1 16-Jul-2017  cherry file cpu.h was added on branch perseant-stdc-iso10646 on 2017-07-16 14:02:49 +0000
 1.89.2.7 26-Dec-2018  pgoyette Sync with HEAD, resolve a few conflicts
 1.89.2.6 26-Nov-2018  pgoyette Sync with HEAD, resolve a couple of conflicts
 1.89.2.5 20-Oct-2018  pgoyette Sync with head
 1.89.2.4 06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.89.2.3 28-Jul-2018  pgoyette Sync with HEAD
 1.89.2.2 25-Jun-2018  pgoyette Sync with HEAD
 1.89.2.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.92.2.3 21-Apr-2020  martin Sync with HEAD
 1.92.2.2 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.92.2.1 10-Jun-2019  christos Sync with HEAD
 1.107.2.2 24-Dec-2021  martin Pull up the following (all via patch), requested by msaitoh in ticket #1396:

usr.sbin/cpuctl/arch/i386.c 1.118-1.119, 1.121-1.122
usr.sbin/cpuctl/arch/cpuctl_i386.h 1.6
sys/arch/x86/x86/identcpu_subr.c 1.8-1.9
sys/arch/x86/x86/identcpu.c 1.123
sys/arch/x86/include/cacheinfo.h 1.30
sys/arch/x86/include/cpu.h 1.132

- Fix a bug that some TLB related lines were not printed.
- Fix a bug that STLB is printed as DTLB.
- If a TLB is variable sized, print the max size instead of error message.
- Cosmetic changes to improve readability.
 1.107.2.1 15-Jul-2020  martin Pull up the following, requested by msaitoh in ticket #1015

sys/arch/x86/conf/files.x86 1.108 (via patch)
sys/arch/x86/include/apicvar.h 1.7 (via patch)
sys/arch/x86/include/cpu.h 1.121 (via patch)
sys/arch/x86/x86/cpu.c 1.185 (via patch)
sys/arch/x86/x86/hyperv.c 1.7 (via patch)
sys/arch/x86/x86/tsc.c 1.41 (via patch)
sys/arch/xen/conf/files.xen 1.181 (via patch)

Get TSC frequency from CPUID 0x15 and/or x16 if it's available.
This change fixes a problem that newer Intel processors' timer
counts very slowly.
 1.116.2.1 17-Jan-2020  ad Sync with head.
 1.117.4.7 25-Apr-2020  bouyer Sync with bouyer-xenpvh-base2 (HEAD)
 1.117.4.6 18-Apr-2020  bouyer Add PVHVM multiprocessor support:
We need the hypervisor to be set up before cpus attaches.
Move hypervisor setup to a new function xen_hvm_init(), called at the
beggining of mainbus_attach(). This function searches the cfdata[] array
to see if the hypervisor device is enabled (so you can disable PV
support with
disable hypervisor
from userconf).
For HVM, ci_cpuid doens't match the virtual CPU index needed by Xen.
Introduce ci_vcpuid to cpu_info. Introduce xen_hvm_init_cpu(), to be
called for each CPU in in its context, which initialize ci_vcpuid and
ci_vcpu, and setup the event callback.
Change Xen code to use ci_vcpuid.

Do not call lapic_calibrate_timer() for VM_GUEST_XENPVHVM, we will use
Xen timers.

Don't call lapic_initclocks() from cpu_hatch(); instead set
x86_cpu_initclock_func to lapic_initclocks() in lapic_calibrate_timer(),
and call *(x86_cpu_initclock_func)() from cpu_hatch().
Also call x86_cpu_initclock_func from cpu_attach() for the boot CPU.
As x86_cpu_initclock_func is called for all CPUs, x86_initclock_func can
be a NOP for lapic timer.

Reorganize Xen code for x86_initclock_func/x86_cpu_initclock_func.
Move x86_cpu_idle_xen() to hypervisor_machdep.c
 1.117.4.5 16-Apr-2020  bouyer Avoid overflow of ci_ipi_events[] in the PVHVM case (it's size is
XEN_NIPIS but we use x86 IPIs): size XEN_NIPIS only for PV, and
CTASSERT that XEN_NIPIS <= X86_NIPI if we ever use Xen IPIs for
PVHVM.
 1.117.4.4 12-Apr-2020  bouyer Get rid of xen-specific ci_x* interrupt handling:
- use the general SIR mechanism, reserving 3 more slots for IPL_VM, IPL_SCHED
and IPL_HIGH
- remove specific handling from C sources, or change to ipending
- convert IPL number to SIR number in various places
- Remove XUNMASK/XPENDING in assembly or change to IUNMASK/IPENDING
- remove Xen-specific ci_xsources, ci_xmask, ci_xunmask, ci_xpending from
struct cpu_info
- for now remove a KASSERT that there are no pending interrupts in
idle_block(). We can get there with some software interrupts pending
in autoconf XXX needs to be looked at.
 1.117.4.3 11-Apr-2020  bouyer Include ci_isources[] for XenPV too.
Adjust spllower() to XenPV needs, and switch XenPV to the native spllower().
Remove xen_spllower().
 1.117.4.2 10-Apr-2020  bouyer Skip cx8_spllower patch if we're running on any form of Xen PV,
we can't handle PV interrupts with a single atomic op here.
Enable x86_patch() for Xen too.
 1.117.4.1 08-Apr-2020  bouyer Remove VM_GUEST_XEN and define only Xen subtypes:
VM_GUEST_XENPV
VM_GUEST_XENPVH
VM_GUEST_XENHVM
VM_GUEST_XENPVHVM

Set vm_guest in the start routine, if it is hypervisor-specific (e.g Xen PV).
If vm_guest was not set early and we detect Xen in identify_hypervisor(),
assume it is VM_GUEST_XENHVM. Refine to VM_GUEST_PVXENHVM in
hypervisor_match().
 1.129.2.1 03-Apr-2021  thorpej Sync with HEAD.
 1.133.4.2 29-Mar-2025  martin Pull up following revision(s) (requested by imil in ticket #1074):

sys/arch/x86/x86/x86_machdep.c: revision 1.155
sys/arch/x86/include/cpu.h: revision 1.137
sys/arch/x86/x86/x86_machdep.c: revision 1.156
sys/arch/x86/include/cpu.h: revision 1.138
sys/arch/x86/x86/consinit.c: revision 1.40
sys/arch/x86/acpi/acpi_machdep.c: revision 1.37
sys/arch/x86/acpi/acpi_machdep.c: revision 1.38
sys/arch/amd64/amd64/machdep.c: revision 1.370
sys/arch/xen/xen/hypervisor.c: revision 1.97
sys/arch/xen/xen/hypervisor.c: revision 1.98
sys/arch/amd64/amd64/genassym.cf: revision 1.98
sys/arch/x86/x86/x86_autoconf.c: revision 1.88
sys/arch/x86/x86/x86_autoconf.c: revision 1.89
sys/arch/amd64/amd64/locore.S: revision 1.226
sys/arch/amd64/amd64/locore.S: revision 1.227
sys/arch/x86/x86/identcpu.c: revision 1.131

Add support for non-Xen PVH guests to amd64. Patch from
Emile 'iMil' Heitor in PR kern/57813, with some cosmetic tweaks by me.
Tested on bare metal, Xen PV and Xen PVH by me.

Get one more change from PR kern/57813, needed for non-Xen PVH.

Introduce vm_guest_is_pvh() and use it in place of
(vm_guest == VM_GUEST_XENPVH || vm_guest == VM_GUEST_GENPVH)
 1.133.4.1 09-Aug-2023  martin Pull up following revision(s) (requested by maya in ticket #316):

sys/arch/m68k/include/mutex.h: revision 1.13
sys/arch/arm/include/cpu.h: revision 1.125
sys/arch/sun68k/include/intr.h: revision 1.21
sys/arch/arm/include/mutex.h: revision 1.28
sys/sys/rwlock.h: revision 1.18
sys/arch/powerpc/include/mutex.h: revision 1.7
sys/arch/arm/include/mutex.h: revision 1.29
sys/arch/powerpc/include/mutex.h: revision 1.8
sys/uvm/uvm_param.h: revision 1.42
sys/sys/ksem.h: revision 1.16
sys/arch/x86/include/mutex.h: revision 1.10
sys/sys/proc.h: revision 1.372
sys/sys/ksem.h: revision 1.17
sys/arch/ia64/include/mutex.h: revision 1.8
sys/arch/evbarm/include/intr.h: revision 1.29
sys/sys/lua.h: revision 1.9
sys/arch/next68k/include/intr.h: revision 1.23
sys/arch/ia64/include/mutex.h: revision 1.9
sys/arch/hp300/include/intr.h: revision 1.35
sys/arch/hp300/include/intr.h: revision 1.36
sys/arch/sparc/include/cpu.h: revision 1.111
sys/arch/hppa/include/mutex.h: revision 1.16
sys/arch/vax/include/intr.h: revision 1.31
sys/arch/hppa/include/mutex.h: revision 1.17
sys/arch/news68k/include/intr.h: revision 1.28
sys/arch/hppa/include/mutex.h: revision 1.18
sys/arch/hppa/include/intr.h: revision 1.3
sys/arch/hppa/include/mutex.h: revision 1.19
sys/arch/hppa/include/intr.h: revision 1.4
sys/sys/sched.h: revision 1.92
sys/opencrypto/cryptodev.h: revision 1.51
sys/arch/vax/include/mutex.h: revision 1.20
sys/arch/sparc64/include/mutex.h: revision 1.10
sys/arch/ia64/include/sapicvar.h: revision 1.2
sys/arch/riscv/include/mutex.h: revision 1.5
sys/arch/amiga/dev/grfabs_cc.c: revision 1.39
sys/external/bsd/drm2/include/linux/idr.h: revision 1.11
sys/arch/riscv/include/mutex.h: revision 1.6
sys/ddb/files.ddb: revision 1.16
sys/arch/mac68k/include/intr.h: revision 1.32
share/man/man4/ddb.4: revision 1.203
sys/ddb/db_command.c: revision 1.183
sys/arch/mips/include/mutex.h: revision 1.10
sys/ddb/db_command.c: revision 1.184
sys/arch/x68k/include/intr.h: revision 1.22
sys/arch/sparc/include/psl.h: revision 1.51
sys/arch/or1k/include/mutex.h: revision 1.4
sys/arch/mips/include/mutex.h: revision 1.11
sys/arch/arm/xscale/pxa2x0_intr.h: revision 1.16
sys/arch/sparc64/include/cpu.h: revision 1.134
sys/arch/sparc/include/psl.h: revision 1.52
sys/arch/or1k/include/mutex.h: revision 1.5
sys/arch/mvme68k/include/intr.h: revision 1.22
sys/arch/luna68k/include/intr.h: revision 1.16
external/cddl/osnet/sys/sys/kcondvar.h: revision 1.6
sys/arch/sparc/include/mutex.h: revision 1.12
sys/arch/sparc/include/mutex.h: revision 1.13
sys/arch/usermode/include/mutex.h: revision 1.5
sys/arch/usermode/include/mutex.h: revision 1.6
sys/kern/kern_core.c: revision 1.38
usr.sbin/crash/Makefile: revision 1.49
sys/arch/amiga/include/intr.h: revision 1.23
sys/arch/alpha/include/mutex.h: revision 1.12
sys/arch/alpha/include/mutex.h: revision 1.13
sys/arch/evbarm/lubbock/sacc_obio.c: revision 1.16
sys/ddb/ddb.h: revision 1.6
sys/arch/sparc64/include/mutex.h: revision 1.8
sys/arch/sh3/include/mutex.h: revision 1.12
sys/arch/evbarm/lubbock/sacc_obio.c: revision 1.17
sys/ddb/db_syncobj.c: revision 1.1
sys/arch/vax/include/mutex.h: revision 1.18
sys/arch/sparc64/include/psl.h: revision 1.63
sys/arch/sparc64/include/mutex.h: revision 1.9
sys/arch/sh3/include/mutex.h: revision 1.13
sys/arch/evbarm/lubbock/obio.c: revision 1.13
sys/arch/atari/include/intr.h: revision 1.23
sys/ddb/db_syncobj.c: revision 1.2
sys/arch/vax/include/mutex.h: revision 1.19
sys/arch/evbarm/g42xxeb/obio.c: revision 1.14
sys/arch/evbarm/g42xxeb/obio.c: revision 1.15
sys/arch/cesfic/include/intr.h: revision 1.14
sys/ddb/db_syncobj.h: revision 1.1
sys/arch/x86/include/cpu.h: revision 1.134
sys/arch/evbarm/g42xxeb/obio.c: revision 1.16
sys/arch/cesfic/include/intr.h: revision 1.15
sys/arch/arm/xscale/pxa2x0_intr.c: revision 1.26
sys/sys/cpu_data.h: revision 1.54
sys/arch/m68k/include/mutex.h: revision 1.12
sys/arch/ia64/acpi/madt.c: revision 1.6

sys/rwlock.h: Make this more self-contained for bool.

machine/mutex.h: Sprinkle includes so this can be used by crash(8).

ddb: New `show all tstiles' command.
Shows who's waiting for which locks and what the owner is up to.

Include psl.h for ipl_cookie_t if __MUTEX_PRIVATE

sys: Rip <sys/resourcevar.h> out of <uvm/uvm_param.h>.

And thus out of <sys/param.h>, which is exceedingly overused and
fragile and delenda est.

Should fix (some) issues with the recent inclusion of machine/lock.h
in various machine/mutex.h files.

arm/mutex.h: Need machine/intr.h, machine/lock.h.

For ipl_cookie_t and __cpu_simple_lock_t.
evbarm/intr.h: Define ipl_cookie_t before including ARM_INTR_IMPL.

Otherwise arm/mutex.h doesn't work, due to a cyclic dependency which
should really be fixed.
opencrypto/cryptodev.h: Fix includes.
- Move sys/condvar.h under #ifdef _KERNEL.
- Add some other necessary includes and forward declarations.
- Sort.

hp300/intr.h: Fix missing includes.
linux/idr.h: Need <sys/mutex.h> for kmutex_t.
amiga/intr.h: Don't define spl*() functions if !_KERNEL.

This is used by crash(8) now, and what's important is ipl_cookie_t.
cesfic/intr.h: Expose ipl_cookie_t to userland for crash(8).
cesfic/intr.h: Expose ipl_cookie_t to userland only with _KMEMUSER.

Probably not necessary but let's be a little more cautious about
this.

atari/intr.h: Expose ipl_cookie_t with _KMEMUSER for crash(8).

arm/cpu.h: Need sys/param.h for COHERENCY_UNIT.

Nix machine/param.h -- not meant to be used directly, pulled in by
sys/param.h.

Move the definition of ipl_cookie_t out of the kernel-only sections,
some _KMEMUSER applications need it.

ddb: Cast pointer to uintptr_t first before db_expr_t.

hppa/intr.h: Expose ipl_cookie_t to _KMEMUSER for crash(8).

luna68k/intr.h: Expose ipl_cookie_t to _KMEMUSER for crash(8).

mvme68k/intr.h: Expose ipl_cookie_t to _KMEMUSER for crash(8).

news68k/intr.h: Fix includes. Put some definitions under _KERNEL.

next68k/intr.h: Expose ipl_cookie_t to _KMEMUSER for crash(8).

sys/ksem.h: Hack around fstat(8) abuse of _KERNEL.

sun68k/intr.h: Expose ipl_cookie_t to _KMEMUSER for crash(8).

vax/intr.h: Expose ipl_cookie_t to _KMEMUSER for crash(8).

x68k/intr.h: Put functions under _KERNEL so crash(8) can use this.

Make ipl_cookie_t visible for _KMEMUSER userland applications.

fix editor mishap in previous

Explicitly include <sys/mutex.h> for kmutex_t.

Replace kmutex_t * (which may be undefined here) with struct kmutex *,
suggested by Taylor.

hp300/intr.h: Put most of this under #ifdef _KERNEL.
Only ipl_cookie_t really needs to be exposed now, for crash(8).

mac68k/intr.h: Expose ipl_cookie_t to _KMEMUSER for crash(8).
Make inclusion of sys/intr.h explicit for spl*.

fix hppa and vax builds.

machine/lock.h isn't necessary for __cpu_simple_lock_t, it's in
sys/types.h. avoids cpu_data.h vs sched.h include order issues.

move the hppa ipl_t typedef with the moved usage of it.
machine/mutex.h: Sprinkle sys/types.h, omit machine/lock.h.

Turns out machine/lock.h is not needed for __cpu_simple_lock_t, which
always comes from sys/types.h. And, really, sys/types.h (or at least
sys/stdint.h) is needed for uintN_t and uintptr_t.

ddb: Cast pointer to uintptr_t, then to db_expr_t.
Avoids warnings about conversion between pointer and integer of
different size on some architectures.

re-fix hppa builds.

this file uses __cpu_simple_lock(), not just the underlying type,
so it does need machine/lock.h.

Break cycle by using `struct kmutex *' instead of `kmutex_t *'.
sys/sched.h included sys/mutex.h
which includes sys/intr.h
which includes machine/intr.h
which on cats includes arm/footbridge/footbridge_intr.h
which includes arm/cpu.h
which includes sys/cpu_data.h
which includes sys/sched.h

But there was never any real need for sys/mutex.h in sys/sched.h,
because it only uses pointers to the opaque struct kmutex. Cycle
broken by using `struct kmutex *' instead of pulling in sys/mutex.h
for the definition of kmutex_t.

Side effect: This revealed that sys/cpu_data.h needed sys/intr.h
(which was pulled in accidentally by sys/mutex.h via sys/sched.h) for
SOFTINT_COUNT. Also revealed some other machine/cpu.h header files
were missing includes of sys/mutex.h for kmutex_t.

ia64: Need sys/types.h for u_int, vaddr_t; sys/mutex.h for kmutex_t.

explicitly include no longer implicitly included sys/mutex.h.

arm/xscale: Use sys/bitops.h fls32 - 1 instead of 31 - __builtin_clz.
Sidesteps namespace collision with `#define bits ...' in net/zlib.c.

complete the previous - there were two calls to find_first_bit() to fix.

arm/xscale: Missed a spot with previous find_first_bit commit.

evbarm/g42xxeb: Fix off-by-one in previous.

The original find_first_bit(x) was 31 - __builtin_clz((uint32_t)x),
which is equivalent to fls32(x) - 1, not to fls32(x).

Note that fls32 is 1-based and returns 0 for x=0.
 1.136.6.1 02-Aug-2025  perseant Sync with HEAD
 1.7 15-Jun-2020  msaitoh Serialize rdtsc using with lfence, mfence or cpuid to read TSC more precisely.

x86/x86/tsc.c rev. 1.67 reduced cache problem and got big improvement, but it
still has room. I measured the effect of lfence, mfence, cpuid and rdtscp.
The impact to TSC skew and/or drift is:

AMD: mfence > rdtscp > cpuid > lfence-serialize > lfence = nomodify
Intel: lfence > rdtscp > cpuid > nomodify

So, mfence is the best on AMD and lfence is the best on Intel. If it has no
SSE2, we can use cpuid.

NOTE:
- An AMD's document says DE_CFG_LFENCE_SERIALIZE bit can be used for
serializing, but it's not so good.
- On Intel i386(not amd64), it seems the improvement is very little.
- rdtscp instruct can be used as serializing instruction + rdtsc, but
it's not good as [lm]fence. Both Intel and AMD's document say that
the latency of rdtscp is bigger than rdtsc, so I suspect the difference
of the result comes from it.
 1.6 08-May-2020  ad Fix the TSC timecounter (on the systems I have access to):

- Make the early i8254-based calculation of frequency a bit more accurate.

- Keep track of how far the HPET & TSC advance between HPET attach and
secondary CPU boot, and use to compute an accurate value before attaching
the timecounter. Initial idea from joerg@.

- When determining skew and drift between CPUs, make each measurement 1000
times and pick the lowest observed value. Increase the error threshold to
1000 clock cycles.

- Use the frequency computed on the boot CPU for secondary CPUs too.

- Remove cpu_counter_serializing().
 1.5 02-Feb-2011  bouyer Some CPU have cpu counter (CPUID_TSC is there) but don't handle the
rdmsr instruction (CPUID_MSR is not there).
Introduce a cpu_counter_serializing() function to remplace rdmsr(MSR_TSC)
calls, which does a rdmsr(MSR_TSC) if available and cpu_counter() otherwise.
This makes the cpu counter useable on vortex86 CPUs.
OK ad@
 1.4 10-May-2008  ad branches: 1.4.12; 1.4.20; 1.4.26; 1.4.28;
Improve x86 tsc handling:

- Ditch the cross-CPU calibration stuff. It didn't work properly, and it's
near impossible to synchronize the CPUs in a running system, because bus
traffic will interfere with any calibration attempt, messing up the
timings.

- Only enable the TSC on CPUs where we are sure it does not drift. If we are
On a known good CPU, give the TSC high timecounter quality, making it the
default.

- When booting CPUs, detect TSC skew and account for it. Most Intel MP
systems have synchronized counters, but that need not be true if the
system has a complicated bus structure. As far as I know, AMD systems
do not have synchronized TSCs and so we need to handle skew.

- While an AP is waiting to be set running, try and make the TSC drift by
entering a reduced power state. If we detect drift, ensure that the TSC
does not get a high timecounter quality. This should not happen and is
only for safety.

- Make cpu_counter() stuff LKM safe.
 1.3 10-May-2008  ad Merge cpu_counter.h.
 1.2 28-Apr-2008  martin branches: 1.2.2;
Remove clause 3 and 4 from TNF licenses
 1.1 07-Jul-2007  tsutsui branches: 1.1.2; 1.1.4; 1.1.16; 1.1.36; 1.1.38; 1.1.40;
Move x86 common cpu_counter functions into <x86/cpu_counter.h>.
 1.1.40.1 16-May-2008  yamt sync with head.
 1.1.38.1 18-May-2008  yamt sync with head.
 1.1.36.1 02-Jun-2008  mjf Sync with HEAD.
 1.1.16.2 03-Sep-2007  yamt sync with head.
 1.1.16.1 07-Jul-2007  yamt file cpu_counter.h was added on branch yamt-lazymbuf on 2007-09-03 14:31:19 +0000
 1.1.4.2 15-Jul-2007  ad Sync with head.
 1.1.4.1 07-Jul-2007  ad file cpu_counter.h was added on branch vmlocking on 2007-07-15 13:21:05 +0000
 1.1.2.2 11-Jul-2007  mjf Sync with head.
 1.1.2.1 07-Jul-2007  mjf file cpu_counter.h was added on branch mjf-ufs-trans on 2007-07-11 20:03:13 +0000
 1.2.2.1 23-Jun-2008  wrstuden Sync w/ -current. 34 merge conflicts to follow.
 1.4.28.1 08-Feb-2011  bouyer Sync with HEAD
 1.4.26.1 06-Jun-2011  jruoho Sync with HEAD.
 1.4.20.1 05-Mar-2011  rmind sync with head
 1.4.12.1 28-Mar-2011  jym Sync with HEAD. TODO before merge:
- shortcut for suspend code in sysmon, when powerd(8) is not running.
Borrow ``xs_watch'' thread context?
- bug hunting in xbd + xennet resume. Rings are currently thrashed upon
resume, so current implementation force flush them on suspend. It's not
really needed.
 1.19 24-Apr-2025  riastradh amd64: Allocate FPU save state outside pcb if it's too large.

We have seen x86_fpu_save_size values (CPUID[EAX=0x0d, ECX=0].ECX) as
large as 11008 bytes, notably with Intel AMX TILEDATA's 8192-byte
state.

We only do this for user threads, and only on machines where it's
necessary, to avoid incurring much overhead. There is still a tiny
bit of overhead when saving and restoring the FPU state by using a
pointer indirection instead of arithmetic indirection for access to
struct pcb::pcb_savefpu, but this is probably a drop in the bucket
compared to the memory traffic incurred by the FPU state save/restore
anyway.

For now, these paths are mostly disabled on i386. We could enable
them but it will require either rewriting cpu_uarea_alloc/free for
i386, or adopting a guard page like amd64 does, which might be costly
and so should be undertaken only with some thought and care. And
since Intel AMX instructions only work in 64-bit mode, it's not
likely to be useful on i386.

PR port-amd64/57661: Crash when booting on Xeon Silver 4416+ in
KVM/Qemu

These changes, as a side effect, may fix:

PR kern/57258: kthread_fpu_enter/exit problem

by making sure to allocate an FPU save space that is large enough to
guarantee fpu_kern_enter/leave work safely, instead of just using a
union savefpu object on the stack (which, at 576 bytes, may be too
small on some machines, particularly with AVX512 requiring ~2.5K).
(But we'll have to do some extra work with kthread_fpu_enter/exit_md
-- if we try doing them again on x86 -- to actually allocate the
separate pcb on these machines!)
 1.18 25-Feb-2023  riastradh branches: 1.18.6;
x86: Mitigate MXCSR Configuration Dependent Timing in kernel FPU use.

In fpu_kern_enter, make sure all the MXCSR exception status bits are
set when we start using the FPU, so that instructions which exhibit
MCDT are unaffected by it.

While here, zero all the other FPU registers in fpu_kern_enter.

In principle we could skip this step on future CPUs that fix the MCDT
bug, but there's probably not much benefit -- workloads that do a lot
of crypto in the kernel are probably better off using
kthread_fpu_enter or WQ_FPU to skip the fpu_kern_enter/leave cycles
in the first place.

For details, see:
https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/best-practices/mxcsr-configuration-dependent-timing.html
 1.17 26-Jun-2019  mgorny branches: 1.17.28;
Implement PT_GETXSTATE and PT_SETXSTATE

Introduce two new ptrace() requests: PT_GETXSTATE and PT_SETXSTATE,
that provide access to the extended (and extensible) set of FPU
registers on amd64 and i386. At the moment, this covers AVX (YMM)
and AVX-512 (ZMM, opmask) registers. It can be easily extended
to cover further register types without breaking backwards
compatibility.

PT_GETXSTATE issues the XSAVE instruction with all kernel-supported
extended components enabled. The data is copied into 'struct xstate'
(which -- unlike the XSAVE area itself -- has stable format
and offsets).

PT_SETXSTATE issues the XRSTOR instruction to restore the register
values from user-provided 'struct xstate'. The function replaces only
the specific XSAVE components that are listed in 'xs_rfbm' field,
making it possible to issue partial updates.

Both syscalls take a 'struct iovec' pointer rather than a direct
argument. This requires the caller to explicitly specify the buffer
size. As a result, existing code will continue to work correctly
when the structure is extended (performing partial reads/updates).
 1.16 23-May-2018  maxv branches: 1.16.2;
Clean up the FPU headers.
 1.15 08-Nov-2017  maxv branches: 1.15.2;
remove vestige
 1.14 31-Oct-2017  maxv Remove outdated comment.
 1.13 31-Oct-2017  maxv Don't embed our own values in the reserved fields of the XSAVE area, it
really is a bad idea. Move them into the PCB.
 1.12 31-Oct-2017  maxv Add xsh_xcomp_bv and fx_zero, and use uint8_t instead.
 1.11 10-Aug-2017  maxv Remove the svr4/ibcs2 fpu flags.
 1.10 18-Aug-2016  maxv KNF and simplify.
 1.9 25-Feb-2014  dsl branches: 1.9.4; 1.9.6; 1.9.10; 1.9.12;
Add support for saving the AVX-256 ymm registers during FPU context switches.
Add support for the forthcoming AVX-512 registers.
Code compiled with -mavx seems to work, but I've not tested context
switches with live ymm registers.
There is a small cost on fork/exec (a larger area is copied/zerod),
but I don't think the ymm registers are read/written unless they
have been used.
The code use XSAVE on all cpus, I'm not brave enough to enable XSAVEOPT.
 1.8 18-Feb-2014  dsl It seems that firefox includes machine/fpu.h on amd64.
Add the file back so that the firwfox source doesn't have to depend
on the version of netbsd it is being compiled for.
(The i386 version doesn't play the same games in its SIGFPE handler.)
 1.7 15-Feb-2014  dsl Remove all references to MDL_USEDFPU and deferred fpu initialisation.
The cost of zeroing the save area on exec is minimal.
This stops the FP registers of a random process being used the first
time an lwp uses the fpu.
sendsig_siginfo() and get_mcontext() now unconditionally copy the FP
registers.
I'll remove the double-copy for signal handlers soon.
get_mcontext() might have been leaking kernel memory to userspace - and
may still do so if i386_use_fxsave is false (short copies).
 1.6 13-Feb-2014  dsl Check the argument types for the fpu asm functions.
 1.5 12-Feb-2014  dsl Change i386 to use x86/fpu.c instead of i386/isa/npx.c
This changes the trap10 and trap13 code to call directly into fpu.c,
removing all the code for T_ARITHTRAP, T_XMM and T_FPUNDA from i386/trap.c
Not all of the code thate appeared to handle fpu traps was ever called!
Most of the changes just replace the include of machine/npx.h with x86/fpu.h
(or remove it entirely).
 1.4 09-Feb-2014  dsl Add compatibility for some userspace code (eg firefox) that seems to look
inside the ucontext structure passed to signal handlers to modify the
xmm registers.
This should make the code compile - I'm not at all sure it works as expected,
the interactions between FP and signal handlers aren't at all clear.
AFAICT the FP state is saved on the user stack when the handler is called,
however the FP trap code can already done odd things to the FPU....
 1.3 08-Feb-2014  dsl Add bit defs for more of the x87 status register.
 1.2 07-Feb-2014  dsl Convert the amd64 build to use x86/cpu_extended_state.h so that the fpu
definitions match those of i386.
Mostly just structure and field renames, in addition:
1) process_xmm_to_s87() and process_s87_to_xmm() moved into
x86/convert_xmm_s87.c so they can be used by amd64's netbsd32 code.
2) The linux signal code simplified to use a structure copy for ths fxsave
data - it matches the hardware definition and won't change.
 1.1 07-Feb-2014  dsl Move all the hardware register layout for the x86 cpus into a header
that can also be used by amd64.
Add in skeleton definitions for XSAVE and AVX.
Update some comments to match reality.
 1.9.12.2 28-Aug-2017  skrll Sync with HEAD
 1.9.12.1 05-Oct-2016  skrll Sync with HEAD
 1.9.10.3 03-Dec-2017  jdolecek update from HEAD
 1.9.10.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.9.10.1 25-Feb-2014  tls file cpu_extended_state.h was added on branch tls-maxphys on 2014-08-20 00:03:29 +0000
 1.9.6.2 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.9.6.1 25-Feb-2014  yamt file cpu_extended_state.h was added on branch yamt-pagecache on 2014-05-22 11:40:13 +0000
 1.9.4.2 18-May-2014  rmind sync with head
 1.9.4.1 25-Feb-2014  rmind file cpu_extended_state.h was added on branch rmind-smpnet on 2014-05-18 17:45:30 +0000
 1.15.2.1 25-Jun-2018  pgoyette Sync with HEAD
 1.16.2.1 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.17.28.1 25-Jul-2023  martin Pull up following revision(s) (requested by riastradh in ticket #244):

sys/arch/x86/x86/fpu.c: revision 1.80
sys/arch/x86/include/cpu_extended_state.h: revision 1.18

x86: Mitigate MXCSR Configuration Dependent Timing in kernel FPU use.

In fpu_kern_enter, make sure all the MXCSR exception status bits are
set when we start using the FPU, so that instructions which exhibit
MCDT are unaffected by it.

While here, zero all the other FPU registers in fpu_kern_enter.
In principle we could skip this step on future CPUs that fix the MCDT
bug, but there's probably not much benefit -- workloads that do a lot
of crypto in the kernel are probably better off using
kthread_fpu_enter or WQ_FPU to skip the fpu_kern_enter/leave cycles
in the first place.

For details, see:
https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/best-practices/mxcsr-configuration-dependent-timing.html
 1.18.6.1 02-Aug-2025  perseant Sync with HEAD
 1.7 05-Oct-2009  rmind Remove X86_IPI_WRITE_MSR (and msr_ipifuncs.c), replace all uses in drivers
with xc_broadcast(). AMD K8 PowerNow driver tested by <jakllsch>, thanks!

Closes PR/37665.
 1.6 17-Oct-2007  garbled branches: 1.6.20; 1.6.34;
Merge the ppcoea-renovation branch to HEAD.

This branch was a major cleanup and rototill of many of the various OEA
cpu based PPC ports that focused on sharing as much code as possible
between the various ports to eliminate near-identical copies of files in
every tree. Additionally there is a new PIC system that unifies the
interface to interrupt code for all different OEA ppc arches. The work
for this branch was done by a variety of people, too long to list here.

TODO:
bebox still needs work to complete the transition to -renovation.
ofppc still needs a bunch of work, which I will be looking at.
ev64260 still needs to be renovated
amigappc was not attempted.

NOTES:
pmppc was removed as an arch, and moved to a evbppc target.
 1.5 06-Oct-2007  xtraeme Use a two clause license for all the code I contributed.

The envsys code will be changed later.
 1.4 25-Mar-2007  xtraeme branches: 1.4.2; 1.4.6; 1.4.8; 1.4.10; 1.4.18; 1.4.20; 1.4.22; 1.4.24;
Add another member to struct cpu_msr_broadcast, msr_read that will
enable the rdmsr call in msr_write_ipi(), so that when it's not
defined we don't read it before writing; disabled in powernow_k8
and enabled in the others.
 1.3 21-Mar-2007  xtraeme branches: 1.3.2;
Remove unneeded headers.
 1.2 21-Mar-2007  xtraeme Remove the MSR read IPI handler, there won't be any driver that will
use it, and we can see if the values are ok in the CPUs in the write
operation.

Suggested by YAMAMOTO Takashi.
 1.1 20-Mar-2007  xtraeme MSR read and write IPI handlers for x86. A MSR will be read or written
in all CPUs available in the system. This adds another member
to struct cpu_info, ci_msr_rvalue; it will contain the value of the MSR
in a previous operation.

Tested with clockmod in UP and SMP by me, tested with est in SMP
by Daniel Carosone and Michael Van Elst.

Ok'ed by Andrew Doran and Matthew R. Green.
 1.3.2.3 15-Apr-2007  yamt sync with head.
 1.3.2.2 24-Mar-2007  yamt sync with head.
 1.3.2.1 21-Mar-2007  yamt file cpu_msr.h was added on branch yamt-idlelwp on 2007-03-24 14:55:05 +0000
 1.4.24.1 14-Oct-2007  yamt sync with head.
 1.4.22.3 27-Oct-2007  yamt sync with head.
 1.4.22.2 03-Sep-2007  yamt sync with head.
 1.4.22.1 25-Mar-2007  yamt file cpu_msr.h was added on branch yamt-lazymbuf on 2007-09-03 14:31:19 +0000
 1.4.20.1 06-Nov-2007  matt sync with HEAD
 1.4.18.1 07-Oct-2007  joerg Sync with HEAD.
 1.4.10.2 11-Jul-2007  mjf Sync with head.
 1.4.10.1 25-Mar-2007  mjf file cpu_msr.h was added on branch mjf-ufs-trans on 2007-07-11 20:03:13 +0000
 1.4.8.1 16-Oct-2007  garbled Sync with HEAD
 1.4.6.2 20-Apr-2007  bouyer Pull up following revision(s) (requested by mlelstv in ticket #575):
sys/arch/i386/i386/est.c sync with 1.37
sys/arch/i386/i386/ipifuncs.c sync with 1.16
sys/arch/x86/include/cpu_msr.h sync with 1.4
sys/arch/x86/include/intrdefs.h sync with 1.8
sys/arch/x86/include/powernow.h sync with 1.9
sys/arch/x86/x86/powernow_k8.c sync with 1.20
sys/arch/x86/x86/msr_ipifuncs.c sync with 1.8
sys/arch/amd64/amd64/ipifuncs.c sync with 1.9
sys/arch/i386/i386/identcpu.c patch
sys/arch/i386/i386/machdep.c patch
sys/arch/i386/include/cpu.h patch
sys/arch/x86/conf/files.x86 patch
sys/arch/x86/x86/x86_machdep.c patch
sys/arch/amd64/amd64/machdep.c patch
Add MSR write IPI handler for x86. Use it and the RUN_ONCE framework
to make est and powernow drivers work properly with SMP.
 1.4.6.1 25-Mar-2007  bouyer file cpu_msr.h was added on branch netbsd-4 on 2007-04-20 20:31:27 +0000
 1.4.2.3 09-Oct-2007  ad Sync with head.
 1.4.2.2 10-Apr-2007  ad Sync with head.
 1.4.2.1 25-Mar-2007  ad file cpu_msr.h was added on branch vmlocking on 2007-04-10 13:22:45 +0000
 1.6.34.1 01-Nov-2009  jym Sync with HEAD.
 1.6.20.1 11-Mar-2010  yamt sync with head
 1.4 10-May-2020  maxv Reintroduce cpu_rng_early_sample(), but this time with embedded detection
for RDRAND/RDSEED, because TSC is not very strong.
 1.3 30-Apr-2020  riastradh Simplify Intel RDRAND/RDSEED and VIA C3 RNG API.

Push it all into MD x86 code to keep it simpler, until we have other
examples on other CPUs. Simplify RDSEED-to-RDRAND fallback.
Eliminate cpu_earlyrng in favour of just using entropy_extract, which
is available early now.
 1.2 21-Jul-2018  maxv More ASLR. Randomize the location of the direct map at boot time on amd64.
This doesn't need "options KASLR" and works on GENERIC. Will soon be
enabled by default.

The location of the areas is abstracted in a slotspace structure. Ideally
we should always use this structure when touching the L4 slots, instead of
the current cocktail of global variables and constants.

machdep initializes the structure with the default values, and we then
randomize its dmap entry. Ideally machdep should randomize everything at
once, but in the case of the direct map its size is determined a little
later in the boot procedure, so we're forced to randomize its location
later too.
 1.1 27-Feb-2016  tls branches: 1.1.2; 1.1.18; 1.1.20; 1.1.22;
Add cpu_rng, a framework for simple on-CPU random number generators.
 1.1.22.1 10-Jun-2019  christos Sync with HEAD
 1.1.20.1 28-Jul-2018  pgoyette Sync with HEAD
 1.1.18.2 03-Dec-2017  jdolecek update from HEAD
 1.1.18.1 27-Feb-2016  jdolecek file cpu_rng.h was added on branch tls-maxphys on 2017-12-03 11:36:50 +0000
 1.1.2.2 19-Mar-2016  skrll Sync with HEAD
 1.1.2.1 27-Feb-2016  skrll file cpu_rng.h was added on branch nick-nhusb on 2016-03-19 11:30:07 +0000
 1.5 15-Sep-2022  msaitoh Verify checksum of the extended signature table.
 1.4 17-Mar-2018  christos branches: 1.4.6;
tuck in all the compat microcode code in one place.
 1.3 17-Oct-2012  drochner branches: 1.3.30; 1.3.36;
put binary compatibility support for the old AMD-only CPU microcode
update API inside COMPAT_60
 1.2 29-Aug-2012  drochner branches: 1.2.2;
Extend the CPU microcode update framework to support Intel x86 CPUs.
Contrary to the AMD implementation, it doesn't use xcalls to distribute
the update to all CPUs but relies on cpuctl(8) to bind itself to the
right CPU -- to keep it simple and avoid possible problems with
hyperthreading.
Also, it doesn't parse the vendor supplied file to pick the right
part for the present CPU model but relies on userland to prepare
files with specific filenames. I'll commit a pkg for this in a minute
(pkgsrc/sysutils/intel-microcode).
The ioctl interface changed; compatibility is provided (should be
limited to COMPAT_NETBSD6 as soon as this is available).
 1.1 13-Jan-2012  cegger branches: 1.1.4; 1.1.6;
Support CPU microcode loading via cpuctl(8).
Implemented and enabled via CPU_UCODE kernel config option
for x86 and Xen Dom0.
Tested on different AMD machines with different
CPU families.

ok wiz@ for the manpages
ok releng@
ok core@ via releng@
 1.1.6.3 30-Oct-2012  yamt sync with head
 1.1.6.2 17-Apr-2012  yamt sync with head
 1.1.6.1 13-Jan-2012  yamt file cpu_ucode.h was added on branch yamt-pagecache on 2012-04-17 00:07:05 +0000
 1.1.4.2 18-Feb-2012  mrg merge to -current.
 1.1.4.1 13-Jan-2012  mrg file cpu_ucode.h was added on branch jmcneill-usbmp on 2012-02-18 07:33:34 +0000
 1.2.2.1 20-Nov-2012  tls Resync to 2012-11-19 00:00:00 UTC
 1.3.36.2 22-Mar-2018  pgoyette Synch with HEAD, resolve conflicts
 1.3.36.1 17-Mar-2018  pgoyette Import christos's changes for the compat_60 cpu_ucode stuff
 1.3.30.1 11-Oct-2022  martin Pull up following revision(s) (requested by msaitoh in ticket #1772):

sys/arch/x86/include/cpu_ucode.h: revision 1.5
sys/arch/x86/x86/cpu_ucode_intel.c: revision 1.19
sys/arch/x86/x86/cpu_ucode_intel.c: revision 1.20

Add missing newline in a message. KNF.
Verify checksum of the extended signature table.
 1.4.6.1 11-Oct-2022  martin Pull up following revision(s) (requested by msaitoh in ticket #1538):
sys/arch/x86/include/cpu_ucode.h: revision 1.5
sys/arch/x86/x86/cpu_ucode_intel.c: revision 1.19
sys/arch/x86/x86/cpu_ucode_intel.c: revision 1.20
Add missing newline in a message. KNF.
Verify checksum of the extended signature table.
 1.42 24-Oct-2020  mgorny Issue 64-bit versions of *XSAVE* for 64-bit amd64 programs

When calling FXSAVE, XSAVE, FXRSTOR, ... for 64-bit programs on amd64
use the 64-suffixed variant in order to include the complete FIP/FDP
registers in the x87 area.

The difference between the two variants is that the FXSAVE64 (new)
variant represents FIP/FDP as 64-bit fields (union fp_addr.fa_64),
while the legacy FXSAVE variant uses split fields: 32-bit offset,
16-bit segment and 16-bit reserved field (union fp_addr.fa_32).
The latter implies that the actual addresses are truncated to 32 bits
which is insufficient in modern programs.

The change is applied only to 64-bit programs on amd64. Plain i386
and compat32 continue using plain FXSAVE. Similarly, NVMM is not
changed as I am not familiar with that code.

This is a potentially breaking change. However, I don't think it likely
to actually break anything because the data provided by the old variant
were not meaningful (because of the truncated pointer).
 1.41 15-Jun-2020  msaitoh Serialize rdtsc using with lfence, mfence or cpuid to read TSC more precisely.

x86/x86/tsc.c rev. 1.67 reduced cache problem and got big improvement, but it
still has room. I measured the effect of lfence, mfence, cpuid and rdtscp.
The impact to TSC skew and/or drift is:

AMD: mfence > rdtscp > cpuid > lfence-serialize > lfence = nomodify
Intel: lfence > rdtscp > cpuid > nomodify

So, mfence is the best on AMD and lfence is the best on Intel. If it has no
SSE2, we can use cpuid.

NOTE:
- An AMD's document says DE_CFG_LFENCE_SERIALIZE bit can be used for
serializing, but it's not so good.
- On Intel i386(not amd64), it seems the improvement is very little.
- rdtscp instruct can be used as serializing instruction + rdtsc, but
it's not good as [lm]fence. Both Intel and AMD's document say that
the latency of rdtscp is bigger than rdtsc, so I suspect the difference
of the result comes from it.
 1.40 14-Jun-2020  riastradh Use static constant rather than stack memset buffer for zero fpregs.
 1.39 02-May-2020  maxv Modify the hotpatch mechanism, in order to make it much less ROP-friendly.

Currently x86_patch_window_open is a big problem, because it is a perfect
function to inject/modify executable code with ROP.

- Remove x86_patch_window_open(), along with its x86_patch_window_close()
counterpart.
- Introduce a read-only link-set of hotpatch descriptor structures,
which reference a maximum of two read-only hotpatch sources.
- Modify x86_hotpatch() to open a window and call the new
x86_hotpatch_apply() function in a hard-coded manner.
- Modify x86_hotpatch() to take a name and a selector, and have
x86_hotpatch_apply() resolve the descriptor from the name and the
source from the selector, before hotpatching.
- Move the error handling in a separate x86_hotpatch_cleanup() function,
that gets called after we closed the window.

The resulting implementation is a bit complex and non-obvious. But it
gains the following properties: the code executed in the hotpatch window
is strictly hard-coded (no callback and no possibility to execute your own
code in the window) and the pointers this code accesses are strictly
read-only (no possibility to forge pointers to hotpatch an area that was
not designated as hotpatchable at compile-time, and no possibility to
choose what bytes to write other than the maximum of two read-only
templates that were designated as valid for the given destination at
compile-time).

With current CPUs this slightly improves a situation that is already
pretty bad by definition on x86. Assuming CET however, this change closes
a big hole and is kinda great.

The only ~problem there is, is that dtrace-fbt tries to hotpatch random
places with random bytes, and there is just no way to make it safe.
However dtrace is only in a module, that is rarely used and never compiled
into the kernel, so it's not a big problem; add a shitty & vulnerable
independent hotpatch window in it, and leave big XXXs. It looks like fbt
is going to collapse soon anyway.
 1.38 25-Apr-2020  bouyer Merge the bouyer-xenpvh branch, bringing in Xen PV drivers support under HVM
guests in GENERIC.
Xen support can be disabled at runtime with
boot -c
disable hypervisor
 1.37 30-Oct-2019  maxv branches: 1.37.6;
More inlined ASM.
 1.36 07-Sep-2019  maxv Convert rdmsr_locked and wrmsr_locked to inlines.
 1.35 07-Sep-2019  maxv Add a memory barrier on wrmsr, because some MSRs control memory access
rights (we don't use them though). Also add barriers on fninit and clts
for safety.
 1.34 05-Jul-2019  maxv branches: 1.34.2;
More inlines, prerequisites for future changes. Also, remove fngetsw(),
which was a duplicate of fnstsw().
 1.33 03-Jul-2019  maxv Inline x86_cpuid2(), prerequisite for future changes. Also, add "memory"
on certain other inlines, to make sure GCC does not reorder.
 1.32 30-May-2019  christos use __asm
 1.31 29-May-2019  maxv Add PCID support in SVS. This avoids TLB flushes during kernel<->user
transitions, which greatly reduces the performance penalty introduced by
SVS.

We use two ASIDs, 0 (kern) and 1 (user), and use invpcid to flush pages
in both ASIDs.

The read-only machdep.svs.pcid={0,1} sysctl is added, and indicates whether
SVS+PCID is in use.
 1.30 11-May-2019  christos Undo previous, fixed in userland.
 1.29 11-May-2019  christos expose the {rd,wr}msr functions to userland and install the header for
the benefit of cpuctl (fix the build).
 1.28 09-May-2019  bouyer sti/cli are not allowed on Xen, we have to clear/set a bit in the
shared page. Revert x86_disable_intr/x86_enable_intr to plain function
calls on XENPV.
While there, clean up unused functions and macros, and change cli()/sti()
macros to x86_disable_intr/x86_enable_intr.
Makes Xen domU boot again
(http://www-soc.lip6.fr/~bouyer/NetBSD-tests/xen/HEAD/)
 1.27 04-May-2019  maxv More inlined ASM. While here switch to proper types.
 1.26 01-May-2019  maxv Start converting the x86 CPU functions to inlined ASM. Matters for NVMM,
where some are invoked millions of times.
 1.25 01-May-2019  maxv Remove unused functions and reorder a little.
 1.24 22-Feb-2018  maxv branches: 1.24.4;
Improve the SVS initialization.

Declare x86_patch_window_open() and x86_patch_window_close(), and globalify
x86_hotpatch().

Introduce svs_enable() in x86/svs.c, that does the SVS hotpatching.

Change svs_init() to take a bool. This function gets called twice; early
when the system just booted (and nothing is initialized), lately when at
least pmap_kernel has been initialized.
 1.23 15-Oct-2017  maxv Add setds and setes, will be useful in the future.
 1.22 13-Dec-2016  kamil branches: 1.22.8;
Torn down KSTACK_CHECK_DR0, i386-only feature to detect stack overflow

This feature was intended to detect stack overflow with CPU Debug Registers
(x86). It was never ported to other ports, neither amd64 and should be
adapted for SMP...

Currently there might be better ways to detect stack overflows like page
mapping protection. Since the number of Debug Registers is restricted
(4 on x86), torn it down completely.

This interface introduced helper functions for Debug Registers, they will
be replaced with the new <x86/dbregs.h> interface.

KSTACK_CHECK_DR0 was disabled by default and won't affect ordinary users.

Sponsored by <The NetBSD Foundation>
 1.21 13-Dec-2016  kamil Switch x86 CPU Debug Register types from vaddr_t to register_t

This is more opaque and appropriate type, as vaddr_t is meant to be used
for vitual address value. Not all DR on x86 are used to represent virtual
address (DR6 and DR7 are definitely not).

No functional change intended.

Change suggested by <christos>

Sponsored by <The NetBSD Foundation>
 1.20 27-Nov-2016  kamil Add accessors for available x86 Debug Registers

There are 8 Debug Registers on i386 (available at least since 80386) and 16
on AMD64. Currently DR4 and DR5 are reserved on both cpu-families and
DR9-DR15 are still reserved on AMD64. Therefore add accessors for DR0-DR3,
DR6-DR7 for all ports.

Debug Registers x86:
* DR0-DR3 Debug Address Registers
* DR4-DR5 Reserved
* DR6 Debug Status Register
* DR7 Debug Control Register
* DR8-DR15 Reserved

Access the registers is available only from a kernel (ring 0) as there is
needed top protected access. For this reason there is need to use special
XEN functions to get and set the registers in the XEN3 kernels.

XEN specific functions as defined in NetBSD:
- HYPERVISOR_get_debugreg()
- HYPERVISOR_set_debugreg()

This code extends the existing rdr6() and ldr6() accessor for additional:
- rdr0() & ldr0()
- rdr1() & ldr1()
- rdr2() & ldr2()
- rdr3() & ldr3()
- rdr7() & ldr7()

Traditionally accessors for DR6 were passing vaddr_t argument, while it's
appropriate type for DR0-DR3, DR6-DR7 should be using u_long, however it's
not a big deal. The resulting functionality should be equivalent so stick
to this convention and use the vaddr_t type for all DR accessors.

There was already a function defined for rdr6() in XEN, but it had a nit on
AMD64 as it was casting HYPERVISOR_get_debugreg() to u_int (32-bit on
AMD64), truncating result. It still works for DR6, but for the sake of
simplicity always return full 64-bit value.

New accessors duplicate functionality of the dr0() function available on
i386 within the KSTACK_CHECK_DR0 option. dr0() is a specialized layer with
logic to set appropriate types of interrupts, now accessors are designed to
pass verbatim values from user-land (with simple sanity checks in the
kernel). At the moment there are no plans to make possible to coexist
KSTACK_CHECK_DR0 with debug registers for user applications (debuggers).

options KSTACK_CHECK_DR0
Detect kernel stack overflow using DR0 register. This option uses DR0
register exclusively so you can't use DR0 register for other purpose
(e.g., hardware breakpoint) if you turn this on.

The KSTACK_CHECK_DR0 functionality was designed for i386 and never ported
to amd64.

Code tested on i386 and amd64 with kernels: GENERIC, XEN3_DOMU, XEN3_DOM0.

Sponsored by <The NetBSD Foundation>
 1.19 05-Jan-2016  hannken branches: 1.19.2;
Adapt prototypes and usage of rdmsr_locked() and wrmsr_locked() to
their implementation. Both functions don't take the passcode as
argument.

As wrmsr_locked() no longer writes the passcode to the msr the
erratum 721 on my Opteron 2356 really gets patched and cc1 no longer
crashes with SIGSEGV.
 1.18 25-Feb-2014  dsl branches: 1.18.4; 1.18.6; 1.18.8;
Add support for saving the AVX-256 ymm registers during FPU context switches.
Add support for the forthcoming AVX-512 registers.
Code compiled with -mavx seems to work, but I've not tested context
switches with live ymm registers.
There is a small cost on fork/exec (a larger area is copied/zerod),
but I don't think the ymm registers are read/written unless they
have been used.
The code use XSAVE on all cpus, I'm not brave enough to enable XSAVEOPT.
 1.17 13-Feb-2014  dsl Check the argument types for the fpu asm functions.
 1.16 12-Feb-2014  dsl Change i386 to use x86/fpu.c instead of i386/isa/npx.c
This changes the trap10 and trap13 code to call directly into fpu.c,
removing all the code for T_ARITHTRAP, T_XMM and T_FPUNDA from i386/trap.c
Not all of the code thate appeared to handle fpu traps was ever called!
Most of the changes just replace the include of machine/npx.h with x86/fpu.h
(or remove it entirely).
 1.15 09-Feb-2014  dsl Add x86_stmxcsr for amd64.
 1.14 08-Dec-2013  dsl Add some definitions for cpu 'extended state'.
These are needed for support of the AVX SIMD instructions.
Nothing yet uses them.
 1.13 24-Sep-2011  jym branches: 1.13.2; 1.13.8; 1.13.12; 1.13.14; 1.13.16; 1.13.22;
Import rdmsr_safe(msr, *value) for x86 world. It allows reading MSRs
in a safe way by handling the fault that might trigger for certain
register <> CPU/arch combos.

Requested by Jukka. Patch adapted from one found in DragonflyBSD.
 1.12 07-Jul-2010  chs add the guts of TLS support on amd64. based on joerg's patch,
reworked by me to support 32-bit processes as well.
we now keep %fs and %gs loaded with the user values
while in the kernel, which means we don't need to
reload them when returning to user mode.
 1.11 27-Jan-2009  christos branches: 1.11.2; 1.11.4; 1.11.6;
factor out common reset code.
 1.10 19-Dec-2008  cegger x86_patch() is not available on Xen.
Make Xen kernels link again.
 1.9 19-Dec-2008  ad PR kern/40213 my i386 machine can't boot because of tsc

- Patch in atomic_cas_64() twice. The first patch is early and makes it
the MP-atomic version available if we have cmpxchg8b. The second patch
strips the lock prefix if ncpu==1.

- Fix the i486 atomic_cas_64() to not unconditionally enable interrupts.
 1.8 30-Apr-2008  cegger branches: 1.8.8; 1.8.10;
AMD's APM Volume 2 says 'All control registers are 64bit in long mode'.
Fix the CR0 prototype to match this (the asm implementation is correct though).
OK ad
 1.7 28-Apr-2008  martin Remove clause 3 and 4 from TNF licenses
 1.6 27-Apr-2008  ad branches: 1.6.2;
+lcr2
 1.5 16-Apr-2008  cegger branches: 1.5.2;
- use aprint_*_dev and device_xname
- use POSIX integer types
 1.4 01-Jan-2008  yamt branches: 1.4.6;
add x86_cpuid2, which can specify ecx register.
 1.3 15-Nov-2007  ad branches: 1.3.6;
Remove support for 80386 level CPUs. PR port-i386/36163.
 1.2 26-Sep-2007  ad branches: 1.2.2; 1.2.4; 1.2.6; 1.2.8; 1.2.10; 1.2.12; 1.2.14;
Update copyright.
 1.1 26-Sep-2007  ad x86 changes for pcc and LKMs.

- Replace most inline assembly with proper functions. As a side effect
this reduces the size of amd64 GENERIC by about 120kB, and i386 by a
smaller amount. Nearly all of the inlines did something slow, or something
that does not need to be fast.
- Make curcpu() and curlwp functions proper, unless __GNUC__ && _KERNEL.
In that case make them inlines. Makes curlwp LKM and preemption safe.
- Make bus_space and bus_dma more LKM friendly.
- Share a few more files between the ports.
- Other minor changes.
 1.2.14.3 09-Jan-2008  matt sync with HEAD
 1.2.14.2 06-Nov-2007  matt sync with HEAD
 1.2.14.1 26-Sep-2007  matt file cpufunc.h was added on branch matt-armv6 on 2007-11-06 23:23:34 +0000
 1.2.12.2 18-Feb-2008  mjf Sync with HEAD.
 1.2.12.1 19-Nov-2007  mjf Sync with HEAD.
 1.2.10.4 21-Jan-2008  yamt sync with head
 1.2.10.3 07-Dec-2007  yamt sync with head
 1.2.10.2 27-Oct-2007  yamt sync with head.
 1.2.10.1 26-Sep-2007  yamt file cpufunc.h was added on branch yamt-lazymbuf on 2007-10-27 11:28:54 +0000
 1.2.8.1 18-Nov-2007  bouyer Sync with HEAD
 1.2.6.3 03-Dec-2007  ad Sync with HEAD.
 1.2.6.2 09-Oct-2007  ad Sync with head.
 1.2.6.1 26-Sep-2007  ad file cpufunc.h was added on branch vmlocking on 2007-10-09 13:38:41 +0000
 1.2.4.2 06-Oct-2007  yamt sync with head.
 1.2.4.1 26-Sep-2007  yamt file cpufunc.h was added on branch yamt-x86pmap on 2007-10-06 15:33:31 +0000
 1.2.2.3 21-Nov-2007  joerg Sync with HEAD.
 1.2.2.2 02-Oct-2007  joerg Sync with HEAD.
 1.2.2.1 26-Sep-2007  joerg file cpufunc.h was added on branch jmcneill-pm on 2007-10-02 18:27:49 +0000
 1.3.6.1 02-Jan-2008  bouyer Sync with HEAD
 1.4.6.2 17-Jan-2009  mjf Sync with HEAD.
 1.4.6.1 02-Jun-2008  mjf Sync with HEAD.
 1.5.2.1 18-May-2008  yamt sync with head.
 1.6.2.3 11-Aug-2010  yamt sync with head.
 1.6.2.2 04-May-2009  yamt sync with head.
 1.6.2.1 16-May-2008  yamt sync with head.
 1.8.10.4 01-Jun-2015  sborrill Pull up the following revisions(s) (requested by msaitoh in ticket #1969):
sys/arch/x86/include/cpufunc.h: revision 1.13
sys/arch/amd64/amd64/cpufunc.S: revision 1.20-1.21 via patch
sys/arch/i386/i386/cpufunc.S: revision 1.16-1.17, 1.21 via patch

Backport rdmsr_safe() to access MSR safely.
 1.8.10.3 02-Feb-2009  snj branches: 1.8.10.3.6; 1.8.10.3.10;
Pull up following revision(s) (requested by ad in ticket #396):
sys/arch/amd64/amd64/machdep.c: revision 1.122
sys/arch/i386/i386/machdep.c: revision 1.657
sys/arch/x86/include/cpufunc.h: revision 1.11
sys/arch/x86/x86/x86_machdep.c: revision 1.28
factor out common reset code.
 1.8.10.2 02-Feb-2009  snj Pull up following revision(s) (requested by bouyer in ticket #343):
sys/arch/x86/x86/identcpu.c: revision 1.13
sys/arch/x86/include/cpufunc.h: revision 1.10
x86_patch() is not available on Xen.
Make Xen kernels link again.
 1.8.10.1 02-Feb-2009  snj Pull up following revision(s) (requested by ad in ticket #343):
common/lib/libc/arch/i386/atomic/atomic.S: revision 1.14
sys/arch/x86/include/cpufunc.h: revision 1.9
sys/arch/x86/x86/identcpu.c: revision 1.12
sys/arch/x86/x86/cpu.c: revision 1.60
sys/arch/x86/x86/patch.c: revision 1.15
PR kern/40213 my i386 machine can't boot because of tsc
- Patch in atomic_cas_64() twice. The first patch is early and makes it
the MP-atomic version available if we have cmpxchg8b. The second patch
strips the lock prefix if ncpu==1.
- Fix the i486 atomic_cas_64() to not unconditionally enable interrupts.
 1.8.10.3.10.1 01-Jun-2015  sborrill Pull up the following revisions(s) (requested by msaitoh in ticket #1969):
sys/arch/x86/include/cpufunc.h: revision 1.13
sys/arch/amd64/amd64/cpufunc.S: revision 1.20-1.21 via patch
sys/arch/i386/i386/cpufunc.S: revision 1.16-1.17, 1.21 via patch

Backport rdmsr_safe() to access MSR safely.
 1.8.10.3.6.1 01-Jun-2015  sborrill Pull up the following revisions(s) (requested by msaitoh in ticket #1969):
sys/arch/x86/include/cpufunc.h: revision 1.13
sys/arch/amd64/amd64/cpufunc.S: revision 1.20-1.21 via patch
sys/arch/i386/i386/cpufunc.S: revision 1.16-1.17, 1.21 via patch

Backport rdmsr_safe() to access MSR safely.
 1.8.8.2 03-Mar-2009  skrll Sync with HEAD.
 1.8.8.1 19-Jan-2009  skrll Sync with HEAD.
 1.11.6.1 05-Mar-2011  rmind sync with head
 1.11.4.1 17-Aug-2010  uebayasi Sync with HEAD.
 1.11.2.1 24-Oct-2010  jym Sync with HEAD
 1.13.22.1 14-Jul-2016  snj Pull up following revision(s) (requested by hannken in ticket #1361):
sys/arch/x86/include/cpufunc.h: revision 1.19
sys/arch/x86/x86/errata.c: revision 1.23
Adapt prototypes and usage of rdmsr_locked() and wrmsr_locked() to
their implementation. Both functions don't take the passcode as
argument.
As wrmsr_locked() no longer writes the passcode to the msr the
erratum 721 on my Opteron 2356 really gets patched and cc1 no longer
crashes with SIGSEGV.
 1.13.16.1 18-May-2014  rmind sync with head
 1.13.14.1 14-Jul-2016  snj Pull up following revision(s) (requested by hannken in ticket #1361):
sys/arch/x86/include/cpufunc.h: revision 1.19
sys/arch/x86/x86/errata.c: revision 1.23
Adapt prototypes and usage of rdmsr_locked() and wrmsr_locked() to
their implementation. Both functions don't take the passcode as
argument.
As wrmsr_locked() no longer writes the passcode to the msr the
erratum 721 on my Opteron 2356 really gets patched and cc1 no longer
crashes with SIGSEGV.
 1.13.12.2 03-Dec-2017  jdolecek update from HEAD
 1.13.12.1 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.13.8.1 14-Jul-2016  snj Pull up following revision(s) (requested by hannken in ticket #1361):
sys/arch/x86/include/cpufunc.h: revision 1.19
sys/arch/x86/x86/errata.c: revision 1.23
Adapt prototypes and usage of rdmsr_locked() and wrmsr_locked() to
their implementation. Both functions don't take the passcode as
argument.
As wrmsr_locked() no longer writes the passcode to the msr the
erratum 721 on my Opteron 2356 really gets patched and cc1 no longer
crashes with SIGSEGV.
 1.13.2.1 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.18.8.1 06-Feb-2016  snj Pull up following revision(s) (requested by hannken in ticket #1073):
sys/arch/x86/x86/errata.c: revision 1.23
sys/arch/x86/include/cpufunc.h: revision 1.19
Adapt prototypes and usage of rdmsr_locked() and wrmsr_locked() to
their implementation. Both functions don't take the passcode as
argument.
As wrmsr_locked() no longer writes the passcode to the msr the
erratum 721 on my Opteron 2356 really gets patched and cc1 no longer
crashes with SIGSEGV.
 1.18.6.3 05-Feb-2017  skrll Sync with HEAD
 1.18.6.2 05-Dec-2016  skrll Sync with HEAD
 1.18.6.1 19-Mar-2016  skrll Sync with HEAD
 1.18.4.1 26-Jan-2016  snj Pull up following revision(s) (requested by hannken in ticket #1073):
sys/arch/x86/x86/errata.c: revision 1.23
sys/arch/x86/include/cpufunc.h: revision 1.19
Adapt prototypes and usage of rdmsr_locked() and wrmsr_locked() to
their implementation. Both functions don't take the passcode as
argument.
As wrmsr_locked() no longer writes the passcode to the msr the
erratum 721 on my Opteron 2356 really gets patched and cc1 no longer
crashes with SIGSEGV.
 1.19.2.1 07-Jan-2017  pgoyette Sync with HEAD. (Note that most of these changes are simply $NetBSD$
tag issues.)
 1.22.8.1 06-Mar-2018  martin Pull up the following revisions, requested by maxv in ticket #603:

amd64/conf/kern.ldscript 1.25 (patch)
amd64/conf/kern.ldscript.Xen 1.14 (patch)
i386/conf/kern.ldscript 1.21 (patch)
i386/conf/kern.ldscript.Xen 1.15 (patch)
x86/include/cpufunc.h 1.24 (patch)
x86/x86/patch.c 1.25 (partial) 1.26 (partial)

Backport x86_hotpatch.
 1.24.4.2 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.24.4.1 10-Jun-2019  christos Sync with HEAD
 1.34.2.1 16-Oct-2019  martin Pull up following revision(s) (requested by maxv in ticket #338):

sys/arch/x86/include/cpufunc.h: revision 1.35

Add a memory barrier on wrmsr, because some MSRs control memory access
rights (we don't use them though). Also add barriers on fninit and clts
for safety.
 1.37.6.1 15-Apr-2020  bouyer On amd64, always use the cmpxchg8b version of spllower. All x86_64 host should
have it and we already rely on it in lock stubs.
On i386, always use i686_mutex_spin_exit and cx8_spllower for Xen;
Xen doesn't run on CPUs on CPUs lacking the required instructions anyway.
Skip x86_patch only for XENPV, and adjust for changes in assembly functions.
Tested on Xen PV and PVHVM, and on bare metal core i5.
 1.4 08-Dec-2013  dsl Remove the now-unused CPU_MAXMODEL and CPU_DEFMODEL
 1.3 27-Jan-2011  bouyer branches: 1.3.4; 1.3.14; 1.3.18;
Properly identify vortex86 CPUs.
 1.2 11-May-2008  ad branches: 1.2.12; 1.2.20; 1.2.26; 1.2.28;
Re-base the cpu types at 0 so they can be used as an array index.
 1.1 01-Jan-2007  ad branches: 1.1.2; 1.1.6; 1.1.20; 1.1.32; 1.1.52; 1.1.54; 1.1.56; 1.1.58;
Report on and where possible, try to work around some of the known errata
for Athlon 64 and Opteron processors. Tested briefly by cube@ and elad@.
 1.1.58.1 23-Jun-2008  wrstuden Sync w/ -current. 34 merge conflicts to follow.
 1.1.56.1 16-May-2008  yamt sync with head.
 1.1.54.1 18-May-2008  yamt sync with head.
 1.1.52.1 02-Jun-2008  mjf Sync with HEAD.
 1.1.32.2 03-Sep-2007  wrstuden Sync w/ NetBSD-4-RC_1
 1.1.32.1 01-Jan-2007  wrstuden file cputypes.h was added on branch wrstuden-fixsa on 2007-09-03 07:04:12 +0000
 1.1.20.2 05-Jun-2007  bouyer Pull up following revision(s) (requested by xtraeme in ticket 702):
sys/arch/amd64/amd64/identcpu.c patch
sys/arch/amd64/include/cpu.h patch
sys/arch/x86/include/cputypes.h 1.1
Print all extended features for Intel EM64T CPUs on amd64.
 1.1.20.1 01-Jan-2007  bouyer file cputypes.h was added on branch netbsd-4 on 2007-06-05 20:28:11 +0000
 1.1.6.2 26-Feb-2007  yamt sync with head.
 1.1.6.1 01-Jan-2007  yamt file cputypes.h was added on branch yamt-lazymbuf on 2007-02-26 09:08:47 +0000
 1.1.2.2 12-Jan-2007  ad Sync with head.
 1.1.2.1 01-Jan-2007  ad file cputypes.h was added on branch newlock2 on 2007-01-12 01:49:08 +0000
 1.2.28.1 08-Feb-2011  bouyer Sync with HEAD
 1.2.26.1 06-Jun-2011  jruoho Sync with HEAD.
 1.2.20.1 05-Mar-2011  rmind sync with head
 1.2.12.1 28-Mar-2011  jym Sync with HEAD. TODO before merge:
- shortcut for suspend code in sysmon, when powerd(8) is not running.
Borrow ``xs_watch'' thread context?
- bug hunting in xbd + xennet resume. Rings are currently thrashed upon
resume, so current implementation force flush them on suspend. It's not
really needed.
 1.3.18.1 18-May-2014  rmind sync with head
 1.3.14.1 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.3.4.1 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.55 01-May-2025  imil Introduce cpu_max_hypervisor_cpuid to cache hypervisor CPUID leaf

This variable stores the maximum supported hypervisor CPUID leaf so that
future checks can avoid repeated calls to x86_cpuid().
 1.54 11-Apr-2025  imil nvmm(4): implement CPUID leaf 0x40000010, VMware compatible TSC and LAPIC
frequency detection. Partially fixes PR kern/59170
 1.53 14-Jul-2020  yamaguchi branches: 1.53.26;
Introduce per-cpu IDTs

This is realized by following modifications:
- Add IDT pages and its allocation maps for each cpu in "struct cpu_info"
- Load per-cpu IDTs at cpu_init_idt(struct cpu_info*)
- Copy the IDT entries for cpu0 to other CPUs at attach
- These are, for example, exceptions, db, system calls, etc.

And, added a kernel option named PCPU_IDT to enable the feature.
 1.52 25-Apr-2020  bouyer Merge the bouyer-xenpvh branch, bringing in Xen PV drivers support under HVM
guests in GENERIC.
Xen support can be disabled at runtime with
boot -c
disable hypervisor
 1.51 11-Feb-2019  cherry branches: 1.51.10;
We reorganise definitions for XEN source support as follows:

XEN - common sources required for baseline XEN support.
XENPV - sources required for support of XEN in PV mode.
XENPVHVM - sources required for support for XEN in HVM mode.
XENPVH - sources required for support for XEN in PVH mode.
 1.50 23-May-2017  nonaka branches: 1.50.10;
x86: Add preliminary x2APIC support.

x2APIC is used only when x2APIC is enabled in BIOS/UEFI.
LAPIC ID is not supported above 256.
 1.49 19-Apr-2017  nonaka remove prototypes of nonexistent function.
 1.48 13-Jan-2017  christos branches: 1.48.2;
Add missing forward decl.
 1.47 13-Dec-2015  maxv branches: 1.47.2;
Retrieve cpuid7 (Structured Extended Features) into ci_feat_val.
 1.46 20-Apr-2012  rmind branches: 1.46.2; 1.46.14; 1.46.16; 1.46.18;
- Convert x86 MD code, mainly pmap(9) e.g. TLB shootdown code, to use
kcpuset(9) and thus replace hardcoded CPU bitmasks. This removes the
limitation of maximum CPUs.

- Support up to 256 CPUs on amd64 architecture by default.

Bug fixes, improvements, completion of Xen part and testing on 64-core
AMD Opteron(tm) Processor 6282 SE (also, as Xen HVM domU with 128 CPUs)
by Manuel Bouyer.
 1.45 13-Aug-2011  cherry branches: 1.45.2; 1.45.6; 1.45.8;
MP probing and startup code
 1.44 12-Jun-2011  rmind Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.
 1.43 04-Mar-2011  jruoho branches: 1.43.2;
Move INTEL_ONDEMAND_CLOCKMOD -- or odcm(4) -- to the cpufeaturebus.
 1.42 24-Feb-2011  jruoho Fix autoconf(9) of cpufeaturebus.
 1.41 23-Feb-2011  jruoho Move ENHANCED_SPEEDSTEP, or henceforth est(4), to the cpufeaturebus.
 1.40 20-Feb-2011  jruoho Modularize coretemp(4). Ok jmcneill@.
 1.39 19-Feb-2011  jmcneill modularize VIA PadLock support
- retire options VIA_PADLOCK, replace with 'padlock0 at cpu0'
- driver supports attach & detach
- support building as a module
 1.38 20-Aug-2010  jruoho branches: 1.38.2; 1.38.4;
Revert all previous changes that were made naively believing that the
existing CPU power management implementations could peacefully coexist with
the acpicpu(4) driver. The following options can not be used with acpicpu(4):
ENHANCED_SPEEDSTEP, INTEL_ONDEMAND_CLOCKMOD, POWERNOW_K7, and POWERNOW_K8.
 1.37 09-Aug-2010  jruoho Revert the previous changes to EST. The used hack had an obvious flaw:
the acpicpu(4) driver should attach even if the existing frequency management
code fails to attach, mainly because ACPI is the only proper way to deal
with EST on new Intel system.

Use a more drastic hack to deal with this: when acpicpu(4) attachs, it tears
down any existing sysctl(8) controls and installs identical ones in place.
Upon detachment, the initialization function of the existing EST is called.
 1.36 09-Aug-2010  jruoho Move the sysctl function pointers used by acpicpu(4) to x86/cpu.c.
Rename these so that the same pointers may be used in other parts.
 1.35 08-Aug-2010  jruoho Merge P-state support for acpicpu(4).

Remarks:

1. All processors (x86 or not) for which the vendor has implemented
ACPI I/O access routines are supported. Native instructions are
currently supported only for Intel's "Enhanced Speedstep". Code for
"PowerNow!" (AMD) will be merged later. Native support for VIA's
"PowerSaver" will be investigated.

2. Backwards compatibility with existing userland code is maintained.
Comparable to the case with cpu_idle(9), the ACPI CPU driver
installs alternative functions for the existing sysctl(8) controls.
The "native" behavior (if any) is restored upon detachment.

3. The dynamic nature of ACPI-provided P-states needs more investigation.
The maximum frequency induced (but not forced) by the firmware may
change dynamically. Currently, the sysctl(8) controls error out with
a value larger than the dynamic maximum. The code itself does not
however yet react to the notifications from the firmware by changing
the frequencies in-place. Presumably the system administrator should
be able to choose whether to use dynamic or static frequencies.
 1.34 04-Aug-2010  jruoho Store the MADT-derived CPU ID to <x86/cpu.h>. This is required to properly
match the ACPI processor object ID with the ID available in the APIC table.
 1.33 06-Jul-2010  cegger Turn PMAP_NOCACHE into MI flag.
Add MI flags PMAP_WRITE_COMBINE, PMAP_WRITE_BACK, PMAP_NOCACHE_OVR.
Update pmap(9) manpage.

hppa: Remove MD PMAP_NOCACHE flag as it exists as MI flag
mips: Rename MD PMAP_NOCACHE to PGC_NOCACHE.

x86: Implement new MI flags using Page-Attribute Tables.
x86: Implement BUS_SPACE_MAP_PREFETCHABLE.

Patch presented on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2010/06/30/msg008458.html

No comments on this last version.
 1.32 18-Apr-2010  jym This patch fixes the NX regression issue observed on amd64 kernels, where
per-page execution right was disabled (therefore leading to the inability
of the kernel to detect fraudulent use of memory mappings marked as not
being executable).

- replace cpu_feature and ci_feature_flags variables by cpu_feature and
ci_feat_val arrays. This makes it cleaner and brings kernel code closer
to the design of cpuctl(8). A warning will be raised for each CPU that
does not expose the same features as the Boot Processor (BP).

- the blacklist of CPU features is now a macro defined in the
specialreg.h header, instead of hardcoding it inside MD initialization
code; fix comments.

- replace checks against CPUID_TSC with the cpu_hascounter() function.

- clean up the code in init_x86_64(), as cpu_feature variables are set
inside cpu_probe().

- use cpu_init_msrs() for i386. It will be eventually used later for NX
feature under i386 PAE kernels.

- remove code that checks for CPUID_NOX in amd64 mptramp.S, this is already
performed by cpu_hatch() through cpu_init_msrs().

- remove cpu_signature and feature_flags members from struct mpbios_proc
(they were never used).

This patch was tested with i386 MONOLITHIC, XEN3PAE_DOM0 and XEN3_DOM0 under
a native i386 host, and amd64 GENERIC, XEN3_DOM0 via QEMU virtual machines.

XXX Should kernel rev be bumped?

XXX A similar patch should be pulled-up for NetBSD-5, hopefully tomorrow.
 1.31 02-Oct-2009  jmcneill branches: 1.31.2; 1.31.4;
Add support for VIA C7 temperature sensors (options VIA_C7TEMP)
 1.30 02-Oct-2009  jmcneill Use the TSC and current multiplier to calculate bus clock on VIA C7 Esther.
Probably needed for all C7 and Nano processors, but to be safe only use
this alternate method on Esther for now.

Now est on my C7-M 1.6GHz properly reports frequencies from 1600 to 400,
instead of 2133 to 533.
 1.29 05-Aug-2009  jym Add Intel SpeedStep and AMD PowerNow! support in Xen dom0. MSR operations
are now compiled in by default.

Note that MSR support in Xen depends on its version. rdmsr() should always
succeed, but wrmsr() to certain registers can end in a NOOP. In that case,
the error will be logged (see xm dmesg).

Setting CPU frequency (SpeedStep) requires Xen 3.3 with the option
cpufreq="dom0-kernel" passed down to hypervisor during boot.

Compiled and tested for SpeedStep under i386 for XEN3_DOM0 and XEN3PAE_DOM0
by jym@. amd64 was tested by Joel Carnat.

See also http://mail-index.netbsd.org/port-xen/2009/08/02/msg005213.html .

Commit requested by bouyer@.
 1.28 11-Mar-2009  yamt wrap opt_* includes with _KERNEL_OPT.
(i forgot to commit this with the tprof modules yesterday.)
 1.27 13-May-2008  ad branches: 1.27.6; 1.27.8; 1.27.12; 1.27.14; 1.27.16;
Be more conservative during AP startup. Don't let the AP access the lapic
or do any setup until the boot processor has finished the init sequence,
and add a few more delays.
 1.26 11-May-2008  ad Stop using APIC IDs to identify CPUs for software purposes. Allows for
APIC IDs beyond 31, which has been possible for some time now.
 1.25 09-May-2008  joerg Make cpu_idle a macro calling a function pointer on x86.
Select the Xen idle routine for Xen, mwait if supported by the CPU and
it is not AMD and halt otherwise. As reported by Christoph Egger,
AMD Barcelona keeps the CPU in C0 state with MWAIT, contrary to HLT,
which uses C1 and therefore much less power.
 1.24 28-Apr-2008  martin branches: 1.24.2;
Remove clause 3 and 4 from TNF licenses
 1.23 16-Apr-2008  cegger branches: 1.23.2; 1.23.4;
- use aprint_*_dev and device_xname
- use POSIX integer types
 1.22 04-Jan-2008  yamt branches: 1.22.6;
i386:
- make tss per-cpu. this considerably speeds up context switch for,
at least, pentium4, where ltr instruction seems very slow.
i386, xen:
- kill cpu_maxproc.
kvm86:
- adapt to per-cpu tss.
- cleanup and simplify.
- move kvm86_mp_lock to more meaningful place.
- disable preemption during a call.
 1.21 01-Jan-2008  yamt try to detect processor resource sharing topologies. ie. package/core/smt IDs.
 1.20 18-Dec-2007  joerg Add new IPI for saving CPU state explicitly, share high-level part of
ACPI wakeup code and teach it how to start the APs again. As a side
effect the CPU_START interface allows choosing between different
bootstrap codes more easily now.
 1.19 15-Nov-2007  ad branches: 1.19.2; 1.19.6;
Disable TLB shootdown IPIs while in the debugger. Crashdumps may try to
use them, and all but one CPU is paused. Reported and tested by martin@.
 1.18 13-Nov-2007  ad In cpu_hatch(), recompute ci_tsc_freq instead of using the boot CPU's value.
 1.17 12-Nov-2007  ad - cpu_vendor was both an int and char[] on amd64 - fix it.
- Run the errata check/patch on all CPUs, not just the boot processor.
 1.16 29-Oct-2007  xtraeme branches: 1.16.2;
Add coretemp(4). A new driver for Intel Core's on-die thermal sensor,
available on Intel Core or newer CPUs.

Ported from FreeBSD. Tested by rmind on i386 and joerg on amd64.

Enabled with "options INTEL_CORETEMP".
 1.15 17-Oct-2007  garbled Merge the ppcoea-renovation branch to HEAD.

This branch was a major cleanup and rototill of many of the various OEA
cpu based PPC ports that focused on sharing as much code as possible
between the various ports to eliminate near-identical copies of files in
every tree. Additionally there is a new PIC system that unifies the
interface to interrupt code for all different OEA ppc arches. The work
for this branch was done by a variety of people, too long to list here.

TODO:
bebox still needs work to complete the transition to -renovation.
ofppc still needs a bunch of work, which I will be looking at.
ev64260 still needs to be renovated
amigappc was not attempted.

NOTES:
pmppc was removed as an arch, and moved to a evbppc target.
 1.14 01-Jul-2007  xtraeme branches: 1.14.8; 1.14.10; 1.14.14;
Add support for the VIA C7-M and Eden processors in the
Enhanced Speedstep driver.

Tested by Heron Gallegos <gallegos at csxxi dot net dot mx>
 1.13 03-Jun-2007  xtraeme Make the Enhanced Speedstep driver available for i386 and amd64.
To use it on EM64T CPUs supporting the EST CPUID feature. Note that
some CPUs still don't work with this driver, like Xeon or Pentium 4.

Move the p[34]_get_bus_clock functions into its own file,
intel_busclock.c and remove this code from i386/identcpu.c.

Tested on i386 by myself and amd64 by Tonerre.
 1.12 21-Mar-2007  xtraeme branches: 1.12.4;
Don't build msr_ipifuncs on Xen, fixes the build with XEN2_DOM0.
 1.11 20-Mar-2007  xtraeme Driver for Intel Thermal Monitor (feature TM) On-Demand Clock
Modulation.

This works by changing the duty cycle of the clock modulation,
and saves power and helps to not increase the temperature by
software.

Adapted from OpenBSD/FreeBSD's p4tcc.

To enable it one must use "options INTEL_ONDEMAND_CLOCKMOD".

Tested by me in UP and SMP, ok'ed by Matthew R. Green.
 1.10 15-Mar-2007  xtraeme Ok... there were people really angry with this, backing it out.
 1.9 15-Mar-2007  xtraeme Add a driver for the Pentium 4 and later models with feature TM
(Thermal Monitor).

This driver will throttle the CPU clock modulation, saving some
power, also known as ODMC (On Demand Modulation Clock).

The processor can change from 12.5% to 100% (there are two erratas,
so two levels might be skipped in the worst case).

If supported, you'll see the following sysctl sub-tree:

machdep.p4tcc.throttling.target: CPU Clock throttling state (0 = lowest, 7 highest)
machdep.p4tcc.throttling.current: current CPU throttling state
machdep.p4tcc.throttling.available: list of CPU Clock throttling states

machdep.p4tcc.throttling.target = 2
machdep.p4tcc.throttling.current = 2
machdep.p4tcc.throttling.available = 7 6 5 4 3 2

Adapted from OpenBSD/FreeBSD.
 1.8 06-Mar-2007  yamt branches: 1.8.2; 1.8.4; 1.8.6;
multiple inclusion protection.
 1.7 05-Mar-2007  drochner clean up how cpus and ioapics are attached at the mainbus:
Seperate "cpubus" and "ioapicbus" -- while they share a common "address
space" (the apic id), the kernel doesn't use this fact. There are different
data passed to cpus and apics, which caused some ugly polymorphism. This
also saves the special "submatch" functions needed to distingush cpus
and ioapics for autoconf. (And it makes that "apid" locators wired
in the kernel configuration are honored now; this allows one to dumb down
an mp box to singleprocessor by userconfig.)
Print "apid" locators in the buses "print" function "as everyone does",
so the per-port cpu drivers don't need to do it.
Being here, constify "struct cpu_functions" and g/c the unused MP_PICMODE
flag.
 1.6 01-Jan-2007  ad branches: 1.6.2;
Report on and where possible, try to work around some of the known errata
for Athlon 64 and Opteron processors. Tested briefly by cube@ and elad@.
 1.5 08-Aug-2006  cube branches: 1.5.2; 1.5.6; 1.5.8;
files.x86 isn't included by Xen kernels, so opt_powernow_k8.h never gets
created by config(1), and thus it's not safe to use it in cpuvar.h.

Simply declare the prototype for k8_powernow_init in powernow.h. No need
to #ifdef protect a prototype, after all, only its users.

Un-breaks build of Xen kernels.
 1.4 07-Aug-2006  xtraeme branches: 1.4.2;
* Do not change struct powernow_pst_s (I added another member in my
previous patch) and this MUST be of that size, otherwise the tables
won't be found.

* powernow_k8.c moved into x86/x86, it should work both i386 and amd64.

* Added more DPRINTFs needed to found the first problem.

* Create "machdep.powernow.frequency" again, I can't remember why I
removed frequency... it should work with estd now.

* Do not try to call k[78]_powernow_init() if cpu is not AMD (thanks
to christos).

And more things I can't remember, but this time it will work in
Athlon 64 cpus and it won't crash in EM64T cpus.
 1.3 27-Oct-2003  junyoung branches: 1.3.16; 1.3.30;
Nuke __P().
 1.2 23-Jun-2003  martin branches: 1.2.2;
Make sure to include opt_foo.h if a defflag option FOO is used.
 1.1 01-Mar-2003  fvdl Moved here from i386/include
 1.2.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.2.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.2.2.1 03-Aug-2004  skrll Sync with HEAD
 1.3.30.1 09-Sep-2006  rpaulo sync with head
 1.3.16.5 21-Jan-2008  yamt sync with head
 1.3.16.4 07-Dec-2007  yamt sync with head
 1.3.16.3 15-Nov-2007  yamt sync with head.
 1.3.16.2 03-Sep-2007  yamt sync with head.
 1.3.16.1 26-Feb-2007  yamt sync with head.
 1.4.2.1 08-Aug-2006  tron Pull up following revision(s) (requested by cube in ticket #7):
sys/arch/x86/include/cpuvar.h: revision 1.5
sys/arch/x86/include/powernow.h: revision 1.3
files.x86 isn't included by Xen kernels, so opt_powernow_k8.h never gets
created by config(1), and thus it's not safe to use it in cpuvar.h.
Simply declare the prototype for k8_powernow_init in powernow.h. No need
to #ifdef protect a prototype, after all, only its users.
Un-breaks build of Xen kernels.
 1.5.8.1 23-Sep-2007  wrstuden Sync with somewhat-recent netbsd-4.
 1.5.6.1 12-Sep-2007  msaitoh Pull up following patches (requested by xtraeme in ticket #809)

share/man/man4/options.4 patch
sys/arch/i386/conf/files.i386 patch
sys/arch/i386/i386/est.c delete
sys/arch/i386/i386/identcpu.c patch
sys/arch/i386/include/cpu.h patch
sys/arch/x86/conf/files.x86 patch
sys/arch/x86/include/cpuvar.h patch
sys/arch/x86/x86/est.c new file
sys/arch/x86/x86/intel_busclock.c new file
sys/arch/amd64/amd64/identcpu.c patch
sys/arch/amd64/conf/GENERIC patch

Add support for the VIA C7-M and Eden processors in the Enhanced
Speedstep driver.
amd64: The Enhanced Speedstep driver is now able to work on EM64T
CPUs running in 64bit mode.
 1.5.2.1 12-Jan-2007  ad Sync with head.
 1.6.2.2 24-Mar-2007  yamt sync with head.
 1.6.2.1 12-Mar-2007  rmind Sync with HEAD.
 1.8.6.1 29-Mar-2007  reinoud Pullup to -current
 1.8.4.1 11-Jul-2007  mjf Sync with head.
 1.8.2.6 03-Dec-2007  ad Sync with HEAD.
 1.8.2.5 03-Dec-2007  ad Sync with HEAD.
 1.8.2.4 01-Nov-2007  ad - Fix interactivity problems under high load. Beacuse soft interrupts
are being stacked on top of regular LWPs, more often than not aston()
was being called on a soft interrupt thread instead of a user thread,
meaning that preemption was not happening on EOI.

- Don't use bool in a couple of data structures. Sub-word writes are not
always atomic and may clobber other fields in the containing word.

- For SCHED_4BSD, make p_estcpu per thread (l_estcpu). Rework how the
dynamic priority level is calculated - it's much better behaved now.

- Kill the l_usrpri/l_priority split now that priorities are no longer
directly assigned by tsleep(). There are three fields describing LWP
priority:

l_priority: Dynamic priority calculated by the scheduler.
This does not change for kernel/realtime threads,
and always stays within the correct band. Eg for
timeshared LWPs it never moves out of the user
priority range. This is basically what l_usrpri
was before.

l_inheritedprio: Lent to the LWP due to priority inheritance
(turnstiles).

l_kpriority: A boolean value set true the first time an LWP
sleeps within the kernel. This indicates that the LWP
should get a priority boost as compensation for blocking.
lwp_eprio() now does the equivalent of sched_kpri() if
the flag is set. The flag is cleared in userret().

- Keep track of scheduling class (OTHER, FIFO, RR) in struct lwp, and use
this to make decisions in a few places where we previously tested for a
kernel thread.

- Partially fix itimers and usr/sys/intr time accounting in the presence
of software interrupts.

- Use kthread_create() to create idle LWPs. Move priority definitions
from the various modules into sys/param.h.

- newlwp -> lwp_create
 1.8.2.3 15-Jul-2007  ad Sync with head.
 1.8.2.2 09-Jun-2007  ad Sync with head.
 1.8.2.1 10-Apr-2007  ad Sync with head.
 1.12.4.2 03-Oct-2007  garbled Sync with HEAD
 1.12.4.1 26-Jun-2007  garbled Sync with HEAD.
 1.14.14.2 18-Nov-2007  bouyer Sync with HEAD
 1.14.14.1 13-Nov-2007  bouyer Sync with HEAD
 1.14.10.2 09-Jan-2008  matt sync with HEAD
 1.14.10.1 06-Nov-2007  matt sync with HEAD
 1.14.8.3 21-Nov-2007  joerg Sync with HEAD.
 1.14.8.2 14-Nov-2007  joerg Sync with HEAD.
 1.14.8.1 29-Oct-2007  joerg Sync with HEAD.
 1.16.2.3 18-Feb-2008  mjf Sync with HEAD.
 1.16.2.2 27-Dec-2007  mjf Sync with HEAD.
 1.16.2.1 19-Nov-2007  mjf Sync with HEAD.
 1.19.6.2 08-Jan-2008  bouyer Sync with HEAD
 1.19.6.1 02-Jan-2008  bouyer Sync with HEAD
 1.19.2.1 26-Dec-2007  ad Sync with head.
 1.22.6.1 02-Jun-2008  mjf Sync with HEAD.
 1.23.4.6 09-Oct-2010  yamt sync with head
 1.23.4.5 11-Aug-2010  yamt sync with head.
 1.23.4.4 11-Mar-2010  yamt sync with head
 1.23.4.3 19-Aug-2009  yamt sync with head.
 1.23.4.2 04-May-2009  yamt sync with head.
 1.23.4.1 16-May-2008  yamt sync with head.
 1.23.2.1 18-May-2008  yamt sync with head.
 1.24.2.1 23-Jun-2008  wrstuden Sync w/ -current. 34 merge conflicts to follow.
 1.27.16.2 20-May-2011  matt bring matt-nb5-mips64 up to date with netbsd-5-1-RELEASE (except compat).
 1.27.16.1 21-Apr-2010  matt sync to netbsd-5
 1.27.14.1 23-Apr-2010  snj Apply patch (requested by jym in ticket #1380):
Fix the NX regression issue observed on amd64 kernels, where per-page
execution right was disabled (therefore leading to the inability
of the kernel to detect fraudulent use of memory mappings marked as not
being executable).
 1.27.12.5 27-Aug-2011  jym Sync with HEAD. Most notably: uvm/pmap work done by rmind@, and MP Xen
work of cherry@.

No regression observed on suspend/restore.
 1.27.12.4 28-Mar-2011  jym Sync with HEAD. TODO before merge:
- shortcut for suspend code in sysmon, when powerd(8) is not running.
Borrow ``xs_watch'' thread context?
- bug hunting in xbd + xennet resume. Rings are currently thrashed upon
resume, so current implementation force flush them on suspend. It's not
really needed.
 1.27.12.3 24-Oct-2010  jym Sync with HEAD
 1.27.12.2 01-Nov-2009  jym Sync with HEAD.
 1.27.12.1 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.27.8.3 22-Apr-2010  snj Apply patch (requested by jym in ticket #1380):
Fix the NX regression issue observed on amd64 kernels, where per-page
execution right was disabled (therefore leading to the inability
of the kernel to detect fraudulent use of memory mappings marked as not
being executable).
 1.27.8.2 05-Oct-2009  sborrill Pull up the following revisions(s) (requested by jmcneill in ticket #1061):
sys/arch/x86/conf/files.x86: revision 1.53
sys/arch/x86/include/cpuvar.h: revision 1.31
sys/arch/x86/x86/identcpu.c: revision 1.17
sys/arch/x86/x86/viac7temp.c: revision 1.1
sys/arch/i386/conf/ALL: revision 1.218
sys/arch/i386/conf/GENERIC: revision 1.949
Add support for VIA C7 temperature sensors (options VIA_C7TEMP) and enable
in i386 GENERIC kernel.
 1.27.8.1 05-Oct-2009  sborrill Pull up following revision(s) (requested by jmcneill in ticket #1059):
sys/arch/x86/include/cpuvar.h: 1.30
sys/arch/x86/x86/est.c: 1.12
sys/arch/x86/x86/intel_busclock.c: 1.8

Use the TSC and current multiplier to calculate bus clock on VIA C7 Esther.
Probably needed for all C7 and Nano processors, but to be safe only use this
alternate method on Esther for now.
 1.27.6.1 28-Apr-2009  skrll Sync with HEAD.
 1.31.4.3 05-Mar-2011  rmind sync with head
 1.31.4.2 31-May-2010  rmind - Split off Xen versions of pmap_map_ptes/pmap_unmap_ptes into Xen pmap,
also move pmap_apte_flush() with pmap_unmap_apdp() there.
- Make Xen buildable.
 1.31.4.1 30-May-2010  rmind sync with head
 1.31.2.3 22-Oct-2010  uebayasi Sync with HEAD (-D20101022).
 1.31.2.2 17-Aug-2010  uebayasi Sync with HEAD.
 1.31.2.1 30-Apr-2010  uebayasi Sync with HEAD.
 1.38.4.1 05-Mar-2011  bouyer Sync with HEAD
 1.38.2.1 06-Jun-2011  jruoho Sync with HEAD.
 1.43.2.2 23-Jun-2011  cherry Catchup with rmind-uvmplock merge.
 1.43.2.1 03-Jun-2011  cherry Initial import of xen MP sources, with kernel and userspace tests.
- this is a source priview.
- boots to single user.
- spurious interrupt and pmap related panics are normal
 1.45.8.1 09-May-2012  riz Pull up following revision(s) (requested by rmind in ticket #202):
sys/arch/x86/include/cpuvar.h: revision 1.46
sys/arch/xen/include/xenpmap.h: revision 1.34
sys/arch/i386/include/param.h: revision 1.77
sys/arch/x86/x86/pmap_tlb.c: revision 1.5
sys/arch/x86/x86/pmap_tlb.c: revision 1.6
sys/arch/i386/i386/genassym.cf: revision 1.92
sys/arch/xen/x86/cpu.c: revision 1.91
sys/arch/x86/x86/pmap.c: revision 1.177
sys/arch/xen/x86/xen_pmap.c: revision 1.21
sys/arch/x86/acpi/acpi_wakeup.c: revision 1.31
sys/kern/subr_kcpuset.c: revision 1.5
sys/arch/amd64/include/param.h: revision 1.18
sys/sys/kcpuset.h: revision 1.5
sys/arch/x86/x86/mtrr_i686.c: revision 1.26
sys/arch/x86/x86/mtrr_i686.c: revision 1.27
sys/arch/xen/x86/x86_xpmap.c: revision 1.43
sys/arch/x86/x86/cpu.c: revision 1.98
sys/arch/amd64/amd64/mptramp.S: revision 1.14
sys/kern/sys_sched.c: revision 1.42
sys/arch/amd64/amd64/genassym.cf: revision 1.50
sys/arch/i386/i386/mptramp.S: revision 1.24
sys/arch/x86/include/pmap.h: revision 1.52
sys/arch/x86/include/cpu.h: revision 1.50
- Convert x86 MD code, mainly pmap(9) e.g. TLB shootdown code, to use
kcpuset(9) and thus replace hardcoded CPU bitmasks. This removes the
limitation of maximum CPUs.
- Support up to 256 CPUs on amd64 architecture by default.
Bug fixes, improvements, completion of Xen part and testing on 64-core
AMD Opteron(tm) Processor 6282 SE (also, as Xen HVM domU with 128 CPUs)
by Manuel Bouyer.
- pmap_tlb_shootdown: do not overwrite tp_cpumask with pm_cpus, but merge
like pm_kernel_cpus. Remove unecessary intersection with kcpuset_running.
Do not reset tp_userpmap if pmap_kernel().
- Remove pmap_tlb_mailbox_t wrapping, which is pointless after recent changes.
- pmap_tlb_invalidate, pmap_tlb_intr: constify for packet structure.
i686_mtrr_init_first: handle the case when there are no variable-size MTRR
registers available (i686_mtrr_vcnt == 0).
 1.45.6.1 29-Apr-2012  mrg sync to latest -current.
 1.45.2.1 23-May-2012  yamt sync with head.
 1.46.18.1 19-Mar-2018  martin Pull up following revision(s) (requested by msaitoh in ticket #1118):
sys/arch/x86/include/cpuvar.h: revision 1.47
sys/arch/x86/x86/cpu.c: revision 1.117
sys/arch/x86/x86/identcpu.c: revision 1.49
sys/arch/x86/include/cpu.h: revision 1.67

Retrieve cpuid7 (Structured Extended Features) into ci_feat_val.
 1.46.16.3 28-Aug-2017  skrll Sync with HEAD
 1.46.16.2 05-Feb-2017  skrll Sync with HEAD
 1.46.16.1 27-Dec-2015  skrll Sync with HEAD (as of 26th Dec)
 1.46.14.1 06-Mar-2016  martin Pull up following revision(s) (requested by msaitoh in ticket #1118):
sys/arch/x86/include/cpuvar.h: revision 1.47
sys/arch/x86/x86/cpu.c: revision 1.117
sys/arch/x86/x86/identcpu.c: revision 1.49
sys/arch/x86/include/cpu.h: revision 1.67
Retrieve cpuid7 (Structured Extended Features) into ci_feat_val.
 1.46.2.1 03-Dec-2017  jdolecek update from HEAD
 1.47.2.2 26-Apr-2017  pgoyette Sync with HEAD
 1.47.2.1 20-Mar-2017  pgoyette Sync with HEAD
 1.48.2.1 21-Apr-2017  bouyer Sync with HEAD
 1.50.10.1 10-Jun-2019  christos Sync with HEAD
 1.51.10.1 18-Apr-2020  bouyer Add PVHVM multiprocessor support:
We need the hypervisor to be set up before cpus attaches.
Move hypervisor setup to a new function xen_hvm_init(), called at the
beggining of mainbus_attach(). This function searches the cfdata[] array
to see if the hypervisor device is enabled (so you can disable PV
support with
disable hypervisor
from userconf).
For HVM, ci_cpuid doens't match the virtual CPU index needed by Xen.
Introduce ci_vcpuid to cpu_info. Introduce xen_hvm_init_cpu(), to be
called for each CPU in in its context, which initialize ci_vcpuid and
ci_vcpu, and setup the event callback.
Change Xen code to use ci_vcpuid.

Do not call lapic_calibrate_timer() for VM_GUEST_XENPVHVM, we will use
Xen timers.

Don't call lapic_initclocks() from cpu_hatch(); instead set
x86_cpu_initclock_func to lapic_initclocks() in lapic_calibrate_timer(),
and call *(x86_cpu_initclock_func)() from cpu_hatch().
Also call x86_cpu_initclock_func from cpu_attach() for the boot CPU.
As x86_cpu_initclock_func is called for all CPUs, x86_initclock_func can
be a NOP for lapic timer.

Reorganize Xen code for x86_initclock_func/x86_cpu_initclock_func.
Move x86_cpu_idle_xen() to hypervisor_machdep.c
 1.53.26.1 02-Aug-2025  perseant Sync with HEAD
 1.4 11-Jan-2014  christos Add softint case (Richard Hansen)
 1.3 30-Apr-2011  christos branches: 1.3.2; 1.3.6; 1.3.8; 1.3.14; 1.3.18; 1.3.22;
add a define for pcb_sp
 1.2 10-Apr-2011  christos branches: 1.2.2;
something ate my /
 1.1 10-Apr-2011  christos Merge db_trace for x86. From: Vladimir Kirillov proger at wilab dot org dot ua
 1.2.2.3 31-May-2011  rmind sync with head
 1.2.2.2 21-Apr-2011  rmind sync with head
 1.2.2.1 10-Apr-2011  rmind file db_machdep.h was added on branch rmind-uvmplock on 2011-04-21 01:41:32 +0000
 1.3.22.1 18-May-2014  rmind sync with head
 1.3.18.1 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.3.14.1 07-Feb-2014  sborrill Pull up the following revisions(s) (requested by christos in ticket #1017):
sys/arch/x86/include/db_machdep.h: revision 1.4
sys/arch/i386/i386/db_machdep.c: revision 1.5

Fix ddb backtrace for softintr (i386).
 1.3.8.1 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.3.6.2 06-Jun-2011  jruoho Sync with HEAD.
 1.3.6.1 30-Apr-2011  jruoho file db_machdep.h was added on branch jruoho-x86intr on 2011-06-06 09:07:06 +0000
 1.3.2.2 02-May-2011  jym Sync with head.
 1.3.2.1 30-Apr-2011  jym file db_machdep.h was added on branch jym-xensuspend on 2011-05-02 22:49:57 +0000
 1.8 13-Jan-2019  maxv Error out if the higher 32 bits of DR6 and DR7 are set. MOV DR would
fault otherwise.
 1.7 27-Sep-2018  maxv Export x86_dbregs_{save/restore}, will be used outside. Reproduce some
internal dbregs logic in them.
 1.6 26-Jul-2018  maxv Rework dbregs, to switch the registers during context switches, and not on
each user->kernel transition via userret. Reloads of DR6/DR7 are expensive
on both native and xen.
 1.5 22-Jul-2018  maxv Clean up dbregs; remove useless comments, remove arguments from prototypes,
style, add KASSERT and move x86_dbregspl into dbregs.c. No real functional
change.
 1.4 23-Feb-2017  kamil branches: 1.4.12; 1.4.14; 1.4.16;
Introduce PT_GETDBREGS and PT_SETDBREGS in ptrace(2) on i386 and amd64

This interface is modeled after FreeBSD API with the usage.

This replaced previous watchpoint API. The previous one was introduced
recently in NetBSD-current and remove its spurs without any
backward-compatibility.

Design choices for Debug Register accessors:
- exec() (TRAP_EXEC event) must remove debug registers from LWP
- debug registers are only per-LWP, not per-process globally
- debug registers must not be inherited after (v)forking a process
- debug registers must not be inherited after forking a thread
- a debugger is responsible to set global watchpoints/breakpoints with the
debug registers, to achieve this PTRACE_LWP_CREATE/PTRACE_LWP_EXIT event
monitoring function is designed to be used
- debug register traps must generate SIGTRAP with si_code TRAP_DBREG
- debugger is responsible to retrieve debug register state to distinguish
the exact debug register trap (DR6 is Status Register on x86)
- kernel must not remove debug register traps after triggering a trap event
a debugger is responsible to detach this trap with appropriate PT_SETDBREGS
call (DR7 is Control Register on x86)
- debug registers must not be exposed in mcontext
- userland must not be allowed to set a trap on the kernel

Implementation notes on i386 and amd64:
- the initial state of debug register is retrieved on boot and this value is
stored in a local copy (initdbregs), this value is used to initialize dbreg
context after PT_GETDBREGS
- struct dbregs is stored in pcb as a pointer and by default not initialized
- reserved registers (DR4-DR5, DR9-DR15) are ignored

Further ideas:
- restrict this interface with securelevel

Tested on real hardware i386 (Intel Pentium IV) and amd64 (Intel i7).

This commit enables 390 debug register ATF tests in kernel/arch/x86.
All tests are passing.

This commit does not cover netbsd32 compat code. Currently other interface
PT_GET_SIGINFO/PT_SET_SIGINFO is required in netbsd32 compat code in order to
validate reliably PT_GETDBREGS/PT_SETDBREGS.

This implementation does not cover FreeBSD specific defines in their
<x86/reg.h>: DBREG_DR7_LOCAL_ENABLE, DBREG_DR7_GLOBAL_ENABLE, DBREG_DR7_LEN_1
etc. These values tend to be reinvented by each tracer on its own. GNU
Debugger (GDB) works with NetBSD debug registers after adding this patch:

--- gdb/amd64bsd-nat.c.orig 2016-02-10 03:19:39.000000000 +0000
+++ gdb/amd64bsd-nat.c
@@ -167,6 +167,10 @@ amd64bsd_target (void)

#ifdef HAVE_PT_GETDBREGS

+#ifndef DBREG_DRX
+#define DBREG_DRX(d,x) ((d)->dr[(x)])
+#endif
+
static unsigned long
amd64bsd_dr_get (ptid_t ptid, int regnum)
{


Another reason to stop introducing unpopular defines covering machine
specific register macros is that these value varies across generations of
the same CPU family.

GDB demo:
(gdb) c
Continuing.

Watchpoint 2: traceme

Old value = 0
New value = 16
main (argc=1, argv=0x7f7fff79fe30) at test.c:8
8 printf("traceme=%d\n", traceme);

(Currently the GDB interface is not reliable due to NetBSD support bugs)

Sponsored by <The NetBSD Foundation>
 1.3 18-Jan-2017  kamil Embed hardware trap and its type that fired (x86), information for tracers

Now x86 throws SIGTRAP on hardware exception with:
- si_code TRAP_HWWPT - dedicated for hw assisted watchpoint interface
- si_trap - unchanged (T_TRCTRAP)
- si_trap2 - watchpoint number that fired
- si_trap3 - watchpoint specific event description

x86 returns in si_trap3 one of the field from <x86/dbregs.h>
- X86_HW_WATCHPOINT_EVENT_FIRED - watchpoint fired
- X86_HW_WATCHPOINT_EVENT_FIRED_AND_SSTEP - watchpoint fired under PT_STEP

Othe changes:
- restrict more code from <x86/dbregs.h> to _KERNEL

Sponsored bt <The NetBSD Foundation>
 1.2 15-Dec-2016  kamil branches: 1.2.2; 1.2.4;
Add support for hardware assisted watchpoints/breakpoints API in ptrace(2)

Add new ptrace(2) calls:
- PT_COUNT_WATCHPOINTS - count the number of available hardware watchpoints
- PT_READ_WATCHPOINT - read struct ptrace_watchpoint from the kernel state
- PT_WRITE_WATCHPOINT - write new struct ptrace_watchpoint state, this
includes enabling and disabling watchpoints

The ptrace_watchpoint structure contains MI and MD parts:

typedef struct ptrace_watchpoint {
int pw_index; /* HW Watchpoint ID (count from 0) */
lwpid_t pw_lwpid; /* LWP described */
struct mdpw pw_md; /* MD fields */
} ptrace_watchpoint_t;

For example amd64 defines MD as follows:
struct mdpw {
void *md_address;
int md_condition;
int md_length;
};

These calls are protected with the __HAVE_PTRACE_WATCHPOINTS guard.

Tested on amd64, initial support added for i386 and XEN.

Sponsored by <The NetBSD Foundation>
 1.1 27-Nov-2016  kamil branches: 1.1.2;
Add accessors for available x86 Debug Registers

There are 8 Debug Registers on i386 (available at least since 80386) and 16
on AMD64. Currently DR4 and DR5 are reserved on both cpu-families and
DR9-DR15 are still reserved on AMD64. Therefore add accessors for DR0-DR3,
DR6-DR7 for all ports.

Debug Registers x86:
* DR0-DR3 Debug Address Registers
* DR4-DR5 Reserved
* DR6 Debug Status Register
* DR7 Debug Control Register
* DR8-DR15 Reserved

Access the registers is available only from a kernel (ring 0) as there is
needed top protected access. For this reason there is need to use special
XEN functions to get and set the registers in the XEN3 kernels.

XEN specific functions as defined in NetBSD:
- HYPERVISOR_get_debugreg()
- HYPERVISOR_set_debugreg()

This code extends the existing rdr6() and ldr6() accessor for additional:
- rdr0() & ldr0()
- rdr1() & ldr1()
- rdr2() & ldr2()
- rdr3() & ldr3()
- rdr7() & ldr7()

Traditionally accessors for DR6 were passing vaddr_t argument, while it's
appropriate type for DR0-DR3, DR6-DR7 should be using u_long, however it's
not a big deal. The resulting functionality should be equivalent so stick
to this convention and use the vaddr_t type for all DR accessors.

There was already a function defined for rdr6() in XEN, but it had a nit on
AMD64 as it was casting HYPERVISOR_get_debugreg() to u_int (32-bit on
AMD64), truncating result. It still works for DR6, but for the sake of
simplicity always return full 64-bit value.

New accessors duplicate functionality of the dr0() function available on
i386 within the KSTACK_CHECK_DR0 option. dr0() is a specialized layer with
logic to set appropriate types of interrupts, now accessors are designed to
pass verbatim values from user-land (with simple sanity checks in the
kernel). At the moment there are no plans to make possible to coexist
KSTACK_CHECK_DR0 with debug registers for user applications (debuggers).

options KSTACK_CHECK_DR0
Detect kernel stack overflow using DR0 register. This option uses DR0
register exclusively so you can't use DR0 register for other purpose
(e.g., hardware breakpoint) if you turn this on.

The KSTACK_CHECK_DR0 functionality was designed for i386 and never ported
to amd64.

Code tested on i386 and amd64 with kernels: GENERIC, XEN3_DOMU, XEN3_DOM0.

Sponsored by <The NetBSD Foundation>
 1.1.2.4 28-Aug-2017  skrll Sync with HEAD
 1.1.2.3 05-Feb-2017  skrll Sync with HEAD
 1.1.2.2 05-Dec-2016  skrll Sync with HEAD
 1.1.2.1 27-Nov-2016  skrll file dbregs.h was added on branch nick-nhusb on 2016-12-05 10:54:59 +0000
 1.2.4.1 21-Apr-2017  bouyer Sync with HEAD
 1.2.2.3 20-Mar-2017  pgoyette Sync with HEAD
 1.2.2.2 07-Jan-2017  pgoyette Sync with HEAD. (Note that most of these changes are simply $NetBSD$
tag issues.)
 1.2.2.1 15-Dec-2016  pgoyette file dbregs.h was added on branch pgoyette-localcount on 2017-01-07 08:56:28 +0000
 1.4.16.1 10-Jun-2019  christos Sync with HEAD
 1.4.14.3 18-Jan-2019  pgoyette Synch with HEAD
 1.4.14.2 30-Sep-2018  pgoyette Ssync with HEAD
 1.4.14.1 28-Jul-2018  pgoyette Sync with HEAD
 1.4.12.2 03-Dec-2017  jdolecek update from HEAD
 1.4.12.1 23-Feb-2017  jdolecek file dbregs.h was added on branch tls-maxphys on 2017-12-03 11:36:50 +0000
 1.15 20-Aug-2022  riastradh machine/efi.h: Migrate common definitions to dev/efi/efi.h.
 1.14 20-Aug-2022  riastradh x86/efi.h: Assert size of struct efi_systbl.
 1.13 20-Aug-2022  riastradh arm/efi.h, x86/efi.h: Fix whitespace around RCS id.

No functional change intended.
 1.12 20-Aug-2022  riastradh x86/efi.h: Fix whitespace. No functional change intended.
 1.11 20-Aug-2022  riastradh machine/efi.h: Add more memory descriptor attributes.
 1.10 01-Apr-2022  skrll Trailing whitespace
 1.9 18-Oct-2019  manu Add UEFI boot services and I/O method protoypes
 1.8 22-Oct-2017  maya branches: 1.8.2; 1.8.6;
Move initialization code out of efi_probe into efi_init

and call it from cpu_configure
 1.7 11-Mar-2017  nonaka search SMBIOS from UEFI configuration table when boot with UEFI.
 1.6 23-Feb-2017  nonaka Avoid panic when amd64 kernel is booted from 32bit UEFI.
 1.5 14-Feb-2017  nonaka Handle persistent memory. Currently only debug output.
 1.4 14-Feb-2017  nonaka x86: make btinfo_memmap from btinfo_efimemmap for to reduce mem_cluster_cnt.

should fix PR/51953.
 1.3 09-Feb-2017  nonaka efi_md::md_virt always uses uint64_t.
 1.2 24-Jan-2017  nonaka Initial commit of native amd64 EFI boot loader.
 1.1 28-Jan-2016  christos branches: 1.1.2; 1.1.4; 1.1.6;
Add support for grub to find the ACPI root table pointer via a bootinfo entry
from grub.
From: https://mail-index.netbsd.org/tech-kern/2014/05/22/msg017119.html
 1.1.6.1 21-Apr-2017  bouyer Sync with HEAD
 1.1.4.1 20-Mar-2017  pgoyette Sync with HEAD
 1.1.2.4 28-Aug-2017  skrll Sync with HEAD
 1.1.2.3 05-Feb-2017  skrll Sync with HEAD
 1.1.2.2 19-Mar-2016  skrll Sync with HEAD
 1.1.2.1 28-Jan-2016  skrll file efi.h was added on branch nick-nhusb on 2016-03-19 11:30:07 +0000
 1.8.6.1 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.8.2.2 03-Dec-2017  jdolecek update from HEAD
 1.8.2.1 22-Oct-2017  jdolecek file efi.h was added on branch tls-maxphys on 2017-12-03 11:36:50 +0000
 1.1 23-Feb-2011  jruoho branches: 1.1.2; 1.1.4; 1.1.6; 1.1.10;
Move ENHANCED_SPEEDSTEP, or henceforth est(4), to the cpufeaturebus.
 1.1.10.2 06-Jun-2011  jruoho Sync with HEAD.
 1.1.10.1 23-Feb-2011  jruoho file est.h was added on branch jruoho-x86intr on 2011-06-06 09:07:06 +0000
 1.1.6.2 28-Mar-2011  jym Sync with HEAD. TODO before merge:
- shortcut for suspend code in sysmon, when powerd(8) is not running.
Borrow ``xs_watch'' thread context?
- bug hunting in xbd + xennet resume. Rings are currently thrashed upon
resume, so current implementation force flush them on suspend. It's not
really needed.
 1.1.6.1 23-Feb-2011  jym file est.h was added on branch jym-xensuspend on 2011-03-28 23:04:49 +0000
 1.1.4.2 05-Mar-2011  rmind sync with head
 1.1.4.1 23-Feb-2011  rmind file est.h was added on branch rmind-uvmplock on 2011-03-05 20:52:28 +0000
 1.1.2.2 05-Mar-2011  bouyer Sync with HEAD
 1.1.2.1 23-Feb-2011  bouyer file est.h was added on branch bouyer-quota2 on 2011-03-05 15:10:10 +0000
 1.9 04-Oct-2025  riastradh x86, m68k <machine/float.h>: `Significand', not `mantissa'.

The IEEE 754 standard uses `significand' and has since its first
edition in 1985; Kahan and Knuth both use `significand' and
explicitly reject `mantissa'; `significand' doesn't have a
conflicting definition in logarithms; and in actual usage in the
floating-point and numerical analysis literature, `significand'
dominates.

No functional change intended -- comment-only.
 1.8 15-Jun-2024  rillig {m68k,x86}/float.h: fix cross references
 1.7 31-Dec-2023  dholland {x86,m68k}/float.h: document LDBL_MIN behavior

It seems that even though both these platforms have 12-byte floats
that are pretty much the same representation and both allegedly
IEEE-compliant, they manifest the top bit of the mantissa and then
differ slightly in the behavior of the extra encodings this permits.

Thanks to riastradh@ for helping sort this out.
 1.6 27-Apr-2013  joerg Systematically include sys/featuretest.h when _NETBSD_SOURCE is used.
Some are redundant, but make verification with grep much easier.
 1.5 23-Oct-2003  kleink branches: 1.5.140; 1.5.150;
* Move the definitions for types other than single-precision and double-
precision back to machine-dependent headers. C99 has no strict
requirement which, if any, extended-precision type `long double' must
match, and even between 80-bit formats there are differences in
implementation (m68k vs. x86).
* On arm, consider __VFP_FP__.
 1.4 12-May-2003  kleink branches: 1.4.2;
Rename <sys/float_ieee.h> to <sys/float_ieee754.h>, following libc's
convention for these.
 1.3 21-Apr-2003  christos Override LDBL_MIN
 1.2 19-Apr-2003  christos PR/3012: Greg A. Woods: Write all float.h files [except the vax of course]
in terms of float_ieee.h
 1.1 26-Feb-2003  fvdl Move some files out of i386 into x86, so that they can be shared with
other ports.
 1.4.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.4.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.4.2.1 03-Aug-2004  skrll Sync with HEAD
 1.5.150.1 23-Jun-2013  tls resync from head
 1.5.140.1 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.23 24-Oct-2020  mgorny Issue 64-bit versions of *XSAVE* for 64-bit amd64 programs

When calling FXSAVE, XSAVE, FXRSTOR, ... for 64-bit programs on amd64
use the 64-suffixed variant in order to include the complete FIP/FDP
registers in the x87 area.

The difference between the two variants is that the FXSAVE64 (new)
variant represents FIP/FDP as 64-bit fields (union fp_addr.fa_64),
while the legacy FXSAVE variant uses split fields: 32-bit offset,
16-bit segment and 16-bit reserved field (union fp_addr.fa_32).
The latter implies that the actual addresses are truncated to 32 bits
which is insufficient in modern programs.

The change is applied only to 64-bit programs on amd64. Plain i386
and compat32 continue using plain FXSAVE. Similarly, NVMM is not
changed as I am not familiar with that code.

This is a potentially breaking change. However, I don't think it likely
to actually break anything because the data provided by the old variant
were not meaningful (because of the truncated pointer).
 1.22 15-Oct-2020  mgorny Revert "Merge convert_xmm_s87.c into fpu.c"

I am going to add ATF tests for these two functions, and having them
in a separate file will make it more convenient to build and run them
in userspace.
 1.21 14-Jun-2020  riastradh Use static constant rather than stack memset buffer for zero fpregs.
 1.20 27-Nov-2019  maxv Add a small API for in-kernel FPU operations.

fpu_kern_enter();
/* do FPU stuff */
fpu_kern_leave();
 1.19 12-Oct-2019  maxv Rewrite the FPU code on x86. This greatly simplifies the logic and removes
the dependency on IPL_HIGH. NVMM is updated accordingly. Posted on
port-amd64 a week ago.

Bump the kernel version to 9.99.16.
 1.18 04-Oct-2019  maxv Rename fpu_eagerswitch to fpu_switch, and add fpu_xstate_reload to
simplify.
 1.17 26-Jun-2019  mgorny Implement PT_GETXSTATE and PT_SETXSTATE

Introduce two new ptrace() requests: PT_GETXSTATE and PT_SETXSTATE,
that provide access to the extended (and extensible) set of FPU
registers on amd64 and i386. At the moment, this covers AVX (YMM)
and AVX-512 (ZMM, opmask) registers. It can be easily extended
to cover further register types without breaking backwards
compatibility.

PT_GETXSTATE issues the XSAVE instruction with all kernel-supported
extended components enabled. The data is copied into 'struct xstate'
(which -- unlike the XSAVE area itself -- has stable format
and offsets).

PT_SETXSTATE issues the XRSTOR instruction to restore the register
values from user-provided 'struct xstate'. The function replaces only
the specific XSAVE components that are listed in 'xs_rfbm' field,
making it possible to issue partial updates.

Both syscalls take a 'struct iovec' pointer rather than a direct
argument. This requires the caller to explicitly specify the buffer
size. As a result, existing code will continue to work correctly
when the structure is extended (performing partial reads/updates).
 1.16 19-May-2019  maxv Rename

fpu_save_area_clear -> fpu_clear
fpu_save_area_reset -> fpu_sigreset

Clearer, and reduces a future diff. No real functional change.
 1.15 19-May-2019  maxv Misc changes in the x86 FPU code. Reduces a future diff. No real functional
change.
 1.14 20-Jan-2019  maxv Improvements in NVMM

* Handle the FPU differently, limit the states via the given mask rather
than via XCR0. Align to 64 bytes. Provide an initial gXCR0, to be sure
that XCR0_X87 is set. Reset XSTATE_BV when the state is modified by
the virtualizer, to force a reload from memory.

* Hide RDTSCP.

* Zero-extend RBX/RCX/RDX when handling the NVMM CPUID signature.

* Take ECX and not RCX on MSR instructions.
 1.13 05-Oct-2018  maxv export x86_fpu_mxcsr_mask, fpu_area_save and fpu_area_restore
 1.12 22-Jun-2018  maxv branches: 1.12.2;
Revert jdolecek's changes related to FXSAVE. They just didn't make any
sense and were trying to hide a real bug, which is, that there is for some
reason a wrong stack alignment that causes FXSAVE to fault in
fpuinit_mxcsr_mask. As seen in current-users@ yesterday, rdi % 16 = 8. And
as seen several months ago, as well.

The rest of the changes in XSAVE are wrong too, but I'll let him fix these
ones.
 1.11 20-Jun-2018  jdolecek as a stop-gap, make fpuinit_mxcsr_mask() for native independant of
XSAVE as it should be, only xen case checks the flag now; need to
investigate further why exactly the fault happens for the xen
no-xsave case

pointed out by maxv
 1.10 19-Jun-2018  jdolecek fix FPU initialization on Xen to allow e.g. AVX when supported by hardware;
only use XSAVE when the the CPUID OSXSAVE bit is set, as this seems to be
reliable indication

tested with Xen 4.2.6 DOM0/DOMU on Intel CPU, without and with no-xsave flag,
so should work also on those AMD CPUs, which have XSAVE disabled by default;
also tested with Xen DOM0 4.8.3

fixes PR kern/50332 by Torbjorn Granlund; sorry it took three years to address

XXX pullup netbsd-8
 1.9 14-Jun-2018  maxv Add some code to support eager fpu switch, INTEL-SA-00145. We restore the
FPU state of the lwp right away during context switches. This guarantees
that when the CPU executes in userland, the FPU doesn't contain secrets.

Maybe we also need to clear the FPU in setregs(), not sure about this one.

Can be enabled/disabled via:

machdep.fpu_eager = {0/1}

Not yet turned on automatically on affected CPUs (Intel Family 6).

More generally it would be good to turn it on automatically when XSAVEOPT
is supported, because in this case there is probably a non-negligible
performance gain; but we need to fix PR/52966.
 1.8 23-May-2018  maxv Merge convert_xmm_s87.c into fpu.c. It contains only two functions, that
are used only in fpu.c.
 1.7 03-Nov-2017  maxv branches: 1.7.2;
Fix MXCSR_MASK, it needs to be detected dynamically, otherwise when masking
MXCSR we are losing some features (eg DAZ).
 1.6 25-Feb-2014  dsl branches: 1.6.4; 1.6.6; 1.6.10; 1.6.28;
Add support for saving the AVX-256 ymm registers during FPU context switches.
Add support for the forthcoming AVX-512 registers.
Code compiled with -mavx seems to work, but I've not tested context
switches with live ymm registers.
There is a small cost on fork/exec (a larger area is copied/zerod),
but I don't think the ymm registers are read/written unless they
have been used.
The code use XSAVE on all cpus, I'm not brave enough to enable XSAVEOPT.
 1.5 23-Feb-2014  dsl Add fpu_set_default_cw() and use it in the emulations to set the default
x87 control word.
This means that nothing outside fpu.c cares about the internals of the
fpu save area.
New kernel modules won't load with the old kernel - but that won't matter.
 1.4 15-Feb-2014  dsl Load and save the fpu registers (for copies to/from userspace) using
helper functions in arch/x86/x86/fpu.c
They (hopefully) ensure that we write to the entire buffer and don't load
values that might cause faults in kernel.
Also zero out the 'pad' field of the i386 mcontext fp area that I think
once contained the registers of any Weitek fpu.
Dunno why it wasn't pasrt of the union.
Some of these copies could be removed if the code directly copied the save
area to/from userspace addresses.
 1.3 15-Feb-2014  dsl Remove all references to MDL_USEDFPU and deferred fpu initialisation.
The cost of zeroing the save area on exec is minimal.
This stops the FP registers of a random process being used the first
time an lwp uses the fpu.
sendsig_siginfo() and get_mcontext() now unconditionally copy the FP
registers.
I'll remove the double-copy for signal handlers soon.
get_mcontext() might have been leaking kernel memory to userspace - and
may still do so if i386_use_fxsave is false (short copies).
 1.2 12-Feb-2014  dsl Change i386 to use x86/fpu.c instead of i386/isa/npx.c
This changes the trap10 and trap13 code to call directly into fpu.c,
removing all the code for T_ARITHTRAP, T_XMM and T_FPUNDA from i386/trap.c
Not all of the code thate appeared to handle fpu traps was ever called!
Most of the changes just replace the include of machine/npx.h with x86/fpu.h
(or remove it entirely).
 1.1 11-Feb-2014  dsl Move sys/arch/amd64/amd64/fpu.c and sys/arch/amd64/include/fpu.h
into sys/arch/x86 in preparation for using the same code for i386.
 1.6.28.1 23-Jun-2018  martin Pull up the following, via patch, requested by maxv in ticket #897:

sys/arch/amd64/amd64/locore.S 1.166 (patch)
sys/arch/i386/i386/locore.S 1.157 (patch)
sys/arch/x86/include/cpu.h 1.92 (patch)
sys/arch/x86/include/fpu.h 1.9 (patch)
sys/arch/x86/x86/fpu.c 1.33-1.39 (patch)
sys/arch/x86/x86/identcpu.c 1.72 (patch)
sys/arch/x86/x86/vm_machdep.c 1.34 (patch)
sys/arch/x86/x86/x86_machdep.c 1.116,1.117 (patch)

Support eager fpu switch, to work around INTEL-SA-00145.
Provide a sysctl machdep.fpu_eager, which gets automatically
initialized to 1 on affected CPUs.
 1.6.10.3 03-Dec-2017  jdolecek update from HEAD
 1.6.10.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.6.10.1 25-Feb-2014  tls file fpu.h was added on branch tls-maxphys on 2014-08-20 00:03:29 +0000
 1.6.6.2 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.6.6.1 25-Feb-2014  yamt file fpu.h was added on branch yamt-pagecache on 2014-05-22 11:40:13 +0000
 1.6.4.2 18-May-2014  rmind sync with head
 1.6.4.1 25-Feb-2014  rmind file fpu.h was added on branch rmind-smpnet on 2014-05-18 17:45:30 +0000
 1.7.2.3 26-Jan-2019  pgoyette Sync with HEAD
 1.7.2.2 20-Oct-2018  pgoyette Sync with head
 1.7.2.1 25-Jun-2018  pgoyette Sync with HEAD
 1.12.2.3 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.12.2.2 08-Apr-2020  martin Merge changes from current as of 20200406
 1.12.2.1 10-Jun-2019  christos Sync with HEAD
 1.1 30-Apr-2021  christos branches: 1.1.4;
merge the i386 and amd64 gdt.h files.
 1.1.4.2 13-May-2021  thorpej Sync with HEAD.
 1.1.4.1 30-Apr-2021  thorpej file gdt.h was added on branch thorpej-i2c-spi-conf on 2021-05-13 00:47:29 +0000
 1.7 17-Oct-2023  bouyer Support non-VGA framebuffers for Xen dom0. This is mandatory for graphic
console on EFI-only hardware.
Add a xen_genfb_getbtinfo() function which will return a btinfo_framebuffer
structure, filled in with parameters provided by Xen
when runing as a Xen dom0, call xen_genfb_getbtinfo() instead of
lookup_bootinfo(BTINFO_FRAMEBUFFER) when adding properties to the
PCI graphic device (when genfb is attached) and in x86_genfb_init()
when genfb is used as console.
x86/x86/consinit.c: If running as a Xen dom0, use xen_genfb_getbtinfo()
to check if we have a genfb console
xen/x86/consinit.c: support genfb as possible console
xen/x86/consinit.c: use the hypervior IO as console until a better one
is found. If the hypervisor is using a serial port for boot messages,
we'll get NetBSD's boot message on the serial port too until
the real console takes over.
xen/x86/autoconf.c: rework device_register() to be closer to the x86 version.
Especially make sure that device_pci_register() is called.
 1.6 16-Oct-2023  bouyer Declare
int acpi_md_vesa_modenum;
int acpi_md_vbios_reset;
struct vcons_screen x86_genfb_console_screen;

in genfb_machdep.h instead of locally as extern in various .c files.
 1.5 28-Jan-2021  jmcneill branches: 1.5.18;
Remove x86_genfb_mtrr_init. PATs have been available since the Pentium III
and this code has been #if notyet'd shortly after being introduced.
 1.4 30-Nov-2019  nonaka branches: 1.4.8;
Prevent panic when attaching genfb if using a serial console with Hyper-V Gen.2.
 1.3 09-Feb-2011  jmcneill branches: 1.3.48; 1.3.56; 1.3.60;
if genfb is attached, hook into db_trap_callback to switch in and out of
polling mode as necessary
 1.2 08-Feb-2011  ahoka Add missing prototype for x86_genfb_mtrr_init to fix build.

Hi Jared!
 1.1 17-Feb-2009  jmcneill branches: 1.1.2; 1.1.4; 1.1.6; 1.1.10; 1.1.12; 1.1.14;
PR# port-i386/37026: userconf(4) doesn't work with vesafb(4)

Add early console support for x86 genfb.
 1.1.14.2 17-Feb-2011  bouyer Sync with HEAD
 1.1.14.1 09-Feb-2011  bouyer Sync with HEAD
 1.1.12.1 06-Jun-2011  jruoho Sync with HEAD.
 1.1.10.1 05-Mar-2011  rmind sync with head
 1.1.6.4 28-Mar-2011  jym Sync with HEAD. TODO before merge:
- shortcut for suspend code in sysmon, when powerd(8) is not running.
Borrow ``xs_watch'' thread context?
- bug hunting in xbd + xennet resume. Rings are currently thrashed upon
resume, so current implementation force flush them on suspend. It's not
really needed.
 1.1.6.3 01-Nov-2009  jym Sync with HEAD.
 1.1.6.2 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.1.6.1 17-Feb-2009  jym file genfb_machdep.h was added on branch jym-xensuspend on 2009-05-13 17:18:44 +0000
 1.1.4.2 04-May-2009  yamt sync with head.
 1.1.4.1 17-Feb-2009  yamt file genfb_machdep.h was added on branch yamt-nfs-mp on 2009-05-04 08:12:09 +0000
 1.1.2.2 03-Mar-2009  skrll Sync with HEAD.
 1.1.2.1 17-Feb-2009  skrll file genfb_machdep.h was added on branch nick-hppapmap on 2009-03-03 18:29:36 +0000
 1.3.60.1 08-Dec-2019  martin Pull up following revision(s) (requested by nonaka in ticket #502):
sys/arch/x86/x86/hyperv.c: revision 1.5
sys/arch/x86/include/genfb_machdep.h: revision 1.4
sys/arch/x86/x86/genfb_machdep.c: revision 1.15
Prevent panic when attaching genfb if using a serial console with Hyper-V Gen.2.
 1.3.56.1 08-Apr-2020  martin Merge changes from current as of 20200406
 1.3.48.1 05-Dec-2019  bouyer Pull up following revision(s) (requested by nonaka in ticket #1466):
sys/arch/x86/x86/hyperv.c: revision 1.5
sys/arch/x86/include/genfb_machdep.h: revision 1.4
sys/arch/x86/x86/genfb_machdep.c: revision 1.15
Prevent panic when attaching genfb if using a serial console with Hyper-V Gen.2.
 1.4.8.1 03-Apr-2021  thorpej Sync with HEAD.
 1.5.18.2 18-Oct-2023  martin Pull up following revision(s) (requested by bouyer in ticket #428):

sys/arch/xen/xen/xen_machdep.c: revision 1.28
sys/arch/x86/pci/pci_machdep.c: revision 1.97
sys/arch/xen/xen/genfb_xen.c: revision 1.1
sys/arch/xen/xen/genfb_xen.c: revision 1.2
sys/arch/xen/include/hypervisor.h: revision 1.59
sys/arch/i386/conf/XEN3PAE_DOM0: revision 1.41 (patch)
sys/arch/x86/x86/genfb_machdep.c: revision 1.22
sys/arch/xen/x86/consinit.c: revision 1.18
sys/arch/xen/x86/autoconf.c: revision 1.26
sys/external/mit/xen-include-public/dist/xen/include/public/platform.h: revision 1.2
sys/arch/xen/conf/files.xen: revision 1.188
sys/arch/x86/x86/consinit.c: revision 1.37
sys/arch/xen/conf/files.xen: revision 1.189
sys/arch/x86/x86/consinit.c: revision 1.38
sys/external/mit/xen-include-public/dist/xen/include/public/xen.h: revision 1.2
sys/arch/x86/include/genfb_machdep.h: revision 1.7
sys/arch/xen/x86/pvh_consinit.c: revision 1.5
sys/arch/xen/x86/pvh_consinit.c: revision 1.6
sys/arch/amd64/conf/XEN3_DOM0: revision 1.201

Move the pvh_xencons so xen_machdep.c as early_xencons, so it can be
used in the future as early ouput for plain PV guests too.

Support non-VGA framebuffers for Xen dom0. This is mandatory for graphic
console on EFI-only hardware.

Add a xen_genfb_getbtinfo() function which will return a btinfo_framebuffer
structure, filled in with parameters provided by Xen

when runing as a Xen dom0, call xen_genfb_getbtinfo() instead of
lookup_bootinfo(BTINFO_FRAMEBUFFER) when adding properties to the
PCI graphic device (when genfb is attached) and in x86_genfb_init()
when genfb is used as console.

x86/x86/consinit.c: If running as a Xen dom0, use xen_genfb_getbtinfo()
to check if we have a genfb console

xen/x86/consinit.c: support genfb as possible console

xen/x86/consinit.c: use the hypervior IO as console until a better one
is found. If the hypervisor is using a serial port for boot messages,
we'll get NetBSD's boot message on the serial port too until
the real console takes over.

xen/x86/autoconf.c: rework device_register() to be closer to the x86 version.
Especially make sure that device_pci_register() is called.

Make sure to always fall back to xen_early_console, even for dom0

Enable genfb in DOM0 kernels

Add ext_lfb_base to dom0_vga_console_info, from recent Xen. We know if it's
present or not by checking dom0.info_size

Add XENPF_get_dom0_console, which gets a dom0_vga_console_info stucture
from the hypervisor. To be used by PVH dom0 kernels.

XENPVH option is not used. Fix consinit.c to use XENPVHVM as intended
and XENPVH from defflag
for a dom0 PVH, the dom0_vga_console_info structure has to be retrieved
using a platform hypercall; do so in the XENPVHVM case.

Now genfb works in a PVH dom0 running on Xen 4.18 (Xen 4.15 doesn't support
this platoform op, so no way to make it work here).
 1.5.18.1 18-Oct-2023  martin Pull up following revision(s) (requested by bouyer in ticket #425):

sys/arch/x86/pci/pci_machdep.c: revision 1.96
sys/arch/x86/acpi/acpi_machdep.c: revision 1.36
sys/arch/x86/x86/hyperv.c: revision 1.16
sys/arch/x86/x86/genfb_machdep.c: revision 1.21
sys/arch/x86/acpi/acpi_wakeup.c: revision 1.56
sys/arch/x86/include/genfb_machdep.h: revision 1.6

Declare
int acpi_md_vesa_modenum;
int acpi_md_vbios_reset;
struct vcons_screen x86_genfb_console_screen;

in genfb_machdep.h instead of locally as extern in various .c files.
 1.7 06-Oct-2022  msaitoh IOAPIC_ID_MASK is 8 bits these days. Fixes PR kern/54276.
 1.6 19-Jun-2019  msaitoh branches: 1.6.2;
Fix ioapic_dump_raw() to dump whole ioapic area.
 1.5 22-Apr-2017  nonaka branches: 1.5.4; 1.5.12;
Added I/O APIC EOI register definition.
 1.4 26-Jan-2013  dyoung branches: 1.4.14; 1.4.18;
Several registers and bitfields named IOAPIC_* actually belong to the
LAPIC, so rename them LAPIC_* and move to a more appropriate header
file.
 1.3 17-Aug-2011  dyoung branches: 1.3.2; 1.3.12;
Add definitions from [1] for the I/O APIC's MSI Message Address & Data
registers.

[1] Intel Corporation, Intel 64 and IA-32 Architectures Software
Developer's Manual, Volume 3A: System Programming Guide, Part 1,
http://www.intel.com/Assets/PDF/manual/253668.pdf, Chapter 10,
January, 2011.
 1.2 28-Apr-2008  martin branches: 1.2.14;
Remove clause 3 and 4 from TNF licenses
 1.1 26-Feb-2003  fvdl branches: 1.1.104; 1.1.106; 1.1.108;
Move some files out of i386 into x86, so that they can be shared with
other ports.
 1.1.108.1 16-May-2008  yamt sync with head.
 1.1.106.1 18-May-2008  yamt sync with head.
 1.1.104.1 02-Jun-2008  mjf Sync with HEAD.
 1.2.14.1 27-Aug-2011  jym Sync with HEAD. Most notably: uvm/pmap work done by rmind@, and MP Xen
work of cherry@.

No regression observed on suspend/restore.
 1.3.12.2 03-Dec-2017  jdolecek update from HEAD
 1.3.12.1 25-Feb-2013  tls resync with head
 1.3.2.1 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.4.18.1 26-Apr-2017  pgoyette Sync with HEAD
 1.4.14.1 28-Aug-2017  skrll Sync with HEAD
 1.5.12.1 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.5.4.1 10-Oct-2022  martin Pull up following revision(s) (requested by msaitoh in ticket #1769):

sys/arch/x86/x86/ioapic.c: revision 1.66
sys/arch/x86/include/i82093reg.h: revision 1.7

Print detail about misconfigured APIC ID.

IOAPIC_ID_MASK is 8 bits these days. Fixes PR kern/54276.
 1.6.2.1 10-Oct-2022  martin Pull up following revision(s) (requested by msaitoh in ticket #1536):

sys/arch/x86/x86/ioapic.c: revision 1.66
sys/arch/x86/include/i82093reg.h: revision 1.7

Print detail about misconfigured APIC ID.

IOAPIC_ID_MASK is 8 bits these days. Fixes PR kern/54276.
 1.16 23-May-2017  nonaka whitespace
 1.15 23-May-2017  nonaka x86: No ioapic_softc.sc_apicid is used anymore. Use ioapic_softc.sc_pic.pic_apicid.
 1.14 27-Apr-2015  knakahara add x86 MD MSI/MSI-X support code.
 1.13 27-Apr-2015  knakahara add intr_handle_t and let pci_intr_handle_t use it.
 1.12 15-Jun-2012  yamt branches: 1.12.2; 1.12.16;
comment
 1.11 25-Mar-2009  dyoung branches: 1.11.12;
It is only by accident that these get the definitions they need from
<sys/device.h>, so explicitly #include <sys/device.h>.
 1.10 03-Jul-2008  drochner branches: 1.10.4; 1.10.10;
split device/softc for ioapic
 1.9 03-Jul-2008  drochner Remove "struct device" from "struct pic", where it was only real
for ioapics and faked up for others. Add it to "struct ioapic_softc"
for now, until device/softc get split.
This required all typecasts between "struct pic" and "struct ioapic_softc"
to be replaced, I hope I got them all.
functionally tested on i386, compile-tested on xen, untested on amd64
 1.8 07-May-2008  joerg branches: 1.8.2; 1.8.4;
Remove some prototypes that are not implemented. Make some functions
static that are only used in intr.c.
 1.7 28-Apr-2008  martin Remove clause 3 and 4 from TNF licenses
 1.6 18-Apr-2008  cegger branches: 1.6.2; 1.6.4;
g/c unused ioapic_bsp_id.
Per discussion with bouyer.
 1.5 16-Apr-2008  cegger - use aprint_*_dev and device_xname
- use POSIX integer types
 1.4 09-Dec-2007  jmcneill branches: 1.4.10;
Merge jmcneill-pm branch.
 1.3 29-May-2005  christos branches: 1.3.2; 1.3.56; 1.3.58; 1.3.68; 1.3.70;
Sprinkle const.
 1.2 27-Oct-2003  junyoung Nuke __P().
 1.1 26-Feb-2003  fvdl branches: 1.1.2;
Move some files out of i386 into x86, so that they can be shared with
other ports.
 1.1.2.4 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.1.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.1.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.1.2.1 03-Aug-2004  skrll Sync with HEAD
 1.3.70.1 11-Dec-2007  yamt sync with head.
 1.3.68.1 26-Dec-2007  ad Sync with head.
 1.3.58.1 09-Jan-2008  matt sync with HEAD
 1.3.56.1 30-Sep-2007  joerg Add a second function ioapic_reenable that restores all vectors.
 1.3.2.1 21-Jan-2008  yamt sync with head
 1.4.10.2 28-Sep-2008  mjf Sync with HEAD.
 1.4.10.1 02-Jun-2008  mjf Sync with HEAD.
 1.6.4.2 04-May-2009  yamt sync with head.
 1.6.4.1 16-May-2008  yamt sync with head.
 1.6.2.1 18-May-2008  yamt sync with head.
 1.8.4.1 03-Jul-2008  simonb Sync with head.
 1.8.2.1 18-Sep-2008  wrstuden Sync with wrstuden-revivesa-base-2.
 1.10.10.2 01-Nov-2009  jym Sync with HEAD.
 1.10.10.1 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.10.4.1 28-Apr-2009  skrll Sync with HEAD.
 1.11.12.1 30-Oct-2012  yamt sync with head
 1.12.16.2 28-Aug-2017  skrll Sync with HEAD
 1.12.16.1 06-Jun-2015  skrll Sync with HEAD
 1.12.2.1 03-Dec-2017  jdolecek update from HEAD
 1.19 14-Jun-2019  msaitoh No functional change:
- Rename macros:
- ICR, LVT and MSIDATA can share the bit definitions. Remove redundant
definitions and use the common macros.
- Consistently use LAPIC_LVT_ for all local vector table's macro names.
- Use __BITS().
- Add definition for TSC-deadline (LAPIC_LVT_TMM_TSCDLT).
 1.18 13-Jun-2019  msaitoh Indent consistently. No functional change.
 1.17 13-Jun-2019  msaitoh Modify LAPIC_LVT_CMCI's comment to be consistent with other LVT's.
No functional change.
 1.16 28-Apr-2017  nonaka branches: 1.16.10;
Added AMD extended APIC register space present definition.
 1.15 22-Apr-2017  nonaka branches: 1.15.2;
move LAPIC_MSR* to specialreg.h.
 1.14 22-Apr-2017  nonaka Add x2APIC register definitions.
 1.13 17-Jul-2015  msaitoh branches: 1.13.2;
Indent. No functional change.
 1.12 26-Jan-2013  dyoung branches: 1.12.14;
Several registers and bitfields named IOAPIC_* actually belong to the
LAPIC, so rename them LAPIC_* and move to a more appropriate header
file.
 1.11 20-Jan-2012  hannken branches: 1.11.6;
Revert revision 1.4 and change LAPIC_LEVEL_ASSERT / _MASK back to 0x4000.

According to "Intel 64 and IA-32 Architectures Software Developer's Manual"
Vol. 3, May 2011, Order Number: 325384-039US, Section 10.6.1:

LEVEL_ASSERT is bit #14, bit #13 is reserved.

With this change NetBSD now boots multiple processors under CentOS 6.2/kvm.
 1.10 15-Nov-2010  cegger branches: 1.10.8; 1.10.12;
add interrupt EAPIC register definitions
 1.9 09-Jan-2010  cegger branches: 1.9.4;
add LAPIC_MSR_ENABLE_x2 MSR. from murray@river-styx via port-amd64@
'...as documented in the Intel 64 and IA32 Architectures Software
Developers Manual 3A, chapter 10.5.1.'
 1.8 12-May-2008  ad branches: 1.8.8; 1.8.12;
Some defs to describe the IA32_APIC_BASE MSR.
 1.7 09-May-2008  cegger Buildfix: Remove duplicate #defines.
 1.6 09-May-2008  ad LAPIC_ID_MASK is 8 bits these days.
 1.5 28-Apr-2008  martin branches: 1.5.2;
Remove clause 3 and 4 from TNF licenses
 1.4 22-Jan-2008  joerg branches: 1.4.6; 1.4.8; 1.4.10;
Fix LAPIC_LEVEL_MASK and related defines.
 1.3 14-Nov-2007  joerg branches: 1.3.6;
Merge from jmcneill-pm:
Add some more defines from the spec. Remove some old ones not
existing in the current Intel Architecture Guide. Use some more
understandable names.

ANSIfy and use uintXX_t to hurt my eyes less.

Further improve readability by exploiting __HAVE_TIMECOUNTER as
invariance on x86 platforms.
 1.2 14-Nov-2007  ad +LAPIC_DLMODE_EXTINT
 1.1 26-Feb-2003  fvdl branches: 1.1.18; 1.1.60; 1.1.78; 1.1.80; 1.1.84; 1.1.86;
Move some files out of i386 into x86, so that they can be shared with
other ports.
 1.1.86.2 18-Feb-2008  mjf Sync with HEAD.
 1.1.86.1 19-Nov-2007  mjf Sync with HEAD.
 1.1.84.1 18-Nov-2007  bouyer Sync with HEAD
 1.1.80.2 23-Mar-2008  matt sync with HEAD
 1.1.80.1 09-Jan-2008  matt sync with HEAD
 1.1.78.2 14-Nov-2007  joerg Sync with HEAD.
 1.1.78.1 06-Sep-2007  joerg Add some more defines from the spec. Remove some old ones not
existing in the current Intel Architecture Guide. Use some more
understandable names.
 1.1.60.1 03-Dec-2007  ad Sync with HEAD.
 1.1.18.2 04-Feb-2008  yamt sync with head.
 1.1.18.1 15-Nov-2007  yamt sync with head.
 1.3.6.1 23-Jan-2008  bouyer Sync with HEAD.
 1.4.10.2 11-Mar-2010  yamt sync with head
 1.4.10.1 16-May-2008  yamt sync with head.
 1.4.8.1 18-May-2008  yamt sync with head.
 1.4.6.1 02-Jun-2008  mjf Sync with HEAD.
 1.5.2.1 23-Jun-2008  wrstuden Sync w/ -current. 34 merge conflicts to follow.
 1.8.12.2 10-Jan-2011  jym Sync with HEAD
 1.8.12.1 24-Oct-2010  jym Sync with HEAD
 1.8.8.1 25-Jan-2012  riz Pull up following revision(s) (requested by hannken in ticket #1715):
- Be robust against an invalid timer period value.
sys/dev/ic/hpetreg.h Rev. 1.4
sys/dev/ic/hpet.c Rev. 1.8

- Fix wrong definition of LAPIC_LEVEL_ASSERT / _MASK
sys/arch/x86/include/i82489reg.h Rev. 1.11

- Add virtio driver - speed up disk and network access in virtual environments
sys/arch/i386/conf/GENERIC Rev. 1.1055
sys/arch/i386/conf/ALL Rev. 1.325
sys/arch/amd64/conf/GENERIC Rev. 1.338
sys/dev/pci/files.pci Rev. 1.350
sys/dev/pci/if_vioif.c Rev. 0-1.2
sys/dev/pci/ld_virtio.c Rev. 0-1.4
sys/dev/pci/viomb.c Rev. 0-1.1
sys/dev/pci/virtio.c Rev. 0-1.3
sys/dev/pci/virtioreg.h Rev. 0-1.1
sys/dev/pci/virtiovar.h Rev. 0-1.1
distrib/sets/lists/man/mi Rev. 1.1352 and 1.1358
share/man/man4/Makefile Rev. 1.573 and 1.575
share/man/man4/ld.4 Rev. 1.19
share/man/man4/virtio.4 Rev. 0-1.4
share/man/man4/vioif.4 Rev. 0-1.2
share/man/man4/viomb.4 Rev. 0-1.2

Allow NetBSD to run unmodified under Linux/kvm.
 1.9.4.1 05-Mar-2011  rmind sync with head
 1.10.12.1 18-Feb-2012  mrg merge to -current.
 1.10.8.2 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.10.8.1 17-Apr-2012  yamt sync with head
 1.11.6.2 03-Dec-2017  jdolecek update from HEAD
 1.11.6.1 25-Feb-2013  tls resync with head
 1.12.14.2 28-Aug-2017  skrll Sync with HEAD
 1.12.14.1 22-Sep-2015  skrll Sync with HEAD
 1.13.2.1 26-Apr-2017  pgoyette Sync with HEAD
 1.15.2.1 02-May-2017  pgoyette Sync with HEAD - tag prg-localcount2-base1
 1.16.10.1 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.21 21-May-2020  ad - Recalibrate the APIC timer using the TSC, once the TSC has in turn been
recalibrated using the HPET. This gets the clock interrupt firing more
closely to HZ.

- Undo change with recent Xen merge and go back to starting the clocks in
initclocks() on the boot CPU, and in cpu_hatch() on secondary CPUs.

- On reflection don't use HPET delay any more, it works very well but means
going over the bus. It's enough to use HPET to calibrate the TSC and
APIC.

Tested on amd64 native, xen and xen PVH.
 1.20 01-Dec-2019  maxv localify
 1.19 23-May-2017  nonaka branches: 1.19.10;
x86: Add preliminary x2APIC support.

x2APIC is used only when x2APIC is enabled in BIOS/UEFI.
LAPIC ID is not supported above 256.
 1.18 22-Apr-2017  nonaka use CR8 instead of LAPIC Task Priority register on x86-64.
 1.17 19-Apr-2017  nonaka remove prototypes of nonexistent function.
 1.16 25-Nov-2016  maxv branches: 1.16.2;
Move the virtual address of the LAPIC page out of the data segment on amd64
and i386. The old design was error-prone, and it didn't allow us to map the
data segment with large pages.

Now, the VA is allocated dynamically in the pmap bootstrap code, and entered
manually later. We go from using &local_apic to using *local_apic_va, and we
therefore need one more level of indirection in the asm code.

Discussed on tech-kern.
 1.15 16-Oct-2016  maxv Remove lapic_tpr on amd64 and i386, unused. Now, we have only one pointer
to the LAPIC page, and each register access is done with relative offsets.
 1.14 12-Jun-2011  rmind branches: 1.14.12; 1.14.30; 1.14.34;
Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.
 1.13 18-May-2011  drochner branches: 1.13.2;
remove stale declarations / empty function
 1.12 28-Apr-2008  martin branches: 1.12.14; 1.12.22; 1.12.28;
Remove clause 3 and 4 from TNF licenses
 1.11 14-Apr-2008  cegger branches: 1.11.2; 1.11.4;
- u_int32_t -> uint32_t
- ansfiy
 1.10 09-Dec-2007  jmcneill branches: 1.10.10;
Merge jmcneill-pm branch.
 1.9 03-Dec-2007  joerg branches: 1.9.2;
Revert last commit which added externs that never get defined anywhere.
At least lapic_get_timecount conflicts with the newly added lapic TC.
 1.8 03-Dec-2007  ad branches: 1.8.2;
Interrupt handling changes, in discussion since February:

- Reduce available SPL levels for hardware devices to none, vm, sched, high.
- Acquire kernel_lock only for interrupts at IPL_VM.
- Implement threaded soft interrupts.
 1.7 17-Oct-2007  garbled branches: 1.7.2;
Merge the ppcoea-renovation branch to HEAD.

This branch was a major cleanup and rototill of many of the various OEA
cpu based PPC ports that focused on sharing as much code as possible
between the various ports to eliminate near-identical copies of files in
every tree. Additionally there is a new PIC system that unifies the
interface to interrupt code for all different OEA ppc arches. The work
for this branch was done by a variety of people, too long to list here.

TODO:
bebox still needs work to complete the transition to -renovation.
ofppc still needs a bunch of work, which I will be looking at.
ev64260 still needs to be renovated
amigappc was not attempted.

NOTES:
pmppc was removed as an arch, and moved to a evbppc target.
 1.6 29-Aug-2007  ad Merge most x86 changes from the vmlocking branch, except the threaded soft
interrupt stuff. This is mostly comprised of changes to the pmap modules to
work on multiprocessor systems without kernel_lock, and changes to speed up
tlb shootdowns.
 1.5 16-Feb-2006  perry branches: 1.5.24; 1.5.32; 1.5.38; 1.5.42; 1.5.44;
Change "inline" back to "__inline" in .h files -- C99 is still too
new, and some apps compile things in C89 mode. C89 keywords stay.

As per core@.
 1.4 24-Dec-2005  perry branches: 1.4.2; 1.4.4; 1.4.6;
__asm__ -> __asm
__const__ -> const
__inline__ -> inline
__volatile__ -> volatile
 1.3 27-Oct-2003  junyoung branches: 1.3.16;
Nuke __P().
 1.2 19-Jul-2003  lukem change multiple include protection #define to match filename
 1.1 26-Feb-2003  fvdl branches: 1.1.2;
Move some files out of i386 into x86, so that they can be shared with
other ports.
 1.1.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.1.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.1.2.1 03-Aug-2004  skrll Sync with HEAD
 1.3.16.3 21-Jan-2008  yamt sync with head
 1.3.16.2 03-Sep-2007  yamt sync with head.
 1.3.16.1 21-Jun-2006  yamt sync with head.
 1.4.6.1 22-Apr-2006  simonb Sync with head.
 1.4.4.1 09-Sep-2006  rpaulo sync with head
 1.4.2.1 18-Feb-2006  yamt sync with head.
 1.5.44.2 09-Jan-2008  matt sync with HEAD
 1.5.44.1 06-Nov-2007  matt sync with HEAD
 1.5.42.3 09-Dec-2007  jmcneill Sync with HEAD.
 1.5.42.2 03-Sep-2007  jmcneill Sync with HEAD.
 1.5.42.1 03-Aug-2007  jmcneill Pull in power management changes from private branch.
 1.5.38.1 03-Sep-2007  skrll Sync with HEAD.
 1.5.32.1 03-Oct-2007  garbled Sync with HEAD
 1.5.24.2 03-Dec-2007  ad Sync with HEAD.
 1.5.24.1 29-Jul-2007  ad - When zeroing/copying pages, use SSE2 movtni to avoid polluting the cache.
- By default, align assembly routines on 32-byte starting boundaries.
- There are now 8 interrupt priority levels, half of which are softints.
Update intrdefs.h to match.
- Always clear/set spinlock words - removes lots of ifdefs.
- Remove the horrible ci_self150 hack that I introduced.
- Overhaul how TLB shootdown is performed. Inspired by a similar change in
OpenBSD but implemented quite differently. This should be a lot faster
but I have not benchmarked it yet.
 1.7.2.2 27-Dec-2007  mjf Sync with HEAD.
 1.7.2.1 08-Dec-2007  mjf Sync with HEAD.
 1.8.2.2 26-Dec-2007  ad Sync with head.
 1.8.2.1 08-Dec-2007  ad Sync with head.
 1.9.2.1 11-Dec-2007  yamt sync with head.
 1.10.10.1 02-Jun-2008  mjf Sync with HEAD.
 1.11.4.1 16-May-2008  yamt sync with head.
 1.11.2.1 18-May-2008  yamt sync with head.
 1.12.28.1 06-Jun-2011  jruoho Sync with HEAD.
 1.12.22.2 31-May-2011  rmind sync with head
 1.12.22.1 26-Apr-2010  rmind Apply renovated patch to significantly reduce TLB shootdowns in x86 pmap,
also provide TLBSTATS option to measure and track TLB shootdowns. Details:

http://mail-index.netbsd.org/port-i386/2009/01/11/msg001018.html

Patch from Andrew Doran, proposed on tech-x86 [sic], in January 2009.

XXX: amd64 and xen are not yet; work in progress.
 1.12.14.1 27-Aug-2011  jym Sync with HEAD. Most notably: uvm/pmap work done by rmind@, and MP Xen
work of cherry@.

No regression observed on suspend/restore.
 1.13.2.1 23-Jun-2011  cherry Catchup with rmind-uvmplock merge.
 1.14.34.3 26-Apr-2017  pgoyette Sync with HEAD
 1.14.34.2 07-Jan-2017  pgoyette Sync with HEAD. (Note that most of these changes are simply $NetBSD$
tag issues.)
 1.14.34.1 04-Nov-2016  pgoyette Sync with HEAD
 1.14.30.2 28-Aug-2017  skrll Sync with HEAD
 1.14.30.1 05-Dec-2016  skrll Sync with HEAD
 1.14.12.1 03-Dec-2017  jdolecek update from HEAD
 1.16.2.1 21-Apr-2017  bouyer Sync with HEAD
 1.19.10.1 08-Apr-2020  martin Merge changes from current as of 20200406
 1.4 07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.3 04-May-2003  fvdl branches: 1.3.2;
Block level-triggered interrupts at the ioapic if they are deferred.
Avoids interrupt storms seen on some systems. Many thanks to
Stoned Elipot for testing.
 1.2 03-Mar-2003  fvdl use CVAROFF.
 1.1 26-Feb-2003  fvdl Move some files out of i386 into x86, so that they can be shared with
other ports.
 1.3.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.3.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.3.2.1 03-Aug-2004  skrll Sync with HEAD
 1.13 02-Jan-2024  christos use sized types
 1.12 16-Sep-2023  christos protect against multiple inclusion
 1.11 15-Sep-2010  christos Commit SoC long double support from Stathis Kamperis
 1.10 02-Feb-2007  christos branches: 1.10.48; 1.10.62; 1.10.68; 1.10.70;
Merge the int bit with the high fraction bit. Add constants/macros
needed by gdtoa.
 1.9 15-Apr-2005  kleink branches: 1.9.2; 1.9.28; 1.9.32;
Push back the descriptions of NaN formats, and descriptions of the
distinction between signalling NaNs and quiet NaNs back into the
machine-dependent headers; treat the implementation of __nanf in the
same spirit.

IEEE 754 leaves the distinction between signalling NaNs and quiet NANs
to the implementation, and unlike our headers used to suggest they're
not identical in the interpretation of the fraction's MSb; in due
course, make those of hppa, mips, sh3, and sh5 reflect reality.
 1.8 27-Oct-2003  kleink branches: 1.8.8; 1.8.14;
Err, rename some members added in previous to make them reflect their
semantics better.
 1.7 26-Oct-2003  kleink For convenient use in libc, add unions of the C floating types and their
corresponding structure definitions.
 1.6 26-Oct-2003  kleink Correct the position of the QUIETNAN bit.
 1.5 26-Oct-2003  kleink Use <sys/ieee754.h> where applicable.
 1.4 25-Oct-2003  kleink Reflect the explicit integer bit here as well.
 1.3 23-Oct-2003  kleink Make ieee_ext match reality, and add a note about its ABI-specific
tail padding.
 1.2 07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.1 26-Feb-2003  fvdl branches: 1.1.2;
Move some files out of i386 into x86, so that they can be shared with
other ports.
 1.1.2.4 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.1.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.1.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.1.2.1 03-Aug-2004  skrll Sync with HEAD
 1.8.14.1 19-Apr-2005  tron Pull up revision 1.9 (requested by kleink in ticket #163):
Push back the descriptions of NaN formats, and descriptions of the
distinction between signalling NaNs and quiet NaNs back into the
machine-dependent headers; treat the implementation of __nanf in the
same spirit.
IEEE 754 leaves the distinction between signalling NaNs and quiet NANs
to the implementation, and unlike our headers used to suggest they're
not identical in the interpretation of the fraction's MSb; in due
course, make those of hppa, mips, sh3, and sh5 reflect reality.
 1.8.8.1 29-Apr-2005  kent sync with -current
 1.9.32.1 07-May-2007  pavel Pull up following revision(s) (requested by manu in ticket #607):
lib/libc/arch/i386/gen/isnanl.c: revision 1.6
lib/libc/gdtoa/gdtoa.c: revision 1.2-1.3
lib/libc/arch/x86_64/gen/isnanl.c: revision 1.6
lib/libc/gdtoa/gdtoaimp.h: revision 1.6
sys/arch/m68k/include/ieee.h: revision 1.13
usr.bin/xlint/lint1/scan.l: revision 1.36-1.37
lib/libc/stdio/snprintf_ss.c: revision 1.4
lib/libc/arch/i386/gen/isfinitel.c: revision 1.2
lib/libc/stdio/vfscanf.c: revision 1.38
sys/arch/sparc/include/ieee.h: revision 1.11-1.12
lib/libc/gdtoa/dtoa.c: revision 1.4
lib/libc/stdio/Makefile.inc: revision 1.35
lib/libc/stdio/fvwrite.c: revision 1.17
lib/libc/arch/m68k/gen/fpclassifyl.c: revision 1.2
lib/libc/arch/i386/gen/isinfl.c: revision 1.6
lib/libc/arch/x86_64/gen/isinfl.c: revision 1.6
lib/libc/arch/x86_64/gen/isfinitel.c: revision 1.2
lib/libc/stdio/vfprintf.c: revision 1.55-1.57
lib/libc/stdio/vsnprintf_ss.c: revision 1.3
lib/libc/stdio/vfwprintf.c: revision 1.10
sys/arch/x86/include/ieee.h: revision 1.10
lib/libc/gdtoa/dmisc.c: revision 1.3
lib/libc/gdtoa/Makefile.inc: revision 1.5
sys/arch/hppa/include/ieee.h: revision 1.10
lib/libc/arch/x86_64/gen/fpclassifyl.c: revision 1.3
lib/libc/arch/i386/gen/fpclassifyl.c: revision 1.2
sys/sys/ieee754.h: revision 1.7
lib/libc/gdtoa/gdtoa.h: revision 1.7
include/stdio.h: revision 1.67-1.68
lib/libc/gdtoa/hdtoa.c: revision 1.1-1.4
lib/libc/gdtoa/ldtoa.c: revision 1.1-1.4
defined(_NETBSD_SOURCE) is equivalent to (!defined(_ANSI_SOURCE) &&
!defined(_POSIX_C_SOURCE) && !defined(_XOPEN_SOURCE)), so there's no
need to check both of them.
Fix for issue reported in PR lib/35401 as well as related overflow bugs.
deal with hex doubles.
Instead of abusing stdio to get a signal-safe version of sprintf, provide one.
remove __SAFE
add long double and hex double support from freebsd.
make this compile.
add new prototypes.
add the new files to the build. Note I am not bumping libc now, because
these are not used yet.
Merge the int bit with the high fraction bit. Add constants/macros
needed by gdtoa.
add constants used by gdtoa
since the int bit is merged, do the explicit math.
ext_int bit is no more.
ext_int bit is no more.
- merge change from freebsd
- add support for building as vfprintf.c
- XXX: we strdup to simplify the freeing logic. This should be fixed for
efficiency in the vfprintf case.
use vfwprintf.c
enable wide doubles.
some int -> size_t
deal with sparc64 that has 112 bits of mantissa.
make extended precision gdtoa friendly.
int/size_t changes
make this gdtoa friendly.
remove dup definition
use dtoa() instead of returning empty when we don't have extended precision
information.
Fix previous, add forgotten pointer dereference in the call to dtoa().
Add a cheesy workaround marked XXX for the situation where the
strtod() implementation available in the environment does not
handle hex floats.
Discussed with and suggested by christos
From Christos: gdtoa fixes for m68k. M68k ports should build now, but
printing extended precision is a little off.
vax does not have <machine/ieee.h> or long double
It would be nice if the compiler provided something like __IEEE_MATH__
bring in FreeBSD's vfscanf() to gain multi-byte/collation support.
Unfortunately it is too difficult to make vfwscanf and this share
the same code like I did with printf, because for string parsing
the code is too different.
 1.9.28.1 09-Feb-2007  ad Sync with HEAD.
 1.9.2.1 26-Feb-2007  yamt sync with head.
 1.10.70.1 05-Mar-2011  rmind sync with head
 1.10.68.1 22-Oct-2010  uebayasi Sync with HEAD (-D20101022).
 1.10.62.1 24-Oct-2010  jym Sync with HEAD
 1.10.48.1 09-Oct-2010  yamt sync with head
 1.4 26-Mar-2011  christos add fp{g,s}etprec
 1.3 31-Jul-2010  joerg branches: 1.3.2;
Add support for fenv.h interface for i386 and amd64.

Submitted by Stathis Kamperis as part of GSoC 2010 and ported from
FreeBSD.
 1.2 05-Aug-2008  matt branches: 1.2.8; 1.2.14; 1.2.16;
Update <machine/ieeefp.h> to use the C99 FE_* definitions instead of the
NetBSD defined ones. Redefine the NetBSD ones in terms of the C99 ones.
Step 1 to having <fenv.h>
 1.1 26-Feb-2003  fvdl branches: 1.1.104; 1.1.108; 1.1.110; 1.1.114;
Move some files out of i386 into x86, so that they can be shared with
other ports.
 1.1.114.1 19-Oct-2008  haad Sync with HEAD.
 1.1.110.1 18-Sep-2008  wrstuden Sync with wrstuden-revivesa-base-2.
 1.1.108.2 11-Aug-2010  yamt sync with head.
 1.1.108.1 04-May-2009  yamt sync with head.
 1.1.104.1 28-Sep-2008  mjf Sync with HEAD.
 1.2.16.2 21-Apr-2011  rmind sync with head
 1.2.16.1 05-Mar-2011  rmind sync with head
 1.2.14.1 17-Aug-2010  uebayasi Sync with HEAD.
 1.2.8.2 28-Mar-2011  jym Sync with HEAD. TODO before merge:
- shortcut for suspend code in sysmon, when powerd(8) is not running.
Borrow ``xs_watch'' thread context?
- bug hunting in xbd + xennet resume. Rings are currently thrashed upon
resume, so current implementation force flush them on suspend. It's not
really needed.
 1.2.8.1 24-Oct-2010  jym Sync with HEAD
 1.3.2.1 06-Jun-2011  jruoho Sync with HEAD.
 1.66 07-Sep-2022  knakahara NetBSD/x86: Raise the number of interrupt sources per CPU from 32 to 56.

There has been no objection for three years.
https://mail-index.netbsd.org/port-amd64/2019/09/22/msg003012.html
Implemented by nonaka@n.o, updated by me.
 1.65 24-May-2022  bouyer Some devices (e.g. ixg in MSI-X mode) don't to have their handlers called
when no interrupt are pending. So add an extra ih_pending field
to struct intrhand, which is incremeted when the handler is not called because
of IPL level and reset to 0 when called. Check this in Xen's resume
assembly to call only handlers that are really pending.
 1.64 04-Apr-2022  andvar fix various typos, mainly in comments.
 1.63 12-Mar-2022  riastradh x86: Check for biglock leakage in interrupt handlers.
 1.62 25-Apr-2020  bouyer Merge the bouyer-xenpvh branch, bringing in Xen PV drivers support under HVM
guests in GENERIC.
Xen support can be disabled at runtime with
boot -c
disable hypervisor
 1.61 22-Dec-2019  thorpej branches: 1.61.6;
Add intr_mask() and corresponding intr_unmask() calls that allow specific
interrupt lines / sources to be masked as needed (rather than making a
set of sources by IPL as with spl*()).
 1.60 14-Feb-2019  cherry Welcome XENPVHVM mode.

It is UP only, has xbd(4) and xennet(4) as PV drivers.

The console is com0 at isa and the native portion is very
rudimentary AT architecture, so is probably suboptimal to
run without PV support.
 1.59 13-Feb-2019  cherry Missed the crucial header file in previous commit.

struct intrstub; is now uniform across native and XEN

This should fix the XEN builds.
 1.58 11-Feb-2019  cherry We reorganise definitions for XEN source support as follows:

XEN - common sources required for baseline XEN support.
XENPV - sources required for support of XEN in PV mode.
XENPVHVM - sources required for support for XEN in HVM mode.
XENPVH - sources required for support for XEN in PVH mode.
 1.57 13-Dec-2018  cherry Allow x86 builds to have the opportunity to not have pre-emption
enabled by default. This can be effected by having a:

"options NO_PREEMPTION"

line in the kernel configuration file.

While it was tempting to tie __HAVE_PREEMPTION to MULTIPROCESSOR,
as is currently assumed in sys/kern/kern_stub.c ,

having MULTIPROCESSOR without __HAVE_PREEMPTION
and not having either are valid configuration options which users
could have choice of. We thus err on the side of configurability.
 1.56 24-Jun-2018  jdolecek branches: 1.56.2;
add support for kern.intr.list aka intrctl(8) 'list' for xen

event_set_handler() and pirq_establish() now have extra intrname
parameter; shared intr_create_intrid() is used to provide the value

xen drivers were changed to pass the specific driver instance
name as the xname, e.g. 'vcpu0 clock' instead just 'clock', or
'xencons0' instead of 'xencons'

associated evcnt is now changed to use intrname - this matches native x86
 1.55 04-Apr-2018  christos Rename Xpreempt{recurse,resume} -> X{recurse,resume}_preempt so that
they fit the pattern. Also the debugger trap sniffer matches them
without adding special entries...
XXX: pullup-8.
 1.54 17-Feb-2018  maxv branches: 1.54.2;
Rename i8259_stubs -> legacy_stubs. We will want the entries to have the
same name, eg:

legacy_stubs
-> Xintr_legacy0, Xrecurse_legacy0, Xresume_legacy0
-> Xintr_legacy1, Xrecurse_legacy1, Xresume_legacy1
...
 1.53 04-Jan-2018  knakahara fix "intrctl list" panic when ACPI is disabled.

reviewed by cherry@n.o and tested by msaitoh@n.o, thanks.
 1.52 04-Nov-2017  cherry Retire xen/x86/intr.c and use the new xen specific glue in x86/x86/intr.c

The purpose of this change is to expose the x86/include/intr.h API
to drivers. Specifically the following functions:

void *intr_establish_xname(...);
void *intr_establish(...);
void intr_disestablish(...);

while maintaining the old API from xen/include/evtchn.h, specifically
the following functions:

int event_set_handler(...);
int event_remove_handler(...);

This is so that if things break, we can keep using the old API until
everything stabilises. This is a stepping stone towards getting the
actual XEN event callback path rework code in place - which can be
done opaquely behind the intr.h API - NetBSD/XEN specific drivers that
have been ported to the intr.h API should then work without
significant further modifications.
 1.51 16-Jul-2017  cherry branches: 1.51.2;
Unify the xen and native x86/ interrupt setup functions and
spl traversal data structures.

This is towards PVHVM.
 1.50 23-May-2017  nonaka branches: 1.50.2;
x86: Add preliminary x2APIC support.

x2APIC is used only when x2APIC is enabled in BIOS/UEFI.
LAPIC ID is not supported above 256.
 1.49 07-Jul-2016  msaitoh KNF. Remove extra spaces. No functional change.
 1.48 17-Aug-2015  knakahara Add kernel code to support intrctl(8).
 1.47 27-Apr-2015  knakahara add intr_handle_t and let pci_intr_handle_t use it.
 1.46 27-Apr-2015  knakahara add pci_intr_distribute(9) for x86.
 1.45 20-Jul-2014  uebayasi branches: 1.45.4;
ipifunc[]: Comment IPI constant names for grep'ability. Constify.
 1.44 29-Mar-2014  christos branches: 1.44.2;
make pci_intr_string and eisa_intr_string take a buffer and a length
instead of relying in local static storage.
 1.43 01-Aug-2011  drochner branches: 1.43.2; 1.43.12; 1.43.16;
if checking whether an interrupt is shared, don't compare pin numbers
if it is "-1" -- this is a hack to allow MSIs which don't have a concept
of pin numbers, and are generally not shared
(This doesn't give us sensible event names for statistics display. The
whole abstraction has more exceptions than regular cases, it should
be redesigned imho.)
 1.42 03-Apr-2011  dyoung Clean up excessive #ifdef'age of NMI trap handling for amd64/i386/xen.
Handle NMI in all Xen kernels.
 1.41 02-May-2010  plunky branches: 1.41.2;
The spl inline functions refer to external symbols that are only
defined in the kernel.

Wrap kernel-specific declarations in #ifdef _KERNEL to avoid unresolved
references when including from userland.
 1.40 25-Apr-2010  ad Nothing uses x86_multicast_ipi() right now and it complicates many
CPU support, so remove it.
 1.39 19-Apr-2009  ad branches: 1.39.2; 1.39.4;
cpuctl:

- Add interrupt shielding (direct hardware interrupts away from the
specified CPUs). Not documented just yet but will be soon.

- Redo /dev/cpu time_t compat so no kernel changes are needed.

x86:

- Make intr_establish, intr_disestablish safe to use when !cold.

- Distribute hardware interrupts among the CPUs, instead of directing
everything to the boot CPU.

- Add MD code for interrupt sheilding. This works in most cases but there is
a bug where delivery is not accepted by an LAPIC after redistribution. It
also needs re-balancing to make things fair after interrupts are turned
back on for a CPU.
 1.38 27-Mar-2009  dyoung If defined(_KERNEL), #include <sys/types.h>, otherwise #include
<stdbool.h>, for the bool definition that we need. intr.h only got the
definition by chance, before.
 1.37 25-Mar-2009  dyoung It is only by accident that this gets the definitions it needs from
<sys/evcnt.h>, so explicitly #include <sys/evcnt.h>.
 1.36 24-Feb-2009  yamt - rewrite x86 nmi dispatcher so that establish and disesablish are safe
on a running system.
- adapt existing users of the api. (elan)
- adapt tprof_pmi driver to use the api.
 1.35 30-May-2008  ad branches: 1.35.6; 1.35.12;
Add a 'known_mpsafe' argument to intr_establish().
 1.34 07-May-2008  joerg branches: 1.34.2;
Remove some prototypes that are not implemented. Make some functions
static that are only used in intr.c.
 1.33 28-Apr-2008  ad Add support for kernel preeemption to the i386 and amd64 ports. Notes:

- I have seen one isolated panic in the x86 pmap, but otherwise i386
seems stable with preemption enabled.

- amd64 is missing the FPU handling changes and it's not yet safe to
enable it there.

- The usual level for kern.sched.kpreempt_pri will be 128 once enabled
by default. For testing, setting it to 0 helps to shake out bugs.
 1.32 28-Apr-2008  martin Remove clause 3 and 4 from TNF licenses
 1.31 21-Jan-2008  dyoung branches: 1.31.6; 1.31.8; 1.31.10;
Add primitive routines to establish NMI handlers on i386.

TBD: synchronize (dis)establishment of handlers.
 1.30 26-Dec-2007  yamt - share idt entry allocation code among x86.
- introduce a function to reserve an idt entry and use it instead of
manipulating idt_allocmap directly.
- rename idt to xen_idt for amd64 xen. add missing #ifdef XEN.
 1.29 03-Dec-2007  ad branches: 1.29.2; 1.29.6; 1.29.8;
Interrupt handling changes, in discussion since February:

- Reduce available SPL levels for hardware devices to none, vm, sched, high.
- Acquire kernel_lock only for interrupts at IPL_VM.
- Implement threaded soft interrupts.
 1.28 17-Oct-2007  garbled branches: 1.28.2;
Merge the ppcoea-renovation branch to HEAD.

This branch was a major cleanup and rototill of many of the various OEA
cpu based PPC ports that focused on sharing as much code as possible
between the various ports to eliminate near-identical copies of files in
every tree. Additionally there is a new PIC system that unifies the
interface to interrupt code for all different OEA ppc arches. The work
for this branch was done by a variety of people, too long to list here.

TODO:
bebox still needs work to complete the transition to -renovation.
ofppc still needs a bunch of work, which I will be looking at.
ev64260 still needs to be renovated
amigappc was not attempted.

NOTES:
pmppc was removed as an arch, and moved to a evbppc target.
 1.27 09-Jul-2007  ad branches: 1.27.8; 1.27.10;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements
 1.26 17-May-2007  yamt merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.
 1.25 16-Feb-2007  ad branches: 1.25.2; 1.25.6; 1.25.8; 1.25.14;
Remove spllowersoftclock() and CLKF_BASEPRI(), and always dispatch callouts
via a soft interrupt. In the near future, softclock will be run from process
context.
 1.24 09-Feb-2007  ad Merge newlock2 to head.
 1.23 26-Dec-2006  ad Define ipl_t as uint8_t so that it can be packed into a word with a lock
byte. Ok yamt@.
 1.22 21-Dec-2006  yamt merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.
 1.21 04-Jul-2006  christos branches: 1.21.4; 1.21.6;
Apply fvdl's acpi pci interrupt configuration code.
- MPACPI is no more.
- MPACPI_SCANPCI -> ACPI_SCANPCI
 1.20 16-Feb-2006  perry branches: 1.20.2; 1.20.10;
Change "inline" back to "__inline" in .h files -- C99 is still too
new, and some apps compile things in C89 mode. C89 keywords stay.

As per core@.
 1.19 24-Dec-2005  perry branches: 1.19.2; 1.19.4; 1.19.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.
 1.18 03-Nov-2005  yamt - use sys/spl.h.
- add some IPL_ definitions.
 1.17 29-Oct-2005  yamt add splraiseipl().
 1.16 28-Oct-2005  yamt remove duplicated spllpt().
 1.15 31-Oct-2004  yamt branches: 1.15.12; 1.15.14;
use __insn_barrier rather than homegrown equivalents.
 1.14 23-Oct-2004  yamt to determine if an interrupt needs to grab the kernel lock or not,
check interrupt's own ipl rather than cpu's current ipl.
 1.13 28-Jun-2004  fvdl Updaing ci_ilevel and testing ci_ipending must be done with all interrupts
off, or priority inversion can occur, which can lead to IPI deadlocks.
Leaves interrupts off for a bit longer, sadly, but with no noticeable
effects on the systems I tested on.

From YAMAMOTO Takashi.
 1.12 04-Mar-2004  dbj fix comment about spllowersoftclock
 1.11 14-Jan-2004  yamt spllower: lower spl before checking pending interrupts.
otherwise, interrupts happened immediately after the check might be left
pending for a while. (until the next tick in the worse case.)
 1.10 30-Oct-2003  fvdl * keep track of PCI buses that aren't known by firmware, but are found
by NetBSD
* use this info in in intr_find_mpmapping
* get rid of the last argument to intr_find_mpmapping, it was redundant
 1.9 27-Oct-2003  junyoung Nuke __P().
 1.8 16-Oct-2003  fvdl Add hooks and structures to allow the MP table intr mapping code a
better shot at finding a mapping. For PCI interrupts, if a bus
has no mappings, try its parent, with the swizzled pin, and the
bridge's device number.
 1.7 06-Sep-2003  fvdl Move the bulk of pci_intr_string into a seperate intr_string function. Use
that new function to print the pciide compat interrupt in pciide_machdep.c.
Share pciide_machdep.c between amd64 and i386.
 1.6 20-Aug-2003  fvdl Pass pointers to frames from assembly, do not use the 'frame on stack
as argument passed by value' trick, as gcc 3.3.x makes (valid) assumptions
about the stack that will not be true. Costs 2 instructions per trap/syscall
on i386, 4 per interrupt for MP. One instruction per trap/syscall on amd64,
2 per interrupt for MP. I expect gcc 3.3.1 to make up for this by better
optimization (it'd better..)

While here, make amd64 compile again by using subr_mbr_disk.c
 1.5 23-Jun-2003  martin branches: 1.5.2;
#ifdef _KERNEL_OPT police
 1.4 23-Jun-2003  martin Make sure to include opt_foo.h if a defflag option FOO is used.
 1.3 16-Jun-2003  thorpej Rename IPL_IMP -> IPL_VM.
 1.2 04-May-2003  fvdl Block level-triggered interrupts at the ioapic if they are deferred.
Avoids interrupt storms seen on some systems. Many thanks to
Stoned Elipot for testing.
 1.1 26-Feb-2003  fvdl Move some files out of i386 into x86, so that they can be shared with
other ports.
 1.5.2.5 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.5.2.4 02-Nov-2004  skrll Sync with HEAD.
 1.5.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.5.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.5.2.1 03-Aug-2004  skrll Sync with HEAD
 1.15.14.1 02-Nov-2005  yamt sync with head.
 1.15.12.6 21-Jan-2008  yamt sync with head
 1.15.12.5 07-Dec-2007  yamt sync with head
 1.15.12.4 03-Sep-2007  yamt sync with head.
 1.15.12.3 26-Feb-2007  yamt sync with head.
 1.15.12.2 30-Dec-2006  yamt sync with head.
 1.15.12.1 21-Jun-2006  yamt sync with head.
 1.19.6.1 22-Apr-2006  simonb Sync with head.
 1.19.4.1 09-Sep-2006  rpaulo sync with head
 1.19.2.1 18-Feb-2006  yamt sync with head.
 1.20.10.1 13-Jul-2006  gdamore Merge from HEAD.
 1.20.2.1 11-Aug-2006  yamt sync with head
 1.21.6.3 21-Sep-2006  yamt rename splraiseipl argument to match with the rest of ports.
 1.21.6.2 18-Sep-2006  yamt correct a header.
 1.21.6.1 18-Sep-2006  yamt implement new api for i386 and amd64.
 1.21.4.2 27-Jan-2007  ad If running on a PPro or later, at boot patch in versions of spllower() and
similar that use cmpxchg8b instead of cli/sti. Cuts the clock cycles for
splx() by a factor of ~6 on the P4, and ~3 on the PIII when bracketed by
serializing instructions (and hopefully more when not).
 1.21.4.1 12-Jan-2007  ad Sync with head.
 1.25.14.2 03-Oct-2007  garbled Sync with HEAD
 1.25.14.1 22-May-2007  matt Update to HEAD.
 1.25.8.1 11-Jul-2007  mjf Sync with head.
 1.25.6.4 03-Dec-2007  ad Sync with HEAD.
 1.25.6.3 03-Dec-2007  ad Sync with HEAD.
 1.25.6.2 17-Jun-2007  ad - Increase the number of thread priorities from 128 to 256. How the space
is set up is to be revisited.
- Implement soft interrupts as kernel threads. A generic implementation
is provided, with hooks for fast-path MD code that can run the interrupt
threads over the top of other threads executing in the kernel.
- Split vnode::v_flag into three fields, depending on how the flag is
locked (by the interlock, by the vnode lock, by the file system).
- Miscellaneous locking fixes and improvements.
 1.25.6.1 27-May-2007  ad Sync with head.
 1.25.2.1 23-Mar-2007  ad - Decouple intr.h from cpu.h.
- Define splraise in spl.S. As a side effect it becomes "preemption safe".
- Make softintr_schedule a function in softintr.c.
- Make softintr a function in spl.S, and remove the unneeded lock prefix.
 1.27.10.3 23-Mar-2008  matt sync with HEAD
 1.27.10.2 09-Jan-2008  matt sync with HEAD
 1.27.10.1 06-Nov-2007  matt sync with HEAD
 1.27.8.1 09-Dec-2007  jmcneill Sync with HEAD.
 1.28.2.2 18-Feb-2008  mjf Sync with HEAD.
 1.28.2.1 08-Dec-2007  mjf Sync with HEAD.
 1.29.8.1 16-Dec-2007  cube Split off device-specific stuff out of subr_autconf.c, and split off
autoconf-specific stuff out of device.h.

The only functional change is the removal of the unused evcnt.h include in
device.h which (*sigh*) has side-effects in x86's intr.h, and probably some
other in the rest of the tree but I'm only compiling i386's QEMU for the
time being.
 1.29.6.2 23-Jan-2008  bouyer Sync with HEAD.
 1.29.6.1 02-Jan-2008  bouyer Sync with HEAD
 1.29.2.1 26-Dec-2007  ad Sync with head.
 1.31.10.3 11-Aug-2010  yamt sync with head.
 1.31.10.2 04-May-2009  yamt sync with head.
 1.31.10.1 16-May-2008  yamt sync with head.
 1.31.8.2 04-Jun-2008  yamt sync with head
 1.31.8.1 18-May-2008  yamt sync with head.
 1.31.6.1 02-Jun-2008  mjf Sync with HEAD.
 1.34.2.1 23-Jun-2008  wrstuden Sync w/ -current. 34 merge conflicts to follow.
 1.35.12.5 27-Aug-2011  jym Sync with HEAD. Most notably: uvm/pmap work done by rmind@, and MP Xen
work of cherry@.

No regression observed on suspend/restore.
 1.35.12.4 02-May-2011  jym Sync with head.
 1.35.12.3 24-Oct-2010  jym Sync with HEAD
 1.35.12.2 01-Nov-2009  jym Sync with HEAD.
 1.35.12.1 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.35.6.2 28-Apr-2009  skrll Sync with HEAD.
 1.35.6.1 03-Mar-2009  skrll Sync with HEAD.
 1.39.4.2 21-Apr-2011  rmind sync with head
 1.39.4.1 30-May-2010  rmind sync with head
 1.39.2.2 17-Aug-2010  uebayasi Sync with HEAD.
 1.39.2.1 30-Apr-2010  uebayasi Sync with HEAD.
 1.41.2.1 06-Jun-2011  jruoho Sync with HEAD.
 1.43.16.1 18-May-2014  rmind sync with head
 1.43.12.2 03-Dec-2017  jdolecek update from HEAD
 1.43.12.1 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.43.2.1 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.44.2.1 10-Aug-2014  tls Rebase.
 1.45.4.3 28-Aug-2017  skrll Sync with HEAD
 1.45.4.2 22-Sep-2015  skrll Sync with HEAD
 1.45.4.1 06-Jun-2015  skrll Sync with HEAD
 1.50.2.2 05-Apr-2018  martin Pull up following revision(s) (requested by christos in ticket #696):

sys/arch/amd64/amd64/vector.S: revision 1.62 (patch)
sys/arch/x86/include/intr.h: revision 1.55
sys/arch/i386/i386/vector.S: revision 1.77
sys/arch/i386/i386/db_interface.c: revision 1.82 (patch)
sys/arch/amd64/amd64/spl.S: revision 1.34 (patch)
sys/arch/amd64/amd64/db_interface.c: revision 1.33 (patch)
sys/arch/x86/x86/intr.c: revision 1.125
sys/arch/i386/i386/spl.S: revision 1.43 (patch)
sys/arch/i386/i386/machdep.c: revision 1.805 (patch)
sys/arch/x86/x86/lapic.c: revision 1.66 (patch)

Rename the DDB IPI IDT vectors for consistency. ok maxv@

Rename Xpreempt{recurse,resume} -> X{recurse,resume}_preempt so that
they fit the pattern. Also the debugger trap sniffer matches them
without adding special entries...

XXX: pullup-8.
 1.50.2.1 13-Jan-2018  snj Pull up following revision(s) (requested by knakahara in ticket #493):
sys/arch/x86/include/intr.h: revision 1.53
sys/arch/x86/pci/pci_intr_machdep.c: revision 1.42
sys/arch/x86/x86/intr.c: revision 1.114 via patch
fix "intrctl list" panic when ACPI is disabled.
reviewed by cherry@n.o and tested by msaitoh@n.o, thanks.
 1.51.2.2 16-Jul-2017  cherry 2302677
 1.51.2.1 16-Jul-2017  cherry file intr.h was added on branch perseant-stdc-iso10646 on 2017-07-16 14:02:49 +0000
 1.54.2.3 26-Dec-2018  pgoyette Sync with HEAD, resolve a few conflicts
 1.54.2.2 25-Jun-2018  pgoyette Sync with HEAD
 1.54.2.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.56.2.2 08-Apr-2020  martin Merge changes from current as of 20200406
 1.56.2.1 10-Jun-2019  christos Sync with HEAD
 1.61.6.5 19-Apr-2020  bouyer Add per-PIC callbacks for interrupt_get_devname(), interrupt_get_assigned()
and interrupt_get_count(). Implement Xen-specific callbacks for
PIC_XEN and use the x86 one for others.
In event_set_handler(), call intr_allocate_io_intrsource() so that
events appears in interrupt list (intrctl list).
 1.61.6.4 19-Apr-2020  bouyer Add a struct pic * member to struct intrhand.
This will be used for interrupt_get_count()
For Xen remplace pic_type with a pointer to the pic, and add a pointer
to intrhand, in struct pintrhand
Make event_set_handler return the pointer to struct intrhand.
Don't allocate a fake intrhand in xen_intr_establish_xname(), use the
one returned by event_set_handler().
 1.61.6.3 16-Apr-2020  bouyer Reorganise sources to make it possible to include Xen PVHVM support in
native kernels. Among others:
- move xen/include/amd64/hypercall.h to amd64/include/xen and
xen/include/i386/hypercall.h to i386/include/xen
- exclude some native files from the build for xenpv
- add xen to "machine" config statement for amd64 and i386
- split arch/xen/conf/files.xen to arch/xen/conf/files.xen (for pv drivers)
and arch/xen/conf/files.xen.pv (for full pv support)
- add GENERIC_XENHVM kernel config which includes GENERIC and add Xen PV
drivers.
 1.61.6.2 11-Apr-2020  bouyer Move softint and preemtion-related functions out of x86/x86/intr.c to
its own file, x86/x86/x86_softintr.c
Add x86/x86/x86_softintr.c for native and XenPV
Make sure XenPV also check ci_ioending, which is used for softints.
Switch XenPV to fast softints and allow kernel preemption.
kpreempt_disable() before calling pmap_changeprot_local()
run xen_wallclock_time() and xen_global_systime_ns() at splshed() to
avoid being interrupted.

XXX amd64 lock stubs are racy for XPENDING
 1.61.6.1 10-Apr-2020  bouyer spllower(): Also check Xen pending events
hypervisor_pvhvm_callback(): exit via Xdoreti, so that pending interrupts
are checked.
disable __HAVE_FAST_SOFTINTS only for XENPV, it now works for PVHVM.
We still have to disable PREEMPTION, until we support MULTIPROCESSOR
 1.2 17-Aug-2015  knakahara branches: 1.2.16;
Add kernel code to support intrctl(8).
 1.1 27-Apr-2015  knakahara branches: 1.1.2;
add pci_intr_distribute(9) for x86.
 1.1.2.3 22-Sep-2015  skrll Sync with HEAD
 1.1.2.2 06-Jun-2015  skrll Sync with HEAD
 1.1.2.1 27-Apr-2015  skrll file intr_distribute.h was added on branch nick-nhusb on 2015-06-06 14:40:04 +0000
 1.2.16.2 03-Dec-2017  jdolecek update from HEAD
 1.2.16.1 17-Aug-2015  jdolecek file intr_distribute.h was added on branch tls-maxphys on 2017-12-03 11:36:50 +0000
 1.1 25-Jan-2023  riastradh branches: 1.1.2;
x86/intr: Work around sleazy clockintr with a secret frame argument.

PR kern/57197
 1.1.2.2 01-Apr-2023  martin Pull up following revision(s) (requested by riastradh in ticket #136):

sys/arch/x86/x86/intr.c: revision 1.164
sys/arch/x86/isa/clock.c: revision 1.41
sys/arch/x86/include/intr_private.h: revision 1.1

x86/intr: Work around sleazy clockintr with a secret frame argument.
PR kern/57197
 1.1.2.1 25-Jan-2023  martin file intr_private.h was added on branch netbsd-10 on 2023-04-01 15:11:00 +0000
 1.26 07-Sep-2022  knakahara NetBSD/x86: Raise the number of interrupt sources per CPU from 32 to 56.

There has been no objection for three years.
https://mail-index.netbsd.org/port-amd64/2019/09/22/msg003012.html
Implemented by nonaka@n.o, updated by me.
 1.25 18-Mar-2021  nonaka LIR_HV priority should be lower than softint.
 1.24 25-Apr-2020  bouyer branches: 1.24.2;
Merge the bouyer-xenpvh branch, bringing in Xen PV drivers support under HVM
guests in GENERIC.
Xen support can be disabled at runtime with
boot -c
disable hypervisor
 1.23 23-Nov-2019  ad branches: 1.23.6;
cpu_need_resched():

- Remove all code that should be MI, leaving the bare minimum under arch/.
- Make the required actions very explicit.
- Pass in LWP pointer for convenience.
- When a trap is required on another CPU, have the IPI set it locally.
- Expunge cpu_did_resched().
 1.22 15-Feb-2019  nonaka Added Microsoft Hyper-V support. It ported from OpenBSD and FreeBSD.

graphical console is not work on Gen.2 VM yet. To use the serial console,
enter "consdev com,0x3f8,115200" on efiboot.
 1.21 11-Feb-2019  cherry We reorganise definitions for XEN source support as follows:

XEN - common sources required for baseline XEN support.
XENPV - sources required for support of XEN in PV mode.
XENPVHVM - sources required for support for XEN in HVM mode.
XENPVH - sources required for support for XEN in PVH mode.
 1.20 19-May-2014  rmind branches: 1.20.20; 1.20.28;
Implement MI IPI interface with cross-call support.
 1.19 01-Dec-2013  christos branches: 1.19.2;
revert fpu/pcu changes until we figure out what's wrong; they cause random
freezes
 1.18 23-Oct-2013  drochner Use the MI "pcu" framework for bookkeeping of npx/fpu states on x86.
This reduces the amount of MD code enormously, and makes it easier
to implement support for newer CPU features which require more fpu
state, or for fpu usage by the kernel.
For access to FPU state across CPUs, an xcall kthread is used now
rather than a dedicated IPI.
No user visible changes intended.
 1.17 06-Nov-2011  cherry branches: 1.17.10;
[merging from cherry-xenmp] Make the xen MMU op queue locking api private. Implement per-cpu queues.
 1.16 22-Jun-2010  rmind branches: 1.16.6; 1.16.8;
Implement high priority (XC_HIGHPRI) xcall(9) mechanism - a facility
to execute functions from software interrupt context, at SOFTINT_CLOCK.
Functions must be lightweight. Will be used for passive serialization.

OK ad@.
 1.15 05-Oct-2009  rmind branches: 1.15.2; 1.15.4;
Remove X86_IPI_WRITE_MSR (and msr_ipifuncs.c), replace all uses in drivers
with xc_broadcast(). AMD K8 PowerNow driver tested by <jakllsch>, thanks!

Closes PR/37665.
 1.14 11-Nov-2008  ad branches: 1.14.4;
PR port-amd64/38293 panic: fp_save ipi didn't

Kill the FP flush IPI and always save. The synchronization here isn't strong
and we could easily pull the chain on an innocent LWP's FP state.

Another fix to follow.
 1.13 28-Apr-2008  ad branches: 1.13.6; 1.13.8; 1.13.10;
Add support for kernel preeemption to the i386 and amd64 ports. Notes:

- I have seen one isolated panic in the x86 pmap, but otherwise i386
seems stable with preemption enabled.

- amd64 is missing the FPU handling changes and it's not yet safe to
enable it there.

- The usual level for kern.sched.kpreempt_pri will be 128 once enabled
by default. For testing, setting it to 0 helps to shake out bugs.
 1.12 18-Dec-2007  joerg branches: 1.12.6; 1.12.8; 1.12.10;
Add new IPI for saving CPU state explicitly, share high-level part of
ACPI wakeup code and teach it how to start the APs again. As a side
effect the CPU_START interface allows choosing between different
bootstrap codes more easily now.
 1.11 03-Dec-2007  ad branches: 1.11.2; 1.11.6;
Interrupt handling changes, in discussion since February:

- Reduce available SPL levels for hardware devices to none, vm, sched, high.
- Acquire kernel_lock only for interrupts at IPL_VM.
- Implement threaded soft interrupts.
 1.10 17-Oct-2007  garbled branches: 1.10.2;
Merge the ppcoea-renovation branch to HEAD.

This branch was a major cleanup and rototill of many of the various OEA
cpu based PPC ports that focused on sharing as much code as possible
between the various ports to eliminate near-identical copies of files in
every tree. Additionally there is a new PIC system that unifies the
interface to interrupt code for all different OEA ppc arches. The work
for this branch was done by a variety of people, too long to list here.

TODO:
bebox still needs work to complete the transition to -renovation.
ofppc still needs a bunch of work, which I will be looking at.
ev64260 still needs to be renovated
amigappc was not attempted.

NOTES:
pmppc was removed as an arch, and moved to a evbppc target.
 1.9 29-Aug-2007  ad Merge most x86 changes from the vmlocking branch, except the threaded soft
interrupt stuff. This is mostly comprised of changes to the pmap modules to
work on multiprocessor systems without kernel_lock, and changes to speed up
tlb shootdowns.
 1.8 21-Mar-2007  xtraeme branches: 1.8.4; 1.8.8; 1.8.12; 1.8.14;
Remove the MSR read IPI handler from X86_IPI_NAMES and use the
correct number in X86_NIPI.
 1.7 21-Mar-2007  xtraeme Remove the MSR read IPI handler, there won't be any driver that will
use it, and we can see if the values are ok in the CPUs in the write
operation.

Suggested by YAMAMOTO Takashi.
 1.6 20-Mar-2007  xtraeme MSR read and write IPI handlers for x86. A MSR will be read or written
in all CPUs available in the system. This adds another member
to struct cpu_info, ci_msr_rvalue; it will contain the value of the MSR
in a previous operation.

Tested with clockmod in UP and SMP by me, tested with est in SMP
by Daniel Carosone and Michael Van Elst.

Ok'ed by Andrew Doran and Matthew R. Green.
 1.5 03-Nov-2005  yamt branches: 1.5.26; 1.5.28; 1.5.32; 1.5.34; 1.5.36;
- use sys/spl.h.
- add some IPL_ definitions.
 1.4 16-Apr-2005  yamt branches: 1.4.2;
make multi inclusion protection macros consistent.
 1.3 16-Jun-2003  thorpej branches: 1.3.2; 1.3.10; 1.3.16;
Rename IPL_IMP -> IPL_VM.
 1.2 04-May-2003  fvdl Block level-triggered interrupts at the ioapic if they are deferred.
Avoids interrupt storms seen on some systems. Many thanks to
Stoned Elipot for testing.
 1.1 26-Feb-2003  fvdl Move some files out of i386 into x86, so that they can be shared with
other ports.
 1.3.16.1 21-Apr-2005  tron Pull up revision 1.4 (requested by yamt in ticket #174):
make multi inclusion protection macros consistent.
 1.3.10.1 29-Apr-2005  kent sync with -current
 1.3.2.1 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.4.2.4 21-Jan-2008  yamt sync with head
 1.4.2.3 07-Dec-2007  yamt sync with head
 1.4.2.2 03-Sep-2007  yamt sync with head.
 1.4.2.1 21-Jun-2006  yamt sync with head.
 1.5.36.1 29-Mar-2007  reinoud Pullup to -current
 1.5.34.1 11-Jul-2007  mjf Sync with head.
 1.5.32.6 23-Oct-2007  ad - Remove most of the hardware interrupt priority levels as proposed on
tech-kern, but leave the names of the remaining levels as none, vm, sched,
high: http://mail-index.netbsd.org/tech-kern/2007/05/05/0005.html

- Add aliases for the old levels to sys/intr.h.
 1.5.32.5 19-Oct-2007  ad Adjust previous.
 1.5.32.4 19-Oct-2007  ad Tidy up IPL defs, and remove a bogus comment block.
 1.5.32.3 29-Jul-2007  ad - When zeroing/copying pages, use SSE2 movtni to avoid polluting the cache.
- By default, align assembly routines on 32-byte starting boundaries.
- There are now 8 interrupt priority levels, half of which are softints.
Update intrdefs.h to match.
- Always clear/set spinlock words - removes lots of ifdefs.
- Remove the horrible ci_self150 hack that I introduced.
- Overhaul how TLB shootdown is performed. Inspired by a similar change in
OpenBSD but implemented quite differently. This should be a lot faster
but I have not benchmarked it yet.
 1.5.32.2 17-Jun-2007  ad - Increase the number of thread priorities from 128 to 256. How the space
is set up is to be revisited.
- Implement soft interrupts as kernel threads. A generic implementation
is provided, with hooks for fast-path MD code that can run the interrupt
threads over the top of other threads executing in the kernel.
- Split vnode::v_flag into three fields, depending on how the flag is
locked (by the interlock, by the vnode lock, by the file system).
- Miscellaneous locking fixes and improvements.
 1.5.32.1 10-Apr-2007  ad Sync with head.
 1.5.28.1 24-Mar-2007  yamt sync with head.
 1.5.26.1 20-Apr-2007  bouyer Pull up following revision(s) (requested by mlelstv in ticket #575):
sys/arch/i386/i386/est.c sync with 1.37
sys/arch/i386/i386/ipifuncs.c sync with 1.16
sys/arch/x86/include/cpu_msr.h sync with 1.4
sys/arch/x86/include/intrdefs.h sync with 1.8
sys/arch/x86/include/powernow.h sync with 1.9
sys/arch/x86/x86/powernow_k8.c sync with 1.20
sys/arch/x86/x86/msr_ipifuncs.c sync with 1.8
sys/arch/amd64/amd64/ipifuncs.c sync with 1.9
sys/arch/i386/i386/identcpu.c patch
sys/arch/i386/i386/machdep.c patch
sys/arch/i386/include/cpu.h patch
sys/arch/x86/conf/files.x86 patch
sys/arch/x86/x86/x86_machdep.c patch
sys/arch/amd64/amd64/machdep.c patch
Add MSR write IPI handler for x86. Use it and the RUN_ONCE framework
to make est and powernow drivers work properly with SMP.
 1.8.14.2 09-Jan-2008  matt sync with HEAD
 1.8.14.1 06-Nov-2007  matt sync with HEAD
 1.8.12.2 09-Dec-2007  jmcneill Sync with HEAD.
 1.8.12.1 03-Sep-2007  jmcneill Sync with HEAD.
 1.8.8.1 03-Sep-2007  skrll Sync with HEAD.
 1.8.4.1 03-Oct-2007  garbled Sync with HEAD
 1.10.2.2 27-Dec-2007  mjf Sync with HEAD.
 1.10.2.1 08-Dec-2007  mjf Sync with HEAD.
 1.11.6.1 02-Jan-2008  bouyer Sync with HEAD
 1.11.2.1 26-Dec-2007  ad Sync with head.
 1.12.10.4 11-Aug-2010  yamt sync with head.
 1.12.10.3 11-Mar-2010  yamt sync with head
 1.12.10.2 04-May-2009  yamt sync with head.
 1.12.10.1 16-May-2008  yamt sync with head.
 1.12.8.1 18-May-2008  yamt sync with head.
 1.12.6.2 17-Jan-2009  mjf Sync with HEAD.
 1.12.6.1 02-Jun-2008  mjf Sync with HEAD.
 1.13.10.1 17-Nov-2008  snj Pull up following revision(s) (requested by ad in ticket #73):
sys/arch/amd64/amd64/fpu.c: revision 1.27
sys/arch/amd64/amd64/ipifuncs.c: revision 1.20
sys/arch/i386/i386/ipifuncs.c: revision 1.28
sys/arch/i386/isa/npx.c: revision 1.130
sys/arch/x86/include/intrdefs.h: revision 1.14
PR port-amd64/38293 panic: fp_save ipi didn't
Kill the FP flush IPI and always save. The synchronization here isn't
strong and we could easily pull the chain on an innocent LWP's FP state.
Another fix to follow.
 1.13.8.1 19-Jan-2009  skrll Sync with HEAD.
 1.13.6.1 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.14.4.2 24-Oct-2010  jym Sync with HEAD
 1.14.4.1 01-Nov-2009  jym Sync with HEAD.
 1.15.4.1 03-Jul-2010  rmind sync with head
 1.15.2.1 17-Aug-2010  uebayasi Sync with HEAD.
 1.16.8.1 10-Nov-2011  yamt sync with head
 1.16.6.1 03-Jun-2011  cherry Initial import of xen MP sources, with kernel and userspace tests.
- this is a source priview.
- boots to single user.
- spurious interrupt and pmap related panics are normal
 1.17.10.1 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.19.2.1 10-Aug-2014  tls Rebase.
 1.20.28.2 08-Apr-2020  martin Merge changes from current as of 20200406
 1.20.28.1 10-Jun-2019  christos Sync with HEAD
 1.20.20.1 09-Mar-2019  martin Pull up following revision(s) via patch (requested by nonaka in ticket #1210):

sys/dev/hyperv/vmbusvar.h: revision 1.1
sys/dev/hyperv/hvs.c: revision 1.1
sys/dev/hyperv/if_hvn.c: revision 1.1
sys/dev/hyperv/vmbusic.c: revision 1.1
sys/arch/x86/x86/lapic.c: revision 1.69
sys/arch/x86/isa/clock.c: revision 1.34
sys/arch/x86/include/intrdefs.h: revision 1.22
sys/arch/i386/conf/GENERIC: revision 1.1201
sys/arch/x86/x86/hyperv.c: revision 1.1
sys/arch/x86/include/cpu.h: revision 1.105
sys/arch/x86/x86/x86_machdep.c: revision 1.124
sys/arch/i386/conf/GENERIC: revision 1.1203
sys/arch/amd64/amd64/genassym.cf: revision 1.74
sys/arch/i386/conf/GENERIC: revision 1.1204
sys/arch/amd64/conf/GENERIC: revision 1.520
sys/arch/x86/x86/hypervreg.h: revision 1.1
sys/arch/amd64/amd64/vector.S: revision 1.69
sys/dev/hyperv/hvshutdown.c: revision 1.1
sys/dev/hyperv/hvshutdown.c: revision 1.2
sys/dev/usb/if_urndisreg.h: file removal
sys/arch/x86/x86/cpu.c: revision 1.167
sys/arch/x86/conf/files.x86: revision 1.107
sys/dev/usb/if_urndis.c: revision 1.20
sys/dev/hyperv/vmbusicreg.h: revision 1.1
sys/dev/hyperv/hvheartbeat.c: revision 1.1
sys/dev/hyperv/vmbusicreg.h: revision 1.2
sys/dev/hyperv/hvheartbeat.c: revision 1.2
sys/dev/hyperv/files.hyperv: revision 1.1
sys/dev/ic/rndisreg.h: revision 1.1
sys/arch/i386/i386/genassym.cf: revision 1.111
sys/dev/ic/rndisreg.h: revision 1.2
sys/dev/hyperv/hyperv_common.c: revision 1.1
sys/dev/hyperv/hvtimesync.c: revision 1.1
sys/dev/hyperv/hypervreg.h: revision 1.1
sys/dev/hyperv/hvtimesync.c: revision 1.2
sys/dev/hyperv/vmbusicvar.h: revision 1.1
sys/dev/hyperv/if_hvnreg.h: revision 1.1
sys/arch/x86/x86/lapic.c: revision 1.70
sys/arch/amd64/amd64/vector.S: revision 1.70
sys/dev/ic/ndisreg.h: revision 1.1
sys/arch/amd64/conf/GENERIC: revision 1.516
sys/dev/hyperv/hypervvar.h: revision 1.1
sys/arch/amd64/conf/GENERIC: revision 1.518
sys/arch/amd64/conf/GENERIC: revision 1.519
sys/arch/i386/conf/files.i386: revision 1.400
sys/dev/acpi/vmbus_acpi.c: revision 1.1
sys/dev/hyperv/vmbus.c: revision 1.1
sys/dev/hyperv/vmbus.c: revision 1.2
sys/arch/x86/x86/intr.c: revision 1.144
sys/arch/i386/i386/vector.S: revision 1.83
sys/arch/amd64/conf/files.amd64: revision 1.112

separate RNDIS definitions from urndis(4) for use with Hyper-V NetVSC.

-

Added Microsoft Hyper-V support. It ported from OpenBSD and FreeBSD.
graphical console is not work on Gen.2 VM yet. To use the serial console,
enter "consdev com,0x3f8,115200" on efiboot.

-

Add __diagused.

-

PR/53984: Partial revert of modify lapic_calibrate_timer() in lapic.c r1.69.

-

Update Hyper-V related drivers description.

-

Remove unused definition.

-

Rename the MODULE_*_HOOK() macros to MODULE_HOOK_*() as briefly
discussed on irc.
NFCI intended.

-

commented out hvkvp entry.

-

fix typo. pointed out by pgoyette@n.o.

-

Use IDTVEC instead of NENTRY for handle_hyperv_hypercall.

-

Rename the MODULE_*_HOOK() macros to MODULE_HOOK_*() as briefly
discussed on irc.
 1.23.6.1 12-Apr-2020  bouyer Get rid of xen-specific ci_x* interrupt handling:
- use the general SIR mechanism, reserving 3 more slots for IPL_VM, IPL_SCHED
and IPL_HIGH
- remove specific handling from C sources, or change to ipending
- convert IPL number to SIR number in various places
- Remove XUNMASK/XPENDING in assembly or change to IUNMASK/IPENDING
- remove Xen-specific ci_xsources, ci_xmask, ci_xunmask, ci_xpending from
struct cpu_info
- for now remove a KASSERT that there are no pending interrupts in
idle_block(). We can get there with some software interrupts pending
in autoconf XXX needs to be looked at.
 1.24.2.1 03-Apr-2021  thorpej Sync with HEAD.
 1.12 25-Dec-2018  mlelstv Make ipmi driver available to other platforms.
Add ACPI attachment.
 1.11 01-Aug-2010  mlelstv branches: 1.11.58; 1.11.60;
sc_cmd_mtx protects a command sequence, no longer abuse it for delays.

Initialize mutexes and condition variables in attach and not in the
asynchronously started kernel thread.

Increase BMC spin timeout from 5ms to 15ms, this is necessary to detect
the BMC in a HP ML110G4 reliably.

Implement non-linear sensors as defined in IPMIv2.0 with some crude
32.32 fixed point arithmetic. This adds some small errors as logarithm
and power functions are only approximated.

Fix sensor index mapping so that sensor limits are computed correctly.
 1.10 20-Jul-2009  dyoung branches: 1.10.2; 1.10.4;
Overhaul synchronization in ipmi(4): synchronize all access to
device registers with a mutex. Convert tsleep/wakeup calls to
cv_wait/cv_signal.

Do not repeatedly malloc/free tiny buffers for sending/receiving
commands, but reserve a command buffer in the softc.

Tickle the watchdog in the sensors-refreshing thread.

I am fairly certain that after the device is attached, every register
access happens in the sensors-refreshing thread. Moreover, no
software interrupt touches any register, now. So I may get rid of
the mutex that protects register accesses, sc_cmd_mtx.
 1.9 03-Nov-2008  cegger branches: 1.9.4;
The functions called from ipmi_match use the DEVNAME macro. But the softc is allocated on the stack and the accessed sc_dev member is not initialized.

Initialize the sc_dev.dv_xname in ipmi_match, which is enough to make DEVNAME work. Finally this also allows the device_t/softc split.
 1.8 23-Sep-2008  ad branches: 1.8.2; 1.8.4;
Speed up ipmi attach a bit, although boot times on my workstation still suck:

before 18s
after 14s
without ipmi 8s
 1.7 16-Apr-2008  cegger branches: 1.7.4; 1.7.6; 1.7.10;
- use aprint_*_dev and device_xname
- use POSIX integer types
 1.6 16-Nov-2007  xtraeme branches: 1.6.14;
Extend the envsys2 API (one more time, sorry) as defined in:

http://mail-index.netbsd.org/tech-kern/2007/11/09/0001.html

sysmon_envsys_create() and sysmon_envsys_destroy() were added to
create/destroy sysmon_envsys objects (and its TAILQ/LIST for sensors/events).

sysmon_envsys_sensor_attach() and sysmon_envsys_sensor_detach() were
added to attach/detach sensors to a specified sysmon_envsys device.

The events framework is now per device and configurable via the
ENVSYS_SETDICTIONARY ioctl or /etc/envsys.conf and envstat(8).

Update all users and documentation to reflect these changes.
 1.5 17-Oct-2007  garbled branches: 1.5.2;
Merge the ppcoea-renovation branch to HEAD.

This branch was a major cleanup and rototill of many of the various OEA
cpu based PPC ports that focused on sharing as much code as possible
between the various ports to eliminate near-identical copies of files in
every tree. Additionally there is a new PIC system that unifies the
interface to interrupt code for all different OEA ppc arches. The work
for this branch was done by a variety of people, too long to list here.

TODO:
bebox still needs work to complete the transition to -renovation.
ofppc still needs a bunch of work, which I will be looking at.
ev64260 still needs to be renovated
amigappc was not attempted.

NOTES:
pmppc was removed as an arch, and moved to a evbppc target.
 1.4 09-Jul-2007  ad branches: 1.4.8; 1.4.10; 1.4.14;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements
 1.3 01-Jul-2007  xtraeme Imported envsys 2, a brief description of the new features:
(Part 2: drivers)

* Support for detachable sensors.
* Cleaned up the API for simplicity and efficiency.
* Ability to send capacity/critical/warning events to powerd(8).
* Adapted all the code to the new locking order.
* Compatibility with the old envsys API: the ENVSYS_GTREINFO
and ENVSYS_GTREDATA ioctl(2)s are supported.
* Added support for a 'dictionary based communication channel' between
sysmon_power(9) and powerd(8), that means there is no 32 bytes event
size restriction anymore.
* Binary compatibility with old envstat(8) and powerd(8) via COMPAT_40.
* All drivers with the n^2 gtredata bug were fixed, PR kern/36226.

Tested by:

blymn: smsc(4).
bouyer: ipmi(4), mfi(4).
kefren: ug(4).
njoly: viaenv(4), adt7463.c.
riz: owtemp(4).
xtraeme: acpiacad(4), acpibat(4), acpitz(4), aiboost(4), it(4), lm(4).
 1.2 15-Feb-2007  ad branches: 1.2.6; 1.2.8; 1.2.14;
Replace some uses of lockmgr() / simplelocks.
 1.1 01-Oct-2006  bouyer branches: 1.1.2; 1.1.4; 1.1.8; 1.1.10;
Add ipmi(4) driver, from OpenBSD. This requires SMBios support, so add
SMBios detection and mapping to bios32.c, also from OpenBSD (for now this
is only compiled in if ipmi(4) is configured). The sensors and watchdog are
accessible though envsys(4).
Works on i386; some work is needed on amd64 to access the BIOS. It would
eventually work on Xen if the SMBios is accessible (to be tested).
 1.1.10.2 08-Jan-2007  ghen Pull up following revision(s) (requested by bouyer in ticket #1621):
sys/arch/i386/conf/GENERIC: revision 1.787 via patch
share/man/man4/Makefile: revision 1.407 via patch
distrib/sets/lists/man/mi: revision 1.936 via patch
share/man/man4/ipmi.4: revision 1.1 via patch
sys/arch/i386/i386/bios32.c: revision 1.11 via patch
sys/dev/DEVNAMES: revision 1.221 via patch
sys/arch/x86/x86/ipmi.c: revision 1.1 via patch
sys/arch/i386/i386/mainbus.c: revision 1.65 via patch
sys/arch/x86/include/smbiosvar.h: revision 1.1 via patch
sys/arch/x86/include/ipmivar.h: revision 1.1 via patch
sys/arch/x86/conf/files.x86: revision 1.20 via patch
sys/arch/i386/conf/files.i386: revision 1.293 via patch
Add ipmi(4) driver, from OpenBSD. This requires SMBios support, so add
SMBios detection and mapping to bios32.c, also from OpenBSD (for now this
is only compiled in if ipmi(4) is configured). The sensors and watchdog are
accessible though envsys(4).
Works on i386; some work is needed on amd64 to access the BIOS. It would
eventually work on Xen if the SMBios is accessible (to be tested).
Add manpage for new ipmi driver.
Claim ipmi.
 1.1.10.1 01-Oct-2006  ghen file ipmivar.h was added on branch netbsd-3 on 2007-01-08 16:36:20 +0000
 1.1.8.5 07-Dec-2007  yamt sync with head
 1.1.8.4 03-Sep-2007  yamt sync with head.
 1.1.8.3 26-Feb-2007  yamt sync with head.
 1.1.8.2 30-Dec-2006  yamt sync with head.
 1.1.8.1 01-Oct-2006  yamt file ipmivar.h was added on branch yamt-lazymbuf on 2006-12-30 20:47:22 +0000
 1.1.4.2 18-Nov-2006  ad Sync with head.
 1.1.4.1 01-Oct-2006  ad file ipmivar.h was added on branch newlock2 on 2006-11-18 21:29:38 +0000
 1.1.2.2 22-Oct-2006  yamt sync with head
 1.1.2.1 01-Oct-2006  yamt file ipmivar.h was added on branch yamt-splraiseipl on 2006-10-22 06:05:16 +0000
 1.2.14.1 03-Oct-2007  garbled Sync with HEAD
 1.2.8.1 11-Jul-2007  mjf Sync with head.
 1.2.6.4 03-Dec-2007  ad Sync with HEAD.
 1.2.6.3 15-Jul-2007  ad Sync with head.
 1.2.6.2 10-Apr-2007  ad Nuke the deferred kthread creation stuff, as it's no longer needed.
Pointed out by thorpej@.
 1.2.6.1 09-Apr-2007  ad - Add two new arguments to kthread_create1: pri_t pri, bool mpsafe.
- Fork kthreads off proc0 as new LWPs, not new processes.
 1.4.14.1 18-Nov-2007  bouyer Sync with HEAD
 1.4.10.2 09-Jan-2008  matt sync with HEAD
 1.4.10.1 06-Nov-2007  matt sync with HEAD
 1.4.8.1 21-Nov-2007  joerg Sync with HEAD.
 1.5.2.1 19-Nov-2007  mjf Sync with HEAD.
 1.6.14.3 17-Jan-2009  mjf Sync with HEAD.
 1.6.14.2 28-Sep-2008  mjf Sync with HEAD.
 1.6.14.1 02-Jun-2008  mjf Sync with HEAD.
 1.7.10.2 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.7.10.1 19-Oct-2008  haad Sync with HEAD.
 1.7.6.1 10-Oct-2008  skrll Sync with HEAD.
 1.7.4.3 11-Aug-2010  yamt sync with head.
 1.7.4.2 19-Aug-2009  yamt sync with head.
 1.7.4.1 04-May-2009  yamt sync with head.
 1.8.4.1 06-Nov-2008  snj Pull up following revision(s) (requested by cegger in ticket #10):
sys/arch/x86/x86/ipmi.c: revision 1.23
sys/arch/x86/include/ipmivar.h: revision 1.9
The functions called from ipmi_match use the DEVNAME macro. But the
softc is allocated on the stack and the accessed sc_dev member is not
initialized.
Initialize the sc_dev.dv_xname in ipmi_match, which is enough to make
DEVNAME work. Finally this also allows the device_t/softc split.
 1.8.2.1 19-Jan-2009  skrll Sync with HEAD.
 1.9.4.3 24-Oct-2010  jym Sync with HEAD
 1.9.4.2 01-Nov-2009  jym Sync with HEAD.
 1.9.4.1 23-Jul-2009  jym Sync with HEAD.
 1.10.4.1 05-Mar-2011  rmind sync with head
 1.10.2.1 17-Aug-2010  uebayasi Sync with HEAD.
 1.11.60.1 10-Jun-2019  christos Sync with HEAD
 1.11.58.1 26-Dec-2018  pgoyette Sync with HEAD, resolve a few conflicts
 1.13 12-Dec-2021  andvar s/Miscellanous/Miscellaneous/ in copypasta comments.
 1.12 15-Oct-2016  jdolecek provide intr xname
 1.11 01-Jul-2011  dyoung branches: 1.11.12; 1.11.30; 1.11.34;
#include <sys/bus.h> instead of <machine/bus.h>.
 1.10 19-Aug-2009  dyoung isa_detach_hook() needs two arguments, the first an isa_chipset_tag_t.
 1.9 18-Aug-2009  dyoung These are stragglers from my last commit ("Let us safely detach
the ISA bus and devices attaching to the ISA bus"). Define
isa_detach_hook() in MD ISA implementations. Define isa_dmadestroy().
 1.8 25-Mar-2009  dyoung It is only by accident that these get the definitions they need from
<sys/device.h>, so explicitly #include <sys/device.h>.
 1.7 08-Feb-2009  bouyer branches: 1.7.2;
Apply patch proposed on port-amd64/port-i386, allowing to use a 64bit
bus_addr_t on i386PAE kernels:
change bus_addr_t to be a paddr_t (so its size follows paddr_t depending
on options PAE)
remplace bus_addr_t with vaddr_t where the value is used as a virtual address.

Difference with the proposed patch: cast to uintmax_t and use %jx in
printf() as suggested by Joerg.
 1.6 27-Jun-2008  cegger branches: 1.6.4; 1.6.6; 1.6.12;
struct device * -> device_t
 1.5 28-Apr-2008  martin branches: 1.5.2; 1.5.4;
Remove clause 3 and 4 from TNF licenses
 1.4 16-Apr-2005  yamt branches: 1.4.82; 1.4.84; 1.4.86;
make multi inclusion protection macros consistent.
 1.3 07-Aug-2003  agc branches: 1.3.8; 1.3.14;
Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.2 09-May-2003  fvdl branches: 1.2.2;
A few ISA sound drivers like to share dma channels, and hence deferred
isa_dmamap_create() calls to their open/close entrypoints. This worked
with some luck, but broke on i386 when _bus_dmamap_create started
to allocate bounce buffers upfront, since memory below 16M may well
not be available when the sound devices is opened for the Nth time.

To fix this, create a new simple interface, isa_drq_alloc/isa_drq_free,
wrappers around already existing bitmask macros. These are expected
to be used before an isa_dmamap_create call, and after an
isa_dmamap_destroy call, respectively. For the sb and ad1848 drivers,
they're deferred until open/close.

All isa_dmamap_create calls can now use BUS_DMA_ALLOCNOW and be done
at attach time.
 1.1 27-Feb-2003  fvdl Move a few more files to x86/include. Trim the list of files to install
in /usr/include a bit.
 1.2.2.4 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.2.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.2.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.2.2.1 03-Aug-2004  skrll Sync with HEAD
 1.3.14.1 21-Apr-2005  tron Pull up revision 1.4 (requested by yamt in ticket #174):
make multi inclusion protection macros consistent.
 1.3.8.1 29-Apr-2005  kent sync with -current
 1.4.86.3 19-Aug-2009  yamt sync with head.
 1.4.86.2 04-May-2009  yamt sync with head.
 1.4.86.1 16-May-2008  yamt sync with head.
 1.4.84.1 18-May-2008  yamt sync with head.
 1.4.82.2 29-Jun-2008  mjf Sync with HEAD.
 1.4.82.1 02-Jun-2008  mjf Sync with HEAD.
 1.5.4.1 27-Jun-2008  simonb Sync with head.
 1.5.2.1 18-Sep-2008  wrstuden Sync with wrstuden-revivesa-base-2.
 1.6.12.1 21-Apr-2010  matt sync to netbsd-5
 1.6.6.1 29-Sep-2009  snj Pull up following revision(s) (requested by bouyer in ticket #1040):
sys/arch/x86/include/bus.h: revision 1.18
sys/arch/x86/include/isa_machdep.h: revision 1.7
sys/arch/x86/x86/bus_space.c: revision 1.21
Apply patch proposed on port-amd64/port-i386, allowing to use a 64bit
bus_addr_t on i386PAE kernels:
change bus_addr_t to be a paddr_t (so its size follows paddr_t depending
on options PAE)
remplace bus_addr_t with vaddr_t where the value is used as a virtual address.
Difference with the proposed patch: cast to uintmax_t and use %jx in
printf() as suggested by Joerg.
 1.6.4.2 28-Apr-2009  skrll Sync with HEAD.
 1.6.4.1 03-Mar-2009  skrll Sync with HEAD.
 1.7.2.3 27-Aug-2011  jym Sync with HEAD. Most notably: uvm/pmap work done by rmind@, and MP Xen
work of cherry@.

No regression observed on suspend/restore.
 1.7.2.2 01-Nov-2009  jym Sync with HEAD.
 1.7.2.1 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.11.34.1 04-Nov-2016  pgoyette Sync with HEAD
 1.11.30.1 05-Dec-2016  skrll Sync with HEAD
 1.11.12.1 03-Dec-2017  jdolecek update from HEAD
 1.6 06-Aug-2014  joerg Consistently define WARN in a way that passes format string checks, i.e.
always uses the same number of arguments as given in the format string.
 1.5 06-Apr-2014  joerg x86_progress takes a format string.
 1.4 21-Mar-2009  ad branches: 1.4.12; 1.4.22; 1.4.26; 1.4.36;
Fix 'boot -z' bogons.
 1.3 25-Sep-2008  christos branches: 1.3.2; 1.3.8;
define a TEST mode.
 1.2 28-Apr-2008  martin branches: 1.2.2; 1.2.6;
Remove clause 3 and 4 from TNF licenses
 1.1 01-Oct-2007  ad branches: 1.1.2; 1.1.4; 1.1.6; 1.1.10; 1.1.14; 1.1.28; 1.1.30; 1.1.32;
Now that the bootblocks are the same, share loadfile_machdep.h between
amd64 and i386.
 1.1.32.2 04-May-2009  yamt sync with head.
 1.1.32.1 16-May-2008  yamt sync with head.
 1.1.30.1 18-May-2008  yamt sync with head.
 1.1.28.2 28-Sep-2008  mjf Sync with HEAD.
 1.1.28.1 02-Jun-2008  mjf Sync with HEAD.
 1.1.14.2 06-Nov-2007  matt sync with HEAD
 1.1.14.1 01-Oct-2007  matt file loadfile_machdep.h was added on branch matt-armv6 on 2007-11-06 23:23:36 +0000
 1.1.10.2 27-Oct-2007  yamt sync with head.
 1.1.10.1 01-Oct-2007  yamt file loadfile_machdep.h was added on branch yamt-lazymbuf on 2007-10-27 11:28:55 +0000
 1.1.6.2 09-Oct-2007  ad Sync with head.
 1.1.6.1 01-Oct-2007  ad file loadfile_machdep.h was added on branch vmlocking on 2007-10-09 13:38:42 +0000
 1.1.4.2 06-Oct-2007  yamt sync with head.
 1.1.4.1 01-Oct-2007  yamt file loadfile_machdep.h was added on branch yamt-x86pmap on 2007-10-06 15:33:32 +0000
 1.1.2.2 02-Oct-2007  joerg Sync with HEAD.
 1.1.2.1 01-Oct-2007  joerg file loadfile_machdep.h was added on branch jmcneill-pm on 2007-10-02 18:27:49 +0000
 1.2.6.1 19-Oct-2008  haad Sync with HEAD.
 1.2.2.1 10-Oct-2008  skrll Sync with HEAD.
 1.3.8.2 01-Nov-2009  jym Sync with HEAD.
 1.3.8.1 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.3.2.1 28-Apr-2009  skrll Sync with HEAD.
 1.4.36.1 10-Aug-2014  tls Rebase.
 1.4.26.1 18-May-2014  rmind sync with head
 1.4.22.1 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.4.12.1 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.29 12-Feb-2022  riastradh __cpu_simple_lock(9): Omit needless barriers in init.

It is, and always has been, the caller's responsibility to ensure the
lock is initialized before it can be used -- otherwise the memory
could hold garbage; it is nonsensical to even attempt locking
operations on it before initialization.

So there's no need to issue explicit barriers here. The barrier
seems to have been introduced in sys/arch/alpha/alpha/lock_machdep.c
rev. 1.1 (since moved to inline asm in alpha/include/lock.h) and then
copied & pasted into several other architectures.
 1.28 16-Sep-2017  christos more const
 1.27 22-Jan-2013  christos Allow for non inlined definitions for RUMP
 1.26 11-Oct-2012  apb Change "=r" to "=qQ" in a register constraint in an asm statement
for a register that is used with the "xchgb" instruction in the
definition of __cpu_simple_lock_try(). This fixes PR 45673, or at
least works around the gcc bug that might be behind PR 45673.

The output from "objdump -d" before and after this change is
identical, for the amd64 GENERIC kernel, the i386 GENERIC kernel,
and the i386 MONOLITHIC kernel.
 1.25 15-Jan-2009  pooka branches: 1.25.14; 1.25.20; 1.25.24; 1.25.26;
The last _KERNEL -> _HARDKERNEL in locking operations.
 1.24 28-Apr-2008  martin branches: 1.24.8;
Remove clause 3 and 4 from TNF licenses
 1.23 09-Jan-2008  yamt branches: 1.23.6; 1.23.8; 1.23.10;
fix SPINLOCK_BACKOFF_HOOK.
 1.22 04-Jan-2008  ad Start detangling lock.h from intr.h. This is likely to cause short term
breakage, but the mess of dependencies has been regularly breaking the
build recently anyhow.
 1.21 25-Dec-2007  perry Convert many of the uses of __attribute__ to equivalent
__packed, __unused and __dead macros from cdefs.h
 1.20 20-Dec-2007  ad - Make __cpu_simple_lock and similar real functions and patch at runtime.
- Remove old x86 atomic ops.
- Drop text alignment back to 16 on i386 (really, this time).
- Minor cleanup.
 1.19 07-Nov-2007  ad branches: 1.19.2; 1.19.6;
__cpu_simple_locks really should be simple, otherwise they can cause
problems for e.g. profiling.
 1.18 17-Oct-2007  garbled branches: 1.18.2;
Merge the ppcoea-renovation branch to HEAD.

This branch was a major cleanup and rototill of many of the various OEA
cpu based PPC ports that focused on sharing as much code as possible
between the various ports to eliminate near-identical copies of files in
every tree. Additionally there is a new PIC system that unifies the
interface to interrupt code for all different OEA ppc arches. The work
for this branch was done by a variety of people, too long to list here.

TODO:
bebox still needs work to complete the transition to -renovation.
ofppc still needs a bunch of work, which I will be looking at.
ev64260 still needs to be renovated
amigappc was not attempted.

NOTES:
pmppc was removed as an arch, and moved to a evbppc target.
 1.17 26-Sep-2007  ad branches: 1.17.2;
Make it build in userspace again.
 1.16 11-Sep-2007  skrll branches: 1.16.2;
Always provide __cpu_simple_lock_{set,clear}.

Fixes LOCKDEBUG kernel builds.
 1.15 10-Sep-2007  skrll Merge nick-csl-alignment.
 1.14 10-Feb-2007  ad branches: 1.14.6; 1.14.12; 1.14.14; 1.14.18; 1.14.22; 1.14.24;
NSPR builds seem to choke on 'inline'. Replace it with __inline.
 1.13 09-Feb-2007  ad Merge newlock2 to head.
 1.12 18-Dec-2006  ad __cpu_simple_unlock(): add a note about memory ordering and why this is
correct, contrary to Intel's documentation.
 1.11 28-Dec-2005  perry branches: 1.11.20; 1.11.22;
inline -> __inline
 1.10 24-Dec-2005  perry Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.
 1.9 16-Apr-2005  yamt branches: 1.9.2;
make multi inclusion protection macros consistent.
 1.8 25-Nov-2004  yamt branches: 1.8.4; 1.8.10;
remove __lockbarrier, which i forgot to remove in the previous.
 1.7 31-Oct-2004  yamt use __insn_barrier rather than homegrown equivalents.
 1.6 23-Oct-2004  yamt __cpu_simple_lock: loop without locking cache or asserting LOCK#.
 1.5 27-Oct-2003  junyoung Nuke __P().
 1.4 26-Oct-2003  yamt define SPINLOCK_SPIN_HOOK to let LK_SPIN lockmgr locks call x86_pause.
 1.3 26-Sep-2003  nathanw Move __cpu_simple_lock_t and __SIMPLELOCK_{UN,}LOCKED to machine/types.h
so that they can be used in a namespace-friendly way.
 1.2 08-May-2003  fvdl branches: 1.2.2;
Add x86_pause() inline function, containing the "pause" instruction
for i386, and nothing for amd64. Sprinkle it in various spinloops,
as recommended by Intel.
 1.1 27-Feb-2003  fvdl Move a few more files to x86/include. Trim the list of files to install
in /usr/include a bit.
 1.2.2.6 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.2.2.5 29-Nov-2004  skrll Sync with HEAD.
 1.2.2.4 02-Nov-2004  skrll Sync with HEAD.
 1.2.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.2.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.2.2.1 03-Aug-2004  skrll Sync with HEAD
 1.8.10.1 21-Apr-2005  tron Pull up revision 1.9 (requested by yamt in ticket #174):
make multi inclusion protection macros consistent.
 1.8.4.1 29-Apr-2005  kent sync with -current
 1.9.2.5 21-Jan-2008  yamt sync with head
 1.9.2.4 15-Nov-2007  yamt sync with head.
 1.9.2.3 27-Oct-2007  yamt sync with head.
 1.9.2.2 26-Feb-2007  yamt sync with head.
 1.9.2.1 30-Dec-2006  yamt sync with head.
 1.11.22.1 18-Dec-2006  yamt sync with head.
 1.11.20.4 02-Feb-2007  ad - Define memory barrier ops in lock_stubs.S.
- If lfence/mfence are available, patch them in at boot.
- Patch to a no-op if !MULTIPROCESSOR. XXX Should be determined at runtime.
 1.11.20.3 12-Jan-2007  ad Sync with head.
 1.11.20.2 29-Dec-2006  ad Checkpoint work in progress.
 1.11.20.1 20-Oct-2006  ad Define memory barriers: mb_read(), mb_write(), mb_memory()
 1.14.24.4 23-Mar-2008  matt sync with HEAD
 1.14.24.3 09-Jan-2008  matt sync with HEAD
 1.14.24.2 08-Nov-2007  matt sync with -HEAD
 1.14.24.1 06-Nov-2007  matt sync with HEAD
 1.14.22.2 11-Nov-2007  joerg Sync with HEAD.
 1.14.22.1 02-Oct-2007  joerg Sync with HEAD.
 1.14.18.2 15-Aug-2007  skrll Provide __SIMPLELOCK_{UN,}LOCKED_P and __cpu_simple_lock_{set,clear}
for all architectures.
 1.14.18.1 18-Jul-2007  skrll Initial work on provided correctly aligned __cpu_simple_lock_t for hppa
and first attempt at adapting i386 to the changes.

More to come.
 1.14.14.1 03-Oct-2007  garbled Sync with HEAD
 1.14.12.1 18-Apr-2007  thorpej Convert i386 and amd64 to the new atomic ops API.
 1.14.6.2 03-Dec-2007  ad Sync with HEAD.
 1.14.6.1 09-Oct-2007  ad Sync with head.
 1.16.2.1 06-Oct-2007  yamt sync with head.
 1.17.2.1 13-Nov-2007  bouyer Sync with HEAD
 1.18.2.3 18-Feb-2008  mjf Sync with HEAD.
 1.18.2.2 27-Dec-2007  mjf Sync with HEAD.
 1.18.2.1 19-Nov-2007  mjf Sync with HEAD.
 1.19.6.3 10-Jan-2008  bouyer Sync with HEAD
 1.19.6.2 08-Jan-2008  bouyer Sync with HEAD
 1.19.6.1 02-Jan-2008  bouyer Sync with HEAD
 1.19.2.1 26-Dec-2007  ad Sync with head.
 1.23.10.2 04-May-2009  yamt sync with head.
 1.23.10.1 16-May-2008  yamt sync with head.
 1.23.8.1 18-May-2008  yamt sync with head.
 1.23.6.2 17-Jan-2009  mjf Sync with HEAD.
 1.23.6.1 02-Jun-2008  mjf Sync with HEAD.
 1.24.8.1 19-Jan-2009  skrll Sync with HEAD.
 1.25.26.1 19-Oct-2012  riz Pull up following revision(s) (requested by apb in ticket #606):
sys/arch/x86/include/lock.h: revision 1.26
Change "=r" to "=qQ" in a register constraint in an asm statement
for a register that is used with the "xchgb" instruction in the
definition of __cpu_simple_lock_try(). This fixes PR 45673, or at
least works around the gcc bug that might be behind PR 45673.
The output from "objdump -d" before and after this change is
identical, for the amd64 GENERIC kernel, the i386 GENERIC kernel,
and the i386 MONOLITHIC kernel.
 1.25.24.3 03-Dec-2017  jdolecek update from HEAD
 1.25.24.2 25-Feb-2013  tls resync with head
 1.25.24.1 20-Nov-2012  tls Resync to 2012-11-19 00:00:00 UTC
 1.25.20.1 17-Oct-2012  riz Pull up following revision(s) (requested by apb in ticket #606):
sys/arch/x86/include/lock.h: revision 1.26
Change "=r" to "=qQ" in a register constraint in an asm statement
for a register that is used with the "xchgb" instruction in the
definition of __cpu_simple_lock_try(). This fixes PR 45673, or at
least works around the gcc bug that might be behind PR 45673.
The output from "objdump -d" before and after this change is
identical, for the amd64 GENERIC kernel, the i386 GENERIC kernel,
and the i386 MONOLITHIC kernel.
 1.25.14.2 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.25.14.1 30-Oct-2012  yamt sync with head
 1.1 30-Nov-2024  christos branches: 1.1.4;
Create a new header lwp_private.h to contain _lwp_getprivate_fast,
_lwp_gettcb_fast, _lwp_settcb and remove them from mcontext.h, so that:
1. we don't need special hacks to hide them
2. we can include <lwp.h> where needed to get the necessary prototypes
without redefining them locally.
 1.1.4.2 02-Aug-2025  perseant Sync with HEAD
 1.1.4.1 30-Nov-2024  perseant file lwp_private.h was added on branch perseant-exfatfs on 2025-08-02 05:56:17 +0000
 1.12 28-Oct-2021  riastradh x86: Process bootloader rndseed much sooner.
 1.11 15-Nov-2020  bouyer remove unused x86_cpu_initclock_func()
 1.10 25-Apr-2020  bouyer branches: 1.10.2;
Merge the bouyer-xenpvh branch, bringing in Xen PV drivers support under HVM
guests in GENERIC.
Xen support can be disabled at runtime with
boot -c
disable hypervisor
 1.9 26-Dec-2016  cherry branches: 1.9.26;
the i386 and amd64 boot time msgbuf init code is nearly identical.

Unify them into x86/x86_machdep.c:init_x86_msgbuf()

Boot tested on GENERIC (i386, amd64), XEN3_DOM0 (amd64)
 1.8 16-Jul-2016  maxv Simplify the way physical pages are internalized into the VM system on x86.
Only two functions are called now: init_x86_clusters, which initializes the
memory clusters from the bootinfo, and init_x86_vm, which inserts the pages
from the clusters into VM.
 1.7 12-Jun-2014  riastradh branches: 1.7.4; 1.7.8;
Tweak x86 page freelists and add x86_select_freelist.

- Add 4G freelist to i386 -- there may be higher addresses if PAE.
- Add 64G and 1T freelists to amd64.
- Simplify freelist setup code and condense it into a table.
- Add x86_select_freelist to get a freelist guaranteed to yield
addresses no greater than a prescribed maximum address.

x86_select_freelist takes a uint64_t, not a paddr_t or bus_addr_t, so
that you can pass in, e.g., a 36-bit maximum address without needing
to write conditionals for i386/PAE.

No objections on port-x86:

https://mail-index.netbsd.org/port-i386/2014/05/21/msg003277.html
https://mail-index.netbsd.org/port-amd64/2014/05/21/msg002062.html
 1.6 12-Apr-2013  christos branches: 1.6.8;
de-duplication police arrests sysctl.
 1.5 21-Oct-2010  yamt branches: 1.5.8; 1.5.18;
don't forget to call nmi_init.
 1.4 23-Aug-2010  jruoho Other entry points beyond x86_cpu_idle_halt() may use HLT as the
idle-mechanism. Send an IPI also for these in cpu_need_resched().
 1.3 18-Jul-2010  jruoho Merge a driver for ACPI CPUs with basic support for processor power states,
also known as C-states. The code is modular and provides an easy way to add
the remaining functionality later (namely throttling and P-states).

Remarks:

1. Commented out in the GENERICs; more testing exposure is needed.

2. The C3-state is disabled for the time being because it turns off
timers, among them the local APIC timer. This may not be universally
true on all x86 processors; define ACPICPU_ENABLE_C3 to test.

3. The algorithm used to choose a power state may need tuning. When
evaluating the appropriate state, the implementation uses the
previous sleep time as an indicator. Additional hints would include
for example the system load.

Also bus master activity is evaluated when choosing a state. The
usb(4) stack is notorious for such activity even when unused.
Typically it must be disabled in order to reach the C3-state,
but it may also prevent the use of C2.

4. While no extensive empirical measurements have been carried out, the
power savings are somewhere between 1-2 W with C1 and C2, depending
on the processor, firmware, and load. With C3 even up to 4 W can be
saved. The less something ticks, the more power is saved.

ok jmcneill@, joerg@, and discussed with various people.
 1.2 15-Dec-2008  cegger branches: 1.2.2; 1.2.4; 1.2.6; 1.2.8; 1.2.10; 1.2.12;
cleanup BIOS memmap code:
- get rid of some nested externs
- reduce dependency on global variables
- some preparations for upcoming pmem(9)
 1.1 14-Nov-2008  cegger branches: 1.1.4;
merge BIOS memmap code from i386/i386/machdep.c:init386() and amd64/amd64/machdep.c:init_x86_64 into x86/x86/x86_machdep.c
 1.1.4.2 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.1.4.1 14-Nov-2008  haad file machdep.h was added on branch haad-dm on 2008-12-13 01:13:38 +0000
 1.2.12.1 05-Mar-2011  rmind sync with head
 1.2.10.2 22-Oct-2010  uebayasi Sync with HEAD (-D20101022).
 1.2.10.1 17-Aug-2010  uebayasi Sync with HEAD.
 1.2.8.4 09-Oct-2010  yamt sync with head
 1.2.8.3 11-Aug-2010  yamt sync with head.
 1.2.8.2 04-May-2009  yamt sync with head.
 1.2.8.1 15-Dec-2008  yamt file machdep.h was added on branch yamt-nfs-mp on 2009-05-04 08:12:09 +0000
 1.2.6.1 24-Oct-2010  jym Sync with HEAD
 1.2.4.2 19-Jan-2009  skrll Sync with HEAD.
 1.2.4.1 15-Dec-2008  skrll file machdep.h was added on branch nick-hppapmap on 2009-01-19 13:17:09 +0000
 1.2.2.2 17-Jan-2009  mjf Sync with HEAD.
 1.2.2.1 15-Dec-2008  mjf file machdep.h was added on branch mjf-devfs2 on 2009-01-17 13:28:38 +0000
 1.5.18.3 03-Dec-2017  jdolecek update from HEAD
 1.5.18.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.5.18.1 23-Jun-2013  tls resync from head
 1.5.8.1 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.6.8.1 10-Aug-2014  tls Rebase.
 1.7.8.2 07-Jan-2017  pgoyette Sync with HEAD. (Note that most of these changes are simply $NetBSD$
tag issues.)
 1.7.8.1 26-Jul-2016  pgoyette Sync with HEAD
 1.7.4.2 05-Feb-2017  skrll Sync with HEAD
 1.7.4.1 05-Oct-2016  skrll Sync with HEAD
 1.9.26.1 18-Apr-2020  bouyer Centralize initialisations of delay_func and initclock_func
in x86_machdep.c and export from <x86/machdep.h>
Introduce a x86_dummy_initclock() and a x86_cpu_initclock_func pointer,
to be used later for Xen HVM native clock support.
rename rtclock_tval to x86_rtclock_tval and export from <x86/machdep.h>,
for the benefit of lapic.c
 1.10.2.1 14-Dec-2020  thorpej Sync w/ HEAD.
 1.2 28-Oct-2003  kleink branches: 1.2.4;
#define __HAVE_LONG_DOUBLE on platforms which implement a distinct
`long double' type.
 1.1 22-Oct-2003  kleink Use a common <machine/math.h> for amd64 and i386.
 1.2.4.4 21-Sep-2004  skrll Fix the sync with head I botched.
 1.2.4.3 18-Sep-2004  skrll Sync with HEAD.
 1.2.4.2 03-Aug-2004  skrll Sync with HEAD
 1.2.4.1 28-Oct-2003  skrll file math.h was added on branch ktrace-lwp on 2004-08-03 10:43:04 +0000
 1.12 06-Oct-2025  riastradh x86: Wire up PCI resource manager if enabled.

Enable in your kernel config with `options PCI_RESOURCE'.

Adapted from a patch by mlelstv@.

PR port-amd64/59118: Thinkpad T495s - iwm PCI BAR is zero
 1.11 23-May-2017  nonaka branches: 1.11.48;
x86: Add preliminary x2APIC support.

x2APIC is used only when x2APIC is enabled in BIOS/UEFI.
LAPIC ID is not supported above 256.
 1.10 31-Mar-2013  chs branches: 1.10.12;
remove unused variable declaration.
 1.9 17-Apr-2009  dyoung branches: 1.9.12; 1.9.22;
Introduce sys/arch/x86/x86/mp.c for common x86 MP configuration code.
mpacpi_scan_pci() and mpbios_scan_pci() are identical code, so replace
them with mp_pci_scan().

Introduce mp_pci_childdetached(), which helps us to detach root PCI
buses that were enumerated either by MP BIOS or by ACPI.

Let us detach and re-attach PCI buses from mainbus0 on i386. This is
necessarily a work-in-progress, because testing detach and re-attach
is very difficult: to detach and re-attach the entire PCI tree on most
x86 computers that I own is not possible because some essential device
attaches under the PCI subtree: the console, com0, NIC, or storage
controller always attaches in the PCI tree.
 1.8 09-Nov-2008  cegger branches: 1.8.4;
struct device * -> device_t
 1.7 09-Nov-2008  cegger Nuke last parameter from mpaci_scan_apics() and mpbios_scan().
It is unused.
 1.6 17-Oct-2007  garbled branches: 1.6.16; 1.6.20; 1.6.26; 1.6.28;
Merge the ppcoea-renovation branch to HEAD.

This branch was a major cleanup and rototill of many of the various OEA
cpu based PPC ports that focused on sharing as much code as possible
between the various ports to eliminate near-identical copies of files in
every tree. Additionally there is a new PIC system that unifies the
interface to interrupt code for all different OEA ppc arches. The work
for this branch was done by a variety of people, too long to list here.

TODO:
bebox still needs work to complete the transition to -renovation.
ofppc still needs a bunch of work, which I will be looking at.
ev64260 still needs to be renovated
amigappc was not attempted.

NOTES:
pmppc was removed as an arch, and moved to a evbppc target.
 1.5 06-Oct-2007  joerg Merge from mpacpi.h 1.4.32.1, acpi_machdep.c 1.13.22.5 and
mpacpi.c 1.48.12.2 from jmcneill-pm:

Don't process the MADT and modify the interrupt config at one moment and
later trying to figure out if an entry was overriden and matches the
ACPI SCI. This is brain-dead and breaks in various situations.

Just check for each ISA override entry, if it matches the SCI. If it
does, remember it and use it for the interrupt setup. If there's no such
override assume that it is not changed, but override the polarity and
level from ISA settings to PCI settings.
 1.4 04-Jul-2006  christos branches: 1.4.8; 1.4.14; 1.4.22; 1.4.24; 1.4.32; 1.4.34; 1.4.36;
Apply fvdl's acpi pci interrupt configuration code.
- MPACPI is no more.
- MPACPI_SCANPCI -> ACPI_SCANPCI
 1.3 16-Apr-2005  yamt branches: 1.3.2; 1.3.12; 1.3.16; 1.3.24;
make multi inclusion protection macros consistent.
 1.2 29-May-2003  fvdl branches: 1.2.2; 1.2.10; 1.2.16;
Add the options MPBIOS_SCANPCI and MPACPI_SCANPCI to configure PCI roots
with the MPBIOS/ACPI bus information, by walking through the buses, and
descending down every bus that hasn't been marked configured yet.
 1.1 11-May-2003  fvdl Make this include file shareable (moved here from sys/arch/i386/include)
 1.2.16.1 21-Apr-2005  tron Pull up revision 1.3 (requested by yamt in ticket #174):
make multi inclusion protection macros consistent.
 1.2.10.1 29-Apr-2005  kent sync with -current
 1.2.2.1 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.3.24.1 13-Jul-2006  gdamore Merge from HEAD.
 1.3.16.1 11-Aug-2006  yamt sync with head
 1.3.12.1 09-Sep-2006  rpaulo sync with head
 1.3.2.2 27-Oct-2007  yamt sync with head.
 1.3.2.1 30-Dec-2006  yamt sync with head.
 1.4.36.1 14-Oct-2007  yamt sync with head.
 1.4.34.1 06-Nov-2007  matt sync with HEAD
 1.4.32.2 07-Oct-2007  joerg Sync with HEAD.
 1.4.32.1 02-Oct-2007  joerg Don't process the MADT and modify the interrupt config at one moment and
later trying to figure out if an entry was overriden and matches the
ACPI SCI. This is brain-dead and breaks in various situations.

Just check for each ISA override entry, if it matches the SCI. If it
does, remember it and use it for the interrupt setup. If there's no such
override assume that it is not changed, but override the polarity and
level from ISA settings to PCI settings.
 1.4.24.1 29-Oct-2007  wrstuden Catch up with 4.0 RC3
 1.4.22.1 16-Oct-2007  garbled Sync with HEAD
 1.4.14.1 09-Oct-2007  ad Sync with head.
 1.4.8.1 14-Oct-2007  xtraeme Pull up following revision(s) (requested by joerg in ticket #925):
sys/arch/x86/x86/mpacpi.c: revision 1.50
sys/arch/x86/include/mpacpi.h: revision 1.5
sys/arch/x86/x86/acpi_machdep.c: revision 1.16

Merge from mpacpi.h 1.4.32.1, acpi_machdep.c 1.13.22.5 and
mpacpi.c 1.48.12.2 from jmcneill-pm:

Don't process the MADT and modify the interrupt config at one moment and
later trying to figure out if an entry was overriden and matches the
ACPI SCI. This is brain-dead and breaks in various situations.
Just check for each ISA override entry, if it matches the SCI. If it
does, remember it and use it for the interrupt setup. If there's no such
override assume that it is not changed, but override the polarity and
level from ISA settings to PCI settings.
 1.6.28.2 28-Apr-2009  skrll Sync with HEAD.
 1.6.28.1 19-Jan-2009  skrll Sync with HEAD.
 1.6.26.1 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.6.20.1 04-May-2009  yamt sync with head.
 1.6.16.1 17-Jan-2009  mjf Sync with HEAD.
 1.8.4.2 01-Nov-2009  jym Sync with HEAD.
 1.8.4.1 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.9.22.2 03-Dec-2017  jdolecek update from HEAD
 1.9.22.1 23-Jun-2013  tls resync from head
 1.9.12.1 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.10.12.1 28-Aug-2017  skrll Sync with HEAD
 1.11.48.1 20-Oct-2025  martin Pull up following revision(s) (requested by riastradh in ticket #66):

sys/arch/x86/include/mpacpi.h: revision 1.12
sys/arch/x86/x86/mpacpi.c: revision 1.112
sys/arch/amd64/conf/ALL: revision 1.194
sys/arch/i386/conf/ALL: revision 1.524
sys/arch/x86/acpi/acpi_machdep.c: revision 1.40
sys/arch/i386/conf/GENERIC: revision 1.1261
sys/dev/acpi/acpi_mcfg.h: revision 1.6
sys/arch/amd64/conf/GENERIC: revision 1.618

x86: Wire up PCI resource manager if enabled.

Enable in your kernel config with `options PCI_RESOURCE'.

Adapted from a patch by mlelstv@.
PR port-amd64/59118: Thinkpad T495s - iwm PCI BAR is zero
 1.6 18-Apr-2010  jym This patch fixes the NX regression issue observed on amd64 kernels, where
per-page execution right was disabled (therefore leading to the inability
of the kernel to detect fraudulent use of memory mappings marked as not
being executable).

- replace cpu_feature and ci_feature_flags variables by cpu_feature and
ci_feat_val arrays. This makes it cleaner and brings kernel code closer
to the design of cpuctl(8). A warning will be raised for each CPU that
does not expose the same features as the Boot Processor (BP).

- the blacklist of CPU features is now a macro defined in the
specialreg.h header, instead of hardcoding it inside MD initialization
code; fix comments.

- replace checks against CPUID_TSC with the cpu_hascounter() function.

- clean up the code in init_x86_64(), as cpu_feature variables are set
inside cpu_probe().

- use cpu_init_msrs() for i386. It will be eventually used later for NX
feature under i386 PAE kernels.

- remove code that checks for CPUID_NOX in amd64 mptramp.S, this is already
performed by cpu_hatch() through cpu_init_msrs().

- remove cpu_signature and feature_flags members from struct mpbios_proc
(they were never used).

This patch was tested with i386 MONOLITHIC, XEN3PAE_DOM0 and XEN3_DOM0 under
a native i386 host, and amd64 GENERIC, XEN3_DOM0 via QEMU virtual machines.

XXX Should kernel rev be bumped?

XXX A similar patch should be pulled-up for NetBSD-5, hopefully tomorrow.
 1.5 28-Apr-2008  martin branches: 1.5.14; 1.5.20; 1.5.22;
Remove clause 3 and 4 from TNF licenses
 1.4 16-Apr-2008  cegger branches: 1.4.2; 1.4.4;
- use aprint_*_dev and device_xname
- use POSIX integer types
 1.3 04-Mar-2003  fvdl branches: 1.3.104;
Make the apic address unsigned, as it should be.
 1.2 04-Mar-2003  fvdl Fix some fields that did not have explicit types yet.
 1.1 26-Feb-2003  fvdl Move some files out of i386 into x86, so that they can be shared with
other ports.
 1.3.104.1 02-Jun-2008  mjf Sync with HEAD.
 1.4.4.2 11-Aug-2010  yamt sync with head.
 1.4.4.1 16-May-2008  yamt sync with head.
 1.4.2.1 18-May-2008  yamt sync with head.
 1.5.22.1 30-May-2010  rmind sync with head
 1.5.20.1 30-Apr-2010  uebayasi Sync with HEAD.
 1.5.14.1 24-Oct-2010  jym Sync with HEAD
 1.8 17-Apr-2009  dyoung Introduce sys/arch/x86/x86/mp.c for common x86 MP configuration code.
mpacpi_scan_pci() and mpbios_scan_pci() are identical code, so replace
them with mp_pci_scan().

Introduce mp_pci_childdetached(), which helps us to detach root PCI
buses that were enumerated either by MP BIOS or by ACPI.

Let us detach and re-attach PCI buses from mainbus0 on i386. This is
necessarily a work-in-progress, because testing detach and re-attach
is very difficult: to detach and re-attach the entire PCI tree on most
x86 computers that I own is not possible because some essential device
attaches under the PCI subtree: the console, com0, NIC, or storage
controller always attaches in the PCI tree.
 1.7 09-Nov-2008  cegger branches: 1.7.4;
struct device * -> device_t
 1.6 09-Nov-2008  cegger Nuke last parameter from mpaci_scan_apics() and mpbios_scan().
It is unused.
 1.5 28-Apr-2008  martin branches: 1.5.6; 1.5.8;
Remove clause 3 and 4 from TNF licenses
 1.4 04-Jul-2006  christos branches: 1.4.58; 1.4.60; 1.4.62;
Apply fvdl's acpi pci interrupt configuration code.
- MPACPI is no more.
- MPACPI_SCANPCI -> ACPI_SCANPCI
 1.3 29-May-2003  fvdl branches: 1.3.18; 1.3.32; 1.3.36; 1.3.44;
Add the options MPBIOS_SCANPCI and MPACPI_SCANPCI to configure PCI roots
with the MPBIOS/ACPI bus information, by walking through the buses, and
descending down every bus that hasn't been marked configured yet.
 1.2 02-Apr-2003  thorpej Use PAGE_SIZE rather than NBPG.
 1.1 26-Feb-2003  fvdl Move some files out of i386 into x86, so that they can be shared with
other ports.
 1.3.44.1 13-Jul-2006  gdamore Merge from HEAD.
 1.3.36.1 11-Aug-2006  yamt sync with head
 1.3.32.1 09-Sep-2006  rpaulo sync with head
 1.3.18.1 30-Dec-2006  yamt sync with head.
 1.4.62.2 04-May-2009  yamt sync with head.
 1.4.62.1 16-May-2008  yamt sync with head.
 1.4.60.1 18-May-2008  yamt sync with head.
 1.4.58.2 17-Jan-2009  mjf Sync with HEAD.
 1.4.58.1 02-Jun-2008  mjf Sync with HEAD.
 1.5.8.2 28-Apr-2009  skrll Sync with HEAD.
 1.5.8.1 19-Jan-2009  skrll Sync with HEAD.
 1.5.6.1 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.7.4.2 01-Nov-2009  jym Sync with HEAD.
 1.7.4.1 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.15 27-Apr-2015  knakahara add intr_handle_t and let pci_intr_handle_t use it.
 1.14 15-Jun-2012  yamt branches: 1.14.2; 1.14.16;
comments
 1.13 01-Jul-2011  dyoung branches: 1.13.2;
#include <sys/bus.h> instead of <machine/bus.h>.
 1.12 09-Jan-2010  cegger add x2apic support.
patch presented on current-users@, port-i386@ and port-amd64@ on 2009-12-22

No comments.
 1.11 17-Apr-2009  dyoung Introduce sys/arch/x86/x86/mp.c for common x86 MP configuration code.
mpacpi_scan_pci() and mpbios_scan_pci() are identical code, so replace
them with mp_pci_scan().

Introduce mp_pci_childdetached(), which helps us to detach root PCI
buses that were enumerated either by MP BIOS or by ACPI.

Let us detach and re-attach PCI buses from mainbus0 on i386. This is
necessarily a work-in-progress, because testing detach and re-attach
is very difficult: to detach and re-attach the entire PCI tree on most
x86 computers that I own is not possible because some essential device
attaches under the PCI subtree: the console, com0, NIC, or storage
controller always attaches in the PCI tree.
 1.10 16-Apr-2008  cegger branches: 1.10.4; 1.10.12; 1.10.18;
- use aprint_*_dev and device_xname
- use POSIX integer types
 1.9 04-Jul-2006  christos branches: 1.9.58;
Apply fvdl's acpi pci interrupt configuration code.
- MPACPI is no more.
- MPACPI_SCANPCI -> ACPI_SCANPCI
 1.8 29-May-2005  christos branches: 1.8.2; 1.8.12; 1.8.16; 1.8.24;
Sprinkle const.
 1.7 16-Apr-2005  yamt make multi inclusion protection macros consistent.
 1.6 27-Oct-2003  junyoung branches: 1.6.8; 1.6.14;
Nuke __P().
 1.5 16-Oct-2003  fvdl Add hooks and structures to allow the MP table intr mapping code a
better shot at finding a mapping. For PCI interrupts, if a bus
has no mappings, try its parent, with the swizzled pin, and the
bridge's device number.
 1.4 06-Sep-2003  fvdl When establishing the ACPI SCI, make sure it's always active low (as well
as level-triggered). Do this by changing the MP config entry that was
set up for the interrupt. Do not change anything if there was an ACPI
interrupt source override, assume that this contains the correct
information already.
 1.3 29-May-2003  fvdl branches: 1.3.2;
Add the options MPBIOS_SCANPCI and MPACPI_SCANPCI to configure PCI roots
with the MPBIOS/ACPI bus information, by walking through the buses, and
descending down every bus that hasn't been marked configured yet.
 1.2 11-May-2003  fvdl Add a global_int field to the mp_intr_map structure, for use with ACPI.
XXX should probably just use an array directly indexed by global interrupt
numbers in that case.
 1.1 26-Feb-2003  fvdl Move some files out of i386 into x86, so that they can be shared with
other ports.
 1.3.2.4 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.3.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.3.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.3.2.1 03-Aug-2004  skrll Sync with HEAD
 1.6.14.1 21-Apr-2005  tron Pull up revision 1.7 (requested by yamt in ticket #174):
make multi inclusion protection macros consistent.
 1.6.8.1 29-Apr-2005  kent sync with -current
 1.8.24.1 13-Jul-2006  gdamore Merge from HEAD.
 1.8.16.1 11-Aug-2006  yamt sync with head
 1.8.12.1 09-Sep-2006  rpaulo sync with head
 1.8.2.1 30-Dec-2006  yamt sync with head.
 1.9.58.1 02-Jun-2008  mjf Sync with HEAD.
 1.10.18.4 27-Aug-2011  jym Sync with HEAD. Most notably: uvm/pmap work done by rmind@, and MP Xen
work of cherry@.

No regression observed on suspend/restore.
 1.10.18.3 24-Oct-2010  jym Sync with HEAD
 1.10.18.2 01-Nov-2009  jym Sync with HEAD.
 1.10.18.1 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.10.12.1 28-Apr-2009  skrll Sync with HEAD.
 1.10.4.2 11-Mar-2010  yamt sync with head
 1.10.4.1 04-May-2009  yamt sync with head.
 1.13.2.1 30-Oct-2012  yamt sync with head
 1.14.16.1 06-Jun-2015  skrll Sync with HEAD
 1.14.2.1 03-Dec-2017  jdolecek update from HEAD
 1.6 31-Jan-2020  maxv constify
 1.5 15-Dec-2011  abs branches: 1.5.48; 1.5.54;
Increase MTRR_I686_NVAR_MAX from 8 to 16. Avoids
"FIXME: more than 8 MTRRs (10)" message on booting Thinkpad W520 and
similar. While here replace a magic number with MTRR_I686_NVAR_MAX * 2
 1.4 01-Jul-2008  mrg branches: 1.4.6; 1.4.30; 1.4.34;
hack around PR#38480:

- rename MTRR_I686_NVAR to MTRR_I686_NVAR_MAX, still set to 8
- store mtrr VCNT value into i686_mtrr_vcnt. if it is less than 8,
zero out the relevant parts of mtrr_raw[].msraddr
- replace all usage of MTRR_I686_NVAR with either i686_mtrr_vcnt or
with MTRR_I686_NVAR_MAX as appropriate
- in i686_mtrr_reload() and mtrr_init_first() don't use mtrr_raw[]
addresses of 0

still needs a bunch of reworking to handle VCNT > 8 case.
 1.3 28-Apr-2008  martin branches: 1.3.2; 1.3.4;
Remove clause 3 and 4 from TNF licenses
 1.2 28-Jul-2003  mrg branches: 1.2.52; 1.2.68; 1.2.102; 1.2.104; 1.2.106;
give >32 bit constants an "LL" prefix to appease gcc3.3
 1.1 26-Feb-2003  fvdl branches: 1.1.2;
Move some files out of i386 into x86, so that they can be shared with
other ports.
 1.1.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.1.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.1.2.1 03-Aug-2004  skrll Sync with HEAD
 1.2.106.2 04-May-2009  yamt sync with head.
 1.2.106.1 16-May-2008  yamt sync with head.
 1.2.104.1 18-May-2008  yamt sync with head.
 1.2.102.2 02-Jul-2008  mjf Sync with HEAD.
 1.2.102.1 02-Jun-2008  mjf Sync with HEAD.
 1.2.68.1 04-Sep-2008  skrll Sync with netbsd-4.
 1.2.52.3 18-Nov-2008  bouyer Pull up following revision(s) (requested by sborrill in ticket #1173):
sys/arch/x86/include/mtrr.h: revision 1.4
sys/arch/amd64/amd64/netbsd32_machdep.c: revision 1.54
sys/arch/x86/x86/mtrr_i686.c: revision 1.18
hack around PR#38480:
- rename MTRR_I686_NVAR to MTRR_I686_NVAR_MAX, still set to 8
- store mtrr VCNT value into i686_mtrr_vcnt. if it is less than 8,
zero out the relevant parts of mtrr_raw[].msraddr
- replace all usage of MTRR_I686_NVAR with either i686_mtrr_vcnt or
with MTRR_I686_NVAR_MAX as appropriate
- in i686_mtrr_reload() and mtrr_init_first() don't use mtrr_raw[]
addresses of 0
still needs a bunch of reworking to handle VCNT > 8 case.
Ensure optional MTRR sections are built if MTRR is enabled (missing
Fix build due to changes in revision 1.4 of sys/arch/x86/include/mtrr.h
 1.2.52.2 23-Aug-2008  bouyer Back out ticket #1173, it breaks the build of amd64 kernels.
 1.2.52.1 20-Aug-2008  bouyer Pull up following revision(s) (requested by sborrill in ticket #1173):
sys/arch/x86/include/mtrr.h: revision 1.4
sys/arch/x86/x86/mtrr_i686.c: revision 1.18
hack around PR#38480:
- rename MTRR_I686_NVAR to MTRR_I686_NVAR_MAX, still set to 8
- store mtrr VCNT value into i686_mtrr_vcnt. if it is less than 8,
zero out the relevant parts of mtrr_raw[].msraddr
- replace all usage of MTRR_I686_NVAR with either i686_mtrr_vcnt or
with MTRR_I686_NVAR_MAX as appropriate
- in i686_mtrr_reload() and mtrr_init_first() don't use mtrr_raw[]
addresses of 0
still needs a bunch of reworking to handle VCNT > 8 case.
 1.3.4.1 03-Jul-2008  simonb Sync with head.
 1.3.2.1 18-Sep-2008  wrstuden Sync with wrstuden-revivesa-base-2.
 1.4.34.1 18-Feb-2012  mrg merge to -current.
 1.4.30.1 17-Apr-2012  yamt sync with head
 1.4.6.1 19-Jun-2013  bouyer Pull up following revision(s) (requested by msaitoh in ticket #1847):
sys/arch/x86/include/mtrr.h: revision 1.5
sys/arch/x86/x86/mtrr_i686.c: revision 1.25
sys/arch/x86/include/specialreg.h: revision 1.55
Increase MTRR_I686_NVAR_MAX from 8 to 16. Avoids
"FIXME: more than 8 MTRRs (10)" message on booting Thinkpad W520 and
similar. While here replace a magic number with MTRR_I686_NVAR_MAX * 2
 1.5.54.1 29-Feb-2020  ad Sync with head.
 1.5.48.1 08-Apr-2020  martin Merge changes from current as of 20200406
 1.10 12-Jul-2023  riastradh machine/mutex.h: Sprinkle sys/types.h, omit machine/lock.h.

Turns out machine/lock.h is not needed for __cpu_simple_lock_t, which
always comes from sys/types.h. And, really, sys/types.h (or at least
sys/stdint.h) is needed for uintN_t and uintptr_t.
 1.9 05-Mar-2020  riastradh branches: 1.9.22;
Fix userland build by surrounding stuff with #ifdef _KERNEL.

(...Why does this header file get exposed to userland at all?)
 1.8 05-Mar-2020  riastradh Remove __MUTEX_PRIVATE conditional in definition of struct kmutex.

This doesn't buy us anything but the need to hack around it in
ctfmerge to avoid massive duplication of kernel types -- which only
worked for the x86 definition.

This changes only x86 and arm for now, pending compile-testing the
remaining architectures.
 1.7 29-Nov-2019  riastradh Nix now-unused definitions of MUTEX_GIVE/MUTEX_RECEIVE.
 1.6 24-Apr-2009  ad branches: 1.6.64; 1.6.68;
A workaround for a bug with some Opteron revisions where locked operations
sometimes do not serve as memory barriers, allowing memory references to
bleed outside of critical sections. It's possible that this is the
reason for pkgbuild's longstanding crashiness.
 1.5 28-Apr-2008  martin branches: 1.5.8; 1.5.10; 1.5.14; 1.5.16;
Remove clause 3 and 4 from TNF licenses
 1.4 09-Dec-2007  ad branches: 1.4.10; 1.4.12; 1.4.14;
Use atomic_cas_ulong().
 1.3 21-Nov-2007  yamt branches: 1.3.2; 1.3.4;
make kmutex_t and krwlock_t smaller by killing lock id.
ok'ed by Andrew Doran.
 1.2 09-Feb-2007  ad branches: 1.2.4; 1.2.8; 1.2.14; 1.2.24; 1.2.26; 1.2.30; 1.2.32;
Merge newlock2 to head.
 1.1 10-Sep-2006  ad branches: 1.1.2;
file mutex.h was initially added on branch newlock2.
 1.1.2.8 01-Feb-2007  ad Header file cleanup.
 1.1.2.7 30-Jan-2007  ad Don't expose the guts of struct kmutex unless _KERNEL.
 1.1.2.6 27-Jan-2007  ad Rename some functions to better describe what they do.
 1.1.2.5 12-Jan-2007  ad Sync with head.
 1.1.2.4 29-Dec-2006  ad Checkpoint work in progress.
 1.1.2.3 17-Nov-2006  ad Checkpoint work in progress.
 1.1.2.2 20-Oct-2006  ad - Don't need locked bus cycles on release from C code.
- Save an integer ID in the lock structures for LOCKDEBUG code.
 1.1.2.1 10-Sep-2006  ad Add updated locking primatives.
 1.2.32.2 27-Dec-2007  mjf Sync with HEAD.
 1.2.32.1 08-Dec-2007  mjf Sync with HEAD.
 1.2.30.1 21-Nov-2007  bouyer Sync with HEAD
 1.2.26.1 09-Jan-2008  matt sync with HEAD
 1.2.24.2 09-Dec-2007  jmcneill Sync with HEAD.
 1.2.24.1 21-Nov-2007  joerg Sync with HEAD.
 1.2.14.1 17-Apr-2007  thorpej G/C _lock_cas() -- the atomic ops API provides what the locking
primitives need.
 1.2.8.1 03-Dec-2007  ad Sync with HEAD.
 1.2.4.4 21-Jan-2008  yamt sync with head
 1.2.4.3 07-Dec-2007  yamt sync with head
 1.2.4.2 26-Feb-2007  yamt sync with head.
 1.2.4.1 09-Feb-2007  yamt file mutex.h was added on branch yamt-lazymbuf on 2007-02-26 09:08:49 +0000
 1.3.4.1 11-Dec-2007  yamt sync with head.
 1.3.2.1 26-Dec-2007  ad Sync with head.
 1.4.14.2 04-May-2009  yamt sync with head.
 1.4.14.1 16-May-2008  yamt sync with head.
 1.4.12.1 18-May-2008  yamt sync with head.
 1.4.10.1 02-Jun-2008  mjf Sync with HEAD.
 1.5.16.1 13-May-2009  snj branches: 1.5.16.1.2;
Pull up following revision(s) (requested by ad in ticket #725):
sys/arch/x86/include/mutex.h: revision 1.6
A workaround for a bug with some Opteron revisions where locked operations
sometimes do not serve as memory barriers, allowing memory references to
bleed outside of critical sections. It's possible that this is the
reason for pkgbuild's longstanding crashiness.
 1.5.16.1.2.1 21-Apr-2010  matt sync to netbsd-5
 1.5.14.2 01-Nov-2009  jym Sync with HEAD.
 1.5.14.1 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.5.10.1 13-May-2009  snj Pull up following revision(s) (requested by ad in ticket #725):
sys/arch/x86/include/mutex.h: revision 1.6
A workaround for a bug with some Opteron revisions where locked operations
sometimes do not serve as memory barriers, allowing memory references to
bleed outside of critical sections. It is possible that this is the
reason for pkgbuild's longstanding crashiness.
 1.5.8.1 28-Apr-2009  skrll Sync with HEAD.
 1.6.68.1 13-May-2020  martin Pull up following revision(s) (requested by chs in ticket #904):

sys/arch/x86/include/mutex.h: revision 1.8
sys/arch/x86/include/mutex.h: revision 1.9
sys/arch/arm/include/mutex.h: revision 1.22
sys/arch/arm/include/mutex.h: revision 1.23

Remove __MUTEX_PRIVATE conditional in definition of struct kmutex.

This doesn't buy us anything but the need to hack around it in
ctfmerge to avoid massive duplication of kernel types -- which only
worked for the x86 definition.

This changes only x86 and arm for now, pending compile-testing the
remaining architectures.

Fix userland build by surrounding stuff with #ifdef _KERNEL.
(...Why does this header file get exposed to userland at all?)
 1.6.64.1 08-Apr-2020  martin Merge changes from current as of 20200406
 1.9.22.1 09-Aug-2023  martin Pull up following revision(s) (requested by maya in ticket #316):

sys/arch/m68k/include/mutex.h: revision 1.13
sys/arch/arm/include/cpu.h: revision 1.125
sys/arch/sun68k/include/intr.h: revision 1.21
sys/arch/arm/include/mutex.h: revision 1.28
sys/sys/rwlock.h: revision 1.18
sys/arch/powerpc/include/mutex.h: revision 1.7
sys/arch/arm/include/mutex.h: revision 1.29
sys/arch/powerpc/include/mutex.h: revision 1.8
sys/uvm/uvm_param.h: revision 1.42
sys/sys/ksem.h: revision 1.16
sys/arch/x86/include/mutex.h: revision 1.10
sys/sys/proc.h: revision 1.372
sys/sys/ksem.h: revision 1.17
sys/arch/ia64/include/mutex.h: revision 1.8
sys/arch/evbarm/include/intr.h: revision 1.29
sys/sys/lua.h: revision 1.9
sys/arch/next68k/include/intr.h: revision 1.23
sys/arch/ia64/include/mutex.h: revision 1.9
sys/arch/hp300/include/intr.h: revision 1.35
sys/arch/hp300/include/intr.h: revision 1.36
sys/arch/sparc/include/cpu.h: revision 1.111
sys/arch/hppa/include/mutex.h: revision 1.16
sys/arch/vax/include/intr.h: revision 1.31
sys/arch/hppa/include/mutex.h: revision 1.17
sys/arch/news68k/include/intr.h: revision 1.28
sys/arch/hppa/include/mutex.h: revision 1.18
sys/arch/hppa/include/intr.h: revision 1.3
sys/arch/hppa/include/mutex.h: revision 1.19
sys/arch/hppa/include/intr.h: revision 1.4
sys/sys/sched.h: revision 1.92
sys/opencrypto/cryptodev.h: revision 1.51
sys/arch/vax/include/mutex.h: revision 1.20
sys/arch/sparc64/include/mutex.h: revision 1.10
sys/arch/ia64/include/sapicvar.h: revision 1.2
sys/arch/riscv/include/mutex.h: revision 1.5
sys/arch/amiga/dev/grfabs_cc.c: revision 1.39
sys/external/bsd/drm2/include/linux/idr.h: revision 1.11
sys/arch/riscv/include/mutex.h: revision 1.6
sys/ddb/files.ddb: revision 1.16
sys/arch/mac68k/include/intr.h: revision 1.32
share/man/man4/ddb.4: revision 1.203
sys/ddb/db_command.c: revision 1.183
sys/arch/mips/include/mutex.h: revision 1.10
sys/ddb/db_command.c: revision 1.184
sys/arch/x68k/include/intr.h: revision 1.22
sys/arch/sparc/include/psl.h: revision 1.51
sys/arch/or1k/include/mutex.h: revision 1.4
sys/arch/mips/include/mutex.h: revision 1.11
sys/arch/arm/xscale/pxa2x0_intr.h: revision 1.16
sys/arch/sparc64/include/cpu.h: revision 1.134
sys/arch/sparc/include/psl.h: revision 1.52
sys/arch/or1k/include/mutex.h: revision 1.5
sys/arch/mvme68k/include/intr.h: revision 1.22
sys/arch/luna68k/include/intr.h: revision 1.16
external/cddl/osnet/sys/sys/kcondvar.h: revision 1.6
sys/arch/sparc/include/mutex.h: revision 1.12
sys/arch/sparc/include/mutex.h: revision 1.13
sys/arch/usermode/include/mutex.h: revision 1.5
sys/arch/usermode/include/mutex.h: revision 1.6
sys/kern/kern_core.c: revision 1.38
usr.sbin/crash/Makefile: revision 1.49
sys/arch/amiga/include/intr.h: revision 1.23
sys/arch/alpha/include/mutex.h: revision 1.12
sys/arch/alpha/include/mutex.h: revision 1.13
sys/arch/evbarm/lubbock/sacc_obio.c: revision 1.16
sys/ddb/ddb.h: revision 1.6
sys/arch/sparc64/include/mutex.h: revision 1.8
sys/arch/sh3/include/mutex.h: revision 1.12
sys/arch/evbarm/lubbock/sacc_obio.c: revision 1.17
sys/ddb/db_syncobj.c: revision 1.1
sys/arch/vax/include/mutex.h: revision 1.18
sys/arch/sparc64/include/psl.h: revision 1.63
sys/arch/sparc64/include/mutex.h: revision 1.9
sys/arch/sh3/include/mutex.h: revision 1.13
sys/arch/evbarm/lubbock/obio.c: revision 1.13
sys/arch/atari/include/intr.h: revision 1.23
sys/ddb/db_syncobj.c: revision 1.2
sys/arch/vax/include/mutex.h: revision 1.19
sys/arch/evbarm/g42xxeb/obio.c: revision 1.14
sys/arch/evbarm/g42xxeb/obio.c: revision 1.15
sys/arch/cesfic/include/intr.h: revision 1.14
sys/ddb/db_syncobj.h: revision 1.1
sys/arch/x86/include/cpu.h: revision 1.134
sys/arch/evbarm/g42xxeb/obio.c: revision 1.16
sys/arch/cesfic/include/intr.h: revision 1.15
sys/arch/arm/xscale/pxa2x0_intr.c: revision 1.26
sys/sys/cpu_data.h: revision 1.54
sys/arch/m68k/include/mutex.h: revision 1.12
sys/arch/ia64/acpi/madt.c: revision 1.6

sys/rwlock.h: Make this more self-contained for bool.

machine/mutex.h: Sprinkle includes so this can be used by crash(8).

ddb: New `show all tstiles' command.
Shows who's waiting for which locks and what the owner is up to.

Include psl.h for ipl_cookie_t if __MUTEX_PRIVATE

sys: Rip <sys/resourcevar.h> out of <uvm/uvm_param.h>.

And thus out of <sys/param.h>, which is exceedingly overused and
fragile and delenda est.

Should fix (some) issues with the recent inclusion of machine/lock.h
in various machine/mutex.h files.

arm/mutex.h: Need machine/intr.h, machine/lock.h.

For ipl_cookie_t and __cpu_simple_lock_t.
evbarm/intr.h: Define ipl_cookie_t before including ARM_INTR_IMPL.

Otherwise arm/mutex.h doesn't work, due to a cyclic dependency which
should really be fixed.
opencrypto/cryptodev.h: Fix includes.
- Move sys/condvar.h under #ifdef _KERNEL.
- Add some other necessary includes and forward declarations.
- Sort.

hp300/intr.h: Fix missing includes.
linux/idr.h: Need <sys/mutex.h> for kmutex_t.
amiga/intr.h: Don't define spl*() functions if !_KERNEL.

This is used by crash(8) now, and what's important is ipl_cookie_t.
cesfic/intr.h: Expose ipl_cookie_t to userland for crash(8).
cesfic/intr.h: Expose ipl_cookie_t to userland only with _KMEMUSER.

Probably not necessary but let's be a little more cautious about
this.

atari/intr.h: Expose ipl_cookie_t with _KMEMUSER for crash(8).

arm/cpu.h: Need sys/param.h for COHERENCY_UNIT.

Nix machine/param.h -- not meant to be used directly, pulled in by
sys/param.h.

Move the definition of ipl_cookie_t out of the kernel-only sections,
some _KMEMUSER applications need it.

ddb: Cast pointer to uintptr_t first before db_expr_t.

hppa/intr.h: Expose ipl_cookie_t to _KMEMUSER for crash(8).

luna68k/intr.h: Expose ipl_cookie_t to _KMEMUSER for crash(8).

mvme68k/intr.h: Expose ipl_cookie_t to _KMEMUSER for crash(8).

news68k/intr.h: Fix includes. Put some definitions under _KERNEL.

next68k/intr.h: Expose ipl_cookie_t to _KMEMUSER for crash(8).

sys/ksem.h: Hack around fstat(8) abuse of _KERNEL.

sun68k/intr.h: Expose ipl_cookie_t to _KMEMUSER for crash(8).

vax/intr.h: Expose ipl_cookie_t to _KMEMUSER for crash(8).

x68k/intr.h: Put functions under _KERNEL so crash(8) can use this.

Make ipl_cookie_t visible for _KMEMUSER userland applications.

fix editor mishap in previous

Explicitly include <sys/mutex.h> for kmutex_t.

Replace kmutex_t * (which may be undefined here) with struct kmutex *,
suggested by Taylor.

hp300/intr.h: Put most of this under #ifdef _KERNEL.
Only ipl_cookie_t really needs to be exposed now, for crash(8).

mac68k/intr.h: Expose ipl_cookie_t to _KMEMUSER for crash(8).
Make inclusion of sys/intr.h explicit for spl*.

fix hppa and vax builds.

machine/lock.h isn't necessary for __cpu_simple_lock_t, it's in
sys/types.h. avoids cpu_data.h vs sched.h include order issues.

move the hppa ipl_t typedef with the moved usage of it.
machine/mutex.h: Sprinkle sys/types.h, omit machine/lock.h.

Turns out machine/lock.h is not needed for __cpu_simple_lock_t, which
always comes from sys/types.h. And, really, sys/types.h (or at least
sys/stdint.h) is needed for uintN_t and uintptr_t.

ddb: Cast pointer to uintptr_t, then to db_expr_t.
Avoids warnings about conversion between pointer and integer of
different size on some architectures.

re-fix hppa builds.

this file uses __cpu_simple_lock(), not just the underlying type,
so it does need machine/lock.h.

Break cycle by using `struct kmutex *' instead of `kmutex_t *'.
sys/sched.h included sys/mutex.h
which includes sys/intr.h
which includes machine/intr.h
which on cats includes arm/footbridge/footbridge_intr.h
which includes arm/cpu.h
which includes sys/cpu_data.h
which includes sys/sched.h

But there was never any real need for sys/mutex.h in sys/sched.h,
because it only uses pointers to the opaque struct kmutex. Cycle
broken by using `struct kmutex *' instead of pulling in sys/mutex.h
for the definition of kmutex_t.

Side effect: This revealed that sys/cpu_data.h needed sys/intr.h
(which was pulled in accidentally by sys/mutex.h via sys/sched.h) for
SOFTINT_COUNT. Also revealed some other machine/cpu.h header files
were missing includes of sys/mutex.h for kmutex_t.

ia64: Need sys/types.h for u_int, vaddr_t; sys/mutex.h for kmutex_t.

explicitly include no longer implicitly included sys/mutex.h.

arm/xscale: Use sys/bitops.h fls32 - 1 instead of 31 - __builtin_clz.
Sidesteps namespace collision with `#define bits ...' in net/zlib.c.

complete the previous - there were two calls to find_first_bit() to fix.

arm/xscale: Missed a spot with previous find_first_bit commit.

evbarm/g42xxeb: Fix off-by-one in previous.

The original find_first_bit(x) was 31 - __builtin_clz((uint32_t)x),
which is equivalent to fls32(x) - 1, not to fls32(x).

Note that fls32 is 1-based and returns 0 for x=0.
 1.1 24-Feb-2009  yamt branches: 1.1.2; 1.1.4; 1.1.6;
- rewrite x86 nmi dispatcher so that establish and disesablish are safe
on a running system.
- adapt existing users of the api. (elan)
- adapt tprof_pmi driver to use the api.
 1.1.6.3 01-Nov-2009  jym Sync with HEAD.
 1.1.6.2 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.1.6.1 24-Feb-2009  jym file nmi.h was added on branch jym-xensuspend on 2009-05-13 17:18:44 +0000
 1.1.4.2 04-May-2009  yamt sync with head.
 1.1.4.1 24-Feb-2009  yamt file nmi.h was added on branch yamt-nfs-mp on 2009-05-04 08:12:09 +0000
 1.1.2.2 03-Mar-2009  skrll Sync with HEAD.
 1.1.2.1 24-Feb-2009  skrll file nmi.h was added on branch nick-hppapmap on 2009-03-03 18:29:37 +0000
 1.1 20-Aug-2022  riastradh x86: Move page attribute table bits to x86/pat.h.
 1.18 27-Apr-2015  knakahara add x86 MD MSI/MSI-X support code.
 1.17 27-Apr-2015  knakahara add intr_handle_t and let pci_intr_handle_t use it.
 1.16 27-Apr-2015  knakahara add pci_intr_distribute(9) for x86.
 1.15 04-Mar-2015  knakahara add a comment for pci_intr_handle_t.
 1.14 14-Mar-2010  dyoung branches: 1.14.20; 1.14.38;
Add a new member, pc_super, to x86's pci_chipset_tag: pc.pc_super points
to the tag that pc inherits its behavior from. Add code to deal with
pc.pc_super.

Pull identical declarations out of xen/include/pci_machdep.h and
x86/include/pci_machdep.h into x86/include/pci_machdep_common.h.
 1.13 25-Feb-2010  dyoung Change the pci_attach_args definition to allow machine-dependent
code to override the default pci(9) behavior by creating a non-NULL
pci_attach_args_t (on x86, pci_attach_args_t is always NULL) containing
one or more non-NULL function pointers.
 1.12 24-Feb-2010  dyoung KNF: change spaces to tabs.
 1.11 24-Feb-2010  dyoung Don't bother to #define PCI_PREFER_IOSPACE, nothing uses it.
 1.10 24-Feb-2010  dyoung Change 'typedef void *pci_chipset_tag_t' to 'typedef struct
pci_chipset_tag *pci_chipset_tag_t' for an improvement in type safety.
(Back when I did the same for cardbus_chipset_tag_t, it helped to turn
up some bugs!)
 1.9 16-Feb-2010  dyoung Get rid of all PCI_CONF_MODE #ifdef'age except for the little bit
that initializes pci_mode, which I have moved to the top.

Make pci_mode private to pci_machdep.c.

Provide pci_mode_set() for pcibios.c to configure the PCI Configuration
Mechanism. KASSERT() in pci_mode_set() that the mechanism is not
changing from anything but the "don't know" value, -1.
 1.8 30-May-2008  joerg branches: 1.8.12; 1.8.18;
Add a function to extract the primary bus number of PCI host bridges,
as far as specific code for this already existed.
 1.7 16-Apr-2008  cegger branches: 1.7.2; 1.7.4; 1.7.6;
- use aprint_*_dev and device_xname
- use POSIX integer types
 1.6 20-Jun-2005  sekiya branches: 1.6.82;
pci_device_foreach(), pci_device_foreach_min(), pci_bridge_foreach(), and
pci_bridge_hook don't actually have any dependancies on PCIBIOS-specific code,
and they can be used to fixup PCI bus numbering in the absence of the BIOS.

To that end, decouple them from PCIBIOS.
 1.5 16-Apr-2005  yamt make multi inclusion protection macros consistent.
 1.4 29-Jul-2004  drochner branches: 1.4.4; 1.4.10;
remove now unnecessary "pci_enumerate_bus" definitions
 1.3 16-Oct-2003  fvdl Add hooks and structures to allow the MP table intr mapping code a
better shot at finding a mapping. For PCI interrupts, if a bus
has no mappings, try its parent, with the swizzled pin, and the
bridge's device number.
 1.2 15-Jun-2003  fvdl branches: 1.2.2;
Handle 64bit DMA addresses on PCI for platforms that can (currently only
enabled on amd64). Add a dmat64 field to various PCI attach structures,
and pass it down where needed. Implement a simple new function called
pci_dma64_available(pa) to test if 64bit DMA addresses may be used.
This returns 1 iff _PCI_HAVE_DMA64 is defined in <machine/pci_machdep.h>,
and there is more than 4G of memory.
 1.1 26-Feb-2003  fvdl Move some files out of i386 into x86, so that they can be shared with
other ports.
 1.2.2.5 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.2.2.4 21-Sep-2004  skrll Fix the sync with head I botched.
 1.2.2.3 18-Sep-2004  skrll Sync with HEAD.
 1.2.2.2 06-Aug-2004  skrll Fix merge mistakes.
 1.2.2.1 03-Aug-2004  skrll Sync with HEAD
 1.4.10.1 21-Apr-2005  tron Pull up revision 1.5 (requested by yamt in ticket #174):
make multi inclusion protection macros consistent.
 1.4.4.1 29-Apr-2005  kent sync with -current
 1.6.82.1 02-Jun-2008  mjf Sync with HEAD.
 1.7.6.1 23-Jun-2008  wrstuden Sync w/ -current. 34 merge conflicts to follow.
 1.7.4.3 11-Aug-2010  yamt sync with head.
 1.7.4.2 11-Mar-2010  yamt sync with head
 1.7.4.1 04-May-2009  yamt sync with head.
 1.7.2.1 04-Jun-2008  yamt sync with head
 1.8.18.1 30-Apr-2010  uebayasi Sync with HEAD.
 1.8.12.1 24-Oct-2010  jym Sync with HEAD
 1.14.38.2 06-Jun-2015  skrll Sync with HEAD
 1.14.38.1 06-Apr-2015  skrll Sync with HEAD
 1.14.20.1 03-Dec-2017  jdolecek update from HEAD
 1.25 01-Aug-2020  jdolecek move __HAVE_PCI_MSI_MSIX to <x86/pci_machdep_common.h>
 1.24 25-Apr-2020  bouyer Merge the bouyer-xenpvh branch, bringing in Xen PV drivers support under HVM
guests in GENERIC.
Xen support can be disabled at runtime with
boot -c
disable hypervisor
 1.23 11-Jul-2016  knakahara branches: 1.23.28;
pci_intr_type() is required pci_chipset_tag_t argument by other than x86.

pointed out by nonaka@n.o.
 1.22 22-Oct-2015  knakahara add pci_intr_alloc related stubs to reduce ifdef from device drivers.
 1.21 17-Aug-2015  knakahara Add kernel code to support intrctl(8).
 1.20 13-Aug-2015  msaitoh - Don't take pci_attach_args as an argument in pci_msi[x]_count().
- Move prototypes of pci_msi[x]_count() from x86/x86/pci_machdep_common to
sys/dev/pci/pcivar.h.
- Move pci_msi[x]_count() from x86/pci/pci_msi_machdep.c to sys/dev/pci/pci.c
 1.19 21-Jul-2015  knakahara add pci_intr_alloc() API
 1.18 15-May-2015  knakahara pci_msi_string() must be used by MD code only.
 1.17 15-May-2015  knakahara unify INTx, MSI and MSI-X APIs without alloc. (alloc API is under discussion)
 1.16 08-May-2015  knakahara add a const qualifier to struct pci_attach_args *pa argument
 1.15 27-Apr-2015  knakahara add x86 MD MSI/MSI-X support code.
 1.14 27-Apr-2015  knakahara add pci_intr_distribute(9) for x86.
 1.13 29-Mar-2014  christos branches: 1.13.6;
make pci_intr_string and eisa_intr_string take a buffer and a length
instead of relying in local static storage.
 1.12 31-Jul-2013  soren Blocking memory space accesses on the SIS 85C496 chipset turned out to be
a bit too heavy-handed and similar cases are unlikely to crop up again,
so simplify by eliminating pci_bus_flags().

Closes PR port-i386/20410.
 1.11 09-Dec-2012  jakllsch branches: 1.11.2;
Reflect that this file is now for the x86 ports and not just i386 in comments.
 1.10 09-Dec-2012  jakllsch Remove trailing whitespace on blank lines.
 1.9 15-Jun-2012  yamt branches: 1.9.2;
comment
 1.8 28-Aug-2011  dyoung branches: 1.8.2;
Add some code for grovelling in the PCI configuration space for all
of the memory & I/O space reserved by the PCI BIOS for PCI devices
(including bridges) and recording that information for later use.

The code takes between 13k and 50k (depends on the architecture and,
bizarrely, the kernel configuration) so I am going to move it from
pci_machdep.c into its own module on Monday.
 1.7 01-Aug-2011  drochner add an experimental implementation of PCI MSIs (Message Signaled
Interrupts). Successfully tested with hdaudio and "wpi" wireless
ethernet.
notes:
-There seem to be buggy chips around which announce MSI support
but don't correctly implement it. Thus the final word whether MSIs
can be used should be by the driver.
-Only a single vector is supported. For multiple vectors, the IDT
allocation code would have to be changed. (And we would possibly
run into problems due to the limited number of vectors supported
by the current code.)
-The code is "#if NIOAPIC > 0" because it uses the ioapic_edge
interrupt stubs. These actually don't touch any ioapic, so this
is somewhat a misnomer.
-MSIs can't be identified by a "pin" but only by a cpu/vector
pair. Common intr code soesn't deal well with this yet.
-Drivers need to take care of saving/restoring MSI data in the device's
config space on suspend/resume.
 1.6 04-Apr-2011  dyoung Neither pci_dma64_available(), pci_probe_device(), pci_mapreg_map(9),
pci_find_rom(), pci_intr_map(9), pci_enumerate_bus(), nor the match
predicate passed to pciide_compat_intr_establish() should ever modify
their pci_attach_args argument, so make their pci_attach_args arguments
const and deal with the fallout throughout the kernel.

For the most part, these changes add a 'const' where there was no
'const' before, however, some drivers and MD code used to modify
pci_attach_args. Now those drivers either copy their pci_attach_args
and modify the copy, or refrain from modifying pci_attach_args:

Xen: according to Manuel Bouyer, writing to pci_attach_args in
pci_intr_map() was a leftover from Xen 2. Probably a bug. I
stopped writing it. I have not tested this change.

siside(4): sis_hostbr_match() needlessly wrote to pci_attach_args.
Probably a bug. I use a temporary variable. I have not tested this
change.

slide(4): sl82c105_chip_map() overwrote the caller's pci_attach_args.
Probably a bug. Use a local pci_attach_args. I have not tested
this change.

viaide(4): via_sata_chip_map() and via_sata_chip_map_new() overwrote the
caller's pci_attach_args. Probably a bug. Make a local copy of the
caller's pci_attach_args and modify the copy. I have not tested
this change.

While I'm here, make pci_mapreg_submap() static.

With these changes in place, I have tested the compilation of these
kernels:

alpha GENERIC
amd64 GENERIC XEN3_DOM0
arc GENERIC
atari HADES MILAN-PCIIDE
bebox GENERIC
cats GENERIC
cobalt GENERIC
evbarm-eb NSLU2
evbarm-el ADI_BRH ARMADILLO9 CP3100 GEMINI GEMINI_MASTER GEMINI_SLAVE GUMSTIX
HDL_G IMX31LITE INTEGRATOR IQ31244 IQ80310 IQ80321 IXDP425 IXM1200
KUROBOX_PRO LUBBOCK MARVELL_NAS NAPPI SHEEVAPLUG SMDK2800 TEAMASA_NPWR
TEAMASA_NPWR_FC TS7200 TWINTAIL ZAO425
evbmips-el AP30 DBAU1500 DBAU1550 MALTA MERAKI MTX-1 OMSAL400 RB153 WGT624V3
evbmips64-el XLSATX
evbppc EV64260 MPC8536DS MPC8548CDS OPENBLOCKS200 OPENBLOCKS266
OPENBLOCKS266_OPT P2020RDB PMPPC RB800 WALNUT
hp700 GENERIC
i386 ALL XEN3_DOM0 XEN3_DOMU
ibmnws GENERIC
macppc GENERIC
mvmeppc GENERIC
netwinder GENERIC
ofppc GENERIC
prep GENERIC
sandpoint GENERIC
sgimips GENERIC32_IP2x
sparc GENERIC_SUN4U KRUPS
sparc64 GENERIC

As of Sun Apr 3 15:26:26 CDT 2011, I could not compile these kernels
with or without my patches in place:

### evbmips-el GDIUM

nbmake: nbmake: don't know how to make /home/dyoung/pristine-nbsd/src/sys/arch/mips/mips/softintr.c. Stop

### evbarm-el MPCSA_GENERIC
src/sys/arch/evbarm/conf/MPCSA_GENERIC:318: ds1672rtc*: unknown device `ds1672rtc'

### ia64 GENERIC

/tmp/genassym.28085/assym.c: In function 'f111':
/tmp/genassym.28085/assym.c:67: error: invalid application of 'sizeof' to incomplete type 'struct pcb'
/tmp/genassym.28085/assym.c:76: error: dereferencing pointer to incomplete type

### sgimips GENERIC32_IP3x

crmfb.o: In function `crmfb_attach':
crmfb.c:(.text+0x2304): undefined reference to `ddc_read_edid'
crmfb.c:(.text+0x2304): relocation truncated to fit: R_MIPS_26 against `ddc_read_edid'
crmfb.c:(.text+0x234c): undefined reference to `edid_parse'
crmfb.c:(.text+0x234c): relocation truncated to fit: R_MIPS_26 against `edid_parse'
crmfb.c:(.text+0x2354): undefined reference to `edid_print'
crmfb.c:(.text+0x2354): relocation truncated to fit: R_MIPS_26 against `edid_print'
 1.5 06-Nov-2010  jakllsch branches: 1.5.2;
Unbreak Xen build, while not actually fixing the real problem.
NetBSD/xen doesn't implement disestablishing interrupts yet.
 1.4 06-Nov-2010  jakllsch Implement pciide_machdep_compat_intr_disestablish() to help enable
detachment of compatibility-mapped pciide(4)-family controllers.
 1.3 28-Apr-2010  dyoung branches: 1.3.2; 1.3.4; 1.3.6;
Provide an x86 implementation of pci_chipset_tag_create(9) and
pci_chipset_tag_destroy(9).
 1.2 20-Mar-2010  dyoung Add a prototype for pci_mmio_range_infer() that will infer the
range of memory forwarded by the host chipset to PCI.
 1.1 14-Mar-2010  dyoung branches: 1.1.2;
Add a new member, pc_super, to x86's pci_chipset_tag: pc.pc_super points
to the tag that pc inherits its behavior from. Add code to deal with
pc.pc_super.

Pull identical declarations out of xen/include/pci_machdep.h and
x86/include/pci_machdep.h into x86/include/pci_machdep_common.h.
 1.1.2.3 21-Apr-2011  rmind sync with head
 1.1.2.2 05-Mar-2011  rmind sync with head
 1.1.2.1 30-May-2010  rmind sync with head
 1.3.6.5 27-Aug-2011  jym Sync with HEAD. Most notably: uvm/pmap work done by rmind@, and MP Xen
work of cherry@.

No regression observed on suspend/restore.
 1.3.6.4 02-May-2011  jym Sync with head.
 1.3.6.3 10-Jan-2011  jym Sync with HEAD
 1.3.6.2 24-Oct-2010  jym Sync with HEAD
 1.3.6.1 28-Apr-2010  jym file pci_machdep_common.h was added on branch jym-xensuspend on 2010-10-24 22:48:16 +0000
 1.3.4.2 11-Aug-2010  yamt sync with head.
 1.3.4.1 28-Apr-2010  yamt file pci_machdep_common.h was added on branch yamt-nfs-mp on 2010-08-11 22:52:55 +0000
 1.3.2.3 09-Nov-2010  uebayasi Sync with HEAD.
 1.3.2.2 30-Apr-2010  uebayasi Sync with HEAD.
 1.3.2.1 28-Apr-2010  uebayasi file pci_machdep_common.h was added on branch uebayasi-xip on 2010-04-30 14:39:57 +0000
 1.5.2.1 06-Jun-2011  jruoho Sync with HEAD.
 1.8.2.3 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.8.2.2 16-Jan-2013  yamt sync with (a bit old) head
 1.8.2.1 30-Oct-2012  yamt sync with head
 1.9.2.3 03-Dec-2017  jdolecek update from HEAD
 1.9.2.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.9.2.1 25-Feb-2013  tls resync with head
 1.11.2.2 18-May-2014  rmind sync with head
 1.11.2.1 28-Aug-2013  rmind sync with head
 1.13.6.4 05-Oct-2016  skrll Sync with HEAD
 1.13.6.3 27-Dec-2015  skrll Sync with HEAD (as of 26th Dec)
 1.13.6.2 22-Sep-2015  skrll Sync with HEAD
 1.13.6.1 06-Jun-2015  skrll Sync with HEAD
 1.23.28.1 16-Apr-2020  bouyer More #ifndef XEN -> #ifndef XENPV
 1.10 25-Apr-2020  bouyer Merge the bouyer-xenpvh branch, bringing in Xen PV drivers support under HVM
guests in GENERIC.
Xen support can be disabled at runtime with
boot -c
disable hypervisor
 1.9 04-Nov-2017  cherry branches: 1.9.14;
Add a PIC_XEN abstraction to evtchn.c

This allows us to get XEN interrupt code closer to unification to x86/intr.c
 1.8 27-Apr-2015  knakahara add x86 MD MSI/MSI-X support code.
 1.7 19-Apr-2009  ad branches: 1.7.22; 1.7.40;
cpuctl:

- Add interrupt shielding (direct hardware interrupts away from the
specified CPUs). Not documented just yet but will be soon.

- Redo /dev/cpu time_t compat so no kernel changes are needed.

x86:

- Make intr_establish, intr_disestablish safe to use when !cold.

- Distribute hardware interrupts among the CPUs, instead of directing
everything to the boot CPU.

- Add MD code for interrupt sheilding. This works in most cases but there is
a bug where delivery is not accepted by an LAPIC after redistribution. It
also needs re-balancing to make things fair after interrupts are turned
back on for a CPU.
 1.6 02-Apr-2009  dyoung During shutdown, detach devices in an orderly fashion.

Call the detach routine for every device in the device tree, starting
with the leaves and moving toward the root, expecting that each
(pseudo-)device driver will use the opportunity to gracefully commit
outstandings transactions to the underlying (pseudo-)device and to
relinquish control of the hardware to the system BIOS.

Detaching devices is not suitable for every shutdown: in an emergency,
or if the system state is inconsistent, we should resort to a fast,
simple shutdown that uses only the pmf(9) shutdown hooks and the
(deprecated) shutdownhooks. For now, if the flag RB_NOSYNC is set in
boothowto, opt for the fast, simple shutdown.

Add a device flag, DVF_DETACH_SHUTDOWN, that indicates by its presence
that it is safe to detach a device during shutdown. Introduce macros
CFATTACH_DECL3() and CFATTACH_DECL3_NEW() for creating autoconf
attachments with default device flags. Add DVF_DETACH_SHUTDOWN
to configuration attachments for atabus(4), atw(4) at cardbus(4),
cardbus(4), cardslot(4), com(4) at isa(4), elanpar(4), elanpex(4),
elansc(4), gpio(4), npx(4) at isa(4), nsphyter(4), pci(4), pcib(4),
pcmcia(4), ppb(4), sip(4), wd(4), and wdc(4) at isa(4).

Add a device-detachment "reason" flag, DETACH_SHUTDOWN, that tells the
autoconf code and a device driver that the reason for detachment is
system shutdown.

Add a sysctl, kern.detachall, that tells the system to try to detach
every device at shutdown, regardless of any device's DVF_DETACH_SHUTDOWN
flag. The default for kern.detachall is 0. SET IT TO 1, PLEASE, TO
HELP TEST AND DEBUG DEVICE DETACHMENT AT SHUTDOWN.

This is a work in progress. In future work, I aim to treat
pseudo-devices more thoroughly, and to gracefully tear down a stack of
(pseudo-)disk drivers and filesystems, including cgd(4), vnd(4), and
raid(4) instances at shutdown.

Also commit some changes that are not easily untangled from the rest:

(1) begin to simplify device_t locking: rename struct pmf_private to
device_lock, and incorporate device_lock into struct device.

(2) #include <sys/device.h> in sys/pmf.h in order to get some
definitions that it needs. Stop unnecessarily #including <sys/device.h>
in sys/arch/x86/include/pic.h to keep the amd64, xen, and i386 releases
building.
 1.5 03-Jul-2008  drochner branches: 1.5.4; 1.5.10;
Remove "struct device" from "struct pic", where it was only real
for ioapics and faked up for others. Add it to "struct ioapic_softc"
for now, until device/softc get split.
This required all typecasts between "struct pic" and "struct ioapic_softc"
to be replaced, I hope I got them all.
functionally tested on i386, compile-tested on xen, untested on amd64
 1.4 04-Jan-2008  ad branches: 1.4.6; 1.4.10; 1.4.12; 1.4.14;
Don't pull in sys/simplelock.h, it's not needed.
 1.3 12-Mar-2007  ad branches: 1.3.18; 1.3.24; 1.3.30;
Include sys/simplelock.h, not lock.h.
 1.2 04-Jul-2006  christos branches: 1.2.10; 1.2.14;
Apply fvdl's acpi pci interrupt configuration code.
- MPACPI is no more.
- MPACPI_SCANPCI -> ACPI_SCANPCI
 1.1 26-Feb-2003  fvdl branches: 1.1.18; 1.1.32; 1.1.36; 1.1.44;
Move some files out of i386 into x86, so that they can be shared with
other ports.
 1.1.44.1 13-Jul-2006  gdamore Merge from HEAD.
 1.1.36.1 11-Aug-2006  yamt sync with head
 1.1.32.1 09-Sep-2006  rpaulo sync with head
 1.1.18.3 21-Jan-2008  yamt sync with head
 1.1.18.2 03-Sep-2007  yamt sync with head.
 1.1.18.1 30-Dec-2006  yamt sync with head.
 1.2.14.1 13-Mar-2007  ad Sync with head.
 1.2.10.1 24-Mar-2007  yamt sync with head.
 1.3.30.1 08-Jan-2008  bouyer Sync with HEAD
 1.3.24.1 18-Feb-2008  mjf Sync with HEAD.
 1.3.18.1 09-Jan-2008  matt sync with HEAD
 1.4.14.1 03-Jul-2008  simonb Sync with head.
 1.4.12.1 18-Sep-2008  wrstuden Sync with wrstuden-revivesa-base-2.
 1.4.10.1 04-May-2009  yamt sync with head.
 1.4.6.1 28-Sep-2008  mjf Sync with HEAD.
 1.5.10.2 01-Nov-2009  jym Sync with HEAD.
 1.5.10.1 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.5.4.1 28-Apr-2009  skrll Sync with HEAD.
 1.7.40.1 06-Jun-2015  skrll Sync with HEAD
 1.7.22.1 03-Dec-2017  jdolecek update from HEAD
 1.9.14.1 19-Apr-2020  bouyer Add per-PIC callbacks for interrupt_get_devname(), interrupt_get_assigned()
and interrupt_get_count(). Implement Xen-specific callbacks for
PIC_XEN and use the x86 one for others.
In event_set_handler(), call intr_allocate_io_intrsource() so that
events appears in interrupt list (intrctl list).
 1.10 15-Nov-2019  maxv Remove the ins* and outs* functions. Not sanitizer-friendly, and unused
anyway.
 1.9 22-May-2011  christos branches: 1.9.56;
remove _
 1.8 28-Apr-2008  martin branches: 1.8.14; 1.8.22; 1.8.28;
Remove clause 3 and 4 from TNF licenses
 1.7 17-Oct-2007  garbled branches: 1.7.16; 1.7.18; 1.7.20;
Merge the ppcoea-renovation branch to HEAD.

This branch was a major cleanup and rototill of many of the various OEA
cpu based PPC ports that focused on sharing as much code as possible
between the various ports to eliminate near-identical copies of files in
every tree. Additionally there is a new PIC system that unifies the
interface to interrupt code for all different OEA ppc arches. The work
for this branch was done by a variety of people, too long to list here.

TODO:
bebox still needs work to complete the transition to -renovation.
ofppc still needs a bunch of work, which I will be looking at.
ev64260 still needs to be renovated
amigappc was not attempted.

NOTES:
pmppc was removed as an arch, and moved to a evbppc target.
 1.6 26-Sep-2007  ad x86 changes for pcc and LKMs.

- Replace most inline assembly with proper functions. As a side effect
this reduces the size of amd64 GENERIC by about 120kB, and i386 by a
smaller amount. Nearly all of the inlines did something slow, or something
that does not need to be fast.
- Make curcpu() and curlwp functions proper, unless __GNUC__ && _KERNEL.
In that case make them inlines. Makes curlwp LKM and preemption safe.
- Make bus_space and bus_dma more LKM friendly.
- Share a few more files between the ports.
- Other minor changes.
 1.5 16-Feb-2006  perry branches: 1.5.24; 1.5.32; 1.5.42; 1.5.44; 1.5.46;
Change "inline" back to "__inline" in .h files -- C99 is still too
new, and some apps compile things in C89 mode. C89 keywords stay.

As per core@.
 1.4 03-Feb-2006  bouyer branches: 1.4.2;
Change repne to rep for {ins,outs}{b,s,l} as proposed
to port-amd64, port-i386 and port-xen 2 weeks ago. Under Xen-3, a repne won't
loop (only the first value is read/written) while rep works as expected.
Linux and FreeBSD use rep, and documentation suggests that repne should
not be used with ins and outs instructions.
See http://mail-index.netbsd.org/port-xen/2006/01/22/0000.html for
details.
 1.3 24-Dec-2005  perry branches: 1.3.2; 1.3.4;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.
 1.2 27-Feb-2003  fvdl branches: 1.2.18;
Reinstate some const qualifiers I accidentally removed when moving this
file.
 1.1 26-Feb-2003  fvdl Move some files out of i386 into x86, so that they can be shared with
other ports.
 1.2.18.2 27-Oct-2007  yamt sync with head.
 1.2.18.1 21-Jun-2006  yamt sync with head.
 1.3.4.1 09-Sep-2006  rpaulo sync with head
 1.3.2.1 18-Feb-2006  yamt sync with head.
 1.4.2.1 22-Apr-2006  simonb Sync with head.
 1.5.46.1 06-Oct-2007  yamt sync with head.
 1.5.44.1 06-Nov-2007  matt sync with HEAD
 1.5.42.1 02-Oct-2007  joerg Sync with HEAD.
 1.5.32.1 03-Oct-2007  garbled Sync with HEAD
 1.5.24.1 09-Oct-2007  ad Sync with head.
 1.7.20.1 16-May-2008  yamt sync with head.
 1.7.18.1 18-May-2008  yamt sync with head.
 1.7.16.1 02-Jun-2008  mjf Sync with HEAD.
 1.8.28.1 06-Jun-2011  jruoho Sync with HEAD.
 1.8.22.1 31-May-2011  rmind sync with head
 1.8.14.1 27-Aug-2011  jym Sync with HEAD. Most notably: uvm/pmap work done by rmind@, and MP Xen
work of cherry@.

No regression observed on suspend/restore.
 1.9.56.1 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.134 20-Aug-2022  riastradh x86: Move definition of struct pmap to pmap_private.h.

This makes pmap_resident_count and pmap_wired_count out-of-line
functions instead of inline. No functional change intended
otherwise.
 1.133 20-Aug-2022  riastradh x86: Split most of pmap.h into pmap_private.h or vmparam.h.

This way pmap.h only contains the MD definition of the MI pmap(9)
API, which loads of things in the kernel rely on, so changing x86
pmap internals no longer requires recompiling the entire kernel every
time.

Callers needing these internals must now use machine/pmap_private.h.
Note: This is not x86/pmap_private.h because it contains three parts:

1. CPU-specific (different for i386/amd64) definitions used by...

2. common definitions, including Xenisms like xpmap_ptetomach,
further used by...

3. more CPU-specific inlines for pmap_pte_* operations

So {amd64,i386}/pmap_private.h defines 1, includes x86/pmap_private.h
for 2, and then defines 3. Maybe we should split that out into a new
pmap_pte.h to reduce this trouble.

No functional change intended, other than that some .c files must
include machine/pmap_private.h when previously uvm/uvm_pmap.h
polluted the namespace with pmap internals.

Note: This migrates part of i386/pmap.h into i386/vmparam.h --
specifically the parts that are needed for several constants defined
in vmparam.h:

VM_MAXUSER_ADDRESS
VM_MAX_ADDRESS
VM_MAX_KERNEL_ADDRESS
VM_MIN_KERNEL_ADDRESS

Since i386 needs PDP_SIZE in vmparam.h, I added it there on amd64
too, just to keep things parallel.
 1.132 20-Aug-2022  riastradh x86: Move pl*_i, pl_i_roundup, and ptp_va2o out of x86/pmap.h.

- pl[1-4]_i -> x86/pte.h
- pl_i, pl_i_roundup, ptp_va2o -> x86/pmap.c
 1.131 20-Aug-2022  riastradh x86: Move struct vm_page_md to common x86/pmap.h.
 1.130 20-Aug-2022  riastradh x86: Split bootspace out of x86/pmap.h into new x86/bootspace.h.
 1.129 20-Aug-2022  riastradh x86: Move page attribute table bits to x86/pat.h.
 1.128 18-Jun-2022  andvar fix typos in word "functions" in comments, mainly s/fuctions/functions/.
 1.127 30-Apr-2021  christos Merge the x86 gdt function and constant definitions
 1.126 30-Apr-2021  christos Bump MAX_USERLDT_SIZE to the max size (wastes some memory). wine needs more
than PAGE_SIZE and fails spuriously.
XXX: Note the duplicate definition hacks. Should really create <x86/gdt.h>,
put the just the constants there and unify them.
This would also avoid the hack in: src/tests/lib/libi386/t_user_ldt.c#46
 1.125 19-Jul-2020  maxv branches: 1.125.6;
Revert most of ad's movs/stos change. Instead do a lot simpler: declare
svs_quad_copy() used by SVS only, with no need for instrumentation, because
SVS is disabled when sanitizers are on.
 1.124 14-Jul-2020  yamaguchi Introduce per-cpu IDTs

This is realized by following modifications:
- Add IDT pages and its allocation maps for each cpu in "struct cpu_info"
- Load per-cpu IDTs at cpu_init_idt(struct cpu_info*)
- Copy the IDT entries for cpu0 to other CPUs at attach
- These are, for example, exceptions, db, system calls, etc.

And, added a kernel option named PCPU_IDT to enable the feature.
 1.123 24-Jun-2020  maxv remove unused x86_stos
 1.122 27-May-2020  ad - Add a couple of wrapper functions around STOS and MOVS and use them to zero
and copy PTEs in preference to memset()/memcpy().

- Remove related SSE / pageidlezero stuff.
 1.121 26-May-2020  bouyer Ajust pmap_enter_ma() for upcoming new Xen privcmd ioctl:
pass flags to xpq_update_foreign()
Introduce a pmap MD flag: PMAP_MD_XEN_NOTR, which cause xpq_update_foreign()
to use the MMU_PT_UPDATE_NO_TRANSLATE flag.
make xpq_update_foreign() return the raw Xen error. This will cause
pmap_enter_ma() to return a negative error number in this case, but the
only user of this code path is privcmd.c and it can deal with it.

Add pmap_enter_gnt()m which maps a set of Xen grant entries at the
specified va in the specified pmap. Use the hooks implemented for EPT to
keep track of mapped grand entries in the pmap, and unmap them
when pmap_remove() is called. This requires pmap_remove() to be split
into a pmap_remove_locked(), to be called from pmap_remove_gnt().
 1.120 08-May-2020  riastradh Factor randomization out of slotspace_rand.

slotspace_rand becomes deterministic; the randomization moves into
the callers instead. Why?

There are two callers of slotspace_rand:

- x86/pmap.c pmap_bootstrap
- amd64/amd64.c init_slotspace

When the randomization was introduced, it used an x86-only
`cpu_earlyrng' abstraction that would hash rdseed/rdrand and rdtsc
output together. Except init_slotspace ran before cpu_probe, so
cpu_feature was not yet filled out, so during init_slotspace, the
only randomization was rdtsc.

In the course of the recent entropy overhaul, I replaced cpu_earlyrng
by entropy_extract, and moved cpu_init_rng much earlier -- but still
after cpu_probe -- in order to reduce the number of abstractions
lying around and the number of copies of rdrand/rdseed logic. In so
doing I added some annoying complication (see curcpu_available) to
kern_entropy.c to make it work early enough for init_slotspace, and
dropped the rdtsc.

For pmap_bootstrap that didn't substantively change anything. But
for init_slotspace, it removed the only randomization. To mitigate
this, this commit pulls the randomization out of slotspace_rand into
pmap_bootstrap and init_slotspace, so that

(a) init_slotspace can use rdtsc and a little private entropy pool in
order to restore the prior (weak) randomization it had, and

(b) pmap_bootstrap, which runs a little bit later, can continue to
use entropy_extract normally and get rdrand/rdseed too.

A subsequent commit will move cpu_init_rng just a wee bit later,
after cpu_init_msrs, so the kern_entropy.c complications can go away.
Perhaps someone else more wizardly with x86 can find a way to make
init_slotspace run a little later too, after cpu_probe and after
cpu_init_msrs and after cpu_rng_init, but I am not that wizardly.
 1.119 25-Apr-2020  bouyer Merge the bouyer-xenpvh branch, bringing in Xen PV drivers support under HVM
guests in GENERIC.
Xen support can be disabled at runtime with
boot -c
disable hypervisor
 1.118 24-Apr-2020  maxv Give the ldt a fixed size of one page (512 slots), and drop the variable-
sized mechanism that was too complex.

This fixes a race between USER_LDT and SVS: during context switches, the
way SVS installs the new ldt relies on the ldt pointer AND the ldt size,
but both cannot be accessed atomically at the same time.
 1.117 05-Apr-2020  ad branches: 1.117.2;
Allocate PV entries in PAGE_SIZE chunks, and cache partially allocated PV
pages with the pmap. Worth about 2-3% sys time on build.sh for me.
 1.116 22-Mar-2020  ad x86 pmap:

- Give pmap_remove_all() its own version of pmap_remove_ptes() that on native
x86 does the bare minimum needed to clear out PTPs. Cuts ~4% sys time on
'build.sh release' for me.

- pmap_sync_pv(): there's no need to issue a redundant TLB shootdown. The
caller waits for the competing operation to finish.

- Bring 'options TLBSTATS' up to date.
 1.115 17-Mar-2020  ad Hallelujah, the bug has been found. Resurrect prior changes, to be fixed
with following commit.
 1.114 17-Mar-2020  ad Back out the recent pmap changes until I can figure out what is going on
with pmap_page_remove() (to pmap.c rev 1.365).
 1.113 14-Mar-2020  ad PR kern/55071 (Panic shortly after running X11 due to kernel diagnostic assertion "mutex_owned(&pp->pp_lock)")

- Fix a locking bug in pmap_pp_clear_attrs() and in pmap_pp_remove() do the
TLB shootdown while still holding the target pmap's lock.

Also:

- Finish PV list locking for x86 & update comments around same.

- Keep track of the min/max index of PTEs inserted into each PTP, and use
that to clip ranges of VAs passed to pmap_remove_ptes().

- Based on the above, implement a pmap_remove_all() for x86 that clears out
the pmap in a single pass. Makes exit() / fork() much cheaper.
 1.112 14-Mar-2020  ad pmap_remove_all(): Return a boolean value to indicate the behaviour. If
true, all mappings have been removed, the pmap is totally cleared out, and
UVM can then avoid doing the work to call pmap_remove() for each map entry.
If false, either nothing has been done, or some helpful arch-specific voodoo
has taken place.
 1.111 10-Mar-2020  ad - pmap_check_inuse() is expensive so make it DEBUG not DIAGNOSTIC.

- Put PV locking back in place with only a minor performance impact.
pmap_enter() still needs more work - it's not easy to satisfy all the
competing requirements so I'll do that with another change.

- Use pmap_find_ptp() (lookup only) in preference to pmap_get_ptp() (alloc).
Make pm_ptphint indexed by VA not PA. Replace the per-pmap radixtree for
dynamic PV entries with a per-PTP rbtree. Cuts system time during kernel
build by ~10% for me.
 1.110 23-Feb-2020  ad UVM locking changes, proposed on tech-kern:

- Change the lock on uvm_object, vm_amap and vm_anon to be a RW lock.
- Break v_interlock and vmobjlock apart. v_interlock remains a mutex.
- Do partial PV list locking in the x86 pmap. Others to follow later.
 1.109 12-Jan-2020  ad x86 pmap:

- It turns out that every page the pmap frees is necessarily zeroed. Tell
the VM system about this and use the pmap as a source of pre-zeroed pages.

- Redo deferred freeing of PTPs more elegantly, including the integration with
pmap_remove_all(). This fixes problems with nvmm, and possibly also a crash
discovered during fuzzing.

Reported-by: syzbot+a97186518c84f1d85c0c@syzkaller.appspotmail.com
 1.108 04-Jan-2020  ad branches: 1.108.2;
x86 pmap improvements, reducing system time during a build by about 15% on
my test machine:

- Replace the global pv_hash with a per-pmap record of dynamically allocated
pv entries. The data structure used for this can be changed easily, and
has no special concurrency requirements. For now go with radixtree.

- Change pmap_pdp_cache back into a pool; cache the page directory with the
pmap, and avoid contention on pmaps_lock by adjusting the global list in
the pool_cache ctor & dtor. Align struct pmap and its lock, and update
some comments.

- Simplify pv_entry lists slightly. Allow both PP_EMBEDDED and dynamically
allocated entries to co-exist on a single page. This adds a pointer to
struct vm_page on x86, but shrinks pv_entry to 32 bytes (which also gets
it nicely aligned).

- More elegantly solve the chicken-and-egg problem introduced into the pmap
with radixtree lookup for pages, where we need PTEs mapped and page
allocations to happen under a single hold of the pmap's lock. While here
undo some cut-n-paste.

- Don't adjust pmap_kernel's stats with atomics, because its mutex is now
held in the places the stats are changed.
 1.107 15-Dec-2019  ad uvm_pagerealloc() can now block because of radixtree manipulation, so defer
freeing PTPs until pmap_unmap_ptes(), where we still have the pmap locked
but can finally tolerate context switches again.

To be revisited soon: pmap_map_ptes() seems broken WRT other pmap load.

Reported-by: syzbot+689fb7dab41abff8e75a@syzkaller.appspotmail.com
Reported-by: syzbot+3e7bbf37d37d451b25d7@syzkaller.appspotmail.com
Reported-by: syzbot+689fb7dab41abff8e75a@syzkaller.appspotmail.com
Reported-by: syzbot+689fb7dab41abff8e75a@syzkaller.appspotmail.com
Reported-by: syzbot+3e7bbf37d37d451b25d7@syzkaller.appspotmail.com
 1.106 08-Dec-2019  ad Merge x86 pmap changes from yamt-pagecache:

- Deal better with the multi-level pmap object locking kludge.
- Handle uvm_pagealloc() being able to block.
 1.105 14-Nov-2019  maxv Add support for Kernel Memory Sanitizer (kMSan). It detects uninitialized
memory used by the kernel at run time, and just like kASan and kCSan, it
is an excellent feature. It has already detected 38 uninitialized variables
in the kernel during my testing, which I have since discreetly fixed.

We use two shadows:
- "shad", to track uninitialized memory with a bit granularity (1:1).
Each bit set to 1 in the shad corresponds to one uninitialized bit of
real kernel memory.
- "orig", to track the origin of the memory with a 4-byte granularity
(1:1). Each uint32_t cell in the orig indicates the origin of the
associated uint32_t of real kernel memory.

The memory consumption of these shadows is consequent, so at least 4GB of
RAM is recommended to run kMSan.

The compiler inserts calls to specific __msan_* functions on each memory
access, to manage both the shad and the orig and detect uninitialized
memory accesses that change the execution flow (like an "if" on an
uninitialized variable).

We mark as uninit several types of memory buffers (stack, pools, kmem,
malloc, uvm_km), and check each buffer passed to copyout, copyoutstr,
bwrite, if_transmit_lock and DMA operations, to detect uninitialized memory
that leaves the system. This allows us to detect kernel info leaks in a way
that is more efficient and also more user-friendly than KLEAK.

Contrary to kASan, kMSan requires comprehensive coverage, ie we cannot
tolerate having one non-instrumented function, because this could cause
false positives. kMSan cannot instrument ASM functions, so I converted
most of them to __asm__ inlines, which kMSan is able to instrument. Those
that remain receive special treatment.

Contrary to kASan again, kMSan uses a TLS, so we must context-switch this
TLS during interrupts. We use different contexts depending on the interrupt
level.

The orig tracks precisely the origin of a buffer. We use a special encoding
for the orig values, and pack together in each uint32_t cell of the orig:
- a code designating the type of memory (Stack, Pool, etc), and
- a compressed pointer, which points either (1) to a string containing
the name of the variable associated with the cell, or (2) to an area
in the kernel .text section which we resolve to a symbol name + offset.

This encoding allows us not to consume extra memory for associating
information with each cell, and produces a precise output, that can tell
for example the name of an uninitialized variable on the stack, the
function in which it was pushed on the stack, and the function where we
accessed this uninitialized variable.

kMSan is available with LLVM, but not with GCC.

The code is organized in a way that is similar to kASan and kCSan, so it
means that other architectures than amd64 can be supported.
 1.104 13-Nov-2019  maxv Rename:
PP_ATTRS_M -> PP_ATTRS_D
PP_ATTRS_U -> PP_ATTRS_A
For consistency.
 1.103 05-Oct-2019  maxv Switch to the new PTE naming. No binary diff (tested with MKREPRO).
 1.102 07-Aug-2019  maxv Add support for USER_LDT in SVS. This allows us to have both enabled at
the same time.

We allocate an LDT for each CPU in the GDT and map an area for it, in
addition to the default LDT already present. In context switches between
different processes, we choose between the default or the per-cpu LDT
selector: if the user set specific LDT entries, we memcpy them to the
per-cpu LDT and load the per-cpu selector.

Tested by Naveen Narayanan (with Wine on amd64).
 1.101 29-May-2019  maxv branches: 1.101.2;
Add PCID support in SVS. This avoids TLB flushes during kernel<->user
transitions, which greatly reduces the performance penalty introduced by
SVS.

We use two ASIDs, 0 (kern) and 1 (user), and use invpcid to flush pages
in both ASIDs.

The read-only machdep.svs.pcid={0,1} sysctl is added, and indicates whether
SVS+PCID is in use.
 1.100 10-Mar-2019  maxv Two changes:

* Allow large pages to be passed in pmap_pdes_valid, this happens under
DDB when it reads RIP (.text), called via pmap_extract.

* Invert a branch in pmap_extract, so that 'l_cpu' is not touched if we're
dealing with the kernel pmap.

This fixes 'boot -d'.
 1.99 09-Mar-2019  maxv Start replacing the x86 PTE bits.
 1.98 23-Feb-2019  maxv Move PATENTRY into pmap.h, will be used outside.
 1.97 13-Feb-2019  maxv Add the EPT pmap code, used by Intel-VMX.

The idea is that under NVMM, we don't want to implement the hypervisor page
tables manually in NVMM directly, because we want pageable guests; that is,
we want to allow UVM to unmap guest pages when the host comes under
pressure.

Contrary to AMD-SVM, Intel-VMX uses a different set of PTE bits from
native, and this has three important consequences:

- We can't use the native PTE bits, so each time we want to modify the
page tables, we need to know whether we're dealing with a native pmap
or an EPT pmap. This is accomplished with callbacks, that handle
everything PTE-related.

- There is no recursive slot possible, so we can't use pmap_map_ptes().
Rather, we walk down the EPT trees via the direct map, and that's
actually a lot simpler (and probably faster too...).

- The kernel is never mapped in an EPT pmap. An EPT pmap cannot be loaded
on the host. This has two sub-consequences: at creation time we must
zero out all of the top-level PTEs, and at destruction time we force
the page out of the pool cache and into the pool, to ensure that a next
allocation will invoke pmap_pdp_ctor() to create a native pmap and not
recycle some stale EPT entries.

To create an EPT pmap, the caller must invoke pmap_ept_transform() on a
newly-allocated native pmap. And that's about it, from then on the EPT
callbacks will be invoked, and the pmap can be destroyed via the usual
pmap_destroy(). The TLB shootdown callback is not initialized however,
it is the responsibility of the hypervisor (NVMM) to set it.

There are some twisted cases that we need to handle. For example if
pmap_is_referenced() is called on a physical page that is entered both by
a native pmap and by an EPT pmap, we take the Accessed bits from the
two pmaps using different PTE sets in each case, and combine them into a
generic PP_ATTRS_U flag (that does not depend on the pmap type).

Given that the EPT layout is a 4-Level tree with the same address space as
native x86_64, we allow ourselves to use a few native macros in EPT, such
as pmap_pa2pte(), rather than re-defining them with "ept" in the name.

Even though this EPT code is rather complex, it is not too intrusive: just
a few callbacks in a few pmap functions, predicted-false to give priority
to native. So this comes with no messy #ifdef or performance cost.
 1.96 11-Feb-2019  cherry We reorganise definitions for XEN source support as follows:

XEN - common sources required for baseline XEN support.
XENPV - sources required for support of XEN in PV mode.
XENPVHVM - sources required for support for XEN in HVM mode.
XENPVH - sources required for support for XEN in PVH mode.
 1.95 01-Feb-2019  maxv Add the remaining pmap callbacks, will be used by NVMM-VMX.
 1.94 01-Feb-2019  maxv Change the format of the pp_attrs field: instead of using PTE bits
directly, use abstracted bits that are converted from/to PTE bits when
needed (in pmap_sync_pv).

This allows us to use the same pp_attrs for pmaps that have PTE bits at
different locations.
 1.93 17-Dec-2018  maxv Add two pmap fields, will be used by NVMM-VMX. Also apply a few cosmetic
changes.
 1.92 06-Dec-2018  maxv Fix inconsistency, these are indexes and not types, no real functional
change.
 1.91 19-Nov-2018  maxv Introduce pl_pi, will be used soon.
 1.90 19-Nov-2018  maxv Rename 'mask' -> 'frame', we will use the real 'mask' soon.
 1.89 07-Nov-2018  maxv Add two pmap fields, will be used by NVMM.
 1.88 29-Aug-2018  maxv clean up a little
 1.87 29-Aug-2018  maxv Remove the constants of the DMAP, they are unused, and move NL4_SLOT_DIRECT
into amd64/.
 1.86 29-Aug-2018  maxv Simplify the ASLR stuff, we don't care about resizable areas now, and it
makes the code more complicated for no good reason.
 1.85 20-Aug-2018  maxv Add support for kASan on amd64. Written by me, with some parts inspired
from Siddharth Muralee's initial work. This feature can detect several
kinds of memory bugs, and it's an excellent feature.

It can be enabled by uncommenting these three lines in GENERIC:

#makeoptions KASAN=1 # Kernel Address Sanitizer
#options KASAN
#no options SVS

The kernel is compiled without SVS, without DMAP and without PCPU area.
A shadow area is created at boot time, and it can cover the upper 128TB
of the address space. This area is populated gradually as we allocate
memory. With this design the memory consumption is kept at its lowest
level.

The compiler calls the __asan_* functions each time a memory access is
done. We verify whether this access is legal by looking at the shadow
area.

We declare our own special memcpy/memset/etc functions, because the
compiler's builtins don't add the __asan_* instrumentation.

Initially all the mappings are marked as valid. During dynamic
allocations, we add a redzone, which we mark as invalid. Any access on
it will trigger a kASan error message. Additionally, the compiler adds
a redzone on global variables, and we mark these redzones as invalid too.
The illegal-access detection works with a 1-byte granularity.

For now, we cover three areas:

- global variables
- kmem_alloc-ated areas
- malloc-ated areas

More will come, but that's a good start.
 1.84 12-Aug-2018  maxv Move the PCPU area from slot 384 to slot 510, to avoid creating too much
fragmentation in the slot space (384 is in the middle of the kernel half
of the VA).
 1.83 12-Aug-2018  maxv Randomize the main memory on Xen, same as native. Tested on amd64-dom0.
 1.82 12-Aug-2018  maxv Add a new area, SLAREA_HYPV, which indicates the slots used by the
hypervisor, in our case Xen.
 1.81 21-Jul-2018  maxv More ASLR. Randomize the location of the direct map at boot time on amd64.
This doesn't need "options KASLR" and works on GENERIC. Will soon be
enabled by default.

The location of the areas is abstracted in a slotspace structure. Ideally
we should always use this structure when touching the L4 slots, instead of
the current cocktail of global variables and constants.

machdep initializes the structure with the default values, and we then
randomize its dmap entry. Ideally machdep should randomize everything at
once, but in the case of the direct map its size is determined a little
later in the boot procedure, so we're forced to randomize its location
later too.
 1.80 20-Jun-2018  maxv branches: 1.80.2;
Add and use bootspace.smodule. Initialize it in locore/prekern to better
hide the specifics from the "upper" layers. This allows for greater
flexibility.
 1.79 19-May-2018  jakllsch remove some remaining uvm_emap(9)-related function prototypes
 1.78 19-May-2018  jdolecek Remove emap support. Unfortunately it never got to state where it would be
used and usable, due to reliability and limited & complicated MD support.

Going forward, we need to concentrate on interface which do not map anything
into kernel in first place (such as direct map or KVA-less I/O), rather
than making those mappings cheaper to do.
 1.77 08-May-2018  maxv Mitigation for the SS bug, CVE-2018-8897. We disabled dbregs a month ago
in -current and -8 so we are not particularly affected anymore.

The #DB handler runs on ist3, if we decide to process the exception we
copy the iret frame on the correct non-ist stack and continue as usual.
 1.76 04-Mar-2018  jdolecek branches: 1.76.2;
drop pmap_update_2pg(), just call pmap_update_pg() separately for each
 1.75 18-Jan-2018  maxv Unmap the kernel heap from the user page tables (SVS).

This implementation is optimized and organized in such a way that we
don't need to copy the kernel stack to a safe place during user<->kernel
transitions. We create two VAs that point to the same physical page; one
will be mapped in userland and is offset in order to contain only the
trapframe, the other is mapped in the kernel and maps the entire stack.

Sent on tech-kern@ a week ago.
 1.74 11-Jan-2018  maxv Add ist0 to pcpu_entry.
 1.73 05-Jan-2018  maxv Add a __HAVE_PCPU_AREA option, enabled by default on native amd64 but not
Xen.

With this option, the CPU structures that must always be present in the
CPU's page tables are moved on L4 slot 384, which means address
0xffffc00000000000.

A new pcpu_area structure is defined. It contains shared structures (IDT,
LDT), and then an array of pcpu_entry structures, indexed by cpu_index(ci).
Theoretically the LDT should be in the array, but this will be done later.

During the boot procedure, cpu0 calls pmap_init_pcpu, which creates a
page tree that is able to map the pcpu_area structure entirely. cpu0 then
immediately maps the shared structures. Later, every CPU goes through
cpu_pcpuarea_init, which allocates physical pages and kenters the relevant
pcpu_entry to them. Finally, each pointer is replaced to point to pcpuarea.

The point of this change is to make sure that the structures that must
always be present in the page tables have their own L4 slot. Until now
their L4 slot was that of pmap_kernel, and making a distinction between
what must be mapped and what does not need to be was complicated.

Even in the non-speculative-bug case this change makes some sense: there
are several x86 instructions that leak the addresses of the CPU structures,
and putting these structures inside pmap_kernel actually offered a way to
compute the address of the kernel heap - which would have made ASLR on it
plainly useless, had we implemented that.

Note that, for now, pcpuarea does not contain rsp0.

Unfortunately this change adds many #ifdefs, and makes the code harder to
understand. There is also some duplication, but that will be solved later.
 1.72 28-Dec-2017  maxv Use variables in PMAP_DIRECT_*, so that the location of the direct map can
change.
 1.71 11-Nov-2017  maxv Modify the layout of the bootspace structure, in such a way that it can
contain several kernel segments of the same type (eg several .text
segments). Some parts are still a bit messy but will be cleaned up soon.

I cannot compile-test this change on i386, but it seems fine enough.

NOTE: you need to rebuild and reinstall a new prekern after this change.
 1.70 29-Oct-2017  maxv Add a fifth region, called "head". On kaslr kernels it contains the ELF
Header and the ELF Section Headers. On normal kernels it is empty (the
headers are in the "boot" region).

Note: if you're using GENERIC_KASLR, you also need to rebuild the prekern.
 1.69 30-Sep-2017  maxv Add a bootspace structure. It describes the physical and virtual space
layout created by the early kernel bootstrap code. Start using it, and
eliminate several references to KERNBASE and other global symbols. While
here clean up xen-i386, it's really tiring.
 1.68 29-Sep-2017  ozaki-r Fix build

sys/arch/x86/x86/cpu.c:920:20: error: 'pmap_largepages' undeclared (first use in this function)
smp_data.large = (pmap_largepages != 0);
^
 1.67 17-Jun-2017  maxv Actually, use slot 456 instead, so that it fits a cache line.
 1.66 14-Jun-2017  maxv Give the direct map 32 slots (16TB of va). This matches MAXPHYSMEM, in
such a way that the direct map is no longer the limiting factor for high
memory systems.
 1.65 14-Jun-2017  maxv Move the direct map from slot 509 to slot 460. We will increase its size
dynamically.
 1.64 23-Mar-2017  maxv branches: 1.64.6;
Remove PG_k completely.
 1.63 05-Mar-2017  maxv Remove PG_u from the kernel pages on Xen. Otherwise there is no privilege
separation between the kernel and userland.

On Xen-amd64, the kernel runs in ring3 just like userland, and the
separation is guaranteed by the hypervisor - each syscall/trap is
intercepted by Xen and sent manually to the kernel. Before that, the
hypervisor modifies the page tables so that the kernel becomes accessible.
Later, when returning to userland, the hypervisor removes the kernel pages
and flushes the TLB.

However, TLB flushes are costly, and in order to reduce the number of pages
flushed Xen marks the userland pages as global, while keeping the kernel
ones as local. This way, when returning to userland, only the kernel pages
get flushed - which makes sense since they are the only ones that got
removed from the mapping.

Xen differentiates the userland pages by looking at their PG_u bit in the
PTE; if a page has this bit then Xen tags it as global, otherwise Xen
manually adds the bit but keeps the page as local. The thing is, since we
set PG_u in the kernel pages, Xen believes our kernel pages are in fact
userland pages, so it marks them as global. Therefore, when returning to
userland, the kernel pages indeed get removed from the page tree, but are
not flushed from the TLB. Which means that they are still accessible.

With this - and depending on the DTLB size - userland has a small window
where it can read/write to the last kernel pages accessed, which is enough
to completely escalate privileges: the sysent structure systematically gets
read when performing a syscall, and chances are that it will still be
cached in the TLB. Userland can then use this to patch a chosen syscall,
make it point to a userland function, retrieve %gs and compute the address
of its credentials, and finally grant itself root privileges.
 1.62 11-Feb-2017  maxv Instead of using a global array with per-cpu indexes, embed the tmp VAs
into cpu_info directly. This concerns only {i386, Xen-i386, Xen-amd64},
because amd64 already has a direct map that is way faster than that.

There are two major issues with the global array: maxcpus entries are
allocated while it is unlikely that common i386 machines have so many
cpus, and the base VA of these entries is not cache-line-aligned, which
mostly guarantees cache-line-thrashing each time the VAs are entered.

Now the number of tmp VAs allocated is proportionate to the number of CPUs
attached (which therefore reduces memory consumption), and the base is
properly aligned.

On my 3-core AMD, the number of DC_refills_L2 events triggered when
performing 5x10^6 calls to pmap_zero_page on two dedicated cores is on
average divided by two with this patch.

Discussed on tech-kern a little.
 1.61 08-Nov-2016  christos branches: 1.61.2;
PR/49691: KAMADA Ken'ichi: free deferred ptp mappings if present.
XXX: pullup-7
 1.60 19-Sep-2016  maya move function prototype to x86, so it is available to amd64 too
 1.59 25-Jul-2016  maxv The L1 entry of the first page of the data segment is overwritten for the
LAPIC page, and set as RWX+PG_N. The LAPIC pa is fixed, and its va resides
in the data segment. Because of this error-prone design, the kernel image
map is not linear, and I first thought it was a bug (as I vaguely said in
PR/51148). Using large pages for the data segment is therefore wrong, since
the first page does not actually belong to the data segment (even if its va
is in the range). This bug is not triggered currently, since local_apic is
not large-page-aligned.

We will certainly have to allocate a va dynamically instead of using the
first page of data; but for now, disable large pages on the data segment,
and map the LAPIC as RW.

This is the last x86-specific RWX page.
 1.58 01-Jul-2016  maxv branches: 1.58.2;
Define pmap_pg_nx globally. Will be used soon.
 1.57 11-Nov-2015  skrll Split out the pmap_pv_track stuff for use by others.

Discussed with riastradh@
 1.56 03-Apr-2015  riastradh Implement pmap_pv(9) for x86 for P->V tracking of unmanaged pages.

Proposed on tech-kern with no objections:

https://mail-index.netbsd.org/tech-kern/2015/03/26/msg018561.html
 1.55 17-Oct-2013  christos branches: 1.55.4; 1.55.6;
__USE() unused variables
 1.54 23-Jun-2013  uebayasi branches: 1.54.2;
Remove obsolete comment. OK'ed by rmind@.
 1.53 13-Nov-2012  chs add a pmap_kremove_local() that doesn't do TLB invalidations
on other CPUs. this is only intended for use while writing
kernel crash dumps. remove unused pmap_map().
 1.52 20-Apr-2012  rmind branches: 1.52.2;
- Convert x86 MD code, mainly pmap(9) e.g. TLB shootdown code, to use
kcpuset(9) and thus replace hardcoded CPU bitmasks. This removes the
limitation of maximum CPUs.

- Support up to 256 CPUs on amd64 architecture by default.

Bug fixes, improvements, completion of Xen part and testing on 64-core
AMD Opteron(tm) Processor 6282 SE (also, as Xen HVM domU with 128 CPUs)
by Manuel Bouyer.
 1.51 11-Mar-2012  jym Alternate PTEs got killed a few weeks ago. Clean up unused prototypes.
 1.50 17-Feb-2012  bouyer Apply patch proposed in PR port-xen/45975 (this does not solve the exact
problem reported here but is part of the solution):
xen_kpm_sync() is not working as expected,
leading to races between CPUs.
1 the check (xpq_cpu != &x86_curcpu) is always false because we
have different x86_curcpu symbols with different addresses in the kernel.
Fortunably, all addresses dissaemble to the same code.
Because of this we always use the code intended for bootstrap, which doesn't
use cross-calls or lock.

2 once 1 above is fixed, xen_kpm_sync() will use xcalls to sync other CPUs,
which cause it to sleep and pmap.c doesn't like that. It triggers this
KASSERT() in pmap_unmap_ptes():
KASSERT(pmap->pm_ncsw == curlwp->l_ncsw);
3 pmap->pm_cpus is not safe for the purpose of xen_kpm_sync(), which
needs to know on which CPU a pmap is loaded *now*:
pmap->pm_cpus is cleared before cpu_load_pmap() is called to switch
to a new pmap, leaving a window where a pmap is still in a CPU's
ci_kpm_pdir but not in pm_cpus. As a virtual CPU may be preempted
by the hypervisor at any time, it can be large enough to let another
CPU free the PTP and reuse it as a normal page.

To fix 2), avoid cross-calls and IPIs completely, and instead
use a mutex to update all CPU's ci_kpm_pdir from the local CPU.
It's safe because we just need to update the table page, a tlbflush IPI will
happen later. As a side effect, we don't need a different code for bootstrap,
fixing 1). The mutex added to struct cpu needs a small headers reorganisation.

to fix 3), introduce a pm_xen_ptp_cpus which is updated from
cpu_pmap_load(), whith the ci_kpm_mtx mutex held. Checking it with
ci_kpm_mtx held will avoid overwriting the wrong pmap's ci_kpm_pdir.

While there I removed the unused pmap_is_active() function;
and added some more details to DIAGNOSTIC panics.
 1.49 04-Dec-2011  chs branches: 1.49.2;
map all of physical memory using large pages.
ported from openbsd years ago by Murray Armfield,
updated for changes since then by me.
 1.48 23-Nov-2011  jym branches: 1.48.2;
No more users of xpmap_update(). Use pmap_pte_*() functions now.
 1.47 23-Nov-2011  jym Move Xen-specific functions to Xen pmap. Requested by cherry@.

Un'ifdef XEN in xen_pmap.c, it is always defined there.
 1.46 20-Nov-2011  jym Expose pmap_pdp_cache publicly to x86/xen pmap. Provide suspend/resume
callbacks for Xen pmap.

Turn static internal callbacks of pmap_pdp_cache.

XXX the implementation of pool_cache_invalidate(9) is still wrong, and
IMHO this needs fixing before -6. See
http://mail-index.netbsd.org/tech-kern/2011/11/18/msg011924.html
 1.45 08-Nov-2011  cherry Expose the PG_k #define pt/pd bit to both xen and "baremetal" x86. This is required, since kernel pages are mapped with user permissions in XEN/amd64 since the VM kernel runs in ring3. Since XEN/i386(including PAE) runs in ring1, supervisor mode is appropriate for these ports. We need to share this since the pmap implementation is still shared. Once the xen implementation is sufficiently independant of the x86 one, this can be made private to xen/include/xenpmap.h
 1.44 06-Nov-2011  cherry [merging from cherry-xenmp] Make the xen MMU op queue locking api private. Implement per-cpu queues.
 1.43 18-Oct-2011  jym branches: 1.43.2;
Make "pmaps" (list of non-kernel pmaps) and "pmaps_lock" externally
visible. Required by pmap MD code that could reside in other
files, notably Xen's pmap.
 1.42 20-Sep-2011  jym Merge jym-xensuspend branch in -current. ok bouyer@.

Goal: save/restore support in NetBSD domUs, for i386, i386 PAE and amd64.

Executive summary:
- split all Xen drivers (xenbus(4), grant tables, xbd(4), xennet(4))
in two parts: suspend and resume, and hook them to pmf(9).
- modify pmap so that Xen hypervisor does not cry out loud in case
it finds "unexpected" recursive memory mappings
- provide a sysctl(7), machdep.xen.suspend, to command suspend from
userland via powerd(8). Note: a suspend can only be handled correctly
when dom0 requested it, so provide a mechanism that will prevent
kernel to blindly validate user's commands

The code is still in experimental state, use at your own risk: restore
can corrupt backend communications rings; this can completely thrash
dom0 as it will loop at a high interrupt level trying to honor
all domU requests.

XXX PAE suspend does not work in amd64 currently, due to (yet again!)
page validation issues with hypervisor. Will fix.

XXX secondary CPUs are not suspended, I will write the handlers
in sync with cherry's Xen MP work.

Tested under i386 and amd64, bear in mind ring corruption though.

No build break expected, GENERICs and XEN* kernels should be fine.
./build.sh distribution still running. In any case: sorry if it does
break for you, contact me directly for reports.
 1.41 13-Aug-2011  cherry Add locking around ops to the hypervisor MMU "queue".
 1.40 13-Jun-2011  tls Fix Xen kernel builds (pmap_is_curpmap can't be static)
 1.39 12-Jun-2011  rmind Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.
 1.38 07-May-2011  jym branches: 1.38.2;
Do as the comment says, use ilog2(). This gets optimized directly at
compile time, no call to fls() is needed.
 1.37 25-Apr-2011  yamt comment
 1.36 25-Apr-2011  yamt remove unused ptei
 1.35 11-Feb-2011  jmcneill add bus_space_mmap support for BUS_SPACE_MAP_PREFETCHABLE, ok matt@
 1.34 01-Feb-2011  chuck udpate license clauses on my code to match the new-style BSD licenses.
remove no-longer-valid wustl email address for me.
based on diff that rmind@ sent me.

no functional change with this commit.
 1.33 24-Jul-2010  jym branches: 1.33.2; 1.33.4;
Welcome PAE inside i386 current.

This patch is inspired by work previously done by Jeremy Morse, ported by me
to -current, merged with the work previously done for port-xen, together with
additionals fixes and improvements.

PAE option is disabled by default in GENERIC (but will be enabled in ALL in
the next few days).

In quick, PAE switches the CPU to a mode where physical addresses become
36 bits (64 GiB). Virtual address space remains at 32 bits (4 GiB). To cope
with the increased size of the physical address, they are manipulated as
64 bits variables by kernel and MMU.

When supported by the CPU, it also allows the use of the NX/XD bit that
provides no-execution right enforcement on a per physical page basis.

Notes:

- reworked locore.S

- introduce cpu_load_pmap(), used to switch pmap for the curcpu. Due to the
different handling of pmap mappings with PAE vs !PAE, Xen vs native, details
are hidden within this function. This helps calling it from assembly,
as some features, like BIOS calls, switch to pmap_kernel before mapping
trampoline code in low memory.

- some changes in bioscall and kvm86_call, to reflect the above.

- the L3 is "pinned" per-CPU, and is only manipulated by a
reduced set of functions within pmap. To track the L3, I added two
elements to struct cpu_info, namely ci_l3_pdirpa (PA of the L3), and
ci_l3_pdir (the L3 VA). Rest of the code considers that it runs "just
like" a normal i386, except that the L2 is 4 pages long (PTP_LEVELS is
still 2).

- similar to the ci_pae_l3_pdir{,pa} variables, amd64's xen_current_user_pgd
becomes an element of cpu_info (slowly paving the way for MP world).

- bootinfo_source struct declaration is modified, to cope with paddr_t size
change with PAE (it is not correct to assume that bs_addr is a paddr_t when
compiled with PAE - it should remain 32 bits). bs_addrs is now a
void * array (in bootloader's code under i386/stand/, the bs_addrs
is a physaddr_t, which is an unsigned long).

- fixes in multiboot code (same reason as bootinfo): paddr_t size
change. I used Elf32_* types, use RELOC() where necessary, and move the
memcpy() functions out of the if/else if (I do not expect sym and str tables
to overlap with ELF).

- 64 bits atomic functions for pmap

- all pmap_pdirpa access are now done through the pmap_pdirpa macro. It
hides the L3/L2 stuff from PAE, as well as the pm_pdirpa change in
struct pmap (it now becomes a PDP_SIZE array, with or without PAE).

- manipulation of recursive mappings ( PDIR_SLOT_{,A}PTEs ) is done via
loops on PDP_SIZE.

See also http://mail-index.netbsd.org/port-i386/2010/07/17/msg002062.html

No objection raised on port-i386@ and port-xen@R for about a week.

XXX kvm(3) will be fixed in another patch to properly handle both PAE and !PAE
kernel dumps (VA => PA macros are slightly different, and need proper 64 bits
PA support in kvm_i386).

XXX Mixing PAE and !PAE modules may lead to unwanted/unexpected results. This
cannot be solved easily, and needs lots of thinking before being declared
safe (paddr_t/bus_addr_t size handling, PD/PT macros abstractions).
 1.32 15-Jul-2010  jym Make the comment about PDPpaddr more thorough.
 1.31 06-Jul-2010  cegger Turn PMAP_NOCACHE into MI flag.
Add MI flags PMAP_WRITE_COMBINE, PMAP_WRITE_BACK, PMAP_NOCACHE_OVR.
Update pmap(9) manpage.

hppa: Remove MD PMAP_NOCACHE flag as it exists as MI flag
mips: Rename MD PMAP_NOCACHE to PGC_NOCACHE.

x86: Implement new MI flags using Page-Attribute Tables.
x86: Implement BUS_SPACE_MAP_PREFETCHABLE.

Patch presented on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2010/06/30/msg008458.html

No comments on this last version.
 1.30 10-May-2010  dyoung Provide pmap_enter_ma(), pmap_extract_ma(), pmap_kenter_ma() in all x86
kernels, and use them in the bus_space(9) implementation instead of ugly
Xen #ifdef-age. In a non-Xen kernel, the _ma() functions either call or
alias the equivalent _pa() functions.

Reviewed on port-xen@netbsd.org and port-i386@netbsd.org. Passes
rmind@'s and bouyer@'s inspection. Tested on i386 and on Xen DOMU /
DOM0.
 1.29 09-Feb-2010  jym branches: 1.29.2;
Fix typos in comments.
 1.28 11-Nov-2009  cegger branches: 1.28.2;
update comment: we use PMAP_NOCACHE for both pmap_enter and pmap_kenter_pa
 1.27 07-Nov-2009  cegger Add a flags argument to pmap_kenter_pa(9).
Patch showed on tech-kern@ http://mail-index.netbsd.org/tech-kern/2009/11/04/msg006434.html
No objections.
 1.26 19-Jul-2009  rmind pmap_emap_sync: add an argument, and do not perform pmap_load() during
context switch (pmap_destroy() path seems to be unsafe), instead just
perform tlbflush(). Slightly inefficient, but good enough for now.
 1.25 28-Jun-2009  rmind Ephemeral mapping (emap) implementation. Concept is based on the idea that
activity of other threads will perform the TLB flush for the processes using
emap as a side effect. To track that, global and per-CPU generation numbers
are used. This idea was suggested by Andrew Doran; various improvements to
it by me. Notes:

- For now, zero-copy on pipe is not yet enabled.
- TCP socket code would likely need more work.
- Additional UVM loaning improvements are needed.

Proposed on <tech-kern>, silence there.
Quickly reviewed by <ad>.
 1.24 22-Apr-2009  cegger change pmap flags argument from int to u_int.
forgot to commit this.
 1.23 18-Apr-2009  cegger Introduce PMAP_NOCACHE as first PMAP MD bit in x86. Make use of it in pmap_enter().
This safes one extra TLB flush when mapping dma-safe memory.
Presented on tech-kern@, port-i386@ and port-amd64@
ok ad@
 1.22 21-Mar-2009  ad PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.
 1.21 09-Dec-2008  pooka branches: 1.21.2;
Make pmap_kernel() a MI macro for struct pmap *kernel_pmap_ptr,
which is now the "API" provided by the pmap module. pmap_kernel()
remains as the syntactic sugar.

Bonus cosmetics round: move all the pmap_t pointer typedefs into
uvm_pmap.h.

Thanks to Greg Oster for providing cpu muscle for doing test builds.
 1.20 16-Sep-2008  bouyer branches: 1.20.2; 1.20.4;
Implement the arch-dependent p2m frame lists list. This adds support for
'xm dump-core' for NetBSD domUs.
From Jean-Yves Migeon (jean-yves dot migeon at espci dot fr)
 1.19 24-Jun-2008  jmcneill branches: 1.19.2;
Define PMAP_FORK -- this was lost in the vmlocking merge, and is required
by options USER_LDT.
 1.18 05-Jun-2008  ad branches: 1.18.2;
pmap_remove_all() for x86. Also, always defer freeing ptps to pmap_update().
There may be a better way to do this, but for now this is simple and avoids
potential bugs.

Proposed on tech-kern and discussed with chs@.
 1.17 04-Jun-2008  ad Revert unintentional change.
 1.16 04-Jun-2008  ad vm_page: put TAILQ_ENTRY into a union with LIST_ENTRY, so we can use both.
 1.15 02-Jun-2008  ad - Don't bother using sse to copy/zero pages on demand. It turns out not
to be worth it.
- If the machine has sse, re-enable zeroing pages in the idle loop and
use the sse instructions so that we don't blow out the cache.
 1.14 03-May-2008  ad branches: 1.14.2;
Back out previous which was not thought through properly.
 1.13 03-May-2008  ad Implement pmap_remove_all().
 1.12 23-Jan-2008  bouyer branches: 1.12.6; 1.12.8; 1.12.10;
Merge the bouyer-xeni386 branch. This brings in PAE support to NetBSD xeni386
(domU only). PAE support is enabled by 'options PAE', see the new XEN3PAE_DOMU
and INSTALL_XEN3PAE_DOMU kernel config files.

See the comments in arch/i386/include/{pte.h,pmap.h} to see how it works.
In short, we still handle it as a 2-level MMU, with the second level page
directory being 4 pages in size. pmap switching is done by switching the
L2 pages in the L3 entries, instead of loading %cr3. This is almost required
by Xen, which handle the last L2 page (the one mapping 0xc0000000 - 0xffffffff)
in a very special way. But this approach should also work for native PAE
support if ever supported (in fact, the pmap should almost suport native
PAE, what's missing is bootstrap code in locore.S).
 1.11 20-Jan-2008  yamt - rewrite P->V tracking.
- use a hash rather than SPLAY trees.
SPLAY tree is a wrong algorithm to use here.
will be revisited if it slows down anything other than
micro-benchmarks.
- optimize the single mapping case (it's a common case) by
embedding an entry into mdpage.
- don't keep a pmap pointer as it can be obtained from ptp.
(discussed on port-i386 some years ago.)
ideally, a single paddr_t should be enough to describe a pte.
but it needs some more thoughts as it can increase computational
costs.
- pmap_enter: simplify and fix races with pmap_sync_pv.
- don't bother to lock pm_obj[i] where i > 0, unless DIAGNOSTIC.
- kill mp_link to save space.
- add many KASSERTs.
 1.10 11-Jan-2008  bouyer Merge the bouyer-xeni386 branch to head, at tag bouyer-xeni386-merge1 (the
branch is still active and will see i386PAE support developement).
Sumary of changes:
- switch xeni386 to the x86/x86/pmap.c, and the xen/x86/x86_xpmap.c
pmap bootstrap.
- merge back most of xen/i386/ to i386/i386
- change the build to reduce diffs between i386 and amd64 in file locations
- remove include files that were identical to the i386/amd64 counterparts,
the build will find them via the xen-ma/machine link.
 1.9 08-Jan-2008  yamt kill unused PMF_USER_RELOAD.
 1.8 02-Jan-2008  yamt g/c pv_page stuffs.
 1.7 25-Dec-2007  perry Convert many of the uses of __attribute__ to equivalent
__packed, __unused and __dead macros from cdefs.h
 1.6 09-Dec-2007  jmcneill branches: 1.6.2;
Merge jmcneill-pm branch.
 1.5 22-Nov-2007  bouyer branches: 1.5.2; 1.5.4;
Pull up the bouyer-xenamd64 branch to HEAD. This brings in amd64 support
to NetBSD/Xen, both Dom0 and DomU.
 1.4 15-Nov-2007  ad Remove support for 80386 level CPUs. PR port-i386/36163.
 1.3 07-Nov-2007  ad Merge from vmlocking:

- pool_cache changes.
- Debugger/procfs locking fixes.
- Other minor changes.
 1.2 18-Oct-2007  yamt branches: 1.2.2; 1.2.4; 1.2.6; 1.2.8; 1.2.10;
merge yamt-x86pmap branch.

- reduce differences between amd64 and i386. notably, share pmap.c
between them. it makes several i386 pmap improvements available to
amd64, including tlb shootdown reduction and bug fixes from Stephan Uphoff.
- implement deferred pmap switching for amd64.
- remove LARGEPAGES option. always use large pages if available.
also, make it work on amd64.
 1.1 08-Oct-2007  yamt branches: 1.1.2; 1.1.4;
file pmap.h was initially added on branch yamt-x86pmap.
 1.1.4.4 18-Nov-2007  bouyer Sync with HEAD
 1.1.4.3 13-Nov-2007  bouyer Sync with HEAD
 1.1.4.2 25-Oct-2007  bouyer Finish sync with HEAD. Especially use the new x86 pmap for xenamd64.
For this:
- rename pmap_pte_set() to pmap_pte_testset()
- make pmap_pte_set() a function or macro for non-atomic PTE write
- define and use pmap_pa2pte()/pmap_pte2pa() to read/write PTE entries
- define pmap_pte_flush() which is a nop in x86 case, and flush the
MMUops queue in the Xen case
 1.1.4.1 25-Oct-2007  bouyer Sync with HEAD.
 1.1.2.3 18-Oct-2007  yamt #ifdef out an unused member for x86_64.
 1.1.2.2 14-Oct-2007  yamt move pl_i_roundup to a header.
 1.1.2.1 08-Oct-2007  yamt merge some parts of x86 pmap.h.
 1.2.10.5 23-Mar-2008  matt sync with HEAD
 1.2.10.4 09-Jan-2008  matt sync with HEAD
 1.2.10.3 08-Nov-2007  matt sync with -HEAD
 1.2.10.2 06-Nov-2007  matt sync with HEAD
 1.2.10.1 18-Oct-2007  matt file pmap.h was added on branch matt-armv6 on 2007-11-06 23:23:38 +0000
 1.2.8.4 18-Feb-2008  mjf Sync with HEAD.
 1.2.8.3 27-Dec-2007  mjf Sync with HEAD.
 1.2.8.2 08-Dec-2007  mjf Sync with HEAD.
 1.2.8.1 19-Nov-2007  mjf Sync with HEAD.
 1.2.6.6 04-Feb-2008  yamt sync with head.
 1.2.6.5 21-Jan-2008  yamt sync with head
 1.2.6.4 07-Dec-2007  yamt sync with head
 1.2.6.3 15-Nov-2007  yamt sync with head.
 1.2.6.2 27-Oct-2007  yamt sync with head.
 1.2.6.1 18-Oct-2007  yamt file pmap.h was added on branch yamt-lazymbuf on 2007-10-27 11:28:56 +0000
 1.2.4.6 27-Nov-2007  joerg Sync with HEAD. amd64 Xen support needs testing.
 1.2.4.5 21-Nov-2007  joerg Sync with HEAD.
 1.2.4.4 11-Nov-2007  joerg Sync with HEAD.
 1.2.4.3 28-Oct-2007  joerg Cosmetic: reduce diff to HEAD.
 1.2.4.2 26-Oct-2007  joerg Sync with HEAD.

Follow the merge of pmap.c on i386 and amd64 and move
pmap_init_tmp_pgtbl into arch/x86/x86/pmap.c. Modify the ACPI wakeup
code to restore CR4 before jumping back into kernel space as the large
page option might cover that.
 1.2.4.1 18-Oct-2007  joerg file pmap.h was added on branch jmcneill-pm on 2007-10-26 15:43:44 +0000
 1.2.2.4 03-Dec-2007  ad Sync with HEAD.
 1.2.2.3 24-Oct-2007  ad Use a pool_cache to allocate pv entries. PR port-i386/37193.
 1.2.2.2 23-Oct-2007  ad Sync with head.
 1.2.2.1 18-Oct-2007  ad file pmap.h was added on branch vmlocking on 2007-10-23 20:36:40 +0000
 1.5.4.1 11-Dec-2007  yamt sync with head.
 1.5.2.1 26-Dec-2007  ad Sync with head.
 1.6.2.7 20-Jan-2008  bouyer Sync with HEAD
 1.6.2.6 17-Jan-2008  bouyer - Fix L2_SLOT_APTE value (not sure how I got this value but it was definitively
wrong)
- Use global variable for the PAE L3 page adresses, so that pmap.c can get it
from the bootstrap code
- Extent the size of our virtual PDP from 3 to 4 pages, so that pmap->pm_pdir[]
is contigous for the whole VA range. The last page is a shadow of
the kernel's real PDP (L3[3]).
- make pm_pdirpa an array of 4 paddr_t if using PAE. introduce a
pmap_pdirpa macro to get the physical address of a given PD entry.
- fix pmap_map_pte

The kernel now boots single-user. fsck will cause a kernel fault in
pmap_pdes_invalid() on exit.
 1.6.2.5 13-Jan-2008  bouyer Work in progress on xeni386 PAE support:
Make xeni386 build with a 64bit paddr_t. For this vaddr_t vs paddr_t vs
pointers usages had to be clarified.
If 'options PAE' is present in a Xen3 kernel, switch paddr_t, pd_entry_t
and pt_entry_t to 64bits, and add the PAE entry in the __xen_guest ELF section.
 1.6.2.4 10-Jan-2008  bouyer Sync with HEAD
 1.6.2.3 02-Jan-2008  bouyer Sync with HEAD
 1.6.2.2 13-Dec-2007  bouyer - make amd64 XEN3 kernels build again
- pin the pdp pages in the PDP cache contructor, and unpin them in the
destructor. garbage-collect PMF_USER_XPIN.
 1.6.2.1 11-Dec-2007  bouyer Switch i386 to x86/x86/pmap.c
 1.12.10.5 11-Aug-2010  yamt sync with head.
 1.12.10.4 11-Mar-2010  yamt sync with head
 1.12.10.3 19-Aug-2009  yamt sync with head.
 1.12.10.2 18-Jul-2009  yamt sync with head.
 1.12.10.1 04-May-2009  yamt sync with head.
 1.12.8.2 17-Jun-2008  yamt sync with head.
 1.12.8.1 04-Jun-2008  yamt sync with head
 1.12.6.4 17-Jan-2009  mjf Sync with HEAD.
 1.12.6.3 28-Sep-2008  mjf Sync with HEAD.
 1.12.6.2 29-Jun-2008  mjf Sync with HEAD.
 1.12.6.1 05-Jun-2008  mjf Sync with HEAD.

Also fix build.
 1.14.2.3 24-Sep-2008  wrstuden Merge in changes between wrstuden-revivesa-base-2 and
wrstuden-revivesa-base-3.
 1.14.2.2 18-Sep-2008  wrstuden Sync with wrstuden-revivesa-base-2.
 1.14.2.1 23-Jun-2008  wrstuden Sync w/ -current. 34 merge conflicts to follow.
 1.18.2.1 27-Jun-2008  simonb Sync with head.
 1.19.2.2 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.19.2.1 19-Oct-2008  haad Sync with HEAD.
 1.20.4.1 04-Apr-2009  snj Pull up following revision(s) (requested by ad in ticket #656):
sys/arch/amd64/amd64/gdt.c: revision 1.21 via patch
sys/arch/amd64/amd64/machdep.c: revision 1.129 via patch
sys/arch/i386/i386/gdt.c: revision 1.47 via patch
sys/arch/i386/i386/kvm86.c: revision 1.17 via patch
sys/arch/i386/i386/locore.S: revision 1.85 via patch
sys/arch/i386/i386/machdep.c: revision 1.666 via patch
sys/arch/i386/i386/vector.S: revision 1.45 via patch
sys/arch/i386/include/pcb.h: revision 1.47 via patch
sys/arch/x86/include/pmap.h: revision 1.22 via patch
sys/arch/x86/include/sysarch.h: revision 1.8 via patch
sys/arch/x86/x86/pmap.c: revision 1.80 via patch
sys/arch/x86/x86/sys_machdep.c: revision 1.17 via patch
sys/compat/linux/arch/i386/linux_machdep.c: revision 1.143 via patch
sys/kern/init_main.c: revision 1.384 via patch
PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash
Fix numerous problems:
1. LDT updates are not atomic.
2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.
3. LDTR can be leaked over context switch.
4. GDT slot allocations can race, giving the same LDT slot to two procs.
5. Incomplete interrupt/trap frames can be stacked.
6. In some rare cases segment faults are not handled correctly.
 1.20.2.2 28-Apr-2009  skrll Sync with HEAD.
 1.20.2.1 19-Jan-2009  skrll Sync with HEAD.
 1.21.2.11 27-Aug-2011  jym Sync with HEAD. Most notably: uvm/pmap work done by rmind@, and MP Xen
work of cherry@.

No regression observed on suspend/restore.
 1.21.2.10 26-May-2011  jym Pull-up some modifications from -current to my branch.
 1.21.2.9 02-May-2011  jym Sync with head.
 1.21.2.8 28-Mar-2011  jym Sync with HEAD. TODO before merge:
- shortcut for suspend code in sysmon, when powerd(8) is not running.
Borrow ``xs_watch'' thread context?
- bug hunting in xbd + xennet resume. Rings are currently thrashed upon
resume, so current implementation force flush them on suspend. It's not
really needed.
 1.21.2.7 24-Oct-2010  jym Sync with HEAD
 1.21.2.6 01-Nov-2009  jym - Upgrade suspend/resume code to comply with Xen2 removal.
- Add support for PAE domUs suspend/resume.
- Fix an issue regarding initialization of the xbd ring I/O that could end
badly during resume, with invalid block operations submitted to dom0 backend.

NetBSD supports PAE under x86_32 by considering the L2 page as being
4 pages long instead of 1.

Xen validates the page types during resume. Sadly, the hypervisor handles
alternative recursive mappings (== PG/PD entries pointing to pages other
than self) inadequately, which could lead to incorrect page pinning.

As a result, the important change with this patch is to clear these alternative
mappings during suspend, and reset them back to their former self upon
resume. For PAE, approx. all 4 PDIR_SLOT_PTEs could be considered as
alternative recursive mappings.

See comments in pmap.c for further details.

Now, let the testing and bug hunting begin.
 1.21.2.5 01-Nov-2009  jym Sync with HEAD.
 1.21.2.4 24-Jul-2009  jym - rework the page pinning API, so that now a function is provided for
each level of indirection encountered during virtual memory translations. Update
pmap accordingly. Pinning looks cleaner that way, and it offers the possibility
to pin lower level pages if necessary (NetBSD does not do it currently).

- some fixes and comments to explain how page validation/invalidation take
place during save/restore/migrate under Xen. L2 shadow entries from PAE are now
handled, so basically, suspend/resume works with PAE.

- fixes an issue reported by Christoph (cegger@) for xencons suspend/resume
in dom0.

TODO:

- PAE save/restore is currently limited to single-user only, multi-user
support requires modifications in PAE pmap that should be discussed first. See
the comments about the L2 shadow pages cached in pmap_pdp_cache in this commit.

- grant table bug is still there; do not use the kernels of this branch
to test suspend/resume, unless you want to experience bad crashes in dom0,
and push the big red button.

Now there is light at the end of the tunnel :)

Note: XEN2 kernels will neither build nor work with this branch.
 1.21.2.3 23-Jul-2009  jym Sync with HEAD.
 1.21.2.2 31-May-2009  jym Modifications for the Xen suspend/migrate/resume branch:

- introduce xenbus_device_{suspend,resume}() functions. These are routines
used to suspend/resume MI parts of the Xenbus device interfaces, like updating
frontend/backend devices' paths found in XenStore.

- introduce HYPERVISOR_sysctl(), an hypercall used only by Xentools to obtain
information from hypervisor (listing VMs, printing console, etc.). I use it
to query xenconsole from ddb(), as a last resort in case of a panic() in
dom0 (xm being not available). Currently unused in the branch; could be, if
requested.

- disable the rwlock(9) used to protect code that could use transient MFNs.
It could trigger nasty context switches in place it should not to.

- fix some bugs in the xennet/xbd suspend/resume pmf(9) handlers.

- following XenSource's design, talk_to_otherend() is now called
watch_otherend(), and free_otherend_details() is used by Xenbus device
suspend/resume routines.

- some slight modifications in pmap regarding APDP. Introduce an inline
function (pmap_unmap_apdp_pde()) that clears APDP entry for the current pmap.

- similarly, implement pmap_unmap_all_apdp_pdes() that iterates through all
pmaps and tears down APDP, as Xen does not handle them properly.

TODO/XXX:

- pmap_unmap_apdp_pde() does not handle APDP shadow entry of PAE. It will,
once I figure out how PAE uses it.

- revisit the pmap locking issue regarding transient MFNs. As NetBSD does not
use kernel preemption and MP for Xen, this could be skipped momentarily. See
http://mail-index.netbsd.org/port-xen/2009/04/27/msg004903.html for details.

- fix a bug regarding grant tables which could technically DoS a dom0 if
ridiculously high consumer/producer indexes are passed down in the ring during
a resume.

All in all, once the grant table index issue and APDP PAE are fixed, next step
is to torture test this branch.

Tested under i386 PAE and non-PAE, Xen3 dom0 and domU. amd64 is only compile
tested.
 1.21.2.1 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.28.2.2 17-Aug-2010  uebayasi Sync with HEAD.
 1.28.2.1 30-Apr-2010  uebayasi Sync with HEAD.
 1.29.2.11 31-May-2011  rmind sync with head
 1.29.2.10 19-May-2011  rmind Implement sharing of vnode_t::v_interlock amongst vnodes:
- Lock is shared amongst UVM objects using uvm_obj_setlock() or getnewvnode().
- Adjust vnode cache to handle unsharing, add VI_LOCKSHARE flag for that.
- Use sharing in tmpfs and layerfs for underlying object.
- Simplify locking in ubc_fault().
- Sprinkle some asserts.

Discussed with ad@.
 1.29.2.9 17-Mar-2011  rmind - Fix tlbflushg() to behave like tlbflush(), if page global extension (PGE)
is not (yet) enabled. This fixes the issue of stale TLB entry, experienced
early on boot, when PGE is not yet set on primary CPU.
- Rewrite i386/amd64 TLB interrupt handlers in C (only stubs are in assembly),
which simplifies and unifies (under x86) code, plus fixes few bugs.
- cpu_attach: remove assignment to cpus_running, as primary CPU might not be
attached first, which causes reset (and thus missed secondary CPUs).
 1.29.2.8 08-Mar-2011  rmind struct pmap_tlb_mailbox: make tm_pending and tm_gen volatile.
 1.29.2.7 05-Mar-2011  rmind sync with head
 1.29.2.6 31-May-2010  rmind - Split off Xen versions of pmap_map_ptes/pmap_unmap_ptes into Xen pmap,
also move pmap_apte_flush() with pmap_unmap_apdp() there.
- Make Xen buildable.
 1.29.2.5 30-May-2010  rmind sync with head
 1.29.2.4 26-May-2010  rmind Split x86 TLB shootdown code into a separate file.
Code part is under TNF license, as per pmap.c 1.105.2.4 revision.
 1.29.2.3 26-Apr-2010  rmind Partly rewrite amd64 TLB shutdown handler for the changes in x86 pmap.
At this point, branch seems to pass preliminar stress tests on amd64.
 1.29.2.2 26-Apr-2010  rmind Apply renovated patch to significantly reduce TLB shootdowns in x86 pmap,
also provide TLBSTATS option to measure and track TLB shootdowns. Details:

http://mail-index.netbsd.org/port-i386/2009/01/11/msg001018.html

Patch from Andrew Doran, proposed on tech-x86 [sic], in January 2009.

XXX: amd64 and xen are not yet; work in progress.
 1.29.2.1 16-Mar-2010  rmind Change struct uvm_object::vmobjlock to be dynamically allocated with
mutex_obj_alloc(). It allows us to share the locks among UVM objects.
 1.33.4.2 17-Feb-2011  bouyer Sync with HEAD
 1.33.4.1 08-Feb-2011  bouyer Sync with HEAD
 1.33.2.1 06-Jun-2011  jruoho Sync with HEAD.
 1.38.2.3 20-Sep-2011  cherry Remove the "xpq lock", since we have per-cpu mmu queues now. This may need further testing. Also add some preliminary locking around queue-ops in the network backend driver
 1.38.2.2 23-Jun-2011  cherry Catchup with rmind-uvmplock merge.
 1.38.2.1 03-Jun-2011  cherry Initial import of xen MP sources, with kernel and userspace tests.
- this is a source priview.
- boots to single user.
- spurious interrupt and pmap related panics are normal
 1.43.2.6 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.43.2.5 16-Jan-2013  yamt sync with (a bit old) head
 1.43.2.4 23-May-2012  yamt sync with head.
 1.43.2.3 17-Apr-2012  yamt sync with head
 1.43.2.2 18-Nov-2011  yamt share a lock among pmap uobjs
 1.43.2.1 10-Nov-2011  yamt sync with head
 1.48.2.3 29-Apr-2012  mrg sync to latest -current.
 1.48.2.2 05-Apr-2012  mrg sync to latest -current.
 1.48.2.1 18-Feb-2012  mrg merge to -current.
 1.49.2.3 06-Mar-2017  snj Pull up following revision(s) (requested by bouyer in ticket #1441):
sys/arch/x86/x86/pmap.c: revision 1.241 via patch
sys/arch/x86/include/pmap.h: revision 1.63 via patch
Should be PG_k, doesn't change anything.
--
Remove PG_u from the kernel pages on Xen. Otherwise there is no privilege
separation between the kernel and userland.
On Xen-amd64, the kernel runs in ring3 just like userland, and the
separation is guaranteed by the hypervisor - each syscall/trap is
intercepted by Xen and sent manually to the kernel. Before that, the
hypervisor modifies the page tables so that the kernel becomes accessible.
Later, when returning to userland, the hypervisor removes the kernel pages
and flushes the TLB.
However, TLB flushes are costly, and in order to reduce the number of pages
flushed Xen marks the userland pages as global, while keeping the kernel
ones as local. This way, when returning to userland, only the kernel pages
get flushed - which makes sense since they are the only ones that got
removed from the mapping.
Xen differentiates the userland pages by looking at their PG_u bit in the
PTE; if a page has this bit then Xen tags it as global, otherwise Xen
manually adds the bit but keeps the page as local. The thing is, since we
set PG_u in the kernel pages, Xen believes our kernel pages are in fact
userland pages, so it marks them as global. Therefore, when returning to
userland, the kernel pages indeed get removed from the page tree, but are
not flushed from the TLB. Which means that they are still accessible.
With this - and depending on the DTLB size - userland has a small window
where it can read/write to the last kernel pages accessed, which is enough
to completely escalate privileges: the sysent structure systematically gets
read when performing a syscall, and chances are that it will still be
cached in the TLB. Userland can then use this to patch a chosen syscall,
make it point to a userland function, retrieve %gs and compute the address
of its credentials, and finally grant itself root privileges.
 1.49.2.2 09-May-2012  riz branches: 1.49.2.2.4; 1.49.2.2.6;
Pull up following revision(s) (requested by rmind in ticket #202):
sys/arch/x86/include/cpuvar.h: revision 1.46
sys/arch/xen/include/xenpmap.h: revision 1.34
sys/arch/i386/include/param.h: revision 1.77
sys/arch/x86/x86/pmap_tlb.c: revision 1.5
sys/arch/x86/x86/pmap_tlb.c: revision 1.6
sys/arch/i386/i386/genassym.cf: revision 1.92
sys/arch/xen/x86/cpu.c: revision 1.91
sys/arch/x86/x86/pmap.c: revision 1.177
sys/arch/xen/x86/xen_pmap.c: revision 1.21
sys/arch/x86/acpi/acpi_wakeup.c: revision 1.31
sys/kern/subr_kcpuset.c: revision 1.5
sys/arch/amd64/include/param.h: revision 1.18
sys/sys/kcpuset.h: revision 1.5
sys/arch/x86/x86/mtrr_i686.c: revision 1.26
sys/arch/x86/x86/mtrr_i686.c: revision 1.27
sys/arch/xen/x86/x86_xpmap.c: revision 1.43
sys/arch/x86/x86/cpu.c: revision 1.98
sys/arch/amd64/amd64/mptramp.S: revision 1.14
sys/kern/sys_sched.c: revision 1.42
sys/arch/amd64/amd64/genassym.cf: revision 1.50
sys/arch/i386/i386/mptramp.S: revision 1.24
sys/arch/x86/include/pmap.h: revision 1.52
sys/arch/x86/include/cpu.h: revision 1.50
- Convert x86 MD code, mainly pmap(9) e.g. TLB shootdown code, to use
kcpuset(9) and thus replace hardcoded CPU bitmasks. This removes the
limitation of maximum CPUs.
- Support up to 256 CPUs on amd64 architecture by default.
Bug fixes, improvements, completion of Xen part and testing on 64-core
AMD Opteron(tm) Processor 6282 SE (also, as Xen HVM domU with 128 CPUs)
by Manuel Bouyer.
- pmap_tlb_shootdown: do not overwrite tp_cpumask with pm_cpus, but merge
like pm_kernel_cpus. Remove unecessary intersection with kcpuset_running.
Do not reset tp_userpmap if pmap_kernel().
- Remove pmap_tlb_mailbox_t wrapping, which is pointless after recent changes.
- pmap_tlb_invalidate, pmap_tlb_intr: constify for packet structure.
i686_mtrr_init_first: handle the case when there are no variable-size MTRR
registers available (i686_mtrr_vcnt == 0).
 1.49.2.1 22-Feb-2012  riz Pull up following revision(s) (requested by bouyer in ticket #29):
sys/arch/xen/x86/x86_xpmap.c: revision 1.39
sys/arch/xen/include/hypervisor.h: revision 1.37
sys/arch/xen/include/intr.h: revision 1.34
sys/arch/xen/x86/xen_ipi.c: revision 1.10
sys/arch/x86/x86/cpu.c: revision 1.97
sys/arch/x86/include/cpu.h: revision 1.48
sys/uvm/uvm_map.c: revision 1.315
sys/arch/x86/x86/pmap.c: revision 1.165
sys/arch/xen/x86/cpu.c: revision 1.81
sys/arch/x86/x86/pmap.c: revision 1.167
sys/arch/xen/x86/cpu.c: revision 1.82
sys/arch/x86/x86/pmap.c: revision 1.168
sys/arch/xen/x86/xen_pmap.c: revision 1.17
sys/uvm/uvm_km.c: revision 1.122
sys/uvm/uvm_kmguard.c: revision 1.10
sys/arch/x86/include/pmap.h: revision 1.50
Apply patch proposed in PR port-xen/45975 (this does not solve the exact
problem reported here but is part of the solution):
xen_kpm_sync() is not working as expected,
leading to races between CPUs.
1 the check (xpq_cpu != &x86_curcpu) is always false because we
have different x86_curcpu symbols with different addresses in the kernel.
Fortunably, all addresses dissaemble to the same code.
Because of this we always use the code intended for bootstrap, which doesn't
use cross-calls or lock.
2 once 1 above is fixed, xen_kpm_sync() will use xcalls to sync other CPUs,
which cause it to sleep and pmap.c doesn't like that. It triggers this
KASSERT() in pmap_unmap_ptes():
KASSERT(pmap->pm_ncsw == curlwp->l_ncsw);
3 pmap->pm_cpus is not safe for the purpose of xen_kpm_sync(), which
needs to know on which CPU a pmap is loaded *now*:
pmap->pm_cpus is cleared before cpu_load_pmap() is called to switch
to a new pmap, leaving a window where a pmap is still in a CPU's
ci_kpm_pdir but not in pm_cpus. As a virtual CPU may be preempted
by the hypervisor at any time, it can be large enough to let another
CPU free the PTP and reuse it as a normal page.
To fix 2), avoid cross-calls and IPIs completely, and instead
use a mutex to update all CPU's ci_kpm_pdir from the local CPU.
It's safe because we just need to update the table page, a tlbflush IPI will
happen later. As a side effect, we don't need a different code for bootstrap,
fixing 1). The mutex added to struct cpu needs a small headers reorganisation.
to fix 3), introduce a pm_xen_ptp_cpus which is updated from
cpu_pmap_load(), whith the ci_kpm_mtx mutex held. Checking it with
ci_kpm_mtx held will avoid overwriting the wrong pmap's ci_kpm_pdir.
While there I removed the unused pmap_is_active() function;
and added some more details to DIAGNOSTIC panics.
When using uvm_km_pgremove_intrsafe() make sure mappings are removed
before returning the pages to the free pool. Otherwise, under Xen,
a page which still has a writable mapping could be allocated for
a PDP by another CPU and the hypervisor would refuse it (this is
PR port-xen/45975).
For this, move the pmap_kremove() calls inside uvm_km_pgremove_intrsafe(),
and do pmap_kremove()/uvm_pagefree() in batch of (at most) 16 entries
(as suggested by Chuck Silvers on tech-kern@, see also
http://mail-index.netbsd.org/tech-kern/2012/02/17/msg012727.html and
followups).
Avoid early use of xen_kpm_sync(); locks are not available at this time.
Don't call cpu_init() twice.
Makes LOCKDEBUG kernels boot again
Revert pmap_pte_flush() -> xpq_flush_queue() in previous.
 1.49.2.2.6.1 06-Mar-2017  snj Pull up following revision(s) (requested by bouyer in ticket #1441):
sys/arch/x86/x86/pmap.c: revision 1.241 via patch
sys/arch/x86/include/pmap.h: revision 1.63 via patch
Should be PG_k, doesn't change anything.
--
Remove PG_u from the kernel pages on Xen. Otherwise there is no privilege
separation between the kernel and userland.
On Xen-amd64, the kernel runs in ring3 just like userland, and the
separation is guaranteed by the hypervisor - each syscall/trap is
intercepted by Xen and sent manually to the kernel. Before that, the
hypervisor modifies the page tables so that the kernel becomes accessible.
Later, when returning to userland, the hypervisor removes the kernel pages
and flushes the TLB.
However, TLB flushes are costly, and in order to reduce the number of pages
flushed Xen marks the userland pages as global, while keeping the kernel
ones as local. This way, when returning to userland, only the kernel pages
get flushed - which makes sense since they are the only ones that got
removed from the mapping.
Xen differentiates the userland pages by looking at their PG_u bit in the
PTE; if a page has this bit then Xen tags it as global, otherwise Xen
manually adds the bit but keeps the page as local. The thing is, since we
set PG_u in the kernel pages, Xen believes our kernel pages are in fact
userland pages, so it marks them as global. Therefore, when returning to
userland, the kernel pages indeed get removed from the page tree, but are
not flushed from the TLB. Which means that they are still accessible.
With this - and depending on the DTLB size - userland has a small window
where it can read/write to the last kernel pages accessed, which is enough
to completely escalate privileges: the sysent structure systematically gets
read when performing a syscall, and chances are that it will still be
cached in the TLB. Userland can then use this to patch a chosen syscall,
make it point to a userland function, retrieve %gs and compute the address
of its credentials, and finally grant itself root privileges.
 1.49.2.2.4.1 06-Mar-2017  snj Pull up following revision(s) (requested by bouyer in ticket #1441):
sys/arch/x86/x86/pmap.c: revision 1.241 via patch
sys/arch/x86/include/pmap.h: revision 1.63 via patch
Should be PG_k, doesn't change anything.
--
Remove PG_u from the kernel pages on Xen. Otherwise there is no privilege
separation between the kernel and userland.
On Xen-amd64, the kernel runs in ring3 just like userland, and the
separation is guaranteed by the hypervisor - each syscall/trap is
intercepted by Xen and sent manually to the kernel. Before that, the
hypervisor modifies the page tables so that the kernel becomes accessible.
Later, when returning to userland, the hypervisor removes the kernel pages
and flushes the TLB.
However, TLB flushes are costly, and in order to reduce the number of pages
flushed Xen marks the userland pages as global, while keeping the kernel
ones as local. This way, when returning to userland, only the kernel pages
get flushed - which makes sense since they are the only ones that got
removed from the mapping.
Xen differentiates the userland pages by looking at their PG_u bit in the
PTE; if a page has this bit then Xen tags it as global, otherwise Xen
manually adds the bit but keeps the page as local. The thing is, since we
set PG_u in the kernel pages, Xen believes our kernel pages are in fact
userland pages, so it marks them as global. Therefore, when returning to
userland, the kernel pages indeed get removed from the page tree, but are
not flushed from the TLB. Which means that they are still accessible.
With this - and depending on the DTLB size - userland has a small window
where it can read/write to the last kernel pages accessed, which is enough
to completely escalate privileges: the sysent structure systematically gets
read when performing a syscall, and chances are that it will still be
cached in the TLB. Userland can then use this to patch a chosen syscall,
make it point to a userland function, retrieve %gs and compute the address
of its credentials, and finally grant itself root privileges.
 1.52.2.3 03-Dec-2017  jdolecek update from HEAD
 1.52.2.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.52.2.1 20-Nov-2012  tls Resync to 2012-11-19 00:00:00 UTC
 1.54.2.1 18-May-2014  rmind sync with head
 1.55.6.6 28-Aug-2017  skrll Sync with HEAD
 1.55.6.5 05-Dec-2016  skrll Sync with HEAD
 1.55.6.4 05-Oct-2016  skrll Sync with HEAD
 1.55.6.3 09-Jul-2016  skrll Sync with HEAD
 1.55.6.2 27-Dec-2015  skrll Sync with HEAD (as of 26th Dec)
 1.55.6.1 06-Apr-2015  skrll Sync with HEAD
 1.55.4.3 06-Mar-2017  snj Pull up following revision(s) (requested by bouyer in ticket #1388):
sys/arch/x86/x86/pmap.c: revision 1.241
Should be PG_k, doesn't change anything.
--
Remove PG_u from the kernel pages on Xen. Otherwise there is no privilege
separation between the kernel and userland.
On Xen-amd64, the kernel runs in ring3 just like userland, and the
separation is guaranteed by the hypervisor - each syscall/trap is
intercepted by Xen and sent manually to the kernel. Before that, the
hypervisor modifies the page tables so that the kernel becomes accessible.
Later, when returning to userland, the hypervisor removes the kernel pages
and flushes the TLB.
However, TLB flushes are costly, and in order to reduce the number of pages
flushed Xen marks the userland pages as global, while keeping the kernel
ones as local. This way, when returning to userland, only the kernel pages
get flushed - which makes sense since they are the only ones that got
removed from the mapping.
Xen differentiates the userland pages by looking at their PG_u bit in the
PTE; if a page has this bit then Xen tags it as global, otherwise Xen
manually adds the bit but keeps the page as local. The thing is, since we
set PG_u in the kernel pages, Xen believes our kernel pages are in fact
userland pages, so it marks them as global. Therefore, when returning to
userland, the kernel pages indeed get removed from the page tree, but are
not flushed from the TLB. Which means that they are still accessible.
With this - and depending on the DTLB size - userland has a small window
where it can read/write to the last kernel pages accessed, which is enough
to completely escalate privileges: the sysent structure systematically gets
read when performing a syscall, and chances are that it will still be
cached in the TLB. Userland can then use this to patch a chosen syscall,
make it point to a userland function, retrieve %gs and compute the address
of its credentials, and finally grant itself root privileges.
 1.55.4.2 18-Dec-2016  snj Pull up following revision(s) (requested by riastradh in ticket #1316):
sys/arch/x86/x86/pmap.c: revision 1.223
sys/arch/x86/x86/vm_machdep.c: revision 1.26
sys/arch/x86/include/pmap.h: revision 1.61
PR/49691: KAMADA Ken'ichi: free deferred ptp mappings if present.
XXX: pullup-7
 1.55.4.1 23-Apr-2015  snj branches: 1.55.4.1.2; 1.55.4.1.4;
Pull up following revision(s) (requested by mrg in ticket #718):
sys/arch/x86/include/pmap.h: revision 1.56
sys/arch/x86/x86/pmap.c: revision 1.188
sys/dev/pci/agp_amd64.c: revision 1.8
sys/dev/pci/agp_i810.c: revision 1.118
sys/external/bsd/drm2/dist/drm/i915/i915_dma.c: revision 1.16
sys/external/bsd/drm2/dist/drm/i915/i915_gem.c: revision 1.29
sys/external/bsd/drm2/dist/drm/nouveau/nouveau_agp.c: revision 1.3
sys/external/bsd/drm2/dist/drm/nouveau/nouveau_ttm.c: revision 1.4
sys/external/bsd/drm2/dist/drm/radeon/atombios_crtc.c: revision 1.3
sys/external/bsd/drm2/dist/drm/radeon/radeon_agp.c: revision 1.3
sys/external/bsd/drm2/dist/drm/radeon/radeon_display.c: revision 1.3
sys/external/bsd/drm2/dist/drm/radeon/radeon_legacy_crtc.c: revision 1.2
sys/external/bsd/drm2/dist/drm/radeon/radeon_object.c: revision 1.3
sys/external/bsd/drm2/dist/drm/radeon/radeon_ttm.c: revision 1.7
sys/external/bsd/drm2/dist/drm/ttm/ttm_bo.c: revisions 1.7-1.10
sys/external/bsd/drm2/dist/drm/ttm/ttm_bo_util.c: revision 1.5
sys/external/bsd/drm2/i915drm/intelfb.c: revision 1.13
sys/external/bsd/drm2/include/drm/drm_wait_netbsd.h: revisions 1.12, 1.13
sys/external/bsd/drm2/include/linux/mm.h: revision 1.5
sys/external/bsd/drm2/include/linux/pci.h: revisions 1.16, 1.17
sys/external/bsd/drm2/nouveau/nouveaufb.c: revision 1.2
sys/external/bsd/drm2/radeon/radeon_pci.c: revisions 1.8, 1.9
sys/uvm/uvm_init.c: revision 1.46
Hack against the blank console problem:
Leave the CLUT alone on ancient cards. At least this leaves us with a
semi working console (red and blue are flipped). Leave an example of what
seems to be happening but disable it because colors are better than 444 bit
greyscale.
--
Initialize P->V tracking for unmanaged device pages in uvm_init.

Conditional on __HAVE_PMAP_PV_TRACK until we add it to all pmaps.

MI part of pmap_pv(9) change proposed on tech-kern:

https://mail-index.netbsd.org/tech-kern/2015/03/26/msg018561.html
--
Implement pmap_pv(9) for x86 for P->V tracking of unmanaged pages.

Proposed on tech-kern with no objections:

https://mail-index.netbsd.org/tech-kern/2015/03/26/msg018561.html
--
Use pmap_pv(9) to remove mappings of Intel graphics aperture pages.

Proposed on tech-kern with no objections:

https://mail-index.netbsd.org/tech-kern/2015/03/26/msg018561.html

Further background at:

https://mail-index.netbsd.org/tech-kern/2014/07/23/msg017392.html
--
Use pmap_pv(9) to remove mappings of device pages in TTM.

Adapt nouveau and radeon to do pmap_pv_track for their device pages.

Proposed on tech-kern with no objections:

https://mail-index.netbsd.org/tech-kern/2015/03/26/msg018561.html

Further background at:

https://mail-index.netbsd.org/tech-kern/2014/07/23/msg017392.html
--
Fix error branches in agp_amd64.c.

- agp_generic_detach always.
- Free asc if it was allocated. (Found by Brainy, noted by maxv@.)
- Free the GATT if it was allocated.
--
pmf_device_register returns false on failure, not true
--
In DRM_SPIN_WAIT_ON, don't stop after waiting only one tick.

Continue the loop to recheck the condition and count the whole
duration.
--
Don't use the video BIOS memory as an i915 flush page!
--
Don't let anyone else allocate the video BIOS either.
--
Missed a zero: it's 0x100000, not 0x10000.
--
Don't reserve if atomic -- caller must have pre-pinned the buffer.
--
Don't reserve if atomic -- caller must have pre-pinned the buffer.
--
almost add radeondrmkms suspend/resume support. it unfortunately doesn't work.
--
Need the page's uvm object lock to do pmap_page_protect.
--
Use KASSERTMSG to show bad base/offset.
--
KASSERT about page-alignment on initialization too.
--
Don't break when hardclock_ticks wraps around.

Since we now only count time spent in wait, rather than determining
the end time and checking whether we've passed it, timeouts might be
marginally longer in effect. Unlikely to be an issue.
--
Remove broken drm2 vm_mmap stub. Can't possibly have ever worked.
--
apply some of the additional changes from Arto Huusko in PR#49645:
- call pmf_device_deregister on detach.

i've kept the "resume = true" for radeon_resume_kms() call as it
seems to work for me (indeed, code inspection shows it is unused
on netbsd :-)

my old nforce4 box that can resume old drm (or could, last i tried
several years ago) while X and GL apps were running, can at least
survive a resume if X hasn't started. my one attempt so far with
X exited, but having run, did not work.
--
First attempt to make ttm_buffer_object_transfer less bogus.
--
Make sure mem.bus.is_iomem is initialized. PR 49833
 1.55.4.1.4.2 13-Mar-2017  skrll Sync with netbsd-7-1-RELEASE
 1.55.4.1.4.1 18-Jan-2017  skrll Sync with netbsd-5
 1.55.4.1.2.2 06-Mar-2017  snj Pull up following revision(s) (requested by bouyer in ticket #1388):
sys/arch/x86/include/pmap.h: revision 1.63 via patch
sys/arch/x86/x86/pmap.c: revision 1.241 via patch
Should be PG_k, doesn't change anything.
--
Remove PG_u from the kernel pages on Xen. Otherwise there is no privilege
separation between the kernel and userland.
On Xen-amd64, the kernel runs in ring3 just like userland, and the
separation is guaranteed by the hypervisor - each syscall/trap is
intercepted by Xen and sent manually to the kernel. Before that, the
hypervisor modifies the page tables so that the kernel becomes accessible.
Later, when returning to userland, the hypervisor removes the kernel pages
and flushes the TLB.
However, TLB flushes are costly, and in order to reduce the number of pages
flushed Xen marks the userland pages as global, while keeping the kernel
ones as local. This way, when returning to userland, only the kernel pages
get flushed - which makes sense since they are the only ones that got
removed from the mapping.
Xen differentiates the userland pages by looking at their PG_u bit in the
PTE; if a page has this bit then Xen tags it as global, otherwise Xen
manually adds the bit but keeps the page as local. The thing is, since we
set PG_u in the kernel pages, Xen believes our kernel pages are in fact
userland pages, so it marks them as global. Therefore, when returning to
userland, the kernel pages indeed get removed from the page tree, but are
not flushed from the TLB. Which means that they are still accessible.
With this - and depending on the DTLB size - userland has a small window
where it can read/write to the last kernel pages accessed, which is enough
to completely escalate privileges: the sysent structure systematically gets
read when performing a syscall, and chances are that it will still be
cached in the TLB. Userland can then use this to patch a chosen syscall,
make it point to a userland function, retrieve %gs and compute the address
of its credentials, and finally grant itself root privileges.
 1.55.4.1.2.1 18-Dec-2016  snj Pull up following revision(s) (requested by riastradh in ticket #1316):
sys/arch/x86/x86/pmap.c: revision 1.223
sys/arch/x86/x86/vm_machdep.c: revision 1.26
sys/arch/x86/include/pmap.h: revision 1.61
PR/49691: KAMADA Ken'ichi: free deferred ptp mappings if present.
XXX: pullup-7
 1.58.2.5 26-Apr-2017  pgoyette Sync with HEAD
 1.58.2.4 20-Mar-2017  pgoyette Sync with HEAD
 1.58.2.3 07-Jan-2017  pgoyette Sync with HEAD. (Note that most of these changes are simply $NetBSD$
tag issues.)
 1.58.2.2 04-Nov-2016  pgoyette Sync with HEAD
 1.58.2.1 26-Jul-2016  pgoyette Sync with HEAD
 1.61.2.1 21-Apr-2017  bouyer Sync with HEAD
 1.64.6.2 22-Mar-2018  martin Pull up the following revisions, requested by maxv in ticket #652:

sys/arch/amd64/amd64/amd64_trap.S upto 1.39 (partial, patch)
sys/arch/amd64/amd64/db_machdep.c 1.6 (patch)
sys/arch/amd64/amd64/genassym.cf 1.65,1.66,1.67 (patch)
sys/arch/amd64/amd64/locore.S upto 1.159 (partial, patch)
sys/arch/amd64/amd64/machdep.c 1.299-1.302 (patch)
sys/arch/amd64/amd64/trap.c upto 1.113 (partial, patch)
sys/arch/amd64/amd64/amd64/vector.S upto 1.61 (partial, patch)
sys/arch/amd64/conf/GENERIC 1.477,1.478 (patch)
sys/arch/amd64/conf/kern.ldscript 1.26 (patch)
sys/arch/amd64/include/frameasm.h upto 1.37 (partial, patch)
sys/arch/amd64/include/param.h 1.25 (patch)
sys/arch/amd64/include/pmap.h 1.41,1.43,1.44 (patch)
sys/arch/x86/conf/files.x86 1.91,1.93 (patch)
sys/arch/x86/include/cpu.h 1.88,1.89 (patch)
sys/arch/x86/include/pmap.h 1.75 (patch)
sys/arch/x86/x86/cpu.c 1.144,1.146,1.148,1.149 (patch)
sys/arch/x86/x86/pmap.c upto 1.289 (partial, patch)
sys/arch/x86/x86/vm_machdep.c 1.31,1.32 (patch)
sys/arch/x86/x86/x86_machdep.c 1.104,1.106,1.108 (patch)
sys/arch/x86/x86/svs.c 1.1-1.14
sys/arch/xen/conf/files.compat 1.30 (patch)

Backport SVS. Not enabled yet.
 1.64.6.1 16-Mar-2018  martin Pull up the following revisions (via patch), requested by maxv in #635:

sys/arch/amd64/amd64/gdt.c 1.39-1.45 (patch)
sys/arch/amd64/amd64/amd64/machdep.c 1.284,1.287,1.288 (patch)
sys/arch/amd64/amd64/include/param.h 1.23 (patch)
sys/arch/amd64/include/types.h 1.53 (patch)
sys/arch/x86/include/cpu.h 1.87 (patch)
sys/arch/x86/include/pmap.h 1.73,1.74 (patch)
sys/arch/x86/x86/cpu.c 1.142 (patch)
sys/arch/x86/x86/intr.c 1.117 (partial),1.120 (patch)
sys/arch/x86/x86/pmap.c 1.276 (patch)

Initialize ist0 in cpu_init_tss.
Backport __HAVE_PCPU_AREA.
 1.76.2.6 26-Dec-2018  pgoyette Sync with HEAD, resolve a few conflicts
 1.76.2.5 26-Nov-2018  pgoyette Sync with HEAD, resolve a couple of conflicts
 1.76.2.4 06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.76.2.3 28-Jul-2018  pgoyette Sync with HEAD
 1.76.2.2 25-Jun-2018  pgoyette Sync with HEAD
 1.76.2.1 21-May-2018  pgoyette Sync with HEAD
 1.80.2.3 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.80.2.2 08-Apr-2020  martin Merge changes from current as of 20200406
 1.80.2.1 10-Jun-2019  christos Sync with HEAD
 1.101.2.1 31-May-2020  martin Pull up following revision(s) (requested by bouyer in ticket #935):

sys/arch/xen/x86/x86_xpmap.c: revision 1.89
sys/arch/x86/include/pmap.h: revision 1.121
sys/arch/xen/xen/privcmd.c: revision 1.58
sys/external/mit/xen-include-public/dist/xen/include/public/memory.h: revision 1.2
sys/arch/xen/include/xenpmap.h: revision 1.44
sys/arch/xen/include/xenio.h: revision 1.12
sys/arch/x86/x86/pmap.c: revision 1.394
(all via patch)

Ajust pmap_enter_ma() for upcoming new Xen privcmd ioctl:
pass flags to xpq_update_foreign()

Introduce a pmap MD flag: PMAP_MD_XEN_NOTR, which cause xpq_update_foreign()
to use the MMU_PT_UPDATE_NO_TRANSLATE flag.
make xpq_update_foreign() return the raw Xen error. This will cause
pmap_enter_ma() to return a negative error number in this case, but the
only user of this code path is privcmd.c and it can deal with it.

Add pmap_enter_gnt()m which maps a set of Xen grant entries at the
specified va in the specified pmap. Use the hooks implemented for EPT to
keep track of mapped grand entries in the pmap, and unmap them
when pmap_remove() is called. This requires pmap_remove() to be split
into a pmap_remove_locked(), to be called from pmap_remove_gnt().

Implement new ioctl, needed by Xen 4.13:
IOCTL_PRIVCMD_MMAPBATCH_V2
IOCTL_PRIVCMD_MMAP_RESOURCE
IOCTL_GNTDEV_MMAP_GRANT_REF
IOCTL_GNTDEV_ALLOC_GRANT_REF

Always enable declarations needed by privcmd.c
 1.108.2.2 29-Feb-2020  ad Sync with head.
 1.108.2.1 17-Jan-2020  ad Sync with head.
 1.117.2.1 25-Apr-2020  bouyer Sync with bouyer-xenpvh-base2 (HEAD)
 1.125.6.1 13-May-2021  thorpej Sync with HEAD.
 1.5 04-Oct-2023  ad Eliminate l->l_ncsw and l->l_nivcsw. From memory think they were added
before we had per-LWP struct rusage; the same is now tracked there.
 1.4 24-Sep-2022  riastradh x86: Support EFI runtime services.

This creates a special pmap, efi_runtime_pmap, which avoids setting
PTE_U but allows mappings to lie in what would normally be user VM --
this way we don't fall afoul of SMAP/SMEP when executing EFI runtime
services from CPL 0. SVS does not apply to the EFI runtime pmap.

The mechanism is intended to work with either physical addressing or
virtual addressing; currently the bootloader does physical addressing
but in principle it could be modified to do virtual addressing
instead, if it allocated virtual pages, assigned them in the memory
map, and issued RT->SetVirtualAddressMap.

Not sure pmap_activate_sync and pmap_deactivate_sync are correct,
need more review from an x86 wizard.

If this causes fallout, it can be disabled temporarily without
reverting anything by just making efi_runtime_init return immediately
without doing anything, or by removing options EFI_RUNTIME.

amd64-only for now pending type fixes and testing on i386.
 1.3 13-Sep-2022  riastradh x86/pmap.h: Need machine/cpufunc.h for invlpg.
 1.2 20-Aug-2022  riastradh x86: Move definition of struct pmap to pmap_private.h.

This makes pmap_resident_count and pmap_wired_count out-of-line
functions instead of inline. No functional change intended
otherwise.
 1.1 20-Aug-2022  riastradh x86: Split most of pmap.h into pmap_private.h or vmparam.h.

This way pmap.h only contains the MD definition of the MI pmap(9)
API, which loads of things in the kernel rely on, so changing x86
pmap internals no longer requires recompiling the entire kernel every
time.

Callers needing these internals must now use machine/pmap_private.h.
Note: This is not x86/pmap_private.h because it contains three parts:

1. CPU-specific (different for i386/amd64) definitions used by...

2. common definitions, including Xenisms like xpmap_ptetomach,
further used by...

3. more CPU-specific inlines for pmap_pte_* operations

So {amd64,i386}/pmap_private.h defines 1, includes x86/pmap_private.h
for 2, and then defines 3. Maybe we should split that out into a new
pmap_pte.h to reduce this trouble.

No functional change intended, other than that some .c files must
include machine/pmap_private.h when previously uvm/uvm_pmap.h
polluted the namespace with pmap internals.

Note: This migrates part of i386/pmap.h into i386/vmparam.h --
specifically the parts that are needed for several constants defined
in vmparam.h:

VM_MAXUSER_ADDRESS
VM_MAX_ADDRESS
VM_MAX_KERNEL_ADDRESS
VM_MIN_KERNEL_ADDRESS

Since i386 needs PDP_SIZE in vmparam.h, I added it there on amd64
too, just to keep things parallel.
 1.17 17-Mar-2020  ad Hallelujah, the bug has been found. Resurrect prior changes, to be fixed
with following commit.
 1.16 17-Mar-2020  ad Back out the recent pmap changes until I can figure out what is going on
with pmap_page_remove() (to pmap.c rev 1.365).
 1.15 15-Mar-2020  ad - pmap_enter(): Remove cosmetic differences between the EPT & native cases.
Remove old code to free PVEs that should not be there that caused panics
(merge error moving between source trees on my part).

- pmap_destroy(): pmap_remove_all() doesn't work for EPT yet, so need to catch
up on deferred PTP frees manually in the EPT case.

- pp_embedded: Remove it. It's one more variable to go wrong and another
store to be made. Just check for non-zero PTP pointer & non-zero VA
instead.
 1.14 14-Mar-2020  ad PR kern/55071 (Panic shortly after running X11 due to kernel diagnostic assertion "mutex_owned(&pp->pp_lock)")

- Fix a locking bug in pmap_pp_clear_attrs() and in pmap_pp_remove() do the
TLB shootdown while still holding the target pmap's lock.

Also:

- Finish PV list locking for x86 & update comments around same.

- Keep track of the min/max index of PTEs inserted into each PTP, and use
that to clip ranges of VAs passed to pmap_remove_ptes().

- Based on the above, implement a pmap_remove_all() for x86 that clears out
the pmap in a single pass. Makes exit() / fork() much cheaper.
 1.13 10-Mar-2020  ad - pmap_check_inuse() is expensive so make it DEBUG not DIAGNOSTIC.

- Put PV locking back in place with only a minor performance impact.
pmap_enter() still needs more work - it's not easy to satisfy all the
competing requirements so I'll do that with another change.

- Use pmap_find_ptp() (lookup only) in preference to pmap_get_ptp() (alloc).
Make pm_ptphint indexed by VA not PA. Replace the per-pmap radixtree for
dynamic PV entries with a per-PTP rbtree. Cuts system time during kernel
build by ~10% for me.
 1.12 23-Feb-2020  ad The PV locking changes are expensive and not needed yet, so back them
out for the moment. I want to find a cheaper approach.
 1.11 23-Feb-2020  ad UVM locking changes, proposed on tech-kern:

- Change the lock on uvm_object, vm_amap and vm_anon to be a RW lock.
- Break v_interlock and vmobjlock apart. v_interlock remains a mutex.
- Do partial PV list locking in the x86 pmap. Others to follow later.
 1.10 12-Jan-2020  ad x86 pmap:

- It turns out that every page the pmap frees is necessarily zeroed. Tell
the VM system about this and use the pmap as a source of pre-zeroed pages.

- Redo deferred freeing of PTPs more elegantly, including the integration with
pmap_remove_all(). This fixes problems with nvmm, and possibly also a crash
discovered during fuzzing.

Reported-by: syzbot+a97186518c84f1d85c0c@syzkaller.appspotmail.com
 1.9 04-Jan-2020  ad branches: 1.9.2;
x86 pmap improvements, reducing system time during a build by about 15% on
my test machine:

- Replace the global pv_hash with a per-pmap record of dynamically allocated
pv entries. The data structure used for this can be changed easily, and
has no special concurrency requirements. For now go with radixtree.

- Change pmap_pdp_cache back into a pool; cache the page directory with the
pmap, and avoid contention on pmaps_lock by adjusting the global list in
the pool_cache ctor & dtor. Align struct pmap and its lock, and update
some comments.

- Simplify pv_entry lists slightly. Allow both PP_EMBEDDED and dynamically
allocated entries to co-exist on a single page. This adds a pointer to
struct vm_page on x86, but shrinks pv_entry to 32 bytes (which also gets
it nicely aligned).

- More elegantly solve the chicken-and-egg problem introduced into the pmap
with radixtree lookup for pages, where we need PTEs mapped and page
allocations to happen under a single hold of the pmap's lock. While here
undo some cut-n-paste.

- Don't adjust pmap_kernel's stats with atomics, because its mutex is now
held in the places the stats are changed.
 1.8 02-Jan-2020  ad Back the pv_hash stuff out. Now seeing errors from ATOMIC_*.
For another day.
 1.7 02-Jan-2020  ad Replace the pv_hash_locks with atomic ops.

Leave the hash table at the same size for now: with the hash table size
doubled, system time for a build drops 10-15%, but user time starts to rise
suspiciously, presumably because the cache is wrecked. Need to try another
data structure.
 1.6 13-Nov-2019  maxv Rename:
PP_ATTRS_M -> PP_ATTRS_D
PP_ATTRS_U -> PP_ATTRS_A
For consistency.
 1.5 09-Mar-2019  maxv Start replacing the x86 PTE bits.
 1.4 01-Feb-2019  maxv Change the format of the pp_attrs field: instead of using PTE bits
directly, use abstracted bits that are converted from/to PTE bits when
needed (in pmap_sync_pv).

This allows us to use the same pp_attrs for pmaps that have PTE bits at
different locations.
 1.3 12-Jun-2011  rmind branches: 1.3.54;
Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.
 1.2 28-Jan-2008  yamt branches: 1.2.2; 1.2.10; 1.2.28; 1.2.36; 1.2.46;
save a word in pv_entry by making pv_hash SLIST.

although this can slow down pmap_sync_pv if hash lists get long,
we should keep them short anyway.
 1.1 20-Jan-2008  yamt branches: 1.1.2; 1.1.4;
- rewrite P->V tracking.
- use a hash rather than SPLAY trees.
SPLAY tree is a wrong algorithm to use here.
will be revisited if it slows down anything other than
micro-benchmarks.
- optimize the single mapping case (it's a common case) by
embedding an entry into mdpage.
- don't keep a pmap pointer as it can be obtained from ptp.
(discussed on port-i386 some years ago.)
ideally, a single paddr_t should be enough to describe a pte.
but it needs some more thoughts as it can increase computational
costs.
- pmap_enter: simplify and fix races with pmap_sync_pv.
- don't bother to lock pm_obj[i] where i > 0, unless DIAGNOSTIC.
- kill mp_link to save space.
- add many KASSERTs.
 1.1.4.3 04-Feb-2008  yamt sync with head.
 1.1.4.2 21-Jan-2008  yamt sync with head
 1.1.4.1 20-Jan-2008  yamt file pmap_pv.h was added on branch yamt-lazymbuf on 2008-01-21 09:40:09 +0000
 1.1.2.2 20-Jan-2008  bouyer Sync with HEAD
 1.1.2.1 20-Jan-2008  bouyer file pmap_pv.h was added on branch bouyer-xeni386 on 2008-01-20 17:51:26 +0000
 1.2.46.1 23-Jun-2011  cherry Catchup with rmind-uvmplock merge.
 1.2.36.1 25-Apr-2010  rmind Drop per-"MD page" (i.e. struct pmap_page) locking i.e. pp_lock/pp_unlock
and rely on locking provided by upper layer, UVM. Sprinkle asserts.
 1.2.28.1 27-Aug-2011  jym Sync with HEAD. Most notably: uvm/pmap work done by rmind@, and MP Xen
work of cherry@.

No regression observed on suspend/restore.
 1.2.10.2 23-Mar-2008  matt sync with HEAD
 1.2.10.1 28-Jan-2008  matt file pmap_pv.h was added on branch matt-armv6 on 2008-03-23 02:04:28 +0000
 1.2.2.2 18-Feb-2008  mjf Sync with HEAD.
 1.2.2.1 28-Jan-2008  mjf file pmap_pv.h was added on branch mjf-devfs on 2008-02-18 21:05:17 +0000
 1.3.54.2 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.3.54.1 10-Jun-2019  christos Sync with HEAD
 1.9.2.2 29-Feb-2020  ad Sync with head.
 1.9.2.1 17-Jan-2020  ad Sync with head.
 1.13 24-Feb-2011  jruoho Move PowerNow! to the cpufeaturebus.
 1.12 26-Oct-2010  jruoho branches: 1.12.2; 1.12.4;
Remove some unused (ACPI) constants.
 1.11 20-Aug-2010  jruoho Revert all previous changes that were made naively believing that the
existing CPU power management implementations could peacefully coexist with
the acpicpu(4) driver. The following options can not be used with acpicpu(4):
ENHANCED_SPEEDSTEP, INTEL_ONDEMAND_CLOCKMOD, POWERNOW_K7, and POWERNOW_K8.
 1.10 19-Aug-2010  jruoho Add sysctl-glue for interaction with the acpicpu(4).
 1.9 24-Mar-2007  xtraeme branches: 1.9.52; 1.9.58; 1.9.60;
* Remove the WRITE_FIDVID macro from powernow.h and use it in in the
powernow_k8 driver (much better than undeffing and write it again).
* Fix the WRITE_FIDVID macro, I changed it to use the third argument
for the bitmask, but it's not correct.

Last change should fix the problem reported by FUKUMOTO Atsushi.
 1.8 18-Mar-2007  xtraeme There's no need to run est_init or k8_powernow_init on each CPU.
Just run it once (in the first cpu probed) with the RUN_ONCE(9)
framework.

Change the argument of est_init and k8_powernow_init to void, we don't
need cpu_info * anymore.

Suggested by tls@ and mrg@.
 1.7 18-Mar-2007  xtraeme Change k8_powernow_init to accept a struct cpu_info * as argument,
so that in the informative messages it prints the correct cpu
and not curcpu().

This fixes the first part of PR kern/35676.
 1.6 04-Oct-2006  cube branches: 1.6.2; 1.6.4; 1.6.6; 1.6.10; 1.6.12; 1.6.14;
Rework the way PowerNow! and Cool'n'Quiet features are detected and
displayed, to make the code much simpler and easier to follow. Also, use
bitmask_printf() to make output consistent with other stuff. Use
CPUID2FAMILY() where appropriate.
 1.5 27-Aug-2006  xtraeme branches: 1.5.2; 1.5.4; 1.5.6;
Update powernow module with POWERNOW_K7 and POWERNOW_K8 support.
Works fine on amd64 cpus running in 32-bit mode.

Tested by Joel Carnat.
 1.4 23-Aug-2006  xtraeme - Move k7_powernow_* prototypes from i386/include/cpu.h to
x86/include/powernow.h
- Protect k[78]_powernow_init() functions with #ifdef POWERNOW_K[78] to
make it build without these options.

This fixes the problem reported by hubertf.
 1.3 08-Aug-2006  cube branches: 1.3.2;
files.x86 isn't included by Xen kernels, so opt_powernow_k8.h never gets
created by config(1), and thus it's not safe to use it in cpuvar.h.

Simply declare the prototype for k8_powernow_init in powernow.h. No need
to #ifdef protect a prototype, after all, only its users.

Un-breaks build of Xen kernels.
 1.2 07-Aug-2006  xtraeme branches: 1.2.2;
* Do not change struct powernow_pst_s (I added another member in my
previous patch) and this MUST be of that size, otherwise the tables
won't be found.

* powernow_k8.c moved into x86/x86, it should work both i386 and amd64.

* Added more DPRINTFs needed to found the first problem.

* Create "machdep.powernow.frequency" again, I can't remember why I
removed frequency... it should work with estd now.

* Do not try to call k[78]_powernow_init() if cpu is not AMD (thanks
to christos).

And more things I can't remember, but this time it will work in
Athlon 64 cpus and it won't crash in EM64T cpus.
 1.1 06-Aug-2006  xtraeme AMD PowerNow!/Cool`n'Quiet driver for NetBSD/amd64,
adapted from OpenBSD.

Tested on a few machines:

http://bigbird.dohd.org:3021/NetBSD/dmesg
http://www.bsd.org.il/netbsd/acpi/dmesg

Thanks to cube, elad and others for testing and fixes.

Enabled by default on GENERIC.
 1.2.2.3 30-Aug-2006  tron Pull up following revision(s) (requested by xtraeme in ticket #74):
sys/lkm/arch/i386/powernow/Makefile: revision 1.3
sys/arch/x86/x86/powernow_k8.c: revision 1.5
sys/arch/x86/include/powernow.h: revision 1.5
sys/lkm/arch/i386/powernow/lkminit_powernow.c: revision 1.6
Update powernow module with POWERNOW_K7 and POWERNOW_K8 support.
Works fine on amd64 cpus running in 32-bit mode.
Tested by Joel Carnat.
 1.2.2.2 27-Aug-2006  tron Pull up following revision(s) (requested by xtraeme in ticket #57):
sys/arch/i386/i386/identcpu.c: revision 1.37
sys/arch/x86/include/powernow.h: revision 1.4
sys/arch/i386/include/cpu.h: revision 1.127
- Move k7_powernow_* prototypes from i386/include/cpu.h to
x86/include/powernow.h
- Protect k[78]_powernow_init() functions with #ifdef POWERNOW_K[78] to
make it build without these options.
This fixes the problem reported by hubertf.
 1.2.2.1 08-Aug-2006  tron Pull up following revision(s) (requested by cube in ticket #7):
sys/arch/x86/include/cpuvar.h: revision 1.5
sys/arch/x86/include/powernow.h: revision 1.3
files.x86 isn't included by Xen kernels, so opt_powernow_k8.h never gets
created by config(1), and thus it's not safe to use it in cpuvar.h.
Simply declare the prototype for k8_powernow_init in powernow.h. No need
to #ifdef protect a prototype, after all, only its users.
Un-breaks build of Xen kernels.
 1.3.2.3 03-Sep-2006  yamt sync with head.
 1.3.2.2 11-Aug-2006  yamt sync with head
 1.3.2.1 08-Aug-2006  yamt file powernow.h was added on branch yamt-pdpolicy on 2006-08-11 15:43:16 +0000
 1.5.6.1 22-Oct-2006  yamt sync with head
 1.5.4.2 09-Sep-2006  rpaulo sync with head
 1.5.4.1 27-Aug-2006  rpaulo file powernow.h was added on branch rpaulo-netinet-merge-pcb on 2006-09-09 02:44:36 +0000
 1.5.2.1 18-Nov-2006  ad Sync with head.
 1.6.14.1 29-Mar-2007  reinoud Pullup to -current
 1.6.12.1 11-Jul-2007  mjf Sync with head.
 1.6.10.1 10-Apr-2007  ad Sync with head.
 1.6.6.2 15-Apr-2007  yamt sync with head.
 1.6.6.1 24-Mar-2007  yamt sync with head.
 1.6.4.3 03-Sep-2007  yamt sync with head.
 1.6.4.2 30-Dec-2006  yamt sync with head.
 1.6.4.1 04-Oct-2006  yamt file powernow.h was added on branch yamt-lazymbuf on 2006-12-30 20:47:22 +0000
 1.6.2.1 20-Apr-2007  bouyer Pull up following revision(s) (requested by mlelstv in ticket #575):
sys/arch/i386/i386/est.c sync with 1.37
sys/arch/i386/i386/ipifuncs.c sync with 1.16
sys/arch/x86/include/cpu_msr.h sync with 1.4
sys/arch/x86/include/intrdefs.h sync with 1.8
sys/arch/x86/include/powernow.h sync with 1.9
sys/arch/x86/x86/powernow_k8.c sync with 1.20
sys/arch/x86/x86/msr_ipifuncs.c sync with 1.8
sys/arch/amd64/amd64/ipifuncs.c sync with 1.9
sys/arch/i386/i386/identcpu.c patch
sys/arch/i386/i386/machdep.c patch
sys/arch/i386/include/cpu.h patch
sys/arch/x86/conf/files.x86 patch
sys/arch/x86/x86/x86_machdep.c patch
sys/arch/amd64/amd64/machdep.c patch
Add MSR write IPI handler for x86. Use it and the RUN_ONCE framework
to make est and powernow drivers work properly with SMP.
 1.9.60.1 05-Mar-2011  rmind sync with head
 1.9.58.1 06-Nov-2010  uebayasi Sync with HEAD.
 1.9.52.3 28-Mar-2011  jym Sync with HEAD. TODO before merge:
- shortcut for suspend code in sysmon, when powerd(8) is not running.
Borrow ``xs_watch'' thread context?
- bug hunting in xbd + xennet resume. Rings are currently thrashed upon
resume, so current implementation force flush them on suspend. It's not
really needed.
 1.9.52.2 10-Jan-2011  jym Sync with HEAD
 1.9.52.1 24-Oct-2010  jym Sync with HEAD
 1.12.4.1 05-Mar-2011  bouyer Sync with HEAD
 1.12.2.1 06-Jun-2011  jruoho Sync with HEAD.
 1.10 12-Aug-2017  maxv Don't include opt_vm86.h.
 1.9 12-Aug-2017  maxv Remove vm86.

Pass 3.
 1.8 04-Oct-2012  dsl branches: 1.8.14;
Remove references to VM86 from the amd64 kernel configs.
VM86 mode isn't supported while in long mode.
 1.7 20-Apr-2012  jym branches: 1.7.2;
PSL_AC is user-settable.
 1.6 18-Sep-2008  dsl branches: 1.6.28; 1.6.32; 1.6.34;
Remove PSL_MBO (the bits that Must Be One) from PSL_USER - which are the
bits that the 'user' can change.
Who knows what the effect of a user signal handler (which I think might have
access to the bits) changing these bits might be!
 1.5 18-Sep-2008  christos Define a PSL_CLEARSIG macro for the psl flags to be cleared on signal delivery
and use it everywhere.
 1.4 17-Sep-2008  christos Include PSL_D in the flags to be able to be set by the user. Since setmcontext
is used to restore context from a signal handler, this will allow restoring
PSL_D to what it was before the user code entered the signal handler allowing
programs to work.
 1.3 30-Nov-2004  nathanw branches: 1.3.96; 1.3.100; 1.3.102; 1.3.106;
Add PSL_T to PSL_USER; it's fine for a program to want to trap itself.
 1.2 07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.1 26-Feb-2003  fvdl branches: 1.1.2;
Move some files out of i386 into x86, so that they can be shared with
other ports.
 1.1.2.4 18-Dec-2004  skrll Sync with HEAD.
 1.1.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.1.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.1.2.1 03-Aug-2004  skrll Sync with HEAD
 1.3.106.1 19-Oct-2008  haad Sync with HEAD.
 1.3.102.1 24-Sep-2008  wrstuden Merge in changes between wrstuden-revivesa-base-2 and
wrstuden-revivesa-base-3.
 1.3.100.1 04-May-2009  yamt sync with head.
 1.3.96.1 28-Sep-2008  mjf Sync with HEAD.
 1.6.34.1 20-Apr-2012  riz Pull up following revision(s) (requested by jym in ticket #189):
sys/arch/x86/include/psl.h: revision 1.7
sys/arch/i386/i386/locore.S: revision 1.98
sys/arch/amd64/acpi/acpi_wakecode.S: revision 1.11
sys/arch/amd64/amd64/mptramp.S: revision 1.13
sys/arch/i386/acpi/acpi_wakecode.S: revision 1.15
sys/arch/i386/i386/mptramp.S: revision 1.23
sys/arch/amd64/amd64/locore.S: revision 1.68
Set the CR0_AM bit so processes can enable alignment check errors under
x86 through PSL_AC bit.
ATF test incoming shortly.
PSL_AC is user-settable.
 1.6.32.1 29-Apr-2012  mrg sync to latest -current.
 1.6.28.2 30-Oct-2012  yamt sync with head
 1.6.28.1 23-May-2012  yamt sync with head.
 1.7.2.2 03-Dec-2017  jdolecek update from HEAD
 1.7.2.1 20-Nov-2012  tls Resync to 2012-11-19 00:00:00 UTC
 1.8.14.1 28-Aug-2017  skrll Sync with HEAD
 1.7 20-Aug-2022  riastradh x86: Forbid using x86/pte.h directly; use machine/pte.h.

machine/pte.h already used outside sys/arch, so let's make it the
primary thing and make sure to use x86/pte.h only as a subroutine.
 1.6 20-Aug-2022  riastradh x86: Move pl*_i, pl_i_roundup, and ptp_va2o out of x86/pmap.h.

- pl[1-4]_i -> x86/pte.h
- pl_i, pl_i_roundup, ptp_va2o -> x86/pmap.c
 1.5 05-Sep-2020  maxv x86: rename PGEX_X -> PGEX_I

To match the x86 specification and the other OSes.
 1.4 14-Mar-2020  maxv style
 1.3 09-Oct-2019  maxv Add new bits.
 1.2 05-Oct-2019  maxv Switch to the new PTE naming. No binary diff (tested with MKREPRO).
 1.1 06-Jul-2010  cegger branches: 1.1.2; 1.1.4; 1.1.6; 1.1.12; 1.1.68;
Turn PMAP_NOCACHE into MI flag.
Add MI flags PMAP_WRITE_COMBINE, PMAP_WRITE_BACK, PMAP_NOCACHE_OVR.
Update pmap(9) manpage.

hppa: Remove MD PMAP_NOCACHE flag as it exists as MI flag
mips: Rename MD PMAP_NOCACHE to PGC_NOCACHE.

x86: Implement new MI flags using Page-Attribute Tables.
x86: Implement BUS_SPACE_MAP_PREFETCHABLE.

Patch presented on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2010/06/30/msg008458.html

No comments on this last version.
 1.1.68.1 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.1.12.2 05-Mar-2011  rmind sync with head
 1.1.12.1 06-Jul-2010  rmind file pte.h was added on branch rmind-uvmplock on 2011-03-05 20:52:28 +0000
 1.1.6.2 24-Oct-2010  jym Sync with HEAD
 1.1.6.1 06-Jul-2010  jym file pte.h was added on branch jym-xensuspend on 2010-10-24 22:48:16 +0000
 1.1.4.2 17-Aug-2010  uebayasi Sync with HEAD.
 1.1.4.1 06-Jul-2010  uebayasi file pte.h was added on branch uebayasi-xip on 2010-08-17 06:45:31 +0000
 1.1.2.2 11-Aug-2010  yamt sync with head.
 1.1.2.1 06-Jul-2010  yamt file pte.h was added on branch yamt-nfs-mp on 2010-08-11 22:52:55 +0000
 1.1 16-Jun-2009  bouyer branches: 1.1.2; 1.1.4; 1.1.6; 1.1.12;
Split mc146818-related functions from clock.c into rtc.c.
Call rtc_set_ymdhms() from xen/xen/clock.c:xen_rtc_set() for xen3 dom0
kernels as the Xen3 hypervisor doesn't write the new date/time to the CMOS
by itself.
Now a XEN3_DOM0 kernel properly updates the CMOS time.
 1.1.12.2 21-Apr-2010  matt sync to netbsd-5
 1.1.12.1 16-Jun-2009  matt file rtc.h was added on branch matt-nb5-mips64 on 2010-04-21 00:33:45 +0000
 1.1.6.2 01-Nov-2009  jym Sync with HEAD.
 1.1.6.1 16-Jun-2009  jym file rtc.h was added on branch jym-xensuspend on 2009-11-01 13:58:16 +0000
 1.1.4.2 20-Jun-2009  yamt sync with head
 1.1.4.1 16-Jun-2009  yamt file rtc.h was added on branch yamt-nfs-mp on 2009-06-20 07:20:12 +0000
 1.1.2.2 19-Jun-2009  snj Pull up following revision(s) (requested by bouyer in ticket #816):
sys/arch/amd64/conf/files.amd64: revision 1.68
sys/arch/i386/conf/files.i386: revision 1.350
sys/arch/x86/include/rtc.h: revision 1.1
sys/arch/x86/isa/clock.c: revision 1.33
sys/arch/x86/isa/rtc.c: revision 1.1
sys/arch/xen/conf/files.xen: revision 1.100
sys/arch/xen/xen/clock.c: revision 1.50 via patch
Split mc146818-related functions from clock.c into rtc.c.
Call rtc_set_ymdhms() from xen/xen/clock.c:xen_rtc_set() for xen3 dom0
kernels as the Xen3 hypervisor doesn't write the new date/time to the CMOS
by itself.
Now a XEN3_DOM0 kernel properly updates the CMOS time.
 1.1.2.1 16-Jun-2009  snj file rtc.h was added on branch netbsd-5 on 2009-06-19 21:22:10 +0000
 1.6 29-Nov-2019  riastradh branches: 1.6.2;
Largely eliminate the MD rwlock.h header file.

This was full of definitions that have been obsolete for over a
decade. The file still remains for __HAVE_RW_STUBS but that's all.
Used only internally in kern_rwlock.c now, not by <sys/rwlock.h>.
 1.5 28-Apr-2008  martin branches: 1.5.88;
Remove clause 3 and 4 from TNF licenses
 1.4 09-Dec-2007  ad branches: 1.4.10; 1.4.12; 1.4.14;
Use atomic_cas_ulong().
 1.3 21-Nov-2007  yamt branches: 1.3.2; 1.3.4;
make kmutex_t and krwlock_t smaller by killing lock id.
ok'ed by Andrew Doran.
 1.2 09-Feb-2007  ad branches: 1.2.4; 1.2.8; 1.2.14; 1.2.24; 1.2.26; 1.2.30; 1.2.32;
Merge newlock2 to head.
 1.1 10-Sep-2006  ad branches: 1.1.2;
file rwlock.h was initially added on branch newlock2.
 1.1.2.3 29-Dec-2006  ad Checkpoint work in progress.
 1.1.2.2 20-Oct-2006  ad - Don't need locked bus cycles on release from C code.
- Save an integer ID in the lock structures for LOCKDEBUG code.
 1.1.2.1 10-Sep-2006  ad Add updated locking primatives.
 1.2.32.2 27-Dec-2007  mjf Sync with HEAD.
 1.2.32.1 08-Dec-2007  mjf Sync with HEAD.
 1.2.30.1 21-Nov-2007  bouyer Sync with HEAD
 1.2.26.1 09-Jan-2008  matt sync with HEAD
 1.2.24.2 09-Dec-2007  jmcneill Sync with HEAD.
 1.2.24.1 21-Nov-2007  joerg Sync with HEAD.
 1.2.14.1 17-Apr-2007  thorpej G/C _lock_cas() -- the atomic ops API provides what the locking
primitives need.
 1.2.8.1 03-Dec-2007  ad Sync with HEAD.
 1.2.4.4 21-Jan-2008  yamt sync with head
 1.2.4.3 07-Dec-2007  yamt sync with head
 1.2.4.2 26-Feb-2007  yamt sync with head.
 1.2.4.1 09-Feb-2007  yamt file rwlock.h was added on branch yamt-lazymbuf on 2007-02-26 09:08:49 +0000
 1.3.4.1 11-Dec-2007  yamt sync with head.
 1.3.2.1 26-Dec-2007  ad Sync with head.
 1.4.14.1 16-May-2008  yamt sync with head.
 1.4.12.1 18-May-2008  yamt sync with head.
 1.4.10.1 02-Jun-2008  mjf Sync with HEAD.
 1.5.88.1 08-Apr-2020  martin Merge changes from current as of 20200406
 1.6.2.2 22-Jan-2020  ad Back out previous.
 1.6.2.1 19-Jan-2020  ad empty these; remove later.
 1.1 21-Jul-2021  jmcneill branches: 1.1.4;
Separate MI smbios interface from MD specific code.
 1.1.4.2 01-Aug-2021  thorpej Sync with HEAD.
 1.1.4.1 21-Jul-2021  thorpej file smbios_machdep.h was added on branch thorpej-i2c-spi-conf on 2021-08-01 22:42:19 +0000
 1.7 21-Jul-2021  jmcneill Separate MI smbios interface from MD specific code.
 1.6 21-Aug-2019  msaitoh branches: 1.6.12;
Fix typo (s/controler/controller/).
 1.5 25-Dec-2018  mlelstv Expose more DMI variables via sysctl.
 1.4 11-Mar-2017  nonaka branches: 1.4.12; 1.4.14;
search SMBIOS from UEFI configuration table when boot with UEFI.
 1.3 16-Apr-2008  cegger branches: 1.3.48; 1.3.68; 1.3.72; 1.3.76;
- use aprint_*_dev and device_xname
- use POSIX integer types
 1.2 30-Mar-2008  ad If SMBIOS is present and there seems to be good expansion slot info,
note the number of ISA compatible slots.
 1.1 01-Oct-2006  bouyer branches: 1.1.2; 1.1.4; 1.1.8; 1.1.10; 1.1.60;
Add ipmi(4) driver, from OpenBSD. This requires SMBios support, so add
SMBios detection and mapping to bios32.c, also from OpenBSD (for now this
is only compiled in if ipmi(4) is configured). The sensors and watchdog are
accessible though envsys(4).
Works on i386; some work is needed on amd64 to access the BIOS. It would
eventually work on Xen if the SMBios is accessible (to be tested).
 1.1.60.2 02-Jun-2008  mjf Sync with HEAD.
 1.1.60.1 03-Apr-2008  mjf Sync with HEAD.
 1.1.10.2 08-Jan-2007  ghen Pull up following revision(s) (requested by bouyer in ticket #1621):
sys/arch/i386/conf/GENERIC: revision 1.787 via patch
share/man/man4/Makefile: revision 1.407 via patch
distrib/sets/lists/man/mi: revision 1.936 via patch
share/man/man4/ipmi.4: revision 1.1 via patch
sys/arch/i386/i386/bios32.c: revision 1.11 via patch
sys/dev/DEVNAMES: revision 1.221 via patch
sys/arch/x86/x86/ipmi.c: revision 1.1 via patch
sys/arch/i386/i386/mainbus.c: revision 1.65 via patch
sys/arch/x86/include/smbiosvar.h: revision 1.1 via patch
sys/arch/x86/include/ipmivar.h: revision 1.1 via patch
sys/arch/x86/conf/files.x86: revision 1.20 via patch
sys/arch/i386/conf/files.i386: revision 1.293 via patch
Add ipmi(4) driver, from OpenBSD. This requires SMBios support, so add
SMBios detection and mapping to bios32.c, also from OpenBSD (for now this
is only compiled in if ipmi(4) is configured). The sensors and watchdog are
accessible though envsys(4).
Works on i386; some work is needed on amd64 to access the BIOS. It would
eventually work on Xen if the SMBios is accessible (to be tested).
Add manpage for new ipmi driver.
Claim ipmi.
 1.1.10.1 01-Oct-2006  ghen file smbiosvar.h was added on branch netbsd-3 on 2007-01-08 16:36:20 +0000
 1.1.8.2 30-Dec-2006  yamt sync with head.
 1.1.8.1 01-Oct-2006  yamt file smbiosvar.h was added on branch yamt-lazymbuf on 2006-12-30 20:47:22 +0000
 1.1.4.2 18-Nov-2006  ad Sync with head.
 1.1.4.1 01-Oct-2006  ad file smbiosvar.h was added on branch newlock2 on 2006-11-18 21:29:38 +0000
 1.1.2.2 22-Oct-2006  yamt sync with head
 1.1.2.1 01-Oct-2006  yamt file smbiosvar.h was added on branch yamt-splraiseipl on 2006-10-22 06:05:16 +0000
 1.3.76.1 21-Apr-2017  bouyer Sync with HEAD
 1.3.72.1 20-Mar-2017  pgoyette Sync with HEAD
 1.3.68.1 28-Aug-2017  skrll Sync with HEAD
 1.3.48.1 03-Dec-2017  jdolecek update from HEAD
 1.4.14.2 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.4.14.1 10-Jun-2019  christos Sync with HEAD
 1.4.12.1 26-Dec-2018  pgoyette Sync with HEAD, resolve a few conflicts
 1.6.12.1 01-Aug-2021  thorpej Sync with HEAD.
 1.220 24-Aug-2025  rillig x86/specialreg.h: remove redundant '\0' from snprintb format
 1.219 28-Apr-2025  riastradh xen: Stop-gap FPU PCB fix; disable Intel AMX for now.

Since the custom cpu_uarea_alloc/free are disabled under XENPV,
nothing would initialize struct pcb::pcb_savefpu to point either to
struct pcb::pcb_savefpusmall, or to a separately allocated large area
on machines with Intel AMX TILECFG/TILEDATA requiring it. So the
memset in fpu_lwp_fork would crash on null pointer dereference:

[ 1.0000030] uvm_fault(0xffffffff8094a300, 0x0, 2) -> e
[ 1.0000030] fatal page fault in supervisor mode
[ 1.0000030] trap type 6 code 0x2 rip 0xffffffff8062795c cs 0xe030 rflags 0x10202 cr2 0 ilevel 0 rsp 0xffffffff80adad38
[ 1.0000030] curlwp 0xffffffff8078f880 pid 0.0 lowest kstack 0xffffffff80ad62c0
kernel: page fault trap, code=0
Stopped in pid 0.0 (system) at netbsd:memset+0x2c: repe stosq %es:(%rdi)
memset() at netbsd:memset+0x2c
lwp_create() at netbsd:lwp_create+0x2f1
fork1() at netbsd:fork1+0x42c
main() at netbsd:main+0x44f

In order to support Intel AMX TILECFG/TILEDATA, or any other CPU
extensions that increase the XSAVE area beyond what fits in a single
page after struct pcb, we would need to enable the the custom
cpu_uarea_alloc/free. Currently that would imply allocating stack
guard pages (`redzone') under XENPV; if there's some reason the stack
guard pages don't work, we could also push #ifdef XENPV conditionals
into cpu_uarea_alloc/free to cover the guard pages -- to be
considered.

PR kern/59371: Xen domU uvm_fault since FPU state allocation patch

PR port-amd64/57661: Crash when booting on Xeon Silver 4416+ in
KVM/Qemu
 1.218 24-Apr-2025  riastradh amd64: Enable TILECFG and TILEDATA registers.

This allows processes to use the registers, and NetBSD will save and
restore them in context switches. But it does not expose them to
ptrace(2) or debuggers like all the other extended CPU state
(xmm/ymm/zmm) -- that will require more work.

PR kern/57661: Crash when booting on Xeon Silver 4416+ in KVM/Qemu
PR port-amd64/59299: Support Intel AMX CPU state (TILECFG/TILEDATA)
 1.217 24-Apr-2025  riastradh x86: Add some more XCR0 bits and references.

PR port-amd64/57661: Crash when booting on Xeon Silver 4416+ in
KVM/Qemu
 1.216 19-Oct-2024  msaitoh x86/specialreg.h: Update AMD CPUID definitions.

- Add AMD Hetero Workload Classification.
- Extend the number of UMC PMCs field from 6bit to 8bit.
- Add Guest Intercept Control for SEV-ES.
- Add Segmented RMP
 1.215 17-Oct-2024  msaitoh x86/specialreg.h: Update AMD CPUID definitions.

Update definitions from the following PPR:
- PPR for AMD Family 19h Model 11h, Revision B2 Processors
(Doc ID 55901 rev. 0.47)
- PPR for AMD Family 1Ah Model 02h, Revision C1 Processors
(Doc ID 57238 rev.0.24)
- PPR for AMD Family 1Ah Model 24h, Revision B0 Processors
(Doc ID 57274 rev. 3.00)

- Rename CPUID Fn8000001b EDX bit 11 from IbsL3MissFiltering to
Zen4IbsExtension.
- Add some CPUID bits.
 1.214 06-Oct-2024  msaitoh Add some unknown CPUID bits for AMD.
 1.213 06-Oct-2024  msaitoh Add some CPUID bits for AMD.
 1.212 01-Jul-2024  andvar Disable the VIA Alternate Instructions according the VIA documentation:
* C7 and above do not support ALTINST, do not check or attempt to disable them.
* For VIA C3 Nehemiah check extended feature flags for support and status,
do no attempt to disable when AIS is not supported or enabled.
* For pre-Nehemiah models explicitly disable, if they are in the range
of documented models, flags aren't present to check the status on these models.
Note: for pre-Nehemiah may be other functional side effects depdending
on the version and stepping.

Explicit disabling of ALTINST was introduced with rev. 1.84 following
the discovery of some VIA CPUs having these instructions enabled by default
leading to the potential backdoor (aka rosenbrindge).

Unfortunately, implementation used a wrong check (ACE supported flag),
which can be true for the later models, still supporting padlock features.
Setting ALTINST bit on those may have unexpected side effects like VIA C7 CPUID
instruction for temperature sensor not reporting correct value or
`cpuctl identify' not reporting certain CPU features. Similar side effects
can be observed even for Nehemiah models not supporting AIS instructions. This
change should limit possibility of such issues to only the pre-Nehemiah models,
not covered at all in the previous implementation.

Feature Control Register (FCR) macros were unified under one group and
consistent naming while implementing the change. Few comments updated as well.

patch reviewed by Riastradh@ (thank you)

need pullups to netbsd-9, 10.

PR kern/58370
 1.211 12-May-2024  msaitoh branches: 1.211.2;
s/RPMQUERY/RMPQUERY/
 1.210 08-Mar-2024  rillig cpuctl: fix i386 bit descriptions for CPUID_SEF_FLAGS1

warning: non-printing character '\31' in description
'BUS_LOCK_DETECT""b\31' [363]
 1.209 27-Oct-2023  mrg add MSR stuff for AMD errata 1474.
 1.208 27-Jul-2023  msaitoh Add AMD IBPB_RET and BusLockThreshold.
 1.207 25-Jul-2023  mrg x86: turn off zenbleed chicken bit on Zen2 cpus.

this is based upon Taylor's original work. i just made the list
of CPUs to run on correct as i could determine. (also, add some
Zen3 and Zen4 cpuids not yet used by any errata.)

(might be nice to have a better way to expression revision ranges
rather than specific cpuid matches, eg, 0x30-0x4f models in a cpu
family, etc.)

tested on ryzen 3600, and a ported zenbleed PoC that no longer
shows any obtained text. (a similar module-version of it stopped
the PoC on a ryzen 3950x without having to reboot.)

https://www.amd.com/en/resources/product-security/bulletin/amd-sb-7008.html
https://lock.cmpxchg8b.com/zenbleed.html
 1.206 11-Apr-2023  msaitoh Fix compile error.
 1.205 11-Apr-2023  msaitoh Add CPUID 0x07 %ecx bit 24 BUS_LOCK_DETECT.
 1.204 25-Mar-2023  andvar s/Predective/Predictive/ and s/dedected/detected/ in comments.
 1.203 17-Feb-2023  msaitoh Add AMD CPUID Fn0000_0008 %ebx bit 3 INVLPGB.
 1.202 14-Feb-2023  msaitoh Add some CPUID bits from PPR for AMD Family 19h Model 61h Revision B1.
 1.201 30-Dec-2022  msaitoh Fix comment.
 1.200 30-Dec-2022  msaitoh Update definitions from the latest Intel SDM.

- Rename HW_FEEDBACK to HWI (Hardware Feedback Interface).
- Add CPUID Fn0000_0006 %eax bit 24 IA32_THERM_INTERRUPT MSR bit 25 Hardware
Feedback Notification support.
- Add CPUID Fn0000_0007 %ecx bit 29 ENQCMD.
- Add CPUID Fn0000_0007 %edx bit 1 SGX-KEYS.
- Add CPUID Fn0000_0007 %edx bit 5 UINTR(User INTeRrupts).
- Add CPUID Fn0000_0007 %edx bit 1 RTM_ALWAYS_ABORT.
- Rename TSX_FORCE_ABORT to RTM_FORCE_ABORT.
- Add CPUID Fn0000_0007 %edx bit 22 AMX_BF16.
- Add CPUID Fn0000_0007 %edx bit 23 AVX512_FP16.
- Add CPUID Fn0000_0007 %edx bit 24 AMX_TILE.
- Add CPUID Fn0000_0007 %edx bit 25 AMX_INT8.
- Add CPUID Fn0000_0007 sub-leaf 1 %edx bit 18 CET_SSS.
- Add CPUID Fn0000_0007 sub-leaf 2 %edx bit 0 PSFD.
- Add CPUID Fn0000_0007 sub-leaf 2 %edx bit 1 IPRED_CTRL.
- Add CPUID Fn0000_0007 sub-leaf 2 %edx bit 2 RRSBA_CTRL.
- Add CPUID Fn0000_0007 sub-leaf 2 %edx bit 3 DDPD_U.
- Add CPUID Fn0000_0007 sub-leaf 2 %edx bit 4 BHI_CTRL.
- Add CPUID Fn0000_0007 sub-leaf 2 %edx bit 5 MCDT_NO.
- Modify comment. Both Intel and AMD support CPUID Fn0000000b.
- Add CPUID Fn0000_000d sub-leaf 1 %eax bit 4 XFD.
- Modify comment. Hybrid Information -> Native Model ID Information.
- Add CPUID Fn0000_001d Tile Information.
- Add CPUID Fn0000_001e TMUL Information.
 1.199 27-Dec-2022  msaitoh Use __BIT(). Add comment. Whitespace. No functional change.
 1.198 21-Nov-2022  msaitoh branches: 1.198.2;
Update AMD CPUID Fn8000_001b

- Add IbsFetchCtlExtd and IbsOpData4.
- Fix typo (lbs -> Ibs).
 1.197 16-Nov-2022  msaitoh Add CPUID Fn8000_0022 AMD Extended Performance Monitoring and Debug.
 1.196 16-Nov-2022  msaitoh Add CPUID Fn8000_0021 AMD Extended Features Identification 2.
 1.195 16-Nov-2022  msaitoh Add Some definitions from AMD APM:

- Add CPUID Fn8000_0007 %eax RAS capabilities.
- Add CPUID Fn8000_001b Instruction-Based Sampling capabilities.
- Add BTC_NO, ROGPT, RPMQUERY, VmplSSS, TscAuxVirt, VmgexitParam,
VirtualTomMsr, bsVirtGuest, SmtProtection, vsmCommPageMSR and
NestedVirtSnpMsr.
 1.194 19-Oct-2022  msaitoh Add AMD cpuid Fn8000_000a x2AVIC, VNMI and IBSVIRT from APM Vol. 3 Rev. 3.34.
 1.193 12-Oct-2022  msaitoh Add CPUID Fn8000_001e Processor Topology Information.
 1.192 06-Oct-2022  msaitoh Update some AMD CPUID bits:

- Rename FSREP_MOV to FSRM.
- Add Memory Bandwidth Enforcement (MBE)
- Add AMD's PPIN. Rename CPUID_SEF_PPIN to CPUID_SEF_INTEL_PPIN.
- Add Collaborative Processor Performance Control (CPPC).
- Add HOST_MCE_OVERRIDE.
- Add some unknown bits as Bxx.
- Add comments.
- Use __BIT().
 1.191 15-Jun-2022  msaitoh Modify CPUID Fn0000000a %ebx's string. Add new string for %ecx.
 1.190 13-Jun-2022  msaitoh Add top-down slots event bit of architectural performance monitoring leaf.
 1.189 01-Feb-2022  msaitoh s/shareing/sharing/. No functional change.
 1.188 29-Jan-2022  msaitoh Add Intel Hybrid Information Enumeration (CPUID Fn0000_001a).
 1.187 17-Jan-2022  andvar fix typos in comments, mainly s/foward/forward/.
 1.186 15-Jan-2022  msaitoh Add Some definitions from AMD APM:

- CPUID Fn80000001 %ecx bit 30 AddrMaskExt.
- CPUID Fn80000008 %ebx bit 13 INT_WBINVD.
- CPUID Fn80000008 %ebx bit 19 IbrsSameMode.
- CPUID Fn80000008 %ebx bit 20 EferLmsleUnsupported.
- CPUID Fn80000008 %ebx bit 28 PSFD.
- CPUID Fn80000008 %edx bit 30 as "B30". Not documented.
- CPUID Fn8000001f %eax bit 8 SecureTSC.
- CPUID Fn8000001f %eax bit 24 VmsaRegProt.
- Tested by nonaka@.
 1.185 15-Jan-2022  msaitoh Whitespace. No functional change.
 1.184 15-Jan-2022  msaitoh Move CPUID_CAPEX_FLAGS next to %eax because it's for %eax.
 1.183 15-Jan-2022  msaitoh No functional change.

- Modify comment. Add comment. Fix typo. Mainly taken from dragonfly.
- Use __BIT().
 1.182 14-Jan-2022  msaitoh Add Architectural LBR and Linear Address Masking.
 1.181 14-Jan-2022  msaitoh Both Intel and AMD says the name of CPUID 0x01 %edx bit 19 is "CLFSH".
 1.180 13-Jan-2022  msaitoh Add some CPUID bits from the latest Intel SDM.

- Last Branch Record.
- Thread Director.
- AVX version of VNNI.
- Fast short REP MOV.
- HRESET.
- PPIN.
 1.179 13-Jan-2022  msaitoh Use __BIT(). KNF. No functional change.
 1.178 30-Sep-2021  msaitoh Print CPUID_PBE (Pending Break Enable) with "PBE".
 1.177 10-Jul-2021  msaitoh Add some definitions from Intel SDM:

- CPUID leaf 7:0 %ecx bit 13 TME_EN (Total Memory Encryption)
- CPUID leaf 7:0 %edx bit 18 PCONFIG (Platform CONFIGuration)
 1.176 24-Nov-2020  msaitoh branches: 1.176.4;
Add some definitions from the latest Intel SDM:

- Add CPUID leaf 7 %edx bit 23 "KL" (Key Locker).
- Add CPUID leaf 7 subleaf 1 %eax bit 5 "AVX512_BF16".
 1.175 07-Sep-2020  jakllsch branches: 1.175.2;
Fix printb string for LA57
 1.174 07-Sep-2020  msaitoh Add CPUID(EAX=07H, ECX=0) ECX bit 16 LA57 from maxv.
 1.173 05-Sep-2020  maxv x86: fix several CPUID flags

- Rename: CPUID_PN -> CPUID_PSN
CPUID_CFLUSH -> CPUID_CLFSH
CPUID_SBF -> CPUID_PBE
CPUID_LZCNT -> CPUID_ABM
CPUID_P1GB -> CPUID_PAGE1GB
CPUID2_PCLMUL -> CPUID2_PCLMULQDQ
CPUID2_CID -> CPUID2_CNXTID
CPUID2_xTPR -> CPUID2_XTPR
CPUID2_AES -> CPUID2_AESNI
To match the x86 specification and the other OSes.

- Remove: CPUID_B10, CPUID_B20, CPUID_IA64. They do not exist.
 1.172 04-Sep-2020  maxv Add a few more CPUID flags.
 1.171 05-Aug-2020  maxv Add new fields here and there.
 1.170 20-Jul-2020  maxv Revert previous, to unbreak the build (NVMM declares the macro too).

There are hundreds of MSRs, we're not going to list them all, especially
when the majority are unused.
 1.169 19-Jul-2020  jdolecek add definition for MSR_IA32_FEATURE_CONTROL, just for information
 1.168 18-Jun-2020  maxv style and fix typo
 1.167 10-Jun-2020  msaitoh Add SRBDS_CTRL bit.
 1.166 01-Jun-2020  msaitoh Add some definitions from the latest Intel SDM plus small fix:

- Add CPUID leaf 6 %eax bit 19 for HW_FEEDBACK* and IA32_PACKAGE_TERM* MSRs.
- Add CPUID leaf 7 %ecx bit 31 for Protection Keys.
- Add definition of Load only TLB and Store only TLB.
- Add IF_PSCHANGE_MC_NO bit of IA32_ARCH_CAPABILITIES
- Fix HWP_IGNIDL.
 1.165 28-May-2020  msaitoh Add AMD MSR_DE_CFG's bit 1 as DE_CFG_LFENCE_SERIALIZE.
This bit makes lfence instruction serializing.
 1.164 01-May-2020  msaitoh - Add AMD INVLPGB/TLBSYNC hypervisor enable in VMCB and TLBSYNC intercept bit.
- Modify comment.
 1.163 25-Apr-2020  bouyer Merge the bouyer-xenpvh branch, bringing in Xen PV drivers support under HVM
guests in GENERIC.
Xen support can be disabled at runtime with
boot -c
disable hypervisor
 1.162 24-Apr-2020  msaitoh - AMD CPUID Fn8000_000a %edx bit 20 is "SPEC_CTRL".
- Add some bit definitions of AMD's CPUID Fn8000_001f Encrypted Memory
features.
 1.161 06-Apr-2020  msaitoh branches: 1.161.2;
Rename CPUID_APM_TSC to CPUID_APM_ITSC. No functional change.
 1.160 06-Apr-2020  msaitoh CPUID Fn00000001 %edx bit 8 is printed as "TSC", so rename CPUID Fn8000_0007
%edx bit 8 from "TSC" to "ITSC" (Invariant TSC) to avoid confusion.
 1.159 01-Apr-2020  msaitoh Add AVX512_VP2INTERSECT, SERIALIZE and TSXLDTRK(TSX suspend load addr tracking)
 1.158 17-Nov-2019  msaitoh Add the following bit definitions from the latest Intel SDM:
- CET shadow stack
- Fast Short REP MOV
- Hybrid part
- CET Indirect Branch Tracking
 1.157 12-Nov-2019  maxv Mitigation for CVE-2019-11135: TSX Asynchronous Abort (TAA).

Two sysctls are added:

machdep.taa.mitigated = {0/1} user-settable
machdep.taa.method = {string} constructed by the kernel

There are two cases:

(1) If the CPU is affected by MDS, then the MDS mitigation will also
mitigate TAA, and we have nothing else to do. We make the 'mitigated' leaf
read-only, and force:
machdep.taa.mitigated = machdep.mds.mitigated
machdep.taa.method = [MDS]
The kernel already enables the MDS mitigation by default.

(2) If the CPU is not affected by MDS but is affected by TAA, then we use
the new TSX_CTRL MSR to disable RTM. This MSR is provided via a microcode
update, now available on the Intel website. The kernel will automatically
enable the TAA mitigation if the updated microcode is present. If the new
microcode is not present, the user can load it via cpuctl, and set
machdep.taa.mitigated=1.
 1.156 30-Oct-2019  msaitoh - GMET is not bit 11 but 17.
- Add unknown CPUID Fn8000_000a %edx bit 20.
 1.155 08-Oct-2019  msaitoh Fix AMD Fn8000_0001f %eax bit 0's name.
 1.154 03-Oct-2019  msaitoh - Add definitions of AMD's CPUID Fn8000_001f Encrypted Memory features.
- Add definition of AMD's CPUID Fn8000_000a %edx bit 11 "GMET".
- Define CPUID_AMD_SVM_PFThreshold correctly.
- Modify comment a bit for consistency.
 1.153 26-Sep-2019  msaitoh Define CPUID_CAPEX_FLAGS's bit 10 correctly.
 1.152 09-Sep-2019  msaitoh Add MCOMMIT instruction.
 1.151 30-Aug-2019  msaitoh Add definitions of AMD's CPUID Fn8000_0008 %ebx.
 1.150 26-Jul-2019  msaitoh branches: 1.150.2;
- AMD CPUID Fn8000_0001d Cache Topology Information leaf is almost the same as
Intel Deterministic Cache Parameter Leaf(0x04), so make new
cpu_dcp_cacheinfo() and share it.
- AMD's L2 and L3's cache descriptor's definition is the same, so use one
common definition.
- KNF.

XXX Split some common functions to new identcpu_subr.c or use #ifdef _KERNEK
... #endif in identcpu.c to share from both kernel and cpuctl?
 1.149 13-Jul-2019  msaitoh Define some new bits of CPUID Fn8000_0007 %edx AMD Advanced Power Management
leaf.
 1.148 26-Jun-2019  mgorny Fetch XSAVE area component offsets and sizes when initializing x86 CPU

Introduce two new arrays, x86_xsave_offsets and x86_xsave_sizes,
and initialize them with XSAVE area component offsets and sizes queried
via CPUID. This will be needed to implement getters and setters for
additional register types.

While at it, add XSAVE_* constants corresponding to specific XSAVE
components.
 1.147 29-May-2019  maxv Add PCID support in SVS. This avoids TLB flushes during kernel<->user
transitions, which greatly reduces the performance penalty introduced by
SVS.

We use two ASIDs, 0 (kern) and 1 (user), and use invpcid to flush pages
in both ASIDs.

The read-only machdep.svs.pcid={0,1} sysctl is added, and indicates whether
SVS+PCID is in use.
 1.146 18-May-2019  maxv Clean up a little, add new XCR0 bits, remove a few unused MSRs, and fix
typos.
 1.145 14-May-2019  msaitoh Add snprintb's string for cpuid7 edx bit 10 "MD_CLEAR".
 1.144 14-May-2019  maxv Mitigation for INTEL-SA-00233: Microarchitectural Data Sampling (MDS).

It requires a microcode update, now available on the Intel website. The
microcode modifies the behavior of the VERW instruction, and makes it flush
internal CPU buffers. We hotpatch the return-to-userland path to add VERW.

Two sysctls are added:

machdep.mds.mitigated = {0/1} user-settable
machdep.mds.method = {string} constructed by the kernel

The kernel will automatically enable the mitigation if the updated
microcode is present. If the new microcode is not present, the user can
load it via cpuctl, and set machdep.mds.mitigated=1.
 1.143 13-Mar-2019  msaitoh Add TSX_FORCE_ABORT related definitions.
 1.142 09-Mar-2019  maxv Start replacing the x86 PTE bits.
 1.141 16-Feb-2019  maxv Handle MSR_MISC_ENABLE on NVMM-Intel (Intel-specific).
 1.140 11-Feb-2019  cherry We reorganise definitions for XEN source support as follows:

XEN - common sources required for baseline XEN support.
XENPV - sources required for support of XEN in PV mode.
XENPVHVM - sources required for support for XEN in HVM mode.
XENPVH - sources required for support for XEN in PVH mode.
 1.139 08-Feb-2019  msaitoh Fix bitstring format of Intel CPUID Architectural Performance Monitoring
Fn0000000a %ebx.
 1.138 05-Feb-2019  msaitoh Add new CPUID flags WAITPKG, CLDEMOTE, MOVDIRI, MOVDIR64B and
IA32_CORE_CAPABILITIES from the latest Intel SDM.
 1.137 13-Jan-2019  maxv Forgot to commit file along with identcpu.c::rev1.86.
 1.136 26-Nov-2018  msaitoh Add Intel CPUID Architectural Performance Monitoring leaf Fn0000000a.
 1.135 22-Nov-2018  msaitoh Add Intel/AMD MONITOR/MWAIT leaf.
 1.134 21-Nov-2018  msaitoh Add Intel CPUID Extended Topology Enumeration Fn0000000b definitions.
 1.133 21-Nov-2018  msaitoh Modify comment. No functional change:
- AMD also has CPUID 0x06 and 0x0d.
- PCOMMIT was obsoleted.
 1.132 15-Nov-2018  msaitoh Add MAWAU (for BND{LD,ST}X instruction) from the latest Intel SDM.
 1.131 10-Nov-2018  maxv Declare the MSR_VIA_ACE values as macros, and use a consistent naming,
similar to the rest of the file.

I'm wondering if I'm not fixing a huge bug here. The ECX8 value we were
using was wrong: ECX8 is bit 1, not bit 0. Bit 0 is ALTINST, an alternate
ISA, which is now known to be backdoored.

So it looks like we were explicitly enabling the backdoor.

Not tested, because I don't have a VIA cpu.
 1.130 20-Aug-2018  msaitoh OK'd by maxv:
- Add cpuid 7 edx L1D_FLUSH bit.
- Add IA32_ARCH_SKIP_L1DFL_VMENTRY bit.
- Add IA32_FLUSH_CMD MSR.
 1.129 07-Aug-2018  maxv Add five errata for AMD Family 17h (Ryzen etc), tested by Patrick Welche,
thanks. Also add two errata for Family 16h, not yet tested, so not yet
enabled.
 1.128 13-Jul-2018  maxv Remove the X86PMC code I had written, replaced by tprof. Many defines
become unused in specialreg.h, so remove them. We don't want to add
defines all the time, there are countless PMCs on many generations, and
it's better to just inline the event/unit values.
 1.127 04-Jul-2018  maya Disable MWAIT/MONITOR on Apollo Lake CPUs to workaround APL30 errata.

We use MWAIT/MONITOR to hatch secondary CPUs. The errata means that
the wakeup may not happen, so SMP boot fails.
Use wrmsr to disable it in hardware too, for extra paranoia.

PR port-amd64/53420,
also reported on netbsd-users by joern clausen and ssartor.
 1.126 31-May-2018  msaitoh branches: 1.126.2;
Fix the bit location of SSBD in the macro for snprintb.
 1.125 23-May-2018  maxv Clean up the FPU headers.
 1.124 22-May-2018  maxv Extend the AMD NONARCH method to family 17h. The AMD spec states that for
17h care must be taken when handling sibling threads.

The concern is that if we have a protected two-thread process running on
two siblings, and context switch one thread to another unprotected thread,
disabling the SSB protection on one logical core will disable SSB on its
sibling too (which is still running the protected thread).

All of that doesn't matter to us, because the SSB value we set is
system-wide, not per-process.
 1.123 22-May-2018  maxv Implement a mitigation for SpectreV4 on AMD families 15h and 16h. We use
a non-architectural MSR. This MSR is also available on 17h, but there SMT
is involved, and it needs more investigation.

Not tested (I have only 10h).
 1.122 22-May-2018  maxv Add RSBA. When set, it indicates that the CPU is vulnerable to SpectreV2
via the RSB.
 1.121 22-May-2018  maxv Mitigation for SpectreV4, based on SSBD. The following sysctl branches
are added:

machdep.spectre_v4.mitigated = {0/1} user-settable
machdep.spectre_v4.affected = {0/1} set by the kernel

The mitigation is not enabled by default yet. It is not tested either,
because no microcode update has been published yet.

On current CPUs a microcode/bios update must be applied for SSBD to be
available. The user can then set mitigated=1. Even with an update applied
the kernel will set affected=1.

On future CPUs, where the problem will presumably be fixed by default,
the CPU will report SSB_NO, and the kernel will set affected=0. In this
case we also have mitigated=0, but the mitigation is not needed.

For now the feature is system-wide. Perhaps we will want a more
fine-grained, per-process approach in the future.
 1.120 30-Mar-2018  maxv Add RDCL_NO and IBRS_ALL.
 1.119 30-Mar-2018  msaitoh Add Some bit definitions of AMD Fn80000001 %edx:
- MMX
- FXSR
 1.118 30-Mar-2018  msaitoh From the latest Intel SDM:
- Add Intel Fn0000_0006 %eax new bit 14-20 (HWP stuff).
- Intel Fn0000_0007 %ecx bit 22 is for both RDPID and IA32_TSC_AUX.
 1.117 14-Mar-2018  maxv ... and also add IBPB ...
 1.116 14-Mar-2018  maxv Add the IBRS and STIBP MSRs.
 1.115 14-Mar-2018  maxv Add IC_CFG.DIS_IND: "Disable Indirect Branch Predictor". Available (at
least) on AMD Families 10h, 12h and 16h.
 1.114 12-Mar-2018  msaitoh s/CLFUSH/CLFLUSH/
No functional change.
 1.113 08-Mar-2018  msaitoh Sort entries. No functional change.
 1.112 05-Mar-2018  msaitoh branches: 1.112.2;
Add Intel Deterministic Address Translation Parameter Leaf(0x18) definitions.
 1.111 15-Jan-2018  msaitoh Add IA32_SPEC_CTRL MSR and IA32_PRED_CMD MSR.
 1.110 15-Jan-2018  msaitoh Add MSR_IA32_ARCH_CAPABILITIES definition.
 1.109 15-Jan-2018  msaitoh - Add Intel cpuid 7 %edx bit 29 IA32_ARCH_CAPABILITIES supported bit.
- Add comment.
 1.108 13-Jan-2018  jdolecek fix swapped comments for EFER LME and LMA
 1.107 10-Jan-2018  msaitoh Add Intel cpuid 7 %edx IBRS(IBPB Speculation Control) and
STIBP(STIBP Speculation Control) from OpenBSD.
 1.106 10-Jan-2018  msaitoh Add comment.
 1.105 19-Oct-2017  msaitoh Add the following bits in AMD Fn8000000a %edx features (SVM features):
PFThreshold (PAUSE filter threshold)
AVIC (AMD virtual interrupt controller)
V_VMSAVE_VMLOAD (virtualized VMSAVE and VMLOAD)
vGIF (virtualized GIF)
 1.104 18-Oct-2017  msaitoh Add Turbo Boost Max Technology 3.0 bit.
 1.103 13-Oct-2017  msaitoh Add the following instruction bits in Structured Extended Flags Enumeration
Leaf from "Intel Architecture Instruction Set Extensions and Future Features
Programming Reference" (319433-030):
AVX512_IFMA
AVX512_VBMI
AVX512_VBMI2
GFNI
VAES
VPCLMULQDQ
AVX512_VNNI
AVX512_BITALG
AVX512_VPOPCNTDQ
AVX512_4VNNIW
AVX512_4FMAPS
 1.102 07-Sep-2017  msaitoh Define CPUID Fn00000001 %ebx bits and use them. No functional change.
 1.101 11-Aug-2017  maxv Add a comment about APICBASE_PHYSADDR. Has to do with PR/42597.
 1.100 11-Jul-2017  gson Fix typo in comment
 1.99 14-Jun-2017  maxv Add EFER_TCE. This would be an interesting feature to have, since it
reduces the indirect cost of invlpg; but I'm not convinced the way we
flush upper-levels is correct for this yet.
 1.98 15-May-2017  msaitoh branches: 1.98.2;
CPUID_CFLUSH bit is not for CFLUSH insn but CLFLUSH insn, so modify comments
and snprintb() sring.
 1.97 22-Apr-2017  nonaka branches: 1.97.2;
move LAPIC_MSR* to specialreg.h.
 1.96 22-Apr-2017  nonaka Add x2APIC register definitions.
 1.95 11-Mar-2017  maxv Add the AMD 10h family, with additional events that I believe are useful,
the DTLB misses on large pages for example.

While here, remove a few K7 flags that do not actually exist on K7 (there
must have been a confusion between K7 and K8); and make the 'pmc list'
command a little more user-friendly.
 1.94 18-Feb-2017  maxv Add the AMD 10h family PMC values. Some values depend on the CPU revision,
they are commented out. Several other values are common with K7, we could
merge them later.

This family of CPUs has a 12bit event selector, contrary to K7 (8bit). The
thing is, i386's PMC interface takes as argument a uint8_t from userland,
so these counters are not accessible (yet).
 1.93 11-Feb-2017  maxv Fix a few (unused) MSR values, and add some others that I believe are
relevant.

From Murray Armfield (PR/42861).
 1.92 02-Feb-2017  msaitoh Modify comment. Use long form.
 1.91 08-Dec-2016  msaitoh branches: 1.91.2;
Add CLWB bit.
 1.90 05-Dec-2016  msaitoh Fix CPUID_SEF_FLAGS. Octal value has no 8.
 1.89 19-Aug-2016  maxv KNF so NXR likes it, and some typos
 1.88 16-Jul-2016  maxv Add the cr4 flags for PKE and UMIP.
 1.87 27-Apr-2016  msaitoh branches: 1.87.2;
Add some bit definitions mainly taken from the latest Intel SDM:
- Add SGX, UMIP, RDPID and SGXLC.
- Add avx512dq, avx512bw and avx512vl.
Fix the bit location of CLFLUSHOPT.
 1.86 13-Jan-2016  msaitoh Add some AMD's bit definitions from "BIOS and Kernel Developer(BKDG) for AMD
Family 15h Models 60h-6Fh Processors".
 1.85 08-Jan-2016  msaitoh Add CLFLUSHOPT bit.
 1.84 08-Jan-2016  msaitoh Add x86 FPU Data Pointer Updated Only bit from Intel SDM.
 1.83 14-Aug-2015  msaitoh - Add Hardware-Controlled Performance States (HWP) bits.
- Use __BIT()
 1.82 08-May-2015  msaitoh From Intel SDM:
- Add the Silicon Debug bit in CPUID Fn00000001 %ecx
- Add CPUID Fn0000_0007 %ecx bits
- Add comments.
 1.81 12-Dec-2014  msaitoh Use specialreg.h's definitions.
 1.80 11-Sep-2014  msaitoh branches: 1.80.2;
- Add two more bit definitions
- XINUSE -> XGETBV
 1.79 09-Sep-2014  msaitoh Update CPUID(EAX=0x0d, ECX=1) from Intel SDM:
- XSAVEC(bit1)
- XGETBV(bit2)
- XSAVES(bit3)
 1.78 25-Feb-2014  dsl branches: 1.78.4;
Add the XCR bits for snazzy upcoming features.
Define a mask for the fpu releated ones - only these wll be enabled.
The memory bound ones will need saving on every context switch.
 1.77 04-Jan-2014  msaitoh Add Energy Performance Bias bit.
 1.76 04-Jan-2014  msaitoh Remove duplicated entry. Modify comments a bit.
 1.75 25-Dec-2013  msaitoh move XCR0 definitions to next to CR0's.
 1.74 08-Dec-2013  dsl Add some definitions for cpu 'extended state'.
These are needed for support of the AVX SIMD instructions.
Nothing yet uses them.
 1.73 20-Nov-2013  msaitoh - Add some AMD Fn80000001 extended features %ecx bits definitions from
the document (AMD64 Architecture ProgrammerVolume 3: General-Purpose and
System Instructions. Document revision 3.20)

- "s/MXX/MMXX/" because this bit is "MMX eXtention".
 1.72 15-Nov-2013  msaitoh Modify some macros and add some new macros for CPU family and model
to reduce code duplication and to avoid bug.

CPUID_TO_STEPPING(cpuid) (not changed)

CPUID_TO_FAMILY(cpuid) (new)
CPUID_TO_MODEL(cpuid) (new)

Return the display family and the display model.
The macro names are the same as FreeBSD.

CPUID_TO_BASEFAMILY(cpuid) (The old name was CPUID2FAMILY)
CPUID_TO_BASEMODEL(cpuid) (The old name was CPUID2MODEL)

Only for the base field.

CPUID_TO_EXTFAMILY(cpuid) (The old name was CPUID2EXTFAMILY)
CPUID_TO_EXTMODEL(cpuid) (The old name was CPUID2EXTMODEL)

Only for the extended field.

See http://mail-index.netbsd.org/port-amd64/2013/11/12/msg001978.html
 1.71 21-Oct-2013  msaitoh - Add Intel Deterministic Cache Parameter Leaf (CPUID leaf 4).
This definitions are required to know cache information of
newer Intel CPU.
- Fix comment.
 1.70 04-Oct-2013  msaitoh Sort definitions. No functional change.
- CPUID_FEAT_BLACKLIST is for Fn00000001 %edx, so move it.
- Sort CPUID definitions with initial EAX value.
 1.69 04-Oct-2013  msaitoh Add comment about CPUID Processor extended state Enumeration Fn0000000d %eax.
 1.68 14-Sep-2013  msaitoh Add some definitions of Intel's cpuid feature from the latest document.
 1.67 12-Aug-2013  drochner add feature flag definitions for the last round of Intel instruction
set extensions (AVX512 et al.)
 1.66 26-Jul-2013  msaitoh Style change.
 1.65 25-Jul-2013  msaitoh Add some new bit definitions of Structured Extended Feature Flags Enumeration
Leaf from the document (Intel 64 and IA-32 Architectures Software Developer's
Manual).
 1.64 25-Jul-2013  msaitoh Fix the bit positions in CPUID_SEF_FLAGS macro. On snprintb(), position 1
means LSB(bit0). The bit position from HLE to SMAP was 1 bit right shifted.
The bit position of BMI1 was completely wrong.
 1.63 06-Mar-2013  yamt branches: 1.63.6;
some more definitions
 1.62 06-Jan-2013  dsl Correct the comment about the extended family and model bits.
Add some definitions related to the process extended state enumeration.
 1.61 03-Jan-2013  dsl Add some missing bit definitions to CPUID2 and those for XCR0.
Taken from the August 2012 Intel SDM (intel_x86_325462.pdf).
Split all the snprintb() format strings to make them (almost) readable.
Fix CPUID_AMD_FLAGS4 to not try to print bits \41 and \42.
 1.60 17-Oct-2012  drochner recognize the P1GB and RDTSCP which were AMD-only on Intel HW too
 1.59 05-May-2012  jym branches: 1.59.2;
Add latest CR4 bits:
- CR4_VMXE: VMX operations, used for hardware virtualization.
- CR4_SMXE: SMX operations, used for safer Mode Extensions (ground for
Intel's TXT - Trusted Execution Technology - platform).
- CR4_FSGSBASE: enable *FSBASE and *GSBASE instructions, for R/W access
to FS/GS segment base addresses.
- CR4_PCIDE: enable Process Context IDentifiers (other architectures may call
these "address space identifiers").
- CR4_OSXSAVE: enable xsave and xrestore instructions
- CR4_SMEP: Supervisor Mode Execution Prevention. Allows enforcing --x rights
from cpl 0.

From Intel® 64 and IA-32 Architectures Software Developer’s Manual,
March 2012.

Align declarations.

CPUID_* bits for these features follow.
 1.58 30-Apr-2012  christos Add VIA Eden FCR MSR.
 1.57 06-Apr-2012  chs bring in this change from openbsd:
Implement the AMD suggested workaround for family 10h & 12h errata 721
"Processor May Incorrectly Update Stack Pointer" by setting a bit
marked 'reserved' in an MSR that is only "documented" to exist on 12h.
 1.56 02-Mar-2012  bouyer Don't mask out CPUID_FXSR. If not set, the kernel won't handle SSE and SSE2
registers on context switches; leading to data corruption when running
binaries using these instructions (like e.g. binaries built with a
-mcpu newer than pentium 4, which enables theses instruction in gcc).
 1.55 15-Dec-2011  abs branches: 1.55.2;
Increase MTRR_I686_NVAR_MAX from 8 to 16. Avoids
"FIXME: more than 8 MTRRs (10)" message on booting Thinkpad W520 and
similar. While here replace a magic number with MTRR_I686_NVAR_MAX * 2
 1.54 09-Dec-2011  cegger add AMD ucode MSRs
 1.53 03-Oct-2011  njoly branches: 1.53.2; 1.53.6;
Do not redefine CPUID_LAHF.
 1.52 26-Jul-2011  yamt - add PCID
- comment
 1.51 20-Feb-2011  jruoho Add MSR_TEMPERATURE_TARGET.
 1.50 15-Feb-2011  cegger update cpuid bits
 1.49 12-Oct-2010  jakllsch branches: 1.49.2; 1.49.4;
Correct another off-by-one-bit error. This time for Erratum 97.
 1.48 18-Sep-2010  jakllsch AMD publication 25759 rev 3.69 says that DisIOReqLock in NB_CFG is "bit 3".
They probably mean "bit 3" and not "the third bit" (or bit 2).
This change should prevent superfluous warnings of errata 89.
 1.47 25-Aug-2010  jruoho Add definitions for Intel Digital Thermal Sensor and Power Management, at
CPUID Fn0000_0006, %eax, %ecx. Use these instead of magic numbers.
 1.46 21-Aug-2010  jruoho Add IA32_MPERF (E7h) and IA32_APERF (E8h) as MSR_MPERF and MSR_APERF.
 1.45 21-Aug-2010  jruoho Add CPUID_APM_CPB at Fn8000_0007 %edx, for core performance boost.
 1.44 29-Jul-2010  cegger add RDTSCP_AUX MSR
 1.43 24-Jul-2010  cegger add AMD OSVW MSRs
 1.42 06-Jul-2010  cegger Turn PMAP_NOCACHE into MI flag.
Add MI flags PMAP_WRITE_COMBINE, PMAP_WRITE_BACK, PMAP_NOCACHE_OVR.
Update pmap(9) manpage.

hppa: Remove MD PMAP_NOCACHE flag as it exists as MI flag
mips: Rename MD PMAP_NOCACHE to PGC_NOCACHE.

x86: Implement new MI flags using Page-Attribute Tables.
x86: Implement BUS_SPACE_MAP_PREFETCHABLE.

Patch presented on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2010/06/30/msg008458.html

No comments on this last version.
 1.41 04-May-2010  jym Enable the NX bit feature for Xen i386pae and amd64 kernels.

Tested with Xen 3.1 and Xen 3.3, dom0 and domU, by bouyer@ and jym@.

Ok bouyer@.
 1.40 18-Apr-2010  jym This patch fixes the NX regression issue observed on amd64 kernels, where
per-page execution right was disabled (therefore leading to the inability
of the kernel to detect fraudulent use of memory mappings marked as not
being executable).

- replace cpu_feature and ci_feature_flags variables by cpu_feature and
ci_feat_val arrays. This makes it cleaner and brings kernel code closer
to the design of cpuctl(8). A warning will be raised for each CPU that
does not expose the same features as the Boot Processor (BP).

- the blacklist of CPU features is now a macro defined in the
specialreg.h header, instead of hardcoding it inside MD initialization
code; fix comments.

- replace checks against CPUID_TSC with the cpu_hascounter() function.

- clean up the code in init_x86_64(), as cpu_feature variables are set
inside cpu_probe().

- use cpu_init_msrs() for i386. It will be eventually used later for NX
feature under i386 PAE kernels.

- remove code that checks for CPUID_NOX in amd64 mptramp.S, this is already
performed by cpu_hatch() through cpu_init_msrs().

- remove cpu_signature and feature_flags members from struct mpbios_proc
(they were never used).

This patch was tested with i386 MONOLITHIC, XEN3PAE_DOM0 and XEN3_DOM0 under
a native i386 host, and amd64 GENERIC, XEN3_DOM0 via QEMU virtual machines.

XXX Should kernel rev be bumped?

XXX A similar patch should be pulled-up for NetBSD-5, hopefully tomorrow.
 1.39 03-Apr-2010  jym Fix the comments about cpuid flags, according cpuid documentation by
Intel and AMD.
 1.38 13-Jan-2010  cegger branches: 1.38.2; 1.38.4;
recognize SVM PauseFilter
 1.37 13-Aug-2009  cegger recognize virtual cpu feature indicating guest state.
 1.36 26-May-2009  rmind Add CPU topology detection support for AMD processors.
Tested on the following AMD CPUs:
- Family 15, model 65
- Family 15, model 67
- Family 15, model 75
- Family 16, model 2
- Family 17, model 3

Reviewed (slightly older version of patch) by <yamt>.
 1.35 16-May-2009  pgoyette Correctly identify flag bit for SSSE3 (one of the 'S' was missing). Also
rename AMD bit from SCALL/RET to SYSCALL/SYSRET to match Intel bit name.
 1.34 13-May-2009  pgoyette 1. Extend CPU probe of Intel processors to handle extended-models. This
allows us to properly identify new Intel 45nm processors, Core i7,
Atom, and the 45nm Xeon MP.

2. Properly decode several new Intel cache descriptors, as listed in the
most recent (March 2009) edition of Intel's Application Note 485.

3. Convert decode of the various features masks to use the newly added
snprintb_m(3) routine.

Addresses my PR bin/41289
Addresses my PR bin/41290
 1.33 12-Mar-2009  yamt add definitions for SVM features.
 1.32 12-Mar-2009  yamt comments
 1.31 14-Oct-2008  cegger branches: 1.31.2; 1.31.4; 1.31.8; 1.31.12;
do correct octal counting and use CPUID_APM_FLAGS in cpuctl
 1.30 14-Oct-2008  cegger add cpuid fn 80000007 %edx: AMD Power Management feature flags
 1.29 14-Oct-2008  cegger fix output of 3DNOWPREFETCH feature flag
 1.28 13-Oct-2008  cegger Add cpuid 0x80000001 %ecx features flags. Rename CPUID_MASK4 to CPUID_INTEL_MASK4 for consistency with new CPUID_AMD_MASK4
 1.27 26-Aug-2008  pgoyette Clean up previous: add bit definitions for some new fields, and use "old"
style bitmask_printf(9) format string for consistency with the rest of the
file. No functional change.

OK cegger@
 1.26 24-Aug-2008  pgoyette Shorten SYSCALL/SYSRET to SCALL/RET bit definition so it fits on one line.
 1.25 24-Aug-2008  pgoyette 1. For non-Intel vendors, don't overload cpuflags with the extended
flags from CPUID 80000001_EDX. Instead, keep the extended flags
separate, in ci_feature3_flags (Intel processors already kept a
separate ci_feature3_flag value).

2. Decode/display ci_feature3_flag in a vendor-specific manner, since
the definitions are vendor-specific.

OK cegger@
 1.24 25-May-2008  chris branches: 1.24.4;
Add detection of errata for AMD Family 10h steppings A and 2. Covering
errata:
254: Internal Resource Livelock Involving Cached TLB Reload
261: Processor May Stall Entering Stop-Grant Due to Pending Data
Cache Scrub
298: L2 Eviction May Occur During Processor Operation To Set
Accessed or Dirty Bit
309: Processor Core May Execute Incorrect Instructions on
Concurrent L2 and Northbridge Response
 1.23 03-Feb-2008  xtraeme branches: 1.23.6; 1.23.8; 1.23.10; 1.23.12;
Add DTES64 and SSE4 related bits to CPUID2_FLAGS, from FreeBSD.
 1.22 21-Dec-2007  drochner define the SSSE3 feature flag bit and print out all known bits
 1.21 29-Oct-2007  xtraeme branches: 1.21.2; 1.21.4; 1.21.8;
Add coretemp(4). A new driver for Intel Core's on-die thermal sensor,
available on Intel Core or newer CPUs.

Ported from FreeBSD. Tested by rmind on i386 and joerg on amd64.

Enabled with "options INTEL_CORETEMP".
 1.20 17-Oct-2007  garbled Merge the ppcoea-renovation branch to HEAD.

This branch was a major cleanup and rototill of many of the various OEA
cpu based PPC ports that focused on sharing as much code as possible
between the various ports to eliminate near-identical copies of files in
every tree. Additionally there is a new PIC system that unifies the
interface to interrupt code for all different OEA ppc arches. The work
for this branch was done by a variety of people, too long to list here.

TODO:
bebox still needs work to complete the transition to -renovation.
ofppc still needs a bunch of work, which I will be looking at.
ev64260 still needs to be renovated
amigappc was not attempted.

NOTES:
pmppc was removed as an arch, and moved to a evbppc target.
 1.19 26-Sep-2007  ad branches: 1.19.2;
x86 changes for pcc and LKMs.

- Replace most inline assembly with proper functions. As a side effect
this reduces the size of amd64 GENERIC by about 120kB, and i386 by a
smaller amount. Nearly all of the inlines did something slow, or something
that does not need to be fast.
- Make curcpu() and curlwp functions proper, unless __GNUC__ && _KERNEL.
In that case make them inlines. Makes curlwp LKM and preemption safe.
- Make bus_space and bus_dma more LKM friendly.
- Share a few more files between the ports.
- Other minor changes.
 1.18 11-Jul-2007  njoly branches: 1.18.8; 1.18.10; 1.18.12;
Display RDTSCP bit on AMD processors (Read Serialized TSC Pair).

ok by xtraeme
 1.17 03-Jul-2007  christos Support for VIA Esther (From FreeBSD)
 1.16 04-Jun-2007  xtraeme Add four missing bits for CPUID2_FLAGS, from FreeBSD.
 1.15 17-Feb-2007  daniel branches: 1.15.2; 1.15.6; 1.15.8; 1.15.14;
Add an opencrypto provider for the AES xcrypt instructions found on VIA
C5P and later cores (also known as 'ACE', which is part of the VIA PadLock
security engine). Ported from OpenBSD.

Reviewed on tech-crypto and port-i386, no objections to commiting this.
 1.14 16-Jan-2007  christos PR/35430: Izumi Tsutsui: Identify amd64 CPU on NetBSD/i386
 1.13 11-Jan-2007  ad x86_errata: correct the definition of MSR_HWCR and re-enable. Problem
noted and debugged by Murray Armfield (murray at river-styx.org).
 1.12 01-Jan-2007  ad Report on and where possible, try to work around some of the known errata
for Athlon 64 and Opteron processors. Tested briefly by cube@ and elad@.
 1.11 03-Sep-2006  xtraeme branches: 1.11.2; 1.11.6;
Update the enhanced speedstep driver and sync the code with OpenBSD:

est.c:

* Use a quintuplet (vendor, MHz_hi, mV_hi, MHz_lo, mV_lo } to match
CPUs more correctly than parsing the brand string.
* Add support for a bunch of models.
* Create a fake table on the fly if the CPU is unknown (there's no
table for it) with the current/highest/lowest frequency.

specialreg.h:

* Add some MSRs needed to get the bus clock value.

identcpu.c:

* Add functions specific to Pentium III, Pentium M and Pentium 4 to
get the bus clock value.

Note that the new fake table code from Simon Burge is not included on
this commit.

Ok'ed by simonb and dogcow.
 1.10 24-Aug-2006  cube Display XD for Intel processors (Execution Disable bit support).
 1.9 02-Dec-2005  christos branches: 1.9.4; 1.9.8; 1.9.18;
PR/32216: Nicolas Joly: Missing HTT feature display for Opterons dual-core CPUs
 1.8 21-Feb-2005  he branches: 1.8.4;
Probe and print the Intel Extended Feature Bits, as documented
in the CPUID instruction description in the "Intel Extended Memory 64
Technology Software Developer's Guide, Volume 1 of 2" available at
ftp://download.intel.com/technology/64bitextensions/30083402.pdf

This presently consists of the SYSCALL/SYSRET and the EM64T features.
CPUs with the EM64T feature available should be able to run amd64 code.

Reviewed by fvdl
 1.7 10-Feb-2005  drochner Recognize an obscure cpu feature flag bit "xTPR"
which indicates that Task Priority Messages might
be disabled. Not relevant for the kernel for now
(related to interrupt distribution on the APIC bus
afaict), but present on one of my boxes.
Being here, also recognise the future "Vanderpool"
extension.
 1.6 17-May-2004  joda branches: 1.6.4; 1.6.6;
the EST and TM2 flags in the second cpuid register were swapped
(according AP-485); while here add a few more flags
 1.5 19-Feb-2004  drochner define AMD64's CPUID_NOX bit (I'm curious where Intel puts this bit in the
ia32 extension just announced)
XXX there should be a better separation between generic and vendor
specific feature flags
 1.4 02-Feb-2004  soren Add Pentium M MSR definitions from Michael Eriksson.
 1.3 07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.2 25-Apr-2003  fvdl branches: 1.2.2;
Share some common cache info cpuid code between i386 and x86_64.
 1.1 26-Feb-2003  fvdl Move some files out of i386 into x86, so that they can be shared with
other ports.
 1.2.2.6 11-Dec-2005  christos Sync with head.
 1.2.2.5 04-Mar-2005  skrll Sync with HEAD.

Hi Perry!
 1.2.2.4 15-Feb-2005  skrll Sync with HEAD.
 1.2.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.2.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.2.2.1 03-Aug-2004  skrll Sync with HEAD
 1.6.6.2 19-Mar-2005  yamt sync with head. xen and whitespace. xen part is not finished.
 1.6.6.1 12-Feb-2005  yamt sync with head.
 1.6.4.1 29-Apr-2005  kent sync with -current
 1.8.4.8 04-Feb-2008  yamt sync with head.
 1.8.4.7 21-Jan-2008  yamt sync with head
 1.8.4.6 15-Nov-2007  yamt sync with head.
 1.8.4.5 27-Oct-2007  yamt sync with head.
 1.8.4.4 03-Sep-2007  yamt sync with head.
 1.8.4.3 26-Feb-2007  yamt sync with head.
 1.8.4.2 30-Dec-2006  yamt sync with head.
 1.8.4.1 21-Jun-2006  yamt sync with head.
 1.9.18.1 06-Sep-2006  riz Pull up following revision(s) (requested by xtraeme in ticket #111):
sys/arch/x86/include/specialreg.h: revision 1.11
sys/arch/i386/i386/identcpu.c: revision 1.39
sys/arch/i386/include/cpu.h: revision 1.128
sys/arch/i386/i386/est.c: revision 1.26
Update the enhanced speedstep driver and sync the code with OpenBSD:
est.c:
* Use a quintuplet (vendor, MHz_hi, mV_hi, MHz_lo, mV_lo } to match
CPUs more correctly than parsing the brand string.
* Add support for a bunch of models.
* Create a fake table on the fly if the CPU is unknown (there's no
table for it) with the current/highest/lowest frequency.
specialreg.h:
* Add some MSRs needed to get the bus clock value.
identcpu.c:
* Add functions specific to Pentium III, Pentium M and Pentium 4 to
get the bus clock value.
Note that the new fake table code from Simon Burge is not included on
this commit.
Ok'ed by simonb and dogcow.
 1.9.8.1 03-Sep-2006  yamt sync with head.
 1.9.4.1 09-Sep-2006  rpaulo sync with head
 1.11.6.1 10-Feb-2007  tron Pull up following revision(s) (requested by chs in ticket #411):
sys/arch/x86/include/specialreg.h: revision 1.14
sys/arch/i386/i386/identcpu.c: revision 1.53
PR/35430: Izumi Tsutsui: Identify amd64 CPU on NetBSD/i386
 1.11.2.2 01-Feb-2007  ad Sync with head.
 1.11.2.1 12-Jan-2007  ad Sync with head.
 1.15.14.2 03-Oct-2007  garbled Sync with HEAD
 1.15.14.1 26-Jun-2007  garbled Sync with HEAD.
 1.15.8.1 11-Jul-2007  mjf Sync with head.
 1.15.6.4 03-Dec-2007  ad Sync with HEAD.
 1.15.6.3 09-Oct-2007  ad Sync with head.
 1.15.6.2 15-Jul-2007  ad Sync with head.
 1.15.6.1 09-Jun-2007  ad Sync with head.
 1.15.2.2 17-Feb-2007  daniel Add an opencrypto provider for the AES xcrypt instructions found on VIA
C5P and later cores (also known as 'ACE', which is part of the VIA PadLock
security engine). Ported from OpenBSD.

Reviewed on tech-crypto and port-i386, no objections to commiting this.
 1.15.2.1 17-Feb-2007  daniel file specialreg.h was added on branch yamt-idlelwp on 2007-02-17 00:28:26 +0000
 1.18.12.1 06-Oct-2007  yamt sync with head.
 1.18.10.3 23-Mar-2008  matt sync with HEAD
 1.18.10.2 09-Jan-2008  matt sync with HEAD
 1.18.10.1 06-Nov-2007  matt sync with HEAD
 1.18.8.2 29-Oct-2007  joerg Sync with HEAD.
 1.18.8.1 02-Oct-2007  joerg Sync with HEAD.
 1.19.2.1 13-Nov-2007  bouyer Sync with HEAD
 1.21.8.1 02-Jan-2008  bouyer Sync with HEAD
 1.21.4.1 26-Dec-2007  ad Sync with head.
 1.21.2.1 18-Feb-2008  mjf Sync with HEAD.
 1.23.12.2 18-Sep-2008  wrstuden Sync with wrstuden-revivesa-base-2.
 1.23.12.1 23-Jun-2008  wrstuden Sync w/ -current. 34 merge conflicts to follow.
 1.23.10.7 09-Oct-2010  yamt sync with head
 1.23.10.6 11-Aug-2010  yamt sync with head.
 1.23.10.5 11-Mar-2010  yamt sync with head
 1.23.10.4 19-Aug-2009  yamt sync with head.
 1.23.10.3 20-Jun-2009  yamt sync with head
 1.23.10.2 16-May-2009  yamt sync with head
 1.23.10.1 04-May-2009  yamt sync with head.
 1.23.8.1 04-Jun-2008  yamt sync with head
 1.23.6.3 17-Jan-2009  mjf Sync with HEAD.
 1.23.6.2 28-Sep-2008  mjf Sync with HEAD.
 1.23.6.1 02-Jun-2008  mjf Sync with HEAD.
 1.24.4.1 19-Oct-2008  haad Sync with HEAD.
 1.31.12.1 21-Apr-2010  matt sync to netbsd-5
 1.31.8.6 27-Aug-2011  jym Sync with HEAD. Most notably: uvm/pmap work done by rmind@, and MP Xen
work of cherry@.

No regression observed on suspend/restore.
 1.31.8.5 28-Mar-2011  jym Sync with HEAD. TODO before merge:
- shortcut for suspend code in sysmon, when powerd(8) is not running.
Borrow ``xs_watch'' thread context?
- bug hunting in xbd + xennet resume. Rings are currently thrashed upon
resume, so current implementation force flush them on suspend. It's not
really needed.
 1.31.8.4 24-Oct-2010  jym Sync with HEAD
 1.31.8.3 01-Nov-2009  jym Sync with HEAD.
 1.31.8.2 31-May-2009  jym Sync with HEAD.
 1.31.8.1 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.31.4.4 01-Jun-2015  sborrill Pull up the following revisions(s) (requested by msaitoh in ticket #1968):
sys/arch/x86/include/specialreg.h: revision 1.72 via patch

Backport CPUID_TO_*() macros. Old macros are kept for compatibility.
 1.31.4.3 19-Jun-2013  bouyer Pull up following revision(s) (requested by msaitoh in ticket #1847):
sys/arch/x86/include/mtrr.h: revision 1.5
sys/arch/x86/x86/mtrr_i686.c: revision 1.25
sys/arch/x86/include/specialreg.h: revision 1.55
Increase MTRR_I686_NVAR_MAX from 8 to 16. Avoids
"FIXME: more than 8 MTRRs (10)" message on booting Thinkpad W520 and
similar. While here replace a magic number with MTRR_I686_NVAR_MAX * 2
 1.31.4.2 28-Nov-2012  riz branches: 1.31.4.2.2;
Pull up following revision(s) (requested by christos in ticket #1819):
sys/arch/x86/include/specialreg.h: revision 1.58
Add VIA Eden FCR MSR.
 1.31.4.1 16-Jun-2009  snj branches: 1.31.4.1.2;
Pull up following revision(s) (requested by rmind in ticket #789):
sys/arch/x86/include/specialreg.h: revision 1.36
sys/arch/x86/x86/cpu_topology.c: revision 1.2
Add CPU topology detection support for AMD processors.
Tested on the following AMD CPUs:
- Family 15, model 65
- Family 15, model 67
- Family 15, model 75
- Family 16, model 2
- Family 17, model 3
Reviewed (slightly older version of patch) by <yamt>.
 1.31.4.2.2.1 01-Jun-2015  sborrill Pull up the following revisions(s) (requested by msaitoh in ticket #1968):
sys/arch/x86/include/specialreg.h: revision 1.72 via patch

Backup CPUID_TO_*() macros. Old macros are kept for compatibility.
 1.31.4.1.2.1 01-Jun-2015  sborrill Pull up the following revisions(s) (requested by msaitoh in ticket #1968):
sys/arch/x86/include/specialreg.h: revision 1.72 via patch

Backup CPUID_TO_*() macros. Old macros are kept for compatibility.
 1.31.2.1 28-Apr-2009  skrll Sync with HEAD.
 1.38.4.2 05-Mar-2011  rmind sync with head
 1.38.4.1 30-May-2010  rmind sync with head
 1.38.2.3 22-Oct-2010  uebayasi Sync with HEAD (-D20101022).
 1.38.2.2 17-Aug-2010  uebayasi Sync with HEAD.
 1.38.2.1 30-Apr-2010  uebayasi Sync with HEAD.
 1.49.4.2 05-Mar-2011  bouyer Sync with HEAD
 1.49.4.1 17-Feb-2011  bouyer Sync with HEAD
 1.49.2.1 06-Jun-2011  jruoho Sync with HEAD.
 1.53.6.6 02-Jun-2012  mrg sync to latest -current.
 1.53.6.5 29-Apr-2012  mrg sync to latest -current.
 1.53.6.4 06-Mar-2012  mrg sync to -current
 1.53.6.3 06-Mar-2012  mrg sync to -current
 1.53.6.2 04-Mar-2012  mrg sync to latest -current.
 1.53.6.1 18-Feb-2012  mrg merge to -current.
 1.53.2.5 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.53.2.4 23-Jan-2013  yamt sync with head
 1.53.2.3 30-Oct-2012  yamt sync with head
 1.53.2.2 23-May-2012  yamt sync with head.
 1.53.2.1 17-Apr-2012  yamt sync with head
 1.55.2.5 26-Jan-2015  martin Pull up the following, requested by msaitoh in ticket #1240:

sys/arch/x86/include/specialreg.h 1.72 via patch

Add CPUID_TO_*() macros to avoid bug. Old macros are kept for compatibility.
See http://mail-index.netbsd.org/port-amd64/2013/11/12/msg001978.html
 1.55.2.4 29-Dec-2014  martin Pull up the following revisisions, requested by msaitoh in #1220:

sys/arch/x86/include/specialreg.h 1.59-1.71, 1.73-1.81 (patch)

Update x86 special register definitions:
- Add latest CR4 bits.
- Recognize the P1GB and RDTSCP which were AMD-only on Intel HW too.
- Add some missing bit definitions for CPUID2 and those for XCR0.
- Fix CPUID_AMD_FLAGS4 to not try to print bits \41 and \42.
- Correct the comment about the extended family and model bits.
- Add some definitions related to the process extended state
enumeration.
- Add Intel Structured Extended Feature leaf (Fn0000_0007).
- Sort CPUID definitions in initial EAX value.
- Add Intel Deterministic Cache Parameter Leaf (CPUID leaf 4).
- Add some AMD Fn80000001 extended features %ecx bits definitions.
- "s/MXX/MMXX/" because this bit is "MMX eXtention".
- Add some definitions for cpu 'extended state' enumeration
(Fn0000000d).
- Add Energy Performance Bias bit of Fn0000_0006 %ecx.
- Add MSR_IA32_PLATFORM_ID (0x017)
- Modify comment.
- Style fix.
 1.55.2.3 07-May-2012  riz Pull up following revision(s) (requested by christos in ticket #220):
sys/arch/x86/x86/identcpu.c: revision 1.31
sys/arch/x86/include/specialreg.h: revision 1.58
PR/41267: Andrius V: 5.0 RC4 does not detect second CPU in VIA. VIA Eden cpuid
lies about it's ability to do cmpxchg8b. Turn the feature on using the FCR MSR.
Needs pullup to both 5 and 6.
Add VIA Eden FCR MSR.
 1.55.2.2 09-Apr-2012  riz Pull up following revision(s) (requested by chs in ticket #168):
sys/arch/x86/include/specialreg.h: revision 1.57
sys/arch/x86/x86/errata.c: revision 1.20
bring in this change from openbsd:
Implement the AMD suggested workaround for family 10h & 12h errata 721
"Processor May Incorrectly Update Stack Pointer" by setting a bit
marked 'reserved' in an MSR that is only "documented" to exist on 12h.
 1.55.2.1 05-Mar-2012  sborrill Pull up the following revisions(s) (requested by bouyer in ticket #80):
sys/arch/xen/x86/x86_xpmap.c: revision 1.42
sys/arch/x86/include/specialreg.h: revision 1.56
sys/arch/amd64/amd64/machdep.c: revision 1.179
sys/arch/i386/i386/locore.S: revision 1.97
sys/arch/i386/i386/machdep.c: revision 1.723 via patch
sys/arch/x86/include/cpu.h: revision 1.49

Fix possible FPU registers corruption on context switches.
Fix type of pointers passed to some hypercalls.
 1.59.2.5 03-Dec-2017  jdolecek update from HEAD
 1.59.2.4 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.59.2.3 23-Jun-2013  tls resync from head
 1.59.2.2 25-Feb-2013  tls resync with head
 1.59.2.1 20-Nov-2012  tls Resync to 2012-11-19 00:00:00 UTC
 1.63.6.2 18-May-2014  rmind sync with head
 1.63.6.1 28-Aug-2013  rmind sync with head
 1.78.4.6 09-Oct-2018  snj Pull up following revision(s) (requested by msaitoh in ticket #1636):
sys/arch/x86/include/cacheinfo.h: 1.23-1.26
sys/arch/x86/include/cpu.h: 1.70
sys/arch/x86/include/specialreg.h: 1.91-1.93,1.98,1.100,1.102-1.124,1.126,1.130 via patch
sys/arch/x86/x86/cpu_topology.c: 1.10
sys/arch/x86/x86/identcpu.c: 1.56-1.57,1.70 via patch
usr.sbin/cpuctl/arch/i386.c: 1.71,1.75-1.79,1.81-1.85 via patch
Add some register definitions for x86:
- Add CLWB bit.
- Fix a few (unused) MSR values, and add some bit definitions of
MSR_EFER from Murray Armfield in PR#42861.
- CPUID_CFLUSH bit is not for CFLUSH insn but CLFLUSH insn, so modify
comments and snprintb() string.
- Define CPUID Fn00000001 %ebx bits and use them.
No functional change.
- Add Structured Extended Flags Enumeration Leaf's bit definitions:
AVX512_{IFMA,VBMI2,VNNI,BITALG,VPOPCNTDQ,4VNNIW,4FMAPS},GFNI&VAES.
- Add Turbo Boost Max Technology 3.0 bit.
- Add AMD SVM features definitions.
- Add Intel cpuid 7 %edx IBRS and STIBP bit definitions.
- Fix swapped comments for EFER LME and LMA
- Add Intel cpuid 7 %edx bit 29 IA32_ARCH_CAPABILITIES supported bit.
- Add MSR_IA32_ARCH_CAPABILITIES definition.
- Add IA32_SPEC_CTRL MSR and IA32_PRED_CMD MSR.
- Add Intel Deterministic Address Translation Parameter Leaf(0x18)
definitions.
- s/CLFUSH/CLFLUSH/
- Add AMD's Disable Indirect Branch Predictor bit definition.
- Add the MSR bits definitions for IBRS, STIBP and IBPB.
- Add Intel Fn0000_0006 %eax new bit 14-20 (HWP stuff).
- Intel Fn0000_0007 %ecx bit 22 is for both RDPID and IA32_TSC_AUX.
- Add AMD's CPUID Fn80000001 %edx MMX and FXSR bit definitions.
- Add RDCL_NO and IBRS_ALL.
- Add SSBD and RSBA bit definitions.
- Add AMD's SSB bit definitions for F15H, F16H and F17H.
- Add cpuid 7 edx L1D_FLUSH bit.
- Add IA32_ARCH_SKIP_L1DFL_VMENTRY bit.
- Add IA32_FLUSH_CMD MSR.
- Add yet another Shared L2 TLB (2M/4M pages).
- Add 3way and 6way of L2 cache or TLB on AMD CPU.
- AMD L3 cache association bitfield is not 8bit but 4bit like others
association bitfields.
- Sort entries. No functional change.
- Modify comment, fix typo in comment and add comment.
cpuctl(8):
- Add detection for Quark X1000, Xeon E5 v4, E7 v4,
Core i7-69xx Extreme Edition, Xeon Scalable (Skylake),
Xeon Phi [357]200 (Knights Landing), Atom (Goldmont),
Atom (Denverton), Future Core (Cannon Lake), Atom (Goldmont Plus),
Xeon Phi 7215, 7285 and 7295 (Knights Mill) and
7th or 8th gen Core (Kaby Lake, Coffee Lake).
- Print Structured Extended Feature leaf Fn0000_0007 %ebx on AMD,too.
- Print Fn0000_0007 %ecx on Intel.
- Print Intel cpuid 7 %edx.
- Parse the TLB info from `cpuid leaf 18H' on Intel processor.
- Use aprint_error_dev() for error output.
 1.78.4.5 08-Dec-2016  snj Pull up following revision(s) (requested by msaitoh in ticket #1285):
sys/arch/x86/include/cacheinfo.h: revision 1.22
sys/arch/x86/include/specialreg.h: revisions 1.87 and 1.90
usr.sbin/cpuctl/arch/i386.c: revisions 1.72-1.74
Changes for x86's cpuctl(8):
- Add Quark X1000, Xeon E[57] v4, Core i7-69xx Extreme, 7th gen Core,
Denverton, Xeon Phi [357]200, Future Xeon and Future Xeon Phi.
- Add SGX, UMIP, RDPID, SGXLC, AVX512DQ, AVX512BW and AVX512VL bit.
- Fix the bit location of CLFLUSHOPT.
- Add new TLB descriptor 0x64 and 0xc4.
 1.78.4.4 06-Mar-2016  martin branches: 1.78.4.4.2;
Pull up the following changes, requested by msaitoh in #1117:

sys/arch/x86/include/cacheinfo.h 1.20-1.21
sys/arch/x86/include/specialreg.h 1.83-1.86
usr.sbin/cpuctl/arch/i386.c 1.67-1.70

Changes for x86's cpuctl(8):
- Add some TLB information (index 0x6a-0x6d).
- Add Hardware-Controlled Performance States (HWP) bits, FPU Data
Pointer Updated Only bit and CLFLUSHOPT bit.
- Add some AMD's bit definitions from "BIOS and Kernel Developer(BKDG)
for AMD Family 15h Models 60h-6Fh Processors".
- Add Xeon E5-4600 v3,
- Add Xeon E3-1200 v4 and v5.
- Add 6th gen Core, Xeon E3-1500 v5 and Xeon D-1500.
- Change CPU family 0x1c from "Atom Family" to "45nm Atom Family"
 1.78.4.3 09-May-2015  snj Pull up following revision(s) (requested by msaitoh in ticket #739):
sys/arch/x86/include/specialreg.h: revision 1.82
usr.sbin/cpuctl/arch/i386.c: revision 1.66
From Intel SDM:
- Add the Silicon Debug bit in CPUID Fn00000001 %ecx
- Add CPUID Fn0000_0007 %ecx bits
- Add comments.
--
Update some Intel CPU models (Sky Lake, Broadwell and Atom X[357]).
 1.78.4.2 09-Jan-2015  martin Pull up following revision(s) (requested by msaitoh in ticket #396):
sys/arch/x86/x86/cpu_ucode_intel.c: revision 1.6
sys/arch/x86/include/specialreg.h: revision 1.81
Use specialreg.h's definitions.
 1.78.4.1 12-Dec-2014  martin Pull up following revision(s) (requested by msaitoh in ticket #310):
sys/arch/x86/include/specialreg.h: revision 1.79-1.80
usr.sbin/cpuctl/arch/i386.c: revision 1.59
sys/arch/x86/include/cacheinfo.h: revision 1.19

Update some cpuid related values:
- Add XSAVECC, XGETBV, XSAVES, SMAP and PQE
- Change XINUSE to XGETBV
- Add new cache descripter value (0xc3)
- Update signatures for the follwing CPUs:
- Core M-5xxx
- Core i7 Extreme
- Future Core (0x4e)
- Future Xeon (0x56)
 1.78.4.4.2.1 18-Jan-2017  skrll Sync with netbsd-5
 1.80.2.8 28-Aug-2017  skrll Sync with HEAD
 1.80.2.7 05-Feb-2017  skrll Sync with HEAD
 1.80.2.6 05-Oct-2016  skrll Sync with HEAD
 1.80.2.5 29-May-2016  skrll Sync with HEAD
 1.80.2.4 19-Mar-2016  skrll Sync with HEAD
 1.80.2.3 22-Sep-2015  skrll Sync with HEAD
 1.80.2.2 06-Jun-2015  skrll Sync with HEAD
 1.80.2.1 06-Apr-2015  skrll Sync with HEAD
 1.87.2.4 26-Apr-2017  pgoyette Sync with HEAD
 1.87.2.3 20-Mar-2017  pgoyette Sync with HEAD
 1.87.2.2 07-Jan-2017  pgoyette Sync with HEAD. (Note that most of these changes are simply $NetBSD$
tag issues.)
 1.87.2.1 26-Jul-2016  pgoyette Sync with HEAD
 1.91.2.1 21-Apr-2017  bouyer Sync with HEAD
 1.97.2.1 19-May-2017  pgoyette Resolve conflicts from previous merge (all resulting from $NetBSD
keywork expansion)
 1.98.2.28 29-Jul-2023  martin Pull up the following revisions, all via patch, requested by msaitoh
in ticket #1853:

sys/arch/x86/include/specialreg.h 1.204-1.206, 1.208

- Add Intel CPUID 0x07 %ecx bit 24 BUS_LOCK_DETECT.
- Add AMD CPUID 0x80000008 %ebx bit 30 IBPB_RET and CPUID 0x8000000a
%edx bit 29 BusLockThreshold.
- Fix typo in comment.
 1.98.2.27 25-Jul-2023  martin Pull up following revision(s) (requested by mrg in ticket #1851):

sys/arch/x86/include/specialreg.h: revision 1.207
sys/arch/x86/x86/errata.c: revision 1.31

x86: turn off zenbleed chicken bit on Zen2 cpus.

this is based upon Taylor's original work. i just made the list
of CPUs to run on correct as i could determine. (also, add some
Zen3 and Zen4 cpuids not yet used by any errata.)

(might be nice to have a better way to expression revision ranges
rather than specific cpuid matches, eg, 0x30-0x4f models in a cpu
family, etc.)

tested on ryzen 3600, and a ported zenbleed PoC that no longer
shows any obtained text. (a similar module-version of it stopped
the PoC on a ryzen 3950x without having to reboot.)

https://www.amd.com/en/resources/product-security/bulletin/amd-sb-7008.html
https://lock.cmpxchg8b.com/zenbleed.html
 1.98.2.26 21-Jun-2023  martin Pull up following revision(s) (requested by msaitoh in ticket #1827):

sys/arch/x86/include/specialreg.h: revision 1.202
sys/arch/x86/include/specialreg.h: revision 1.203
usr.sbin/cpuctl/arch/i386.c: revision 1.136

Add some CPUID bits from PPR for AMD Family 19h Model 61h Revision B1.

Add AMD CPUID Fn0000_0008 %ebx bit 3 INVLPGB.
 1.98.2.25 23-Jan-2023  martin Pull up the following revisions, requested by msaitoh in ticket #1791:

sys/arch/x86/include/specialreg.h 1.193-1.198

- Add CPUID Fn0000_0006 %eax bit 24 IA32_THERM_INTERRUPT MSR bit 25
Hardware Feedback Notification support.
- Add CPUID Fn0000_0007 %ecx bit 29 ENQCMD.
- Add CPUID Fn0000_0007 %edx bit 1 SGX-KEYS.
- Add CPUID Fn0000_0007 %edx bit 5 UINTR(User INTeRrupts).
- Add CPUID Fn0000_0007 %edx bit 11 RTM_ALWAYS_ABORT.
- Add CPUID Fn0000_0007 %edx bit 22 AMX_BF16.
- Add CPUID Fn0000_0007 %edx bit 23 AVX512_FP16.
- Add CPUID Fn0000_0007 %edx bit 24 AMX_TILE.
- Add CPUID Fn0000_0007 %edx bit 25 AMX_INT8.
- Add CPUID Fn0000_0007 sub-leaf 1 %edx bit 18 CET_SSS.
- Add CPUID Fn0000_0007 sub-leaf 2 %edx definitions.
- Add CPUID Fn0000_000d sub-leaf 1 %eax bit 4 XFD.
- Add CPUID Fn0000_001d Tile Information.
- Add CPUID Fn0000_001e TMUL Information.
- Add CPUID Fn8000_0007 %eax RAS capabilities.
- Add CPUID Fn8000_0008 %ebx BTC_NO,
- Add cpuid Fn8000_000a x2AVIC, VNMI, IBSVIRT and ROGPT.
- Add CPUID Fn8000_001b Instruction-Based Sampling.
- Add CPUID Fn8000_001e Processor Topology Information.
- Add CPUID Fn8000_001f %eax RPMQUERY, VmplSSS, TscAuxVirt,
VmgexitParam, VirtualTomMsr, IbsVirtGuest, SmtProtection,
vsmCommPageMSR and NestedVirtSnpMsr.
- Add CPUID Fn8000_0021 AMD Extended Features Identification 2.
- Add CPUID Fn8000_0022 AMD Extended Performance Monitoring and Debug.
- Rename HW_FEEDBACK to HWI (Hardware Feedback Interface).
- Rename TSX_FORCE_ABORT to RTM_FORCE_ABORT.
- Modify comment. Both Intel and AMD support CPUID Fn0000000b.
- Modify comment. Hybrid Information -> Native Model ID Information.
- Use __BIT(). Add comment. Whitespace fix.
 1.98.2.24 15-Oct-2022  martin Pull up following revision(s) (requested by msaitoh in ticket #1775):

sys/arch/x86/include/specialreg.h: revision 1.189
usr.sbin/cpuctl/arch/i386.c: revision 1.128
sys/arch/x86/include/specialreg.h: revision 1.190
sys/arch/x86/include/specialreg.h: revision 1.191
sys/arch/x86/include/specialreg.h: revision 1.192

s/shareing/sharing/. No functional change.

Add top-down slots event bit of architectural performance monitoring leaf.

Modify CPUID Fn0000000a %ebx's string. Add new string for %ecx.

Modify output of CPUID Fn0000000a.
old:
cpu0: Perfmon-eax 0x8300805<VERSION=0x5,GPCounter=0x8,GPBitwidth=0x30>
cpu0: Perfmon-eax 0x8300805<Vectorlen=0x8>
cpu0: Perfmon-edx 0x8604<FixedFunc=0x4,FFBitwidth=0x30,ANYTHREADDEPR>
new:
cpu0: Perfmon: Ver. 5
cpu0: Perfmon: General: bitwidth 48, 8 counters
cpu0: Perfmon: General: avail 0xff<CORECYCL,INST,REFCYCL,LLCREF,LLCMISS,BRINST>
cpu0: Perfmon: General: avail 0xff<BRMISPR,TOPDOWNSLOT>
cpu0: Perfmon: Fixed: bitwidth 48, 4 counters
cpu0: Perfmon: Fixed: avail 0xf<INST,CLK_CORETHREAD,CLK_REF_TSC,TOPDOWNSLOT>

Update some AMD CPUID bits:
- Rename FSREP_MOV to FSRM.
- Add Memory Bandwidth Enforcement (MBE)
- Add AMD's PPIN. Rename CPUID_SEF_PPIN to CPUID_SEF_INTEL_PPIN.
- Add Collaborative Processor Performance Control (CPPC).
- Add HOST_MCE_OVERRIDE.
- Add some unknown bits as Bxx.
- Add comments.
- Use __BIT().
 1.98.2.23 31-Jan-2022  martin Pull up the following revisions (all via patch), requested by
msaitoh in ticket #1731:

sys/arch/x86/include/specialreg.h 1.179-1.188

- Add CPUID definitions of Last Branch Record, Thread Director,
AVX version of VNNI, Fast short REP MOV, HRESET, PPIN, Architectural
LBR, Linear Address Masking and Hybrid Information from the latest
Intel SDM.
- Add CPUID definitions of AddrMaskExt, INT_WBINVD, IbrsSameMode,
EferLmsleUnsupported, PSFD and SecureTSC from AMD APM.
- Print CLFSH instead of CLFLUSH because both Intel and AMD documents
say so.
- Modify comment. Add comment. Fix typo. Use __BIT(). KNF. Sort lines.
No functional change.
 1.98.2.22 08-Dec-2021  martin Pull up the following, requested by msaitoh in ticket #1720:

sys/arch/x86/include/specialreg.h 1.146, 1.171,
1.173-1.178 via patch
sys/arch/x86/x86/identcpu.c 1.106, 1.117,
1.122 via patch
sys/arch/x86/x86/pmap.c patch
sys/external/bsd/drm2/drm/drm_cache.c 1.14
usr.sbin/cpuctl/arch/i386.c 1.114-1.117


- Add PT, PKRU, HDC, LA57, PKE, PKS, CET, CET_U, CET_S, HWP, KL,
AVX512_BF16, TME_EN and PCONFIG.
- Rename some macros to match the x86 specification and the other OSes.
- Print CPUID 0x8000008 %ebx on Intel, too.
- Print CPUID leaf 7 subleaf 1.
- Identify Tiger Lake, 3rd gen Xeon Scalable (Ice Lake), Elkhart Lake
and Jasper Lake.
- Remove a few unused MSRs.
- Add comment.
- KNF. Whitespace fix.
 1.98.2.21 05-Aug-2020  martin Accidently not commited for ticket #1595:

sys/arch/x86/include/specialreg.h 1.129 via patch

Add six errata for AMD Family 17h (Ryzen etc).
 1.98.2.20 05-Aug-2020  martin Pull up the following revisions, requested by msaitoh in ticket #1588:

sys/arch/x86/include/specialreg.h 1.162-1.168 via patch

- AMD CPUID Fn8000_000a %edx bit 20 is "SPEC_CTRL".
- Add some bit definitions of AMD's CPUID Fn8000_001f Encrypted Memory
features.
- Add AMD INVLPGB/TLBSYNC hypervisor enable in VMCB and TLBSYNC
intercept bit.
- Add AMD MSR_DE_CFG's bit 1 as DE_CFG_LFENCE_SERIALIZE.
- Add some definitions for Intel:
- Add CPUID leaf 6 %eax bit 19 for HW_FEEDBACK* and
IA32_PACKAGE_TERM* MSRs.
- Add CPUID leaf 7 %ecx bit 31 for Protection Keys.
- Add definition of Load only TLB and Store only TLB.
- Add IF_PSCHANGE_MC_NO bit of IA32_ARCH_CAPABILITIES
- Fix HWP_IGNIDL.
- Add CPUID 7 %edx bit 9 "SRBDS_CTRL"
- Modify comment. Style and fix typo.
 1.98.2.19 15-Apr-2020  martin Pull up the following, requested by msaitoh in ticket #1530:

sys/arch/x86/x86/procfs_machdep.c 1.33-1.36
sys/arch/x86/x86/tsc.c 1.40
sys/arch/x86/x86/specialreg.h 1.159-1.161
usr.sbin/cpuctl/arch/i386.c 1.109-1.110 via patch

- Print avx512ifma, cqm_mbm_total, cqm_mbm_local, waitpkg, rdpru,
Fast Short Rep Mov(fsrm), AVX512_VP2INTERSECT, SERIALIZE and
TSXLDTRK.
- Rename CPUID Fn8000_0007 %edx bit 8 from "TSC" to "ITSC"
(Invariant TSC) to avoid confusion.
- Print CPUID 0x80000007 %edx on both Intel and AMD.
- Remove ci_max_ext_cpuid from usr.sbin/cpuctl/arch/i386.c because it's
the same as ci_cpuid_extlevel.
- Use unsigned to avoid undefined behavior in procfs_getonefeatreg().
 1.98.2.18 31-Jan-2020  martin Pull up the following, requested by msaitoh in ticket #1494:

sys/arch/x86/include/specialreg.h 1.146, 1.151-1.154, 1.156 via patch
usr.sbin/cpuctl/arch/i386.c 1.105-1.107 via patch

- Add definitions of AMD's CPUID Fn8000_0008 %ebx.
- Add definitions of AMD's CPUID Fn8000_001f Encrypted Memory features.
- Add definition of AMD's CPUID Fn8000_000a %edx bit 11 "GMET".
- Define CPUID_AMD_SVM_PFThreshold correctly.
- Modify comment a bit for consistency.
- Call cpu_dcp_cacheinfo() only when the cpuid Topology Extension flag
is set on AMD processor.
- Fix typos.
 1.98.2.17 19-Nov-2019  martin Pull up following revision(s) (requested by msaitoh in ticket #1450):

usr.sbin/cpuctl/arch/i386.c: revision 1.108
sys/arch/x86/include/specialreg.h: revision 1.158

Add the following bit definitions from the latest Intel SDM:
- CET shadow stack
- Fast Short REP MOV
- Hybrid part
- CET Indirect Branch Tracking

0x7d and 0x7e are for 10th generation Core (Ice Lake).
 1.98.2.16 12-Nov-2019  martin Pull up following revision(s) (requested by maxv in ticket #1433):

sys/arch/x86/include/specialreg.h: revision 1.157
sys/arch/x86/x86/spectre.c: revision 1.31

Mitigation for CVE-2019-11135: TSX Asynchronous Abort (TAA).

Two sysctls are added:
machdep.taa.mitigated = {0/1} user-settable
machdep.taa.method = {string} constructed by the kernel

There are two cases:

(1) If the CPU is affected by MDS, then the MDS mitigation will also
mitigate TAA, and we have nothing else to do. We make the 'mitigated' leaf
read-only, and force:

machdep.taa.mitigated = machdep.mds.mitigated
machdep.taa.method = [MDS]

The kernel already enables the MDS mitigation by default.

(2) If the CPU is not affected by MDS but is affected by TAA, then we use
the new TSX_CTRL MSR to disable RTM. This MSR is provided via a microcode
update, now available on the Intel website. The kernel will automatically
enable the TAA mitigation if the updated microcode is present. If the new
microcode is not present, the user can load it via cpuctl, and set
machdep.taa.mitigated=1.
 1.98.2.15 16-Aug-2019  martin Pull up following revision(s) (requested by msaitoh in ticket #1338):

usr.sbin/cpuctl/arch/i386.c: revision 1.104
sys/arch/x86/x86/identcpu.c: revision 1.93
sys/arch/x86/include/cacheinfo.h: revision 1.28
sys/arch/x86/include/specialreg.h: revision 1.150

- AMD CPUID Fn8000_0001d Cache Topology Information leaf is almost the same as
Intel Deterministic Cache Parameter Leaf(0x04), so make new
cpu_dcp_cacheinfo() and share it.
- AMD's L2 and L3's cache descriptor's definition is the same, so use one
common definition.
- KNF.

XXX Split some common functions to new identcpu_subr.c or use #ifdef _KERNEK
... #endif in identcpu.c to share from both kernel and cpuctl?
 1.98.2.14 17-Jul-2019  martin Pull up following revision(s) (requested by msaitoh in ticket #1293):

sys/arch/x86/include/specialreg.h: revision 1.149

Define some new bits of CPUID Fn8000_0007 %edx AMD Advanced Power Management
leaf.
 1.98.2.13 29-May-2019  martin Pullup the following, requested by msaitoh in ticket #1270:

sys/arch/x86/include/specialreg.h 1.143, 1.145 via patch
sys/arch/x86/x86/procfs_machdep.c 1.30

Add TSX_FORCE_ABORT related definitions.
Add cpuid7 edx bit 10 "MD_CLEAR".
 1.98.2.12 14-May-2019  martin Pull up following revision(s) (requested by maxv in ticket #1269):

sys/arch/amd64/amd64/locore.S: revision 1.181 (adapted)
sys/arch/amd64/amd64/amd64_trap.S: revision 1.47 (adapted)
sys/arch/x86/include/specialreg.h: revision 1.144 (adapted)
sys/arch/amd64/include/frameasm.h: revision 1.43 (adapted)
sys/arch/x86/x86/spectre.c: revision 1.27 (adapted)

Mitigation for INTEL-SA-00233: Microarchitectural Data Sampling (MDS).
It requires a microcode update, now available on the Intel website. The
microcode modifies the behavior of the VERW instruction, and makes it flush
internal CPU buffers. We hotpatch the return-to-userland path to add VERW.

Two sysctls are added:

machdep.mds.mitigated = {0/1} user-settable
machdep.mds.method = {string} constructed by the kernel

The kernel will automatically enable the mitigation if the updated
microcode is present. If the new microcode is not present, the user can
load it via cpuctl, and set machdep.mds.mitigated=1.
 1.98.2.11 12-Feb-2019  martin Actually pull up rev 1.139 (as claimed, but not done in previous),
requested by msaitoh in ticket #1187:

Fix bitstring format of Intel CPUID Architectural Performance Monitoring
Fn0000000a %ebx.
 1.98.2.10 11-Feb-2019  martin Pull up following revision(s) (requested by msaitoh in ticket #1187):

usr.sbin/cpuctl/arch/i386.c: revision 1.92
sys/arch/x86/include/specialreg.h: revision 1.138

Add new CPUID flags WAITPKG, CLDEMOTE, MOVDIRI, MOVDIR64B and
IA32_CORE_CAPABILITIES from the latest Intel SDM.

Add Ice Lake and Tremont from the latest Intel SDM.

Fix bitstring format of Intel CPUID Architectural Performance Monitoring
Fn0000000a %ebx.
 1.98.2.9 27-Dec-2018  martin Pull up following revision(s) (requested by maxv in ticket #1148):

sys/arch/x86/x86/identcpu.c: revision 1.81
sys/arch/x86/x86/identcpu.c: revision 1.82
sys/arch/x86/x86/identcpu.c: revision 1.84
sys/arch/x86/include/specialreg.h: revision 1.131

Declare the MSR_VIA_ACE values as macros, and use a consistent naming,
similar to the rest of the file.

I'm wondering if I'm not fixing a huge bug here. The ECX8 value we were
using was wrong: ECX8 is bit 1, not bit 0. Bit 0 is ALTINST, an alternate
ISA, which is now known to be backdoored.

So it looks like we were explicitly enabling the backdoor.

Not tested, because I don't have a VIA cpu.

-

Merge the VIA detection code into cpu_probe_c3.

-

Explicitly disable ALTINST on VIA, in case it isn't disabled by default
already (the 'VIA cpu backdoor').
 1.98.2.8 04-Dec-2018  martin Pull up following revision(s) (requested by msaitoh in ticket #1120):

usr.sbin/cpuctl/arch/i386.c: revision 1.85
usr.sbin/cpuctl/arch/i386.c: revision 1.86
usr.sbin/cpuctl/arch/i386.c: revision 1.87
usr.sbin/cpuctl/arch/i386.c: revision 1.88
usr.sbin/cpuctl/arch/i386.c: revision 1.89
usr.sbin/cpuctl/arch/i386.c: revision 1.90
sys/arch/x86/include/specialreg.h: revision 1.132
sys/arch/x86/include/specialreg.h: revision 1.133
sys/arch/x86/include/specialreg.h: revision 1.134
sys/arch/x86/include/specialreg.h: revision 1.135
sys/arch/x86/include/specialreg.h: revision 1.136
sys/arch/x86/x86/cpu_topology.c: revision 1.14

Add MAWAU (for BND{LD,ST}X instruction) from the latest Intel SDM.

Whitespace fix. No functional change.

Modify comment. No functional change:
- AMD also has CPUID 0x06 and 0x0d.
- PCOMMIT was obsoleted.
- Use ci_feat_val[7] as CPUID 7 %edx to match x86/cpu.h
- AMD also has CPUID 6.
- Remove unused code for coretemp.
- Consistently use descs[] instead of data[].
- AMD also reports CPUID 7's highest subleaf. Print it.
- Use macro.
Add Intel CPUID Extended Topology Enumeration Fn0000000b definitions.
Decode package, core and SMT id if CPUID 0x0b is available on Intel processor.

If the value is different from the kernel value, we should fix the kernel code.

TODO: Use 0x1f if it's available.

Add Intel/AMD MONITOR/MWAIT leaf.
Decode Intel/AMD MONITOR/MWAIT leaf.

Add Intel CPUID Architectural Performance Monitoring leaf Fn0000000a.

Print Intel CPUID Architectural Performance Monitoring leaf Fn0000000a.
 1.98.2.7 23-Sep-2018  martin Pull up following revision(s) (requested by msaitoh in ticket #1026):

sys/arch/x86/x86/procfs_machdep.c: revision 1.24
sys/arch/x86/include/specialreg.h: revision 1.130

OK'd by maxv:
- Add cpuid 7 edx L1D_FLUSH bit.
- Add IA32_ARCH_SKIP_L1DFL_VMENTRY bit.
- Add IA32_FLUSH_CMD MSR.
 1.98.2.6 13-Jul-2018  martin Pull up following revision(s) (requested by maya in ticket #912):

sys/arch/x86/x86/identcpu.c: revision 1.79
sys/arch/x86/include/specialreg.h: revision 1.127

Disable MWAIT/MONITOR on Apollo Lake CPUs to workaround APL30 errata.

We use MWAIT/MONITOR to hatch secondary CPUs. The errata means that
the wakeup may not happen, so SMP boot fails.
Use wrmsr to disable it in hardware too, for extra paranoia.

PR port-amd64/53420,
also reported on netbsd-users by joern clausen and ssartor.
 1.98.2.5 09-Jun-2018  martin Pullup the following revisions, requested by maxv in ticket #865:

sys/arch/amd64/amd64/machdep.c 1.303 (patch)
sys/arch/amd64/conf/GENERIC 1.492 (patch)
sys/arch/amd64/conf/files.amd64 1.103 (patch)
sys/arch/i386/i386/machdep.c 1.806 (patch)
sys/arch/i386/conf/GENERIC 1.1179 (patch)
sys/arch/i386/conf/files.i386 1.393 (patch)
sys/arch/x86/include/cpu.h 1.91 (patch)
sys/arch/x86/include/specialreg.h upto 1.126 (patch)
sys/arch/x86/x86/x86_machdep.c upto 1.115 (patch, adapted)
sys/arch/x86/x86/spectre.c upto 1.19 (patch, adapted,
no IBRS,
SpectreV2 mitigations not
enabled by default)

Backport the hardware SpectreV2 and SpectreV4 mitigations.
 1.98.2.4 18-Apr-2018  martin Pull up following revision(s) (requested by msaitoh in ticket #778):

sys/arch/x86/include/specialreg.h: revision 1.118,1.119

From the latest Intel SDM:
- Add Intel Fn0000_0006 %eax new bit 14-20 (HWP stuff).
- Intel Fn0000_0007 %ecx bit 22 is for both RDPID and IA32_TSC_AUX.

Add Some bit definitions of AMD Fn80000001 %edx:
- MMX
- FXSR
 1.98.2.3 31-Mar-2018  martin Pull up following revision(s) (requested by maxv in ticket #678):

sys/arch/x86/include/specialreg.h: revision 1.115-1.117,1.120

Add IC_CFG.DIS_IND: "Disable Indirect Branch Predictor". Available (at
least) on AMD Families 10h, 12h and 16h.

Add the IBRS and STIBP MSRs.

... and also add IBPB ...

Add RDCL_NO and IBRS_ALL.
 1.98.2.2 16-Mar-2018  martin Pull up following revision(s) (requested by msaitoh in ticket #633):
sys/arch/x86/include/specialreg.h: revision 1.107
sys/arch/x86/include/specialreg.h: revision 1.108
sys/arch/x86/include/specialreg.h: revision 1.109
sys/arch/x86/include/cacheinfo.h: revision 1.23
sys/arch/x86/include/specialreg.h: revision 1.110
sys/arch/x86/include/specialreg.h: revision 1.111
sys/arch/x86/include/specialreg.h: revision 1.112
sys/arch/x86/include/specialreg.h: revision 1.113
sys/arch/x86/include/specialreg.h: revision 1.114
usr.sbin/cpuctl/arch/i386.c: revision 1.79
sys/arch/x86/x86/identcpu.c: revision 1.70
sys/arch/x86/include/specialreg.h: revision 1.106

Add comment.

Add Intel cpuid 7 %edx IBRS(IBPB Speculation Control) and
STIBP(STIBP Speculation Control) from OpenBSD.

Print Intel cpuid 7 %edx.

Example output of cpuctl -v identify 0:
+cpu0: 00000007: 00000000 000027ab 00000000 0c000000
(snip)
+cpu0: SEF edx 0xc000000<IBRS,STIBP>

fix swapped comments for EFER LME and LMA

- Add Intel cpuid 7 %edx bit 29 IA32_ARCH_CAPABILITIES supported bit.
- Add comment.
Add MSR_IA32_ARCH_CAPABILITIES definition.

Add IA32_SPEC_CTRL MSR and IA32_PRED_CMD MSR.

Add Intel Deterministic Address Translation Parameter Leaf(0x18) definitions.

Sort entries. No functional change.

s/CLFUSH/CLFLUSH/
No functional change.
 1.98.2.1 21-Nov-2017  martin Pull up following revision(s) (requested by msaitoh in ticket #365):
sys/arch/x86/include/specialreg.h: revision 1.99
usr.sbin/cpuctl/arch/i386.c: revision 1.75
usr.sbin/cpuctl/arch/i386.c: revision 1.76
usr.sbin/cpuctl/arch/i386.c: revision 1.77
usr.sbin/cpuctl/arch/i386.c: revision 1.78
sys/arch/x86/x86/identcpu.c: revision 1.56
sys/arch/x86/x86/identcpu.c: revision 1.57
sys/arch/x86/x86/cpu_topology.c: revision 1.10
sys/arch/x86/include/specialreg.h: revision 1.100
sys/arch/x86/include/specialreg.h: revision 1.101
sys/arch/x86/include/specialreg.h: revision 1.102
sys/arch/x86/include/specialreg.h: revision 1.103
sys/arch/x86/include/specialreg.h: revision 1.104
sys/arch/x86/include/specialreg.h: revision 1.105
Add EFER_TCE. This would be an interesting feature to have, since it
reduces the indirect cost of invlpg; but I'm not convinced the way we
flush upper-levels is correct for this yet.
Fix typo in comment
Add a comment about APICBASE_PHYSADDR. Has to do with PR/42597.
Define CPUID Fn00000001 %ebx bits and use them. No functional change.
Set ci->ci_cflush_lsize correctly. This bug was added in the last commit(1.56).
Add the following instruction bits in Structured Extended Flags Enumeration
Leaf from "Intel Architecture Instruction Set Extensions and Future Features
Programming Reference" (319433-030):
AVX512_IFMA
AVX512_VBMI
AVX512_VBMI2
GFNI
VAES
VPCLMULQDQ
AVX512_VNNI
AVX512_BITALG
AVX512_VPOPCNTDQ
AVX512_4VNNIW
AVX512_4FMAPS
- Print ci_feat_val[5] (Structured Extended Feature leaf Fn0000_0007 %ebx) on
AMD, too.
- Print ci_feat_val[6] (Fn0000_0007 %ecx) on Intel.
Update from the latest Intel SDM:
0x5c: Atom (Goldmont)
0x5f: Atom (Goldmont, Denverton)
0x7a: Atom (Goldmont Plus)
Add Turbo Boost Max Technology 3.0 bit.
Update from Intel SDM:
0x55: Xeon Scalable (Skylake)
0x57: Xeon Phi [357]200 (Knights Landing)
0x66: Future Core (Cannon Lake)
0x85: Future Xeon Phi (Knights Mill)
Add the following bits in AMD Fn8000000a %edx features (SVM features):
PFThreshold (PAUSE filter threshold)
AVIC (AMD virtual interrupt controller)
V_VMSAVE_VMLOAD (virtualized VMSAVE and VMLOAD)
vGIF (virtualized GIF)
 1.112.2.8 18-Jan-2019  pgoyette Synch with HEAD
 1.112.2.7 26-Dec-2018  pgoyette Sync with HEAD, resolve a few conflicts
 1.112.2.6 26-Nov-2018  pgoyette Sync with HEAD, resolve a couple of conflicts
 1.112.2.5 06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.112.2.4 28-Jul-2018  pgoyette Sync with HEAD
 1.112.2.3 25-Jun-2018  pgoyette Sync with HEAD
 1.112.2.2 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.112.2.1 15-Mar-2018  pgoyette Synch with HEAD
 1.126.2.2 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.126.2.1 10-Jun-2019  christos Sync with HEAD
 1.150.2.16 20-Jul-2024  martin Pull up following revision(s) (requested by andvar in ticket #1855):

sys/arch/x86/x86/identcpu.c: revision 1.129
sys/arch/x86/include/specialreg.h: revision 1.212
sys/arch/x86/x86/identcpu.c: revision 1.130

Disable the VIA Alternate Instructions according the VIA documentation:
* C7 and above do not support ALTINST, do not check or attempt to disable them.
* For VIA C3 Nehemiah check extended feature flags for support and status,
do no attempt to disable when AIS is not supported or enabled.
* For pre-Nehemiah models explicitly disable, if they are in the range
of documented models, flags aren't present to check the status on
these models.

Note: for pre-Nehemiah may be other functional side effects depdending
on the version and stepping.

Explicit disabling of ALTINST was introduced with rev. 1.84 following
the discovery of some VIA CPUs having these instructions enabled by default
leading to the potential backdoor (aka rosenbrindge).

Unfortunately, implementation used a wrong check (ACE supported flag),
which can be true for the later models, still supporting padlock features.

Setting ALTINST bit on those may have unexpected side effects like VIA C7 CPUID
instruction for temperature sensor not reporting correct value or
`cpuctl identify' not reporting certain CPU features. Similar side effects
can be observed even for Nehemiah models not supporting AIS instructions. This
change should limit possibility of such issues to only the pre-Nehemiah models,
not covered at all in the previous implementation.

Feature Control Register (FCR) macros were unified under one group and
consistent naming while implementing the change. Few comments updated as well.
patch reviewed by Riastradh@ (thank you)

PR kern/58370

Move determination of the largest VIA CPU extended function value
to the intended place where the checks are performed.
Currently the value can be overridden while checking for the padlock features,
and failing the check for max function value as a result.
 1.150.2.15 29-Jul-2023  martin Pull up the following revisions, all via patch, requested by msaitoh
in ticket #1669:

sys/arch/x86/include/specialreg.h 1.204-1.206, 1.208

- Add Intel CPUID 0x07 %ecx bit 24 BUS_LOCK_DETECT.
- Add AMD CPUID 0x80000008 %ebx bit 30 IBPB_RET and CPUID 0x8000000a
%edx bit 29 BusLockThreshold.
- Fix typo in comment.
 1.150.2.14 25-Jul-2023  martin Pull up following revision(s) (requested by mrg in ticket #1664):

sys/arch/x86/include/specialreg.h: revision 1.207
sys/arch/x86/x86/errata.c: revision 1.31

x86: turn off zenbleed chicken bit on Zen2 cpus.

this is based upon Taylor's original work. i just made the list
of CPUs to run on correct as i could determine. (also, add some
Zen3 and Zen4 cpuids not yet used by any errata.)

(might be nice to have a better way to expression revision ranges
rather than specific cpuid matches, eg, 0x30-0x4f models in a cpu
family, etc.)

tested on ryzen 3600, and a ported zenbleed PoC that no longer
shows any obtained text. (a similar module-version of it stopped
the PoC on a ryzen 3950x without having to reboot.)

https://www.amd.com/en/resources/product-security/bulletin/amd-sb-7008.html
https://lock.cmpxchg8b.com/zenbleed.html
 1.150.2.13 21-Jun-2023  martin Pull up following revision(s) (requested by msaitoh in ticket #1646):

sys/arch/x86/include/specialreg.h: revision 1.202
sys/arch/x86/include/specialreg.h: revision 1.203
usr.sbin/cpuctl/arch/i386.c: revision 1.136

Add some CPUID bits from PPR for AMD Family 19h Model 61h Revision B1.

Add AMD CPUID Fn0000_0008 %ebx bit 3 INVLPGB.
 1.150.2.12 23-Jan-2023  martin Pull up the following revisions, requested by msaitoh in ticket #1574:

sys/arch/x86/include/specialreg.h 1.193-1.198

- Add CPUID Fn0000_0006 %eax bit 24 IA32_THERM_INTERRUPT MSR bit 25
Hardware Feedback Notification support.
- Add CPUID Fn0000_0007 %ecx bit 29 ENQCMD.
- Add CPUID Fn0000_0007 %edx bit 1 SGX-KEYS.
- Add CPUID Fn0000_0007 %edx bit 5 UINTR(User INTeRrupts).
- Add CPUID Fn0000_0007 %edx bit 11 RTM_ALWAYS_ABORT.
- Add CPUID Fn0000_0007 %edx bit 22 AMX_BF16.
- Add CPUID Fn0000_0007 %edx bit 23 AVX512_FP16.
- Add CPUID Fn0000_0007 %edx bit 24 AMX_TILE.
- Add CPUID Fn0000_0007 %edx bit 25 AMX_INT8.
- Add CPUID Fn0000_0007 sub-leaf 1 %edx bit 18 CET_SSS.
- Add CPUID Fn0000_0007 sub-leaf 2 %edx definitions.
- Add CPUID Fn0000_000d sub-leaf 1 %eax bit 4 XFD.
- Add CPUID Fn0000_001d Tile Information.
- Add CPUID Fn0000_001e TMUL Information.
- Add CPUID Fn8000_0007 %eax RAS capabilities.
- Add CPUID Fn8000_0008 %ebx BTC_NO,
- Add cpuid Fn8000_000a x2AVIC, VNMI, IBSVIRT and ROGPT.
- Add CPUID Fn8000_001b Instruction-Based Sampling.
- Add CPUID Fn8000_001e Processor Topology Information.
- Add CPUID Fn8000_001f %eax RPMQUERY, VmplSSS, TscAuxVirt,
VmgexitParam, VirtualTomMsr, IbsVirtGuest, SmtProtection,
vsmCommPageMSR and NestedVirtSnpMsr.
- Add CPUID Fn8000_0021 AMD Extended Features Identification 2.
- Add CPUID Fn8000_0022 AMD Extended Performance Monitoring and Debug.
- Rename HW_FEEDBACK to HWI (Hardware Feedback Interface).
- Rename TSX_FORCE_ABORT to RTM_FORCE_ABORT.
- Modify comment. Both Intel and AMD support CPUID Fn0000000b.
- Modify comment. Hybrid Information -> Native Model ID Information.
- Use __BIT(). Add comment. Whitespace fix.
 1.150.2.11 15-Oct-2022  martin Pull up following revision(s) (requested by msaitoh in ticket #1542):

sys/arch/x86/include/specialreg.h: revision 1.189
sys/dev/nvmm/x86/nvmm_x86.c: revision 1.23
usr.sbin/cpuctl/arch/i386.c: revision 1.128
sys/arch/x86/include/specialreg.h: revision 1.190
sys/arch/x86/include/specialreg.h: revision 1.191
sys/arch/x86/include/specialreg.h: revision 1.192

s/shareing/sharing/. No functional change.

Add top-down slots event bit of architectural performance monitoring leaf.

Modify CPUID Fn0000000a %ebx's string. Add new string for %ecx.

Modify output of CPUID Fn0000000a.
old:
cpu0: Perfmon-eax 0x8300805<VERSION=0x5,GPCounter=0x8,GPBitwidth=0x30>
cpu0: Perfmon-eax 0x8300805<Vectorlen=0x8>
cpu0: Perfmon-edx 0x8604<FixedFunc=0x4,FFBitwidth=0x30,ANYTHREADDEPR>
new:
cpu0: Perfmon: Ver. 5
cpu0: Perfmon: General: bitwidth 48, 8 counters
cpu0: Perfmon: General: avail 0xff<CORECYCL,INST,REFCYCL,LLCREF,LLCMISS,BRINST>
cpu0: Perfmon: General: avail 0xff<BRMISPR,TOPDOWNSLOT>
cpu0: Perfmon: Fixed: bitwidth 48, 4 counters
cpu0: Perfmon: Fixed: avail 0xf<INST,CLK_CORETHREAD,CLK_REF_TSC,TOPDOWNSLOT>

Update some AMD CPUID bits:
- Rename FSREP_MOV to FSRM.
- Add Memory Bandwidth Enforcement (MBE)
- Add AMD's PPIN. Rename CPUID_SEF_PPIN to CPUID_SEF_INTEL_PPIN.
- Add Collaborative Processor Performance Control (CPPC).
- Add HOST_MCE_OVERRIDE.
- Add some unknown bits as Bxx.
- Add comments.
- Use __BIT().
 1.150.2.10 31-Jan-2022  martin Pull up the following revisions (all via patch), requested by msaitoh
in ticket #1417:

sys/arch/x86/include/specialreg.h 1.179-1.188

- Add CPUID definitions of Last Branch Record, Thread Director,
AVX version of VNNI, Fast short REP MOV, HRESET, PPIN, Architectural
LBR, Linear Address Masking and Hybrid Information from the latest
Intel SDM.
- Add CPUID definitions of AddrMaskExt, INT_WBINVD, IbrsSameMode,
EferLmsleUnsupported, PSFD and SecureTSC from AMD APM.
- Print CLFSH instead of CLFLUSH because both Intel and AMD documents
say so.
- Modify comment. Add comment. Fix typo. Use __BIT(). KNF. Sort lines.
No functional change.
 1.150.2.9 08-Dec-2021  martin Pull up the following revisions, requested by msaitoh in ticket #1391:

sys/arch/x86/include/specialreg.h 1.171, 1.173-1.178
sys/arch/x86/x86/identcpu.c 1.106, 1.117,
1.122 via patch
sys/dev/nvmm/x86/nvmm_x86.c 1.18
sys/external/bsd/drm2/drm/drm_cache.c 1.14
sys/external/bsd/drm2/include/asm/cpufeature.h 1.5
usr.sbin/cpuctl/arch/i386.c 1.114-1.117


- Add LA57, PKE, PKS, CET, CET_U, CET_S, HWP, KL, AVX512_BF16, TME_EN
and PCONFIG.
- Rename some macros to match the x86 specification and the other OSes.
- Print CPUID 0x8000008 %ebx on Intel, too.
- Print CPUID leaf 7 subleaf 1.
- Identify Tiger Lake, 3rd gen Xeon Scalable (Ice Lake), Elkhart Lake
and Jasper Lake.
- Add comment.
- KNF. Whitespace fix.
 1.150.2.8 04-Sep-2020  martin Pull up following revision(s) (requested by maxv in ticket #1076):

sys/dev/nvmm/x86/nvmm_x86_svm.c: revision 1.75
sys/arch/x86/include/specialreg.h: revision 1.172
sys/dev/nvmm/x86/nvmm_x86_vmx.c: revision 1.72

nvmm-x86-vmx: fix detection of the BIOS lock

If it's locked, ensure it's locked with VMX enabled. If it's not locked,
then lock it ourselves with VMX enabled.

Should fix NetBSD PR/55596.

-

Add a few more CPUID flags.

-

nvmm-x86-svm: check the SVM revision
Only revision 1 exists, but check it, for future-proofness.
 1.150.2.7 13-Jul-2020  martin Pull up following revision(s) (requested by msaitoh in ticket #998):

sys/arch/x86/include/specialreg.h: revision 1.162
sys/arch/x86/include/specialreg.h: revision 1.164
sys/arch/x86/include/specialreg.h: revision 1.165
sys/arch/x86/include/specialreg.h: revision 1.166
sys/arch/x86/include/specialreg.h: revision 1.167
sys/arch/x86/include/specialreg.h: revision 1.168

- AMD CPUID Fn8000_000a %edx bit 20 is "SPEC_CTRL".
- Add some bit definitions of AMD's CPUID Fn8000_001f Encrypted Memory
features.
- Add AMD INVLPGB/TLBSYNC hypervisor enable in VMCB and TLBSYNC intercept bit.
- Modify comment.
Add AMD MSR_DE_CFG's bit 1 as DE_CFG_LFENCE_SERIALIZE.
This bit makes lfence instruction serializing.
Add some definitions from the latest Intel SDM plus small fix:
- Add CPUID leaf 6 %eax bit 19 for HW_FEEDBACK* and IA32_PACKAGE_TERM* MSRs.
- Add CPUID leaf 7 %ecx bit 31 for Protection Keys.
- Add definition of Load only TLB and Store only TLB.
- Add IF_PSCHANGE_MC_NO bit of IA32_ARCH_CAPABILITIES
- Fix HWP_IGNIDL.
Add SRBDS_CTRL bit.
style and fix typo
 1.150.2.6 14-Apr-2020  martin Pull up following revision(s) (requested by msaitoh in ticket #833):

usr.sbin/cpuctl/arch/i386.c: revision 1.109
sys/arch/x86/include/specialreg.h: revision 1.159
usr.sbin/cpuctl/arch/i386.c: revision 1.110
sys/arch/x86/include/specialreg.h: revision 1.160
sys/arch/x86/include/specialreg.h: revision 1.161
sys/arch/x86/x86/tsc.c: revision 1.40
sys/arch/x86/x86/procfs_machdep.c: revision 1.35
sys/arch/x86/x86/procfs_machdep.c: revision 1.36

Add Fast Short Rep Mov(fsrm).

Add AVX512_VP2INTERSECT, SERIALIZE and TSXLDTRK(TSX suspend load addr tracking)

CPUID Fn00000001 %edx bit 8 is printed as "TSC", so rename CPUID Fn8000_0007
%edx bit 8 from "TSC" to "ITSC" (Invariant TSC) to avoid confusion.

Rename CPUID_APM_TSC to CPUID_APM_ITSC. No functional change.

Remove ci_max_ext_cpuid because it's the same as ci_cpuid_extlevel.

Print CPUID 0x80000007 %edx on both Intel and AMD.
 1.150.2.5 19-Nov-2019  martin Pull up following revision(s) (requested by msaitoh in ticket #452):

usr.sbin/cpuctl/arch/i386.c: revision 1.108
sys/arch/x86/include/specialreg.h: revision 1.158

Add the following bit definitions from the latest Intel SDM:
- CET shadow stack
- Fast Short REP MOV
- Hybrid part
- CET Indirect Branch Tracking
0x7d and 0x7e are for 10th generation Core (Ice Lake).
 1.150.2.4 12-Nov-2019  martin Pull up following revision(s) (requested by maxv in ticket #419):

sys/arch/x86/include/specialreg.h: revision 1.157
sys/arch/x86/x86/spectre.c: revision 1.31

Mitigation for CVE-2019-11135: TSX Asynchronous Abort (TAA).

Two sysctls are added:
machdep.taa.mitigated = {0/1} user-settable
machdep.taa.method = {string} constructed by the kernel

There are two cases:

(1) If the CPU is affected by MDS, then the MDS mitigation will also
mitigate TAA, and we have nothing else to do. We make the 'mitigated' leaf
read-only, and force:

machdep.taa.mitigated = machdep.mds.mitigated
machdep.taa.method = [MDS]

The kernel already enables the MDS mitigation by default.

(2) If the CPU is not affected by MDS but is affected by TAA, then we use
the new TSX_CTRL MSR to disable RTM. This MSR is provided via a microcode
update, now available on the Intel website. The kernel will automatically
enable the TAA mitigation if the updated microcode is present. If the new
microcode is not present, the user can load it via cpuctl, and set
machdep.taa.mitigated=1.
 1.150.2.3 10-Nov-2019  martin Pull up following revision(s) (requested by msaitoh in ticket #407):

sys/arch/x86/include/specialreg.h: revision 1.156

- GMET is not bit 11 but 17.
- Add unknown CPUID Fn8000_000a %edx bit 20.
 1.150.2.2 17-Oct-2019  martin Pull up following revision(s) (requested by msaitoh in ticket #344):

sys/arch/x86/include/specialreg.h: revision 1.154
sys/arch/x86/include/specialreg.h: revision 1.155
usr.sbin/cpuctl/arch/i386.c: revision 1.107
sys/arch/x86/x86/procfs_machdep.c: revision 1.34

- Add definitions of AMD's CPUID Fn8000_001f Encrypted Memory features.
- Add definition of AMD's CPUID Fn8000_000a %edx bit 11 "GMET".
- Define CPUID_AMD_SVM_PFThreshold correctly.
- Modify comment a bit for consistency.

Fix AMD Fn8000_0001f %eax bit 0's name.

Add rdpru.
 1.150.2.1 26-Sep-2019  martin Pull up following revision(s) (requested by msaitoh in ticket #241):

sys/arch/x86/include/specialreg.h: revision 1.152
sys/arch/x86/include/specialreg.h: revision 1.153
usr.sbin/cpuctl/arch/i386.c: revision 1.105
sys/arch/x86/x86/spectre.c: revision 1.30
sys/arch/x86/include/specialreg.h: revision 1.151

Add definitions of AMD's CPUID Fn8000_0008 %ebx.
Decode AMD's CPUID Fn8000_0008 %ebx.
Use macro.
Add MCOMMIT instruction.
Define CPUID_CAPEX_FLAGS's bit 10 correctly.
 1.161.2.1 25-Apr-2020  bouyer Sync with bouyer-xenpvh-base2 (HEAD)
 1.175.2.1 14-Dec-2020  thorpej Sync w/ HEAD.
 1.176.4.1 01-Aug-2021  thorpej Sync with HEAD.
 1.198.2.6 03-Oct-2024  martin Pull up following revision(s) (requested by rin in ticket #919):

sys/arch/x86/x86/errata.c: revision 1.28
sys/arch/x86/x86/errata.c: revision 1.29
sys/arch/x86/include/specialreg.h: revision 1.209
usr.sbin/cpuctl/arch/i386.c: revision 1.144
sys/arch/x86/x86/errata.c: revision 1.30
sys/arch/x86/x86/errata.c: revision 1.33
sys/arch/x86/x86/errata.c: revision 1.34
sys/arch/x86/x86/errata.c: revision 1.35
sys/arch/x86/include/specialreg.h: revision 1.210
sys/arch/x86/include/specialreg.h: revision 1.211

x86/errata.c: Link to original AMD errata guide.

This one is no longer updated; need to link to newer ones for
individual families too. That's where all the cryptic nomenclature
comes from here.

x86/errata.c: Say what revision we're searching for.

x86/errata.c: Only say the errata revision search for cpu0.

x86: make the CPUID list for errata be far less confusing
the 0x80000001 CPUID result needs some parsing to match against
actual family/model/stepping values. 4-bit 'family' values of
15 or 6 change how to parse the 4-bit extended model and 8-bit
extended family value - for family 6 or 15, the extended model
bits (4) are concatenated with the base 4-bits to create an
8-bit value, and for family 15, the family value is addition
of the family value and the 8-bit extended-family value, giving
a range of 0 to 15 + 0xff aka 270.

use a CPUREV(family, model, stepping) macro that builds the
relevant bit-representation of a CPUID, making it far easier
to understand what each entry means, and to add new ones too.
i have confirmed that the emitted cpurevs[] array has the same
values before/after this change, ie, NFCI or observed.

x86: add names for errata that don't have actual numbers
zenbleed is reported as "erratum 65535" currently, this adds a name
for it, and enables the name for any others as well.
pull logging into a function with a tag message.

x86: handle AMD errata 1474: A CPU core may hang after about 1044 days
from the new comment:
* This requires disabling CC6 power level, which can be a performance
* issue since it stops full turbo in some implementations (eg, half the
* cores must be in CC6 to achieve the highest boost level.) Set a timer
* to fire in 1000 days -- except NetBSD timers end up having a signed
* 32-bit hz-based value, which rolls over in under 25 days with HZ=1000,
* and doing xcall(9) or kthread(9) from a callout is not allowed anyway,
* so just have a kthread wait 1 day for 1000 times.
documented in:
https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/revision-guides/56323-PUB_1_01.pdf

add MSR stuff for AMD errata 1474.

cpuctl: fix i386 bit descriptions for CPUID_SEF_FLAGS1
warning: non-printing character '\31' in description
'BUS_LOCK_DETECT""b\31' [363]
s/RPMQUERY/RMPQUERY/
 1.198.2.5 20-Jul-2024  martin Pull up following revision(s) (requested by andvar in ticket #738):

sys/arch/x86/x86/identcpu.c: revision 1.129
sys/arch/x86/include/specialreg.h: revision 1.212
sys/arch/x86/x86/identcpu.c: revision 1.130

Disable the VIA Alternate Instructions according the VIA documentation:
* C7 and above do not support ALTINST, do not check or attempt to disable them.
* For VIA C3 Nehemiah check extended feature flags for support and status,
do no attempt to disable when AIS is not supported or enabled.
* For pre-Nehemiah models explicitly disable, if they are in the range
of documented models, flags aren't present to check the status on
these models.

Note: for pre-Nehemiah may be other functional side effects depdending
on the version and stepping.

Explicit disabling of ALTINST was introduced with rev. 1.84 following
the discovery of some VIA CPUs having these instructions enabled by default
leading to the potential backdoor (aka rosenbrindge).

Unfortunately, implementation used a wrong check (ACE supported flag),
which can be true for the later models, still supporting padlock features.

Setting ALTINST bit on those may have unexpected side effects like VIA C7 CPUID
instruction for temperature sensor not reporting correct value or
`cpuctl identify' not reporting certain CPU features. Similar side effects
can be observed even for Nehemiah models not supporting AIS instructions. This
change should limit possibility of such issues to only the pre-Nehemiah models,
not covered at all in the previous implementation.

Feature Control Register (FCR) macros were unified under one group and
consistent naming while implementing the change. Few comments updated as well.
patch reviewed by Riastradh@ (thank you)

PR kern/58370

Move determination of the largest VIA CPU extended function value
to the intended place where the checks are performed.
Currently the value can be overridden while checking for the padlock features,
and failing the check for max function value as a result.
 1.198.2.4 29-Jul-2023  martin Pull up the following revisions, all via patch, requested by msaitoh
in ticket #250:

sys/arch/x86/include/specialreg.h 1.204-1.206, 1.208

- Add Intel CPUID 0x07 %ecx bit 24 BUS_LOCK_DETECT.
- Add AMD CPUID 0x80000008 %ebx bit 30 IBPB_RET and CPUID 0x8000000a
%edx bit 29 BusLockThreshold.
- Fix typo in comment.
 1.198.2.3 25-Jul-2023  martin Pull up following revision(s) (requested by mrg in ticket #243):

sys/arch/x86/include/specialreg.h: revision 1.207
sys/arch/x86/x86/errata.c: revision 1.31

x86: turn off zenbleed chicken bit on Zen2 cpus.

this is based upon Taylor's original work. i just made the list
of CPUs to run on correct as i could determine. (also, add some
Zen3 and Zen4 cpuids not yet used by any errata.)

(might be nice to have a better way to expression revision ranges
rather than specific cpuid matches, eg, 0x30-0x4f models in a cpu
family, etc.)

tested on ryzen 3600, and a ported zenbleed PoC that no longer
shows any obtained text. (a similar module-version of it stopped
the PoC on a ryzen 3950x without having to reboot.)

https://www.amd.com/en/resources/product-security/bulletin/amd-sb-7008.html
https://lock.cmpxchg8b.com/zenbleed.html
 1.198.2.2 21-Jun-2023  martin Pull up following revision(s) (requested by msaitoh in ticket #200):

sys/arch/x86/include/specialreg.h: revision 1.202
sys/arch/x86/include/specialreg.h: revision 1.203
usr.sbin/cpuctl/arch/i386.c: revision 1.136

Add some CPUID bits from PPR for AMD Family 19h Model 61h Revision B1.

Add AMD CPUID Fn0000_0008 %ebx bit 3 INVLPGB.
 1.198.2.1 23-Jan-2023  martin Pull up following revision(s) (requested by msaitoh in ticket #56):

sys/arch/x86/include/specialreg.h: revision 1.200
sys/arch/x86/include/specialreg.h: revision 1.201
sys/arch/x86/include/specialreg.h: revision 1.199

Use __BIT(). Add comment. Whitespace. No functional change.

Update definitions from the latest Intel SDM.
- Rename HW_FEEDBACK to HWI (Hardware Feedback Interface).
- Add CPUID Fn0000_0006 %eax bit 24 IA32_THERM_INTERRUPT MSR bit 25 Hardware
Feedback Notification support.
- Add CPUID Fn0000_0007 %ecx bit 29 ENQCMD.
- Add CPUID Fn0000_0007 %edx bit 1 SGX-KEYS.
- Add CPUID Fn0000_0007 %edx bit 5 UINTR(User INTeRrupts).
- Add CPUID Fn0000_0007 %edx bit 1 RTM_ALWAYS_ABORT.
- Rename TSX_FORCE_ABORT to RTM_FORCE_ABORT.
- Add CPUID Fn0000_0007 %edx bit 22 AMX_BF16.
- Add CPUID Fn0000_0007 %edx bit 23 AVX512_FP16.
- Add CPUID Fn0000_0007 %edx bit 24 AMX_TILE.
- Add CPUID Fn0000_0007 %edx bit 25 AMX_INT8.
- Add CPUID Fn0000_0007 sub-leaf 1 %edx bit 18 CET_SSS.
- Add CPUID Fn0000_0007 sub-leaf 2 %edx bit 0 PSFD.
- Add CPUID Fn0000_0007 sub-leaf 2 %edx bit 1 IPRED_CTRL.
- Add CPUID Fn0000_0007 sub-leaf 2 %edx bit 2 RRSBA_CTRL.
- Add CPUID Fn0000_0007 sub-leaf 2 %edx bit 3 DDPD_U.
- Add CPUID Fn0000_0007 sub-leaf 2 %edx bit 4 BHI_CTRL.
- Add CPUID Fn0000_0007 sub-leaf 2 %edx bit 5 MCDT_NO.
- Modify comment. Both Intel and AMD support CPUID Fn0000000b.
- Add CPUID Fn0000_000d sub-leaf 1 %eax bit 4 XFD.
- Modify comment. Hybrid Information -> Native Model ID Information.
- Add CPUID Fn0000_001d Tile Information.
- Add CPUID Fn0000_001e TMUL Information.

Fix comment.
 1.211.2.1 02-Aug-2025  perseant Sync with HEAD
 1.15 19-Jun-2020  maxv localify
 1.14 13-Jul-2018  maxv Remove the X86PMC code I had written, replaced by tprof. Many defines
become unused in specialreg.h, so remove them. We don't want to add
defines all the time, there are countless PMCs on many generations, and
it's better to just inline the event/unit values.
 1.13 12-Jul-2018  maxv Remove the kernel PMC code. Sent yesterday on tech-kern@.

This change:

* Removes "options PERFCTRS", the associated includes, and the associated
ifdefs. In doing so, it removes several XXXSMPs in the MI code, which is
good.

* Removes the PMC code of ARM XSCALE.

* Removes all the pmc.h files. They were all empty, except for ARM XSCALE.

* Reorders the x86 PMC code not to rely on the legacy pmc.h file. The
definitions are put in sysarch.h.

* Removes the kern/sys_pmc.c file, and along with it, the sys_pmc_control
and sys_pmc_get_info syscalls. They are marked as OBSOL in kern,
netbsd32 and rump.

* Removes the pmc_evid_t and pmc_ctr_t types.

* Removes all the associated man pages. The sets are marked as obsolete.
 1.12 12-Jul-2017  maxv branches: 1.12.4; 1.12.6;
Properly handle overflows, and take them into account in userland.
 1.11 10-Mar-2017  maxv branches: 1.11.6;
Switch to per-CPU PMC results, and completely rewrite the pmc(1) tool. Now
the PMCs are system-wide, fine-grained and more tunable by the user.

We don't do application tracking, since it would require to store the PMC
values in mdproc and starting/stopping the counters on each context switch.
While this doesn't seem to be particularly difficult to achieve, I don't
think it is really interesting; and if someone really wants to measure
the performance of an application, they can simply schedctl it to a cpu
and look at the PMC results for this cpu.

Note that several options are implemented but not yet used.
 1.10 08-Mar-2017  maxv Add a version argument, set to 1, and check it in usr.bin/pmc. Use uint32_t
instead uint8_t since we now need 12bit selectors (10h family). And while
here KNF.
 1.9 07-Jul-2010  chs branches: 1.9.18; 1.9.36; 1.9.40; 1.9.44;
add the guts of TLS support on amd64. based on joerg's patch,
reworked by me to support 32-bit processes as well.
we now keep %fs and %gs loaded with the user values
while in the kernel, which means we don't need to
reload them when returning to user mode.
 1.8 21-Mar-2009  ad branches: 1.8.2; 1.8.4;
PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.
 1.7 28-Apr-2008  martin branches: 1.7.8; 1.7.10; 1.7.14;
Remove clause 3 and 4 from TNF licenses
 1.6 10-Nov-2007  ad branches: 1.6.14; 1.6.16; 1.6.18;
- When computing the TSC frequency, call i8254_delay() and not DELAY().
- Use atomics to adjust the pmap reference count, instead of taking locks.
- Implement I386_{SET,GET}_{FS,GS}BASE, allowing %fs and %gs to be used
as per-thread registers. This is compatible with FreeBSD.
- Run patches after we have attached CPUs, since we then know if the
system is uniprocessor or not. Eliminates a lot of #ifdef MULTIPROCESSOR
and makes running MP kernels on UP systems cheaper.
- Patch out many of the 'lock' prefixes to nops if uniprocessor.
- Do a wbinvd after patching to ensure that the trace/instruction cache
is up to date.
 1.5 17-Oct-2007  garbled branches: 1.5.2;
Merge the ppcoea-renovation branch to HEAD.

This branch was a major cleanup and rototill of many of the various OEA
cpu based PPC ports that focused on sharing as much code as possible
between the various ports to eliminate near-identical copies of files in
every tree. Additionally there is a new PIC system that unifies the
interface to interrupt code for all different OEA ppc arches. The work
for this branch was done by a variety of people, too long to list here.

TODO:
bebox still needs work to complete the transition to -renovation.
ofppc still needs a bunch of work, which I will be looking at.
ev64260 still needs to be renovated
amigappc was not attempted.

NOTES:
pmppc was removed as an arch, and moved to a evbppc target.
 1.4 16-Sep-2007  ragge branches: 1.4.4;
i386 -> __i386__
 1.3 23-Jun-2007  dsl branches: 1.3.2; 1.3.10; 1.3.12; 1.3.14;
Split x86_set/get_ldt() so they are callable with kernel buffers.
For linux emulation code.
 1.2 16-Apr-2007  ad branches: 1.2.2; 1.2.4; 1.2.6;
Fix error in previous.
 1.1 16-Apr-2007  ad Share the sysarch stuff between the x86 ports. PR kern/36046.
 1.2.6.5 03-Dec-2007  ad Sync with HEAD.
 1.2.6.4 09-Oct-2007  ad Sync with head.
 1.2.6.3 15-Jul-2007  ad Sync with head.
 1.2.6.2 09-Jun-2007  ad Sync with head.
 1.2.6.1 16-Apr-2007  ad file sysarch.h was added on branch vmlocking on 2007-06-09 21:37:04 +0000
 1.2.4.2 07-May-2007  yamt sync with head.
 1.2.4.1 16-Apr-2007  yamt file sysarch.h was added on branch yamt-idlelwp on 2007-05-07 10:55:05 +0000
 1.2.2.2 03-Oct-2007  garbled Sync with HEAD
 1.2.2.1 26-Jun-2007  garbled Sync with HEAD.
 1.3.14.4 15-Nov-2007  yamt sync with head.
 1.3.14.3 27-Oct-2007  yamt sync with head.
 1.3.14.2 03-Sep-2007  yamt sync with head.
 1.3.14.1 23-Jun-2007  yamt file sysarch.h was added on branch yamt-lazymbuf on 2007-09-03 14:31:21 +0000
 1.3.12.2 09-Jan-2008  matt sync with HEAD
 1.3.12.1 06-Nov-2007  matt sync with HEAD
 1.3.10.2 11-Nov-2007  joerg Sync with HEAD.
 1.3.10.1 02-Oct-2007  joerg Sync with HEAD.
 1.3.2.2 11-Jul-2007  mjf Sync with head.
 1.3.2.1 23-Jun-2007  mjf file sysarch.h was added on branch mjf-ufs-trans on 2007-07-11 20:03:16 +0000
 1.4.4.1 13-Nov-2007  bouyer Sync with HEAD
 1.5.2.1 19-Nov-2007  mjf Sync with HEAD.
 1.6.18.3 11-Aug-2010  yamt sync with head.
 1.6.18.2 04-May-2009  yamt sync with head.
 1.6.18.1 16-May-2008  yamt sync with head.
 1.6.16.1 18-May-2008  yamt sync with head.
 1.6.14.1 02-Jun-2008  mjf Sync with HEAD.
 1.7.14.3 24-Oct-2010  jym Sync with HEAD
 1.7.14.2 01-Nov-2009  jym Sync with HEAD.
 1.7.14.1 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.7.10.1 04-Apr-2009  snj Pull up following revision(s) (requested by ad in ticket #656):
sys/arch/amd64/amd64/gdt.c: revision 1.21 via patch
sys/arch/amd64/amd64/machdep.c: revision 1.129 via patch
sys/arch/i386/i386/gdt.c: revision 1.47 via patch
sys/arch/i386/i386/kvm86.c: revision 1.17 via patch
sys/arch/i386/i386/locore.S: revision 1.85 via patch
sys/arch/i386/i386/machdep.c: revision 1.666 via patch
sys/arch/i386/i386/vector.S: revision 1.45 via patch
sys/arch/i386/include/pcb.h: revision 1.47 via patch
sys/arch/x86/include/pmap.h: revision 1.22 via patch
sys/arch/x86/include/sysarch.h: revision 1.8 via patch
sys/arch/x86/x86/pmap.c: revision 1.80 via patch
sys/arch/x86/x86/sys_machdep.c: revision 1.17 via patch
sys/compat/linux/arch/i386/linux_machdep.c: revision 1.143 via patch
sys/kern/init_main.c: revision 1.384 via patch
PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash
Fix numerous problems:
1. LDT updates are not atomic.
2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.
3. LDTR can be leaked over context switch.
4. GDT slot allocations can race, giving the same LDT slot to two procs.
5. Incomplete interrupt/trap frames can be stacked.
6. In some rare cases segment faults are not handled correctly.
 1.7.8.1 28-Apr-2009  skrll Sync with HEAD.
 1.8.4.1 05-Mar-2011  rmind sync with head
 1.8.2.1 17-Aug-2010  uebayasi Sync with HEAD.
 1.9.44.1 21-Apr-2017  bouyer Sync with HEAD
 1.9.40.1 20-Mar-2017  pgoyette Sync with HEAD
 1.9.36.1 28-Aug-2017  skrll Sync with HEAD
 1.9.18.1 03-Dec-2017  jdolecek update from HEAD
 1.11.6.1 01-Aug-2017  snj Pull up following revision(s) (requested by maxv in ticket #164):
distrib/sets/lists/base/md.amd64: revision 1.269
distrib/sets/lists/debug/md.amd64: revision 1.97
sys/arch/amd64/conf/GENERIC: revision 1.460
sys/arch/amd64/conf/files.amd64: revision 1.89
sys/arch/i386/conf/GENERIC: revision 1.1157
sys/arch/i386/conf/files.i386: revision 1.379
sys/arch/i386/i386/i386_trap.S: revision 1.7-1.8
sys/arch/i386/include/frameasm.h: revision 1.16
sys/arch/x86/include/sysarch.h: revision 1.12
sys/arch/x86/x86/pmc.c: revision 1.8-1.10
sys/arch/x86/x86/sys_machdep.c: revision 1.36
sys/arch/xen/conf/files.compat: revision 1.26
sys/secmodel/suser/secmodel_suser.c: revision 1.43
sys/sys/kauth.h: revision 1.74
usr.bin/pmc/Makefile: revision 1.5
usr.bin/pmc/pmc.1: revision 1.12-1.13
usr.bin/pmc/pmc.c: revision 1.24-1.25
style
--
style
--
Disable interrupts for T_NMI (inline calltrap). Note that there's still a
way to evade the NMI mode here, if a segment register faults in
INTRFASTEXIT; but we don't care. I didn't test this change, but it seems
fine enough.
--
Make the PMC syscalls privileged.
--
Check argc, and add a message.
--
include opt_pmc.h
--
Build the pmc tool on amd64.
--
Properly handle overflows, and take them into account in userland.
--
Update.
--
Enable PMCs by default.
--
Sort sections. Fix macro usage.
 1.12.6.1 10-Jun-2019  christos Sync with HEAD
 1.12.4.1 28-Jul-2018  pgoyette Sync with HEAD
 1.3 15-Jul-2018  maxv Remove unused x86/include/tprof.h, there should be no need for this kind
of includes.
 1.2 24-Feb-2009  yamt branches: 1.2.62; 1.2.64;
- rewrite x86 nmi dispatcher so that establish and disesablish are safe
on a running system.
- adapt existing users of the api. (elan)
- adapt tprof_pmi driver to use the api.
 1.1 01-Jan-2008  yamt branches: 1.1.2; 1.1.4; 1.1.6; 1.1.8; 1.1.18; 1.1.26; 1.1.32;
a simple performance monitor based profiler, inspired from linux oprofile.
 1.1.32.2 01-Nov-2009  jym Sync with HEAD.
 1.1.32.1 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.1.26.1 03-Mar-2009  skrll Sync with HEAD.
 1.1.18.1 04-May-2009  yamt sync with head.
 1.1.8.2 18-Feb-2008  mjf Sync with HEAD.
 1.1.8.1 01-Jan-2008  mjf file tprof.h was added on branch mjf-devfs on 2008-02-18 21:05:17 +0000
 1.1.6.2 21-Jan-2008  yamt sync with head
 1.1.6.1 01-Jan-2008  yamt file tprof.h was added on branch yamt-lazymbuf on 2008-01-21 09:40:09 +0000
 1.1.4.2 09-Jan-2008  matt sync with HEAD
 1.1.4.1 01-Jan-2008  matt file tprof.h was added on branch matt-armv6 on 2008-01-09 01:49:49 +0000
 1.1.2.2 02-Jan-2008  bouyer Sync with HEAD
 1.1.2.1 01-Jan-2008  bouyer file tprof.h was added on branch bouyer-xeni386 on 2008-01-02 21:51:21 +0000
 1.2.64.1 10-Jun-2019  christos Sync with HEAD
 1.2.62.1 28-Jul-2018  pgoyette Sync with HEAD
 1.3 14-Mar-2020  maxv style
 1.2 07-Aug-2003  agc branches: 1.2.192;
Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.1 26-Feb-2003  fvdl branches: 1.1.2;
Move some files out of i386 into x86, so that they can be shared with
other ports.
 1.1.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.1.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.1.2.1 03-Aug-2004  skrll Sync with HEAD
 1.2.192.1 08-Apr-2020  martin Merge changes from current as of 20200406
 1.3 24-Aug-2009  jmcneill Add vga_post_set_vbe for setting video mode.
 1.2 29-Mar-2008  jmcneill branches: 1.2.4; 1.2.18;
Add RCSID to top of file.
 1.1 25-Dec-2007  joerg branches: 1.1.2; 1.1.4; 1.1.6; 1.1.8; 1.1.10; 1.1.16;
Add initial version of calling VGA POST from vga_resume. This is the
equivalent to "vbetool post" using x86emu in the kernel.
 1.1.16.1 03-Apr-2008  mjf Sync with HEAD.
 1.1.10.2 18-Feb-2008  mjf Sync with HEAD.
 1.1.10.1 25-Dec-2007  mjf file vga_post.h was added on branch mjf-devfs on 2008-02-18 21:05:17 +0000
 1.1.8.2 21-Jan-2008  yamt sync with head
 1.1.8.1 25-Dec-2007  yamt file vga_post.h was added on branch yamt-lazymbuf on 2008-01-21 09:40:10 +0000
 1.1.6.2 09-Jan-2008  matt sync with HEAD
 1.1.6.1 25-Dec-2007  matt file vga_post.h was added on branch matt-armv6 on 2008-01-09 01:49:49 +0000
 1.1.4.2 02-Jan-2008  bouyer Sync with HEAD
 1.1.4.1 25-Dec-2007  bouyer file vga_post.h was added on branch bouyer-xeni386 on 2008-01-02 21:51:21 +0000
 1.1.2.2 26-Dec-2007  ad Sync with head.
 1.1.2.1 25-Dec-2007  ad file vga_post.h was added on branch vmlocking2 on 2007-12-26 19:17:17 +0000
 1.2.18.1 01-Nov-2009  jym Sync with HEAD.
 1.2.4.1 16-Sep-2009  yamt sync with head
 1.10 29-Jun-2020  riastradh padlock(4): Remove legacy rijndael API use.

This doesn't actually need to compute AES -- it just needs the
standard AES key schedule, so use the BearSSL constant-time key
schedule implementation.

XXX Compile-tested only.
XXX The byte-order business here seems highly questionable.
 1.9 27-Feb-2016  tls Remove callout-based RNG support in VIA crypto driver; add VIA RNG backend for cpu_rng.
 1.8 13-Apr-2015  riastradh Convert arch/x86 to use <sys/rnd*.h>. Omit needless includes.
 1.7 19-Nov-2011  tls branches: 1.7.8; 1.7.26;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.
 1.6 19-Feb-2011  jmcneill branches: 1.6.4;
modularize VIA PadLock support
- retire options VIA_PADLOCK, replace with 'padlock0 at cpu0'
- driver supports attach & detach
- support building as a module
 1.5 01-Apr-2009  drochner branches: 1.5.4; 1.5.6; 1.5.8;
sort out what is needed for crash(8) and what not, should fix
recent build errors
 1.4 01-Apr-2009  tls Fix probe for VIA C3 and successors -- these are CPU family 6, not 5.
The broken probe was causing the VIA padlock driver to never attach!
Now we can see that its AES appears to be broken -- it makes FAST_IPSEC
ESP not work, on systems where it works fine with cryptosoft.

Rework code to detect and (if necessary) enable VIA crypto and RNG.
Add RNG support to VIA padlock driver. In the process, have a quick
go at debugging the AES support but no luck thus far.
 1.3 07-Mar-2009  ad Expose more stuff if _KMEMUSER is defined.
 1.2 16-Apr-2008  cegger branches: 1.2.4; 1.2.12; 1.2.18;
- use aprint_*_dev and device_xname
- use POSIX integer types
 1.1 17-Feb-2007  daniel branches: 1.1.2; 1.1.4; 1.1.46;
Add an opencrypto provider for the AES xcrypt instructions found on VIA
C5P and later cores (also known as 'ACE', which is part of the VIA PadLock
security engine). Ported from OpenBSD.

Reviewed on tech-crypto and port-i386, no objections to commiting this.
 1.1.46.1 02-Jun-2008  mjf Sync with HEAD.
 1.1.4.2 26-Feb-2007  yamt sync with head.
 1.1.4.1 17-Feb-2007  yamt file via_padlock.h was added on branch yamt-lazymbuf on 2007-02-26 09:08:49 +0000
 1.1.2.2 17-Feb-2007  daniel Add an opencrypto provider for the AES xcrypt instructions found on VIA
C5P and later cores (also known as 'ACE', which is part of the VIA PadLock
security engine). Ported from OpenBSD.

Reviewed on tech-crypto and port-i386, no objections to commiting this.
 1.1.2.1 17-Feb-2007  daniel file via_padlock.h was added on branch yamt-idlelwp on 2007-02-17 00:28:26 +0000
 1.2.18.3 28-Mar-2011  jym Sync with HEAD. TODO before merge:
- shortcut for suspend code in sysmon, when powerd(8) is not running.
Borrow ``xs_watch'' thread context?
- bug hunting in xbd + xennet resume. Rings are currently thrashed upon
resume, so current implementation force flush them on suspend. It's not
really needed.
 1.2.18.2 01-Nov-2009  jym Sync with HEAD.
 1.2.18.1 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.2.12.1 28-Apr-2009  skrll Sync with HEAD.
 1.2.4.1 04-May-2009  yamt sync with head.
 1.5.8.1 05-Mar-2011  bouyer Sync with HEAD
 1.5.6.1 06-Jun-2011  jruoho Sync with HEAD.
 1.5.4.1 05-Mar-2011  rmind sync with head
 1.6.4.1 17-Apr-2012  yamt sync with head
 1.7.26.2 19-Mar-2016  skrll Sync with HEAD
 1.7.26.1 06-Jun-2015  skrll Sync with HEAD
 1.7.8.1 03-Dec-2017  jdolecek update from HEAD

RSS XML Feed