Home | History | Annotate | only in /src/sys/arch/aarch64
History log of /src/sys/arch/aarch64
RevisionDateAuthorComments
 1.1 10-Aug-2014  matt branches: 1.1.4;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file Makefile was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.10 30-Oct-2021  skrll Remove an item
 1.9 03-Aug-2020  ryo Implement MD ucas(9) (__HAVE_UCAS_FULL)
 1.8 04-Dec-2019  jmcneill remove DTrace from TODO list
 1.7 08-May-2019  msaitoh Add DTrace.
 1.6 12-Apr-2019  ryo COMPAT_NETBSD32 to work on also thumbmode
 1.5 10-Apr-2019  ryo some items are already done. update and sync with reality.
 1.4 06-Apr-2019  thorpej Overhaul the API used to fetch and store individual memory cells in
userspace. The old fetch(9) and store(9) APIs (fubyte(), fuword(),
subyte(), suword(), etc.) are retired and replaced with new ufetch(9)
and ustore(9) APIs that can return proper error codes, etc. and are
implemented consistently across all platforms. The interrupt-safe
variants are no longer supported (and several of the existing attempts
at fuswintr(), etc. were buggy and not actually interrupt-safe).

Also augmement the ucas(9) API, making it consistently available on
all plaforms, supporting uniprocessor and multiprocessor systems, even
those that do not have CAS or LL/SC primitives.

Welcome to NetBSD 8.99.37.
 1.3 26-Aug-2018  ryo update TODO
* done: kernel text/rodata mapping with correct permission. implemented.
* add: kernel preemption
 1.2 09-Jul-2018  ryo check off SMP
 1.1 01-Apr-2018  ryo branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1.4.2 08-Apr-2020  martin Merge changes from current as of 20200406
 1.1.4.1 10-Jun-2019  christos Sync with HEAD
 1.1.2.4 06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.1.2.3 28-Jul-2018  pgoyette Sync with HEAD
 1.1.2.2 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.2.1 01-Apr-2018  pgoyette file TODO was added on branch pgoyette-compat on 2018-04-07 04:12:10 +0000
 1.9 07-Feb-2024  msaitoh Remove ryo@'s mail addresses.
 1.8 05-Oct-2023  ad Arrange to update cached LWP credentials in userret() rather than during
syscall/trap entry, eliminating a test+branch on every syscall/trap.

This wasn't possible in the 3.99.x timeframe when l->l_cred came about
because there wasn't a reliable/timely way to force an ONPROC LWP running on
a remote CPU into the kernel (which is just about the only new thing in
this scheme).
 1.7 25-Feb-2023  riastradh aarch64: curcpu() audit.

Sprinkle KASSERT (or KDASSERT in hot paths) for kpreempt_disabled()
when we use curcpu() and it's not immediately obvious that the caller
has preemption disabled but closer scrutiny suggests the caller has.

Note unsafe curcpu()s for syscall event counting. Not sure this is
worth changing.

Possible bugs fixed:

- cpu_irq and cpu_fiq could be preempted while trying to run softints
on this CPU.

- data_abort_handler might incorrectly think it was invoked in
interrupt context when it was only preempted and migrated to
another CPU.

- pmap_fault_fixup might report the wrong CPU logs.

(However, we don't currently run with kpreemption on aarch64, so
these are not yet real bugs fixed except if you patch it to build
with __HAVE_PREEMPTION.)
 1.6 25-Nov-2021  ryo add support COMPAT_LINUX32 for aarch64
 1.5 15-May-2021  rin Wrap long line. No binary changes.
 1.4 15-May-2021  rin Fix __syscall(2) for COMPAT_NETBSD32 on aarch64{,eb}.

The 1st argument for __syscall(2) is quad_t, which is stored in r0 and r1.

Now, tests/lib/libc/t_syscall:mmap___syscall passes for COMPAT_NETBSD32.
 1.3 12-Apr-2019  ryo branches: 1.3.4; 1.3.18; 1.3.20;
COMPAT_NETBSD32 to work on also thumbmode
 1.2 01-Mar-2019  mrg no need to include opt_multiprocessor.h here.
 1.1 12-Oct-2018  ryo branches: 1.1.2;
add initial support of COMPAT_NETBSD32 on AArch64.
arm ELF32 EABI binaries could be execute in AArch32 state on AArch64. A32 THUMB mode is not supported yet.
 1.1.2.2 20-Oct-2018  pgoyette Sync with head
 1.1.2.1 12-Oct-2018  pgoyette file aarch32_syscall.c was added on branch pgoyette-compat on 2018-10-20 06:58:23 +0000
 1.3.20.1 31-May-2021  cjep sync with head
 1.3.18.1 17-Jun-2021  thorpej Sync w/ HEAD.
 1.3.4.2 10-Jun-2019  christos Sync with HEAD
 1.3.4.1 12-Apr-2019  christos file aarch32_syscall.c was added on branch phil-wifi on 2019-06-10 22:05:42 +0000
 1.71 06-Sep-2025  thorpej Refactor the "platform" defitions into fdt_platform.h
 1.70 16-Jul-2023  riastradh aarch64: Omit needless xcfunc_t casts by using xcfunc_t correctly.

No functional change intended, except for avoiding possible undefined
behaviour that could have made demons come flying out your nose.
 1.69 18-Apr-2023  skrll G/C an outdated comment.
 1.68 16-Apr-2023  skrll Rename VM_KERNEL_IO_ADDRESS to VM_KERNEL_IO_BASE to match RISC-V

It's less letters, matches other similar variables and will help with
sharing code between the two architectures.

NFCI.
 1.67 07-Apr-2023  skrll Rename ARM_PLATFORM to FDT_PLATFORM and make it available outside arm.
 1.66 19-Aug-2022  ryo Fixed a bug that pte's __BIT(63,48) could be set when accessing addresses above 0x0001000000000000 in /dev/mem with mmap().
 1.65 12-Mar-2022  skrll No need to call arm_fdt_platform twice.
 1.64 31-Jan-2022  ryo add support Hardware updates to Access flag and Dirty state (FEAT_HAFDBS)

- The DBM bit of the PTE is now used to determine if it is writable, and
the AF bit is treated entirely as a reference bit. A valid PTE is always
treated as readable. There can be no valid PTE that is not readable.
- LX_BLKPAG_OS_{READ,WRITE} are used only for debugging purposes,
and has been superseded by LX_BLKPAG_AF and LX_BLKPAG_DBM.
- Improve comment

The need for reference/modify emulation has been eliminated,
and access/permission faults have been reduced, however,
there has been little change in overall performance.
 1.63 31-Oct-2021  skrll Rework Arm (32bit and 64bit) AP startup so that cpu_hatch doesn't sleep.

The AP initialisation code in cpu_init_secondary_processor will read and
initialise the required system registers and state for the BP to attach
and report.

Rework the interrupt handler code for this new sequence. Thankfully,
this removes a bunch of code for bcm2836mp.

The VFP detection handler on <= armv7 relies on the global undefined
handler being in place until the BP attaches vfp. That is, after the
APs have been spun up.

gicv3_its.c has a serialisation issue which is protected against in
the gicv3_its_cpu_init, which is called from cpu_hatch, with a spin
lock. The serialisation issue needs addressing more completely.

Tested on RPI3, Apple M1, QEMU, and lx2k

Fixes PR port-arm/56264:
diagnostic assertion "l->l_stat == LSONPROC" failed on RPI3
 1.62 08-Oct-2021  ryo Use BOOT_FLAG() to recognize standard boot options.
 1.61 03-Jun-2021  skrll Two fixes for loading free pages into UVM

- Only consider a boot_physmem (inner loop) range that has its end
(bp_end) after the bootconfig.dram (outer loop) range start (start).
This was harmless as a later condition correctly checks there is only
something to do if start < bp_end.

- Stop processing boot_physmem ranges if all the bootconfig.dram range has
been passed to UVM. This fixes a boot problem for simon@
 1.60 25-Mar-2021  skrll branches: 1.60.2; 1.60.6;
More debug
 1.59 25-Mar-2021  skrll Update a comment to reflect reality
 1.58 21-Mar-2021  skrll Adjust the kernel virtual address space so that KASAN will map the kernel
seperately from managed kernel virtual memory and not map the unused space
between the two.
 1.57 21-Mar-2021  skrll Tweak a comment
 1.56 12-Dec-2020  skrll branches: 1.56.2;
Move evbarm/fdt/fdt_memory.[ch] to sys/dev/fdt and simplify the api and
some operations. This allows other architectures to use it.
 1.55 09-Dec-2020  skrll Remove unnecessary aarch64_dcache_wbinv_all now that pmapboot_enter does
dsb(ish)
 1.54 10-Nov-2020  skrll AA64 is not MIPS.

Change all KSEG references to directmap
 1.53 22-Oct-2020  skrll branches: 1.53.2;
Use the dmb/dsb/isb macros... if nothing else they're all now consistent
about the "memory" assembler contraint.

No binary change
 1.52 22-Oct-2020  skrll Simplify the cpufunc.h header, i.e. always use #include <arm/cpufunc.h>
 1.51 04-Oct-2020  skrll KNF
 1.50 03-Oct-2020  skrll G/C
 1.49 30-Sep-2020  skrll Improve a comment
 1.48 16-Sep-2020  skrll G/C AARCH64_KMEMORY_BASE
 1.47 16-Sep-2020  skrll Fix a comment
 1.46 02-Aug-2020  maxv Add support for Privileged Access Never (ARMv8.1-PAN).

PAN provides the same functionality as SMAP on x86: it forbids kernel
access to userland pages when PSTATE.PAN=1, and allows such accesses when
PSTATE.PAN=0.

We clear SCTLR_SPAN, to guarantee that PAN=1 each time the kernel is
entered. We catch PAN faults and panic right away without further
processing. In copyin, copyout, etc, we temporarily authorize access to
userland pages.

PAN is a very useful exploit mitigation. Reviewed by ryo@, thanks. Tested
on Qemu. Enabled by default.
 1.45 16-Jul-2020  skrll pmapboot_enter simplication
- bootpage_alloc in asm becomes pmapboot_pagealloc in C
- PMAPBOOT_ENTER_NOBLOCK is removed as it's not used
- PMAPBOOT_ENTER_NOOVERWRITE is removed as it's now always on
- physpage_allocator argument is removed as it's always
pmapboot_pagealloc
- Support for EARLYCONS without CONSADDR is removed so that the identity
map for CONSADDR is always known.

For the assembly files:
2 files changed, 40 insertions(+), 89 deletions(-)

LGTM ryo
 1.44 01-Jul-2020  ryo - On some systems with a different cache line size (and DIC,IDC) per CPU, trap "mrs Xt,ctr_el0" instruction
to return the minimum cache line size of the system to userland.
- add CLIDR_EL1 and CTR_EL0 to struct aarch64_sysctl_cpu_id.

On most systems, cache line size is the same for all CPUs, so this mechanism won't be required.
Rather, this is primarily for errata support, which will be committed later.
 1.43 21-Jun-2020  jmcneill Add support for installing modules that were loaded by the bootloader.
 1.42 13-Apr-2020  maxv Add support for Branch Target Identification (BTI).

On the executable pages that have the GP (Guarded Page) bit, the semantic
of the "br" and "blr" instructions is changed: the CPU expects the first
instruction of the jump/call target to be "bti", and faults if it isn't.

We add the GP bit on the kernel .text pages (and incidentally the .rodata
pages, but we don't care). The compiler adds a "bti c" instruction at the
beginning of each C function. We modify the ENTRY() macros to manually add
"bti c" in the asm functions.

cpuswitch.S needs a specific change: with "br x27" the CPU expects "bti j",
which is bad because the functions begin with "bti c"; switch to "br x16",
for the CPU to accept "bti c".

BTI helps defend against JOP/COP. Tested on Qemu.
 1.41 12-Apr-2020  maxv Add support for Pointer Authentication (PAC).

We use the "pac-ret" option, to sign the return instruction pointer on
function entry, and authenticate it on function exit. This acts as a
mitigation against ROP.

The authentication uses a per-lwp (secret) I-A key stored in the 128bit
APIAKey register and part of the lwp context. During lwp creation, the
kernel generates a random key, and during context switches, it installs
the key of the target lwp on the CPU.

Userland cannot read the APIAKey register directly. However, it can sign
its pointers with it, because the register is architecturally shared
between userland and the kernel. Although part of the CPU design, it is
a bit of an undesired behavior, because it allows to forge valid kernel
pointers from userland. To avoid that, we don't share the key with
userland, and rather switch it in EL0<->EL1 transitions. This means that
when userland executes, a different key is loaded in APIAKey than the one
the kernel uses. For now the userland key is a fixed 128bit zero value.

The DDB stack unwinder is changed to strip the authentication code from
the pointers in lr.

Two problems are known:

* Currently the idlelwps' keys are not really secret. This is because
the RNG is not yet available when we spawn these lwps. Not overly
important, but would be nice to fix with UEFI RNG.
* The key switching in EL0<->EL1 transitions is not the most optimized
code on the planet. Instead of checking aarch64_pac_enabled, it would
be better to hot-patch the code at boot time, but there currently is
no hot-patch support on aarch64.

Tested on Qemu.
 1.40 29-Feb-2020  ryo branches: 1.40.4;
add support userspace tagged address for aarch64 (experimental)

'sysctl machdep.tagged_address' to set/clear TCR_EL1.TBI0 to enable/disable address tagging.
with 'machdep.tagged_address=1', some syscalls may cause problems?
 1.39 29-Feb-2020  ryo replace KSEG pages mapping code with generic function pmapboot_enter_range()
 1.38 22-Jan-2020  skrll Fixup a comment
 1.37 08-Jan-2020  ryo branches: 1.37.2;
fix panic when modload.

>panic: kernel diagnostic assertion "!pmap_extract(pmap_kernel(), loopva, NULL)" failed: file "../../../../uvm/uvm_km.c", line 674 loopva=0xffffffc001000000'

The space allocated by bootpage_alloc() is only used as a physical page
for pagetable pages, so there is no need to map it with KVA.
And kernend_extra should not have consumed any KVA space.
 1.36 30-Dec-2019  skrll Flush the cache and disable TTBR0 translations once we're done with
them in cpu_kernel_vm_init

The Cortex A72s in RPI4 need the cache flush for some reason.
 1.35 18-Dec-2019  riastradh New function cpu_startup_hook on arm.

Called at end of cpu_startup. Can be defined in, e.g., evbarm to do
additional stuff after cpu_startup. Defined as a weak alias to a
function that does nothing, so optional.

ok jmcneill
 1.34 14-Nov-2019  maxv Mark several kASan functions with __nothing, to avoid annoying #ifdefs.
Same as kCSan and kMSan.
 1.33 14-Nov-2019  maxv Don't include "opt_kasan.h" when there's already <sys/asan.h> included.
 1.32 28-Sep-2019  skrll Whitespace
 1.31 11-Sep-2019  ryo used L3 even if L2 could cover the range. fix to use larger block if possible good enough.
pointed out by jmcneill@. thanks.
 1.30 09-Sep-2019  ryo use L1-L3 blocks/pages for KSEG mappings to fit dramblocks exactly.
r1.29 and this changes avoid over cache prefetch problem (perhaps) with PMAP_MAP_POOLPAGE/KSEG on CortexA72, and be more stable for rockpro64.
 1.29 06-Sep-2019  jmcneill Do not assume that DRAM is linear when creating KSEG mappings. Instead,
create L2 blocks to cover all ranges specified in the memory map.
 1.28 27-Jan-2019  pgoyette branches: 1.28.4;
Merge the [pgoyette-compat] branch
 1.27 18-Jan-2019  skrll KNF
 1.26 27-Dec-2018  mrg redo the previous using ptoa(). also apply to another instance of
the same integer overflow, and now savecore actually does something
in the OD1K.
 1.25 27-Dec-2018  mrg avoid integer overflow when calculating the end address of a ram
block. fixes a bug when a PhysMem range covers more than 4GB.

with this, my OD1K (8GB ram) is almost able to properly coredump.
savecore finds the core, but can't read it properly.
 1.24 27-Dec-2018  mrg make savecore for arm64 basically work.

- move MD lwp "md_ktf" member into struct pcb. the pcb is used by
the gdb "bsd-kvm" target code to find the stack of each thread
and needs to be available in a well known location.
- implement aarch64_nbsd_supply_pcb() in GDB. makes basic gdb work
on a crash dump.
- remove '#if L_MD_KTF + 8 == L_MD_CPACR' conditional code, as there
is no more L_MD_KTF.

with this gdb has minimal working functionality with "target kvm",
and crash can at least "ps" on a crash dump.

ok skrll.
 1.23 28-Nov-2018  ryo support boot option "-1" to disable multiprocessor boot, and "-z" to set AB_SILENT flag.
 1.22 20-Nov-2018  mrg rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.
 1.21 13-Nov-2018  jmcneill Physical end should be the end of the last range, not the first.
 1.20 10-Nov-2018  ryo need to invalidate TLBs after pmapboot_enter(). pmapboot_protect() requires KSEG mappings.
 1.19 09-Nov-2018  mrg implement dumpsys() and friends for arm64.

this is almost a direct copy of the arm code, which is simply
as the basic structures about physical memory are the same
between arm and arm64. the main change i made was to use
the direct map instead of a virtual dump page that is remapped
to whatever physical page is being dumped.

i also changed the existing cpu_kcore_hdr_t to include the
missing number of ram segments.

note that this is not a complete solution for crash dumps yet,
as the libkvm code needs some work. i'm fairly positive that
this side is correct, as i can see the data i expect to see,
but libkvm's _kvm_kvtop() function returns garbage so far.

there is no "minidump" support here yet, ala amd64, but we
probably want it eventually.


ok skrll@.
 1.18 01-Nov-2018  maxv Add kASan support for aarch64. Stack tracking needs more investigation
and will come in a separate commit.

Reviewed by ryo@ jmcneill@ skrll@.
 1.17 31-Oct-2018  jmcneill Implement parse_mi_bootargs for aarch64
 1.16 20-Oct-2018  ryo changes of r1.14 was incomplete. use bootconfig.dram[] to resolve valid memory range.
pmap(1) failed to access kvm on some environment.
 1.15 14-Oct-2018  skrll Use __nothing
 1.14 13-Oct-2018  ryo - define PMAP_{MAP,UNMAP}_POOLPAGE for performance
- define __HAVE_MM_MD_KERNACC and add mm_md_kernacc()
 1.13 12-Oct-2018  jmcneill Add optional ap_startup callback to struct arm_platform. This allows for
late (post-UVM init) initialization of platform specific stuff.
 1.12 04-Oct-2018  ryo cleanup locore, and changed the way to map memories during boot.
- add functions bootpage_enter() and bootpage_alloc() to adapt various layout
of physical memory map. especially for 64bit physical memory layout.
pmapboot_alloc() allocates pagetable pages from _end[].
- changed to map only the required amount for PA=VA identity mapping
(kernel image, UART device, and FDT blob) with L2_BLOCK(2Mbyte).
- changing page permission for kernel image, and making KSEG mapping are done
at cpu_kernel_vm_init() instead of at locore.
- optimize PTE entries with PTE Contiguous bit. it is enabled on devmap only for now.

reviewed by skrll@, thanks.
 1.11 26-Aug-2018  ryo add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!
 1.10 24-Aug-2018  jmcneill Sprinkle __unused
 1.9 15-Aug-2018  ryo MODULAR support
 1.8 05-Aug-2018  skrll Refactor code to split aarch{32,64} kernel page tables and VM setup. This
will help re-build the kernel page tables on aarch64 with correct section
mappings.
 1.7 17-Jul-2018  christos Keep in .data by using a section attribute.
 1.6 17-Jul-2018  ryo kern_vtopdiff is stored in fdt_start.S. that is before cleaning bss.
decl "kern_vtopdiff = 0" for keep in .data section.
 1.5 17-Jul-2018  christos remove unused variables, add missing casts
 1.4 23-May-2018  ryo branches: 1.4.2;
style
 1.3 03-May-2018  ryo add sysctl for machdep.{cpu_id,id_revidr,id_mvfr,id_mpidr,id_aa64isar,id_aa64mmfr,id_aa64pfr}
each corresponding to system registers MIDR_EL1, REVIDR_EL1, MVFR*_EL1, MPIDR_EL1, ID_AA64ISAR*_EL1, ID_AA64MMFR*_EL1, ID_AA64PFR*_EL1.
 1.2 01-Apr-2018  ryo Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.28;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.28.9 18-Jan-2019  pgoyette Synch with HEAD
 1.1.28.8 26-Dec-2018  pgoyette Sync with HEAD, resolve a few conflicts
 1.1.28.7 26-Nov-2018  pgoyette Sync with HEAD, resolve a couple of conflicts
 1.1.28.6 20-Oct-2018  pgoyette Sync with head
 1.1.28.5 06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.1.28.4 28-Jul-2018  pgoyette Sync with HEAD
 1.1.28.3 25-Jun-2018  pgoyette Sync with HEAD
 1.1.28.2 21-May-2018  pgoyette Sync with HEAD
 1.1.28.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file aarch64_machdep.c was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.4.2.3 21-Apr-2020  martin Sync with HEAD
 1.4.2.2 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.4.2.1 10-Jun-2019  christos Sync with HEAD
 1.28.4.5 19-Oct-2025  martin Pull up following revision(s) (requested by riastradh in ticket #60):

sys/arch/sparc/sparc/locore.s: revision 1.287
share/man/man9/Makefile: revision 1.475
sys/arch/mips/mips/cpu_subr.c: revision 1.65
sys/arch/mips/mips/cpu_subr.c: revision 1.66
sys/arch/amd64/amd64/cpufunc.S: revision 1.70
sys/arch/hppa/hppa/support.S: revision 1.9
sys/arch/alpha/alpha/locore.s: revision 1.145
share/man/man9/paravirt_membar_sync.9: revision 1.1
sys/arch/sparc64/sparc64/locore.s: revision 1.436
distrib/sets/lists/comp/mi: revision 1.2499
sys/arch/i386/i386/cpufunc.S: revision 1.54
sys/sys/paravirt_membar.h: revision 1.1
sys/arch/arm/arm/cpu_subr.c: revision 1.6
(all via patch)

paravirt_membar_sync(9): New memory barrier.

For use in paravirtualized drivers which require store-before-load
ordering -- irrespective of whether the kernel is built for a single
processor, or whether the (virtual) machine is booted with a single
processor.

This is even required on architectures that don't even have a
store-before-load ordering barrier, like m68k; adding, e.g., a virtio
bus is _as if_ the architecture has been extended with relaxed memory
ordering when talking with that new bus. Such architectures need
some way to request the hypervisor enforce that ordering -- on m68k,
that's done by issuing a CASL instruction, which qemu maps to an
atomic r/m/w with sequential consistency ordering in the host.

PR kern/59618: occasional virtio block device lock ups/hangs

mips: Fix asm arch options in new paravirt_membar_sync.
Need to explicitly enable mips2 (MIPS-II) instructions in order to
use sync. Fixes:
/tmp/ccxgOmXc.s: Assembler messages:
/tmp/ccxgOmXc.s:3576: Error: opcode not supported on this processor: mips1 (mips1) `sync'
--- cpu_subr.o ---
*** Failed target: cpu_subr.o

PR kern/59618: occasional virtio block device lock ups/hangs
 1.28.4.4 04-Jun-2021  martin Pull up following revision(s) (requested by skrll in ticket #1278):

sys/arch/aarch64/aarch64/aarch64_machdep.c: revision 1.50
sys/arch/aarch64/aarch64/aarch64_machdep.c: revision 1.60
sys/arch/aarch64/aarch64/aarch64_machdep.c: revision 1.61

G/C

-

More debug

-

Two fixes for loading free pages into UVM

- Only consider a boot_physmem (inner loop) range that has its end
(bp_end) after the bootconfig.dram (outer loop) range start (start).
This was harmless as a later condition correctly checks there is only
something to do if start < bp_end.

- Stop processing boot_physmem ranges if all the bootconfig.dram range has
been passed to UVM. This fixes a boot problem for simon@
 1.28.4.3 12-Feb-2020  martin Pull up following revision(s) (requested by riastradh in ticket #705):

sys/arch/aarch64/aarch64/aarch64_machdep.c: revision 1.35
sys/stand/efiboot/efifdt.c: revision 1.20
sys/stand/efiboot/efifdt.h: revision 1.7
sys/arch/aarch64/include/machdep.h: revision 1.9
sys/stand/efiboot/efiboot.h: revision 1.11
sys/arch/arm/arm32/arm32_machdep.c: revision 1.129
sys/arch/arm/include/arm32/machdep.h: revision 1.30
sys/stand/efiboot/exec.c: revision 1.12
sys/arch/evbarm/fdt/fdt_machdep.c: revision 1.65
sys/stand/efiboot/version: revision 1.14
sys/stand/efiboot/boot.c: revision 1.19

New function cpu_startup_hook on arm.

Called at end of cpu_startup. Can be defined in, e.g., evbarm to do
additional stuff after cpu_startup. Defined as a weak alias to a
function that does nothing, so optional.
ok jmcneill

Implement rndseed support in efiboot and fdt arm.

The EFI environment variable `rndseed' specifies the path to the
random seed. It is loaded only for fdt platforms at the moment.
Since the rndseed (an rndsave_t object as defined in <sys/rndio.h>)
is 536 bytes long (for hysterical raisins), and to avoid having to
erase parts of the fdt tree, we load it into a physical page whose
address is passed in the fdt tree, rather than passing the content of
the file as an fdt node directly; the kernel then reserves the page
from uvm, and maps it into kva to call rnd_seed.

For now, the only kernel that does use efiboot with fdt is evbarm,
which knows to handle the rndseed. Any new kernels that use efiboot
with fdt must do the same; otherwise uvm may hand out the page with
the secret key on it for a normal page allocation in the kernel --
which should be OK if there are no kernel memory disclosure bugs, but
would lead to worse consequences than simply loading the seed late in
userland with /etc/rc.d/random_seed otherwise.

ok jmcneill
 1.28.4.2 21-Jan-2020  martin Pull up following revision(s) (requested by ryo in ticket #617):

sys/arch/aarch64/aarch64/aarch64_machdep.c: revision 1.37
sys/arch/aarch64/aarch64/locore.S: revision 1.50

fix panic when modload.

panic: kernel diagnostic assertion "!pmap_extract(pmap_kernel(), loopva, NULL)" failed: file "../../../../uvm/uvm_km.c", line 674 loopva=0xffffffc001000000'

The space allocated by bootpage_alloc() is only used as a physical page
for pagetable pages, so there is no need to map it with KVA.
And kernend_extra should not have consumed any KVA space.
 1.28.4.1 22-Sep-2019  martin Pull up following revision(s) (requested by ryo in ticket #215):

sys/arch/aarch64/aarch64/aarch64_machdep.c: revision 1.30
sys/arch/aarch64/aarch64/aarch64_machdep.c: revision 1.31
sys/arch/aarch64/aarch64/aarch64_machdep.c: revision 1.29

Do not assume that DRAM is linear when creating KSEG mappings. Instead,
create L2 blocks to cover all ranges specified in the memory map.

-

use L1-L3 blocks/pages for KSEG mappings to fit dramblocks exactly.
r1.29 and this changes avoid over cache prefetch problem (perhaps) with PMAP_MAP_POOLPAGE/KSEG on CortexA72, and be more stable for rockpro64.

-

used L3 even if L2 could cover the range. fix to use larger block if possible good enough.
pointed out by jmcneill@. thanks.
 1.37.2.1 25-Jan-2020  ad Sync with head.
 1.40.4.1 20-Apr-2020  bouyer Sync with HEAD
 1.53.2.2 03-Apr-2021  thorpej Sync with HEAD.
 1.53.2.1 14-Dec-2020  thorpej Sync w/ HEAD.
 1.56.2.1 03-Apr-2021  thorpej Sync with HEAD.
 1.60.6.1 06-Jun-2021  cjep sync with head
 1.60.2.1 17-Jun-2021  thorpej Sync w/ HEAD.
 1.5 05-Mar-2024  thorpej Move the at-shutdown call to resettodr() from cpu_reboot() to kern_reboot().

It's a small step, but it's a step.
 1.4 11-Dec-2020  skrll s:aarch64/cpufunc.h:arm/cpufunc.h:

a baby step in the grand arm header unification challenge
 1.3 09-Nov-2018  mrg branches: 1.3.12;
implement dumpsys() and friends for arm64.

this is almost a direct copy of the arm code, which is simply
as the basic structures about physical memory are the same
between arm and arm64. the main change i made was to use
the direct map instead of a virtual dump page that is remapped
to whatever physical page is being dumped.

i also changed the existing cpu_kcore_hdr_t to include the
missing number of ram segments.

note that this is not a complete solution for crash dumps yet,
as the libkvm code needs some work. i'm fairly positive that
this side is correct, as i can see the data i expect to see,
but libkvm's _kvm_kvtop() function returns garbage so far.

there is no "minidump" support here yet, ala amd64, but we
probably want it eventually.


ok skrll@.
 1.2 31-May-2018  mrg branches: 1.2.2;
docpureset() doesn't return anything, so mark it void.
(probably could also be __dead.)
 1.1 01-Apr-2018  ryo branches: 1.1.2;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1.2.4 26-Nov-2018  pgoyette Sync with HEAD, resolve a couple of conflicts
 1.1.2.3 25-Jun-2018  pgoyette Sync with HEAD
 1.1.2.2 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.2.1 01-Apr-2018  pgoyette file aarch64_reboot.c was added on branch pgoyette-compat on 2018-04-07 04:12:10 +0000
 1.2.2.1 10-Jun-2019  christos Sync with HEAD
 1.3.12.1 14-Dec-2020  thorpej Sync w/ HEAD.
 1.2 29-Oct-2021  skrll Fix length of memset in tlb_record_asids
 1.1 10-Oct-2021  skrll Use sys/uvm/pmap/pmap_tlb.c on Aarch64 in the same way that some Arm, MIPS,
and some PPC kernels do. This removes the limitation of 256 processes on
CPUs with 8bit ASID field, e.g. Apple M1.

Additionally the following changes have been made

- removed a couple of unnecessary aarch64_tlbi_all calls
- removed any invalidation after freeing page tables due to
_pmap_sweep_pdp. This was never necessary afaict.
- all kernel mappings are marked global and userland mapping not-global.

Performance testing hasn't show a significant difference. The data here
is from building a kernel on an lx2k system with nvme.

before
1489.6u 400.4s 2:40.65 1176.5% 228+224k 0+32289io 57pf+0w
1482.6u 403.2s 2:38.49 1189.9% 228+222k 0+32274io 46pf+0w
1485.4u 402.2s 2:37.27 1200.2% 228+222k 0+32275io 12pf+0w

after
1493.9u 404.6s 2:37.50 1205.4% 227+221k 0+32265io 48pf+0w
1485.0u 408.0s 2:38.54 1194.0% 227+222k 0+32272io 36pf+0w
1484.3u 407.0s 2:35.88 1213.3% 228+224k 0+32268io 14pf+0w

>>> stats.ttest_ind([160.65,158.49,157.27], [157.5,158.54,155.88])
Ttest_indResult(statistic=1.1923622711296888, pvalue=0.2990182944606766)
>>>
 1.2 01-Apr-2018  ryo Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.28;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.28.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file bus_dma.c was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.18 07-Feb-2024  msaitoh Remove ryo@'s mail addresses.
 1.17 15-Oct-2022  jmcneill Use "non-posted" instead of "strongly ordered" to describe nGnRnE mappings

Rename the following defines:
- _ARM_BUS_SPACE_MAP_STRONGLY_ORDERED to BUS_SPACE_MAP_NONPOSTED
- PMAP_DEV_SO to PMAP_DEV_NP
- LX_BLKPAG_ATTR_DEVICE_MEM_SO to LX_BLKPAG_ATTR_DEVICE_MEM_NP
Rename the following option:
- AARCH64_DEVICE_MEM_STRONGLY_ORDERED to AARCH64_DEVICE_MEM_NONPOSTED
 1.16 14-Apr-2021  ryo Fix the problem "pcictl pci0 list" causes "panic: trap_el1h_error" on rockpro64.

The panic occures in bus_space_barrier() in rk3399_pcie.c:rkpcie_conf_read().
We expected bus_space_peek_4() to trap and recover in the path
trap_el1h_sync() -> data_abort_handler(), but In fact, the read is delayed
until bus_space_barrier(), and we get an SError interrupt (trap_el1h_error)
instead of a Synchronous Exception (trap_el1h_sync).

To catch this correctly, An implicit barrier in bus_space_peek have been added,
and trap the SError interrupt to recover from.
 1.15 14-Dec-2020  skrll branches: 1.15.2;
Add a note about completion vs ordering barrier as well.
 1.14 14-Dec-2020  skrll Add a big comment in generic_bs_barrier about mappings and what barriers
are really required and why we cheat. Inspired by a similar comment in
x86/bus_space.c
 1.13 14-Dec-2020  jmcneill Use full system DSB ops for bs barrier.
 1.12 14-Dec-2020  jmcneill The bus_space(9) man page is not clear whether barriers should enforce
ordering or completion. To be safe, use dsb here instead of dmb.
 1.11 15-Oct-2020  jmcneill branches: 1.11.2;
Reduce scope of memory barriers use in bus_space_barrier() implementation.

Instead of always "dsb sy", use "dsb ishld" for reads, "dsb ishst" for
writes, and "dsh ish" for reads and writes.

Ok skrll@
 1.10 05-Sep-2020  jakllsch Adjust aarch64 bus_space tags to also work on aarch64eb
 1.9 28-Dec-2019  jmcneill Do not use Early Write Acknowledge for PCIe I/O and config space.
 1.8 27-Jan-2019  pgoyette branches: 1.8.4;
Merge the [pgoyette-compat] branch
 1.7 19-Nov-2018  jmcneill On second thought, get rid of "bs_base" from struct bus_space and use a
custom bs_map for acpipchb instead.
 1.6 18-Nov-2018  jmcneill Add a "bs_base" field to struct bus_space. If present, use it to translate
mappings by appending the value to the pa passed to bus_space_map.
 1.5 16-Jun-2018  jmcneill branches: 1.5.2;
initialize bs_cookie for generic_dsb tags
 1.4 08-Jun-2018  jmcneill Provide bs_mmap implementations for bcm283x based boards.

PR: port-arm/53283
Submitted by: Nick Hudson
 1.3 09-Apr-2018  jmcneill Fix encoding of MMAP flags for generic_bs_mmap
 1.2 01-Apr-2018  ryo Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.28;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.28.4 26-Nov-2018  pgoyette Sync with HEAD, resolve a couple of conflicts
 1.1.28.3 25-Jun-2018  pgoyette Sync with HEAD
 1.1.28.2 16-Apr-2018  pgoyette Sync with HEAD, resolve some conflicts
 1.1.28.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file bus_space.c was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.5.2.2 08-Apr-2020  martin Merge changes from current as of 20200406
 1.5.2.1 10-Jun-2019  christos Sync with HEAD
 1.8.4.1 29-Dec-2019  martin Pull up following revision(s) (requested by jmcneill in ticket #586):

sys/arch/arm/nvidia/tegra_pcie.c: revision 1.27
sys/arch/aarch64/aarch64/pmap.c: revision 1.57
sys/arch/aarch64/aarch64/locore.S: revision 1.48
sys/arch/aarch64/include/armreg.h: revision 1.29
sys/arch/aarch64/aarch64/pmap.c: revision 1.58
sys/arch/aarch64/aarch64/locore.S: revision 1.49
sys/arch/arm/acpi/acpipchb.c: revision 1.14
sys/arch/aarch64/aarch64/genassym.cf: revision 1.16
sys/arch/arm/acpi/acpi_machdep.c: revision 1.13
sys/arch/aarch64/include/pmap.h: revision 1.27
sys/arch/aarch64/aarch64/genassym.cf: revision 1.17
sys/arch/aarch64/include/pmap.h: revision 1.28
sys/arch/arm/fdt/pcihost_fdtvar.h: revision 1.3
sys/arch/arm/include/bus_defs.h: revision 1.14
sys/arch/aarch64/aarch64/bus_space.c: revision 1.9
sys/arch/arm/fdt/pcihost_fdt.c: revision 1.12
sys/arch/aarch64/conf/files.aarch64: revision 1.15
sys/arch/aarch64/conf/files.aarch64: revision 1.16
sys/arch/arm/rockchip/rk3399_pcie.c: revision 1.9

Enable early write acknowledge for device memory mappings.

Do not use Early Write Acknowledge for PCIe I/O and config space.
 1.11.2.2 03-Jan-2021  thorpej Sync w/ HEAD.
 1.11.2.1 14-Dec-2020  thorpej Sync w/ HEAD.
 1.15.2.1 17-Apr-2021  thorpej Sync with HEAD.
 1.7 07-Feb-2024  msaitoh Remove ryo@'s mail addresses.
 1.6 12-Jan-2023  ryo fixed a bug that bus_space_read_region_{2,4,8}_swap() accesses wrong address.
 1.5 12-Nov-2020  jmcneill branches: 1.5.18;
Fix typo in comment
 1.4 24-Sep-2020  ryo branches: 1.4.2;
fix *_bs_rm_4_swap(). it was only reading 2 bytes, not 4 bytes.

pointed out by skrll@ thanks.
 1.3 24-Sep-2020  ryo fix bugs in *_bs_rm_8_swap(). it was only reading 4 bytes, not 8 bytes.
 1.2 13-Jan-2020  ryo Fix mis-incrementing pointer size in bus_space_read_region_{4,8}

pointed out by jmcneill@. thanks.
 1.1 01-Apr-2018  ryo branches: 1.1.2; 1.1.4; 1.1.8; 1.1.10;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1.10.1 17-Jan-2020  ad Sync with head.
 1.1.8.1 21-Jan-2020  martin Pull up following revision(s) (requested by ryo in ticket #624):

sys/arch/aarch64/aarch64/bus_space_asm_generic.S: revision 1.2

Fix mis-incrementing pointer size in bus_space_read_region_{4,8}
pointed out by jmcneill@. thanks.
 1.1.4.1 08-Apr-2020  martin Merge changes from current as of 20200406
 1.1.2.2 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.2.1 01-Apr-2018  pgoyette file bus_space_asm_generic.S was added on branch pgoyette-compat on 2018-04-07 04:12:10 +0000
 1.4.2.1 14-Dec-2020  thorpej Sync w/ HEAD.
 1.5.18.1 12-Jan-2023  martin Pull up following revision(s) (requested by ryo in ticket #44):

sys/arch/aarch64/aarch64/bus_space_asm_generic.S: revision 1.6

fixed a bug that bus_space_read_region_{2,4,8}_swap() accesses wrong address.
 1.2 07-Feb-2024  msaitoh Remove ryo@'s mail addresses.
 1.1 01-Apr-2018  ryo branches: 1.1.2;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1.2.2 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.2.1 01-Apr-2018  pgoyette file bus_space_notimpl.S was added on branch pgoyette-compat on 2018-04-07 04:12:10 +0000
 1.2 24-Jul-2021  jmcneill aarch64: Remove empty source file and references to it.
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.46;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.46.1 01-Aug-2021  thorpej Sync with HEAD.
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file cctr_machdep.c was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.16 11-Feb-2021  ryo include "opt_gprof.h" so that _PROF_PROLOGUE works properly in ENTRY() macro in *.S files
 1.15 12-Aug-2020  skrll branches: 1.15.2;
Part II of ad's aarch64 performance improvements (cpu_switch.S bugs are
all mine)

- Use tpidr_el1 to hold curlwp and not curcpu, because curlwp is accessed
much more often by MI code. It also makes curlwp preemption safe and
allows aarch64_curlwp() to be a const function (curcpu must be volatile).

- Make ASTs operate per-LWP rather than per-CPU, otherwise sometimes LWPs
can see spurious ASTs (which doesn't cause a problem, it just means some
time may be wasted).

- Use plain stores to set/clear ASTs. Make sure ASTs are always set on the
same CPU as the target LWP, and delivered via IPI if posted from a remote
CPU so that they are resolved quickly.

- Add some cache line padding to struct cpu_info, to match x86.

- Add a memory barrier in a couple of places where ci_curlwp is set. This
is needed whenever an LWP that is resuming on the CPU could hold an
adaptive mutex. The barrier needs to drain the CPU's store buffer, so
that the update to ci_curlwp becomes globally visible before the LWP can
resume and call mutex_exit(). By my reading of the ARM docs it looks like
the instruction I used will do the right thing, but I'm not 100% sure.
 1.14 06-Aug-2020  ryo revert the changes of http://mail-index.netbsd.org/source-changes/2020/08/03/msg120183.html

This change is overengineered.
bus_space_{peek,poke}_N does not have to be reentrant nor available for interrupt context.

requested by skrll@
 1.13 03-Aug-2020  ryo Implement MD ucas(9) (__HAVE_UCAS_FULL)
 1.12 03-Aug-2020  ryo Fix a problem in which a fault occured in an interrupt handler during copyin/copyout was erroneously detected as being occured by copyin.

- keep idepth in faultbuf and compare it to avoid unnecessary fault recovery
- make cpu_set_onfault() nestable to use bus_space_{peek,poke}()
in hardware interrupt handlers during copyin & copyout.
 1.11 02-Aug-2020  maxv Add support for Privileged Access Never (ARMv8.1-PAN).

PAN provides the same functionality as SMAP on x86: it forbids kernel
access to userland pages when PSTATE.PAN=1, and allows such accesses when
PSTATE.PAN=0.

We clear SCTLR_SPAN, to guarantee that PAN=1 each time the kernel is
entered. We catch PAN faults and panic right away without further
processing. In copyin, copyout, etc, we temporarily authorize access to
userland pages.

PAN is a very useful exploit mitigation. Reviewed by ryo@, thanks. Tested
on Qemu. Enabled by default.
 1.10 30-Jun-2020  maxv Make copystr() a MI C function, part of libkern and shared on all
architectures.

Notes:

- On alpha and ia64 the function is kept but gets renamed locally to avoid
symbol collision. This is because on these two arches, I am not sure
whether the ASM callers do not rely on fixed registers, so I prefer to
keep the ASM body for now.
- On Vax, only the symbol is removed, because the body is used from other
functions.
- On RISC-V, this change fixes a bug: copystr() was just a wrapper around
strlcpy(), but strlcpy() makes the operation less safe (strlen on the
source beyond its size).
- The kASan, kCSan and kMSan wrappers are removed, because now that
copystr() is in C, the compiler transformations are applied to it,
without the need for manual wrappers.

Could test on amd64 only, but should be fine.
 1.9 14-Sep-2018  ryo change copystr() to asm so that we don't have to add __noasan.
Also copyinout.S is the right place for copystr().
 1.8 10-Sep-2018  ryo changed kcopy() to asm to avoid replacement memcpy() to kasan_memcpy() when defined KASAN.
 1.7 30-Jul-2018  ryo fix copy{in,out}str to return ENAMETOOLONG if the string is longer than len bytes.
 1.6 24-Jul-2018  ryo copy(9) had returned -1 if a bad address is encountered. fix to return EFAULT in that case.
 1.5 17-Jul-2018  christos centralize fp,lr definitions
 1.4 17-Jul-2018  ryo fix build with aarch64 gcc/gas
 1.3 01-Apr-2018  ryo branches: 1.3.2;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.2 16-Aug-2017  nisimura branches: 1.2.2;
retire copyinout.S and fusu.S
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.6;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.6.1 28-Aug-2017  skrll Sync with HEAD
 1.1.4.3 03-Dec-2017  jdolecek update from HEAD
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file copyinout.S was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.2.2.4 30-Sep-2018  pgoyette Ssync with HEAD
 1.2.2.3 06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.2.2.2 28-Jul-2018  pgoyette Sync with HEAD
 1.2.2.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.3.2.1 10-Jun-2019  christos Sync with HEAD
 1.15.2.1 03-Apr-2021  thorpej Sync with HEAD.
 1.6 23-Sep-2021  ryo use lwp_trapframe() macro. NFC.
 1.5 20-Nov-2019  pgoyette Move all non-emulation-specific coredump code into the coredump module,
and remove all #ifdef COREDUMP conditional compilation. Now, the
coredump module is completely separated from the emulation modules, and
they can all be independently loaded and unloaded.

Welcome to 9.99.18 !
 1.4 01-Apr-2018  ryo branches: 1.4.2;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.3 25-Feb-2018  skrll branches: 1.3.2;
KNF
 1.2 25-Feb-2018  skrll Use correct MID_ value
 1.1 10-Aug-2014  matt branches: 1.1.4;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file core_machdep.c was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.3.2.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.4.2.1 08-Apr-2020  martin Merge changes from current as of 20200406
 1.83 31-Jan-2025  jmcneill aarch64: Identify Cortex-A520, Cortex-A720 cores
 1.82 09-Dec-2024  jmcneill arm64: Identify Qualcomm Oryon CPUs
 1.81 07-Oct-2024  jakllsch CPU ID strings for Arm Cortex-A710, Neoverse V1, Neoverse N2, and Fujitsu A64FX
 1.80 27-Sep-2024  jakllsch refine previous
 1.79 27-Sep-2024  jakllsch Add Ampere 1 and 1A CPU IDs
 1.78 10-Aug-2024  riastradh aarch64: Count RNDRRS failure events and add dtrace probe.

PR port-arm/58572: aarch64 RNDRRS failures should be evcounted and
dtraced
 1.77 30-Jun-2024  jmcneill aarch64: print NUMA ID
 1.76 09-May-2024  pho branches: 1.76.2;
kern/58195: arm: Support drvctl -d and -r for cpufeaturebus

This is required for detaching and re-attaching the vmt(4) driver on aarch64.
 1.75 09-May-2024  pho port-arm/58194: Resurrect vmt(4) from bitrot

On this architecture vmt(4) used to search for a node "/hypervisor" in the
FDT and probed the VMware hypervisor call only when the node was
found. However, things appear to have changed and VMware no longer provides
the FDT node.

Since vmt(4) doesn't actually need to read anything from FDT, and the
hypervisor call logically resides in virtual CPUs themselves, it would be
better to attach it directly to cpu, just like how it's probed on x86.
 1.74 07-Feb-2024  msaitoh Remove ryo@'s mail addresses.
 1.73 03-Feb-2023  skrll Remove useless/harmful casts in debug messages. MPIDR AFF3 would not
be printed before.
 1.72 22-Dec-2022  ryo PMCR_EL0.LC should be set. ARM deprecates use of PMCR_EL0.LC=0
 1.71 22-Dec-2022  ryo Explicitly disable overflow interrupts before enabling the cycle counter.
 1.70 29-May-2022  ryo branches: 1.70.4;
fix build without options DDB
 1.69 03-Mar-2022  riastradh arm: Use device_set_private for cpuN.

For cpu at fdt, nix the fdt softc -- this was leaked and never used
for anything. The device's private storage is the cpu_info.
 1.68 12-Nov-2021  skrll Print a big warning about trying to run on early ThunderX parts
 1.67 31-Oct-2021  skrll Rework Arm (32bit and 64bit) AP startup so that cpu_hatch doesn't sleep.

The AP initialisation code in cpu_init_secondary_processor will read and
initialise the required system registers and state for the BP to attach
and report.

Rework the interrupt handler code for this new sequence. Thankfully,
this removes a bunch of code for bcm2836mp.

The VFP detection handler on <= armv7 relies on the global undefined
handler being in place until the BP attaches vfp. That is, after the
APs have been spun up.

gicv3_its.c has a serialisation issue which is protected against in
the gicv3_its_cpu_init, which is called from cpu_hatch, with a spin
lock. The serialisation issue needs addressing more completely.

Tested on RPI3, Apple M1, QEMU, and lx2k

Fixes PR port-arm/56264:
diagnostic assertion "l->l_stat == LSONPROC" failed on RPI3
 1.66 30-Oct-2021  skrll G/C MD_CPU_HATCH. It's old evbarm (<= armv7)
 1.65 30-Oct-2021  skrll style. NFCI.
 1.64 17-Oct-2021  skrll Remove some newlines
 1.63 10-Oct-2021  skrll Need to call pmap_tlb_info_attach for each CPU. Missed in previous
commit.
CVS ----------------------------------------------------------------------
 1.62 04-Oct-2021  skrll Add a KASSERT
 1.61 30-Aug-2021  jmcneill Identify Apple M1 "Icestorm" and "Firestorm" CPU types.
 1.60 19-Jun-2021  jmcneill Do not try to initialize PMU if ID_AA64DFR0_EL1 reports a non-standard
PMU implementation.
 1.59 09-Mar-2021  ryo branches: 1.59.4;
Add support hardware breakpoint and watchpoint again.

Limited support for hardware watchpoint has been available for some time, but it
has not been working properly. In addition, it stopped working at the time of
the PTRACE support commit on 2018-12-13. This has been fixed to work correctly,
and also fixed to be practical by sharing hardware watchpoints and breakpoints
between CPUs on MULTIPROCESSOR.

Also fixed a bug that causes a malfunction when switching CPUs with
"machine cpu N" when entering ddb mode from other than cpu_Debugger().

I have confirmed that the CPU can be switched by "machine cpu N" and return from
ddb properly in each case where ddb is called triggered by ddb break/watchpoint,
hardware break/watchpoint, and cpu_Debugger().
 1.58 11-Jan-2021  skrll Improve a comment
 1.57 11-Dec-2020  skrll s:aarch64/cpufunc.h:arm/cpufunc.h:

a baby step in the grand arm header unification challenge
 1.56 10-Oct-2020  jmcneill branches: 1.56.2;
Fix detection of FP and SIMD features on Armv8.2+.
 1.55 07-Oct-2020  jmcneill Only touch PMC registers if Performance Monitor Extensions are present.
 1.54 25-Jul-2020  riastradh Implement ChaCha with NEON on ARM.

XXX Needs performance measurement.
XXX Needs adaptation to arm32 neon which has half the registers.
 1.53 25-Jul-2020  riastradh Split aes_impl declarations out into aes_impl.h.

This will make it less painful to add more operations to struct
aes_impl without having to recompile everything that just uses the
block cipher directly or similar.
 1.52 01-Jul-2020  ryo - On some systems with a different cache line size (and DIC,IDC) per CPU, trap "mrs Xt,ctr_el0" instruction
to return the minimum cache line size of the system to userland.
- add CLIDR_EL1 and CTR_EL0 to struct aarch64_sysctl_cpu_id.

On most systems, cache line size is the same for all CPUs, so this mechanism won't be required.
Rather, this is primarily for errata support, which will be committed later.
 1.51 01-Jul-2020  ryo Switch the Icache sync operation to the necessary and sufficient one according to the CTR_EL0.DIC and CTR_EL0.IDC flags.

If CTR_EL0.DIC=1, Icache invalidation is not required.
If CTR_EL0.IDC=1, Dcache clean before Icache invalidation is not required.
CLIDR_EL1.LoC is 0, or CLIDR_EL1.LoUIS and CLIDR_EL1.LoUU are 0, Dcache clean is not required as well.

SEE ALSO ARMARM, "CTR_EL0 Cache Type Register", and "CLIDR_EL1 Cache Level ID Register"
 1.50 29-Jun-2020  riastradh New permutation-based AES implementation using ARM NEON.

Also derived from Mike Hamburg's public-domain vpaes code.
 1.49 29-Jun-2020  riastradh Implement AES in kernel using ARMv8.0-AES on aarch64.
 1.48 29-Jun-2020  riastradh Draft fpu_kern_enter/leave on aarch64.
 1.47 14-Jun-2020  riastradh Add some more id_aa64pfr0_el1 bits.
 1.46 30-May-2020  jmcneill sctlr_el1 and ctr_el0 are 64-bit registers
 1.45 11-May-2020  riastradh Add support for the ARMv8.5-RNG CPU random number generator.

We use the RNDRRS system register. I made the following two
wild-arse guesses about the architecture of real implementations,
which might not exist yet:

1. There's only one physical source per CPU package, so not worth
attaching one per core.

2. Like other CPU RNGs -- RDSEED, VIA C3 -- this probably gives about
half a bit of entropy per bit of data (although perhaps we should
say zero and revisit this once it arrives on real silicon).

Tested in qemu as well as I can, using `-cpu max' (which doesn't get
to userland for unrelated reasons).

This uses the numeric notation `mrs %0, s3_3_c2_c4_1' for the rndrrs
system register instead of the more legible `mrs %0, rndrrs' as
suggested in the ARMv8.5 ARM. Why?

- clang doesn't like `mrs %0, rndrrs' for reasons unclear to me.

- gas only likes it with `.arch armv8.5-a+rng', but there's no clear
way to keep that scoped; the `.set push/pop' stack that would be an
obvious choice for this works only on mips.

- gcc supports __attribute__((target("arch=..."))) on functions, but
the version we use doesn't yet know about armv8.5-a+rng.

Later on, we should replace this by a target attribute and the more
obvious `mrs %0, rndrrs' notation.

ok nick
 1.44 10-May-2020  riastradh Print RNDR support in verbose CPU feature identification.
 1.43 05-Apr-2020  jmcneill Cleanup CPU attach output:
- Always print the core's vendor and product name.
- Print the CPU ID on the same line as the name. Single line of dmesg
per core.
- Use aprint_verbose for reporting additional details.
 1.42 30-Mar-2020  jmcneill Enable the cycle counter when a CPU hatches and store an estimate of the
frequency in ci_data.cpu_cc_freq.
 1.41 15-Feb-2020  skrll Various updates and improvements to cpu start up on arm/aarch64

- start sharing more code around the AP startup messaging.
- call arm_cpu_topology_set early so that ci_core_id is available for
drivers, e.g. bcm2835_intr.c
- both arm and aarch64 now have
- a static cpu_info_store array
- the same arm_cpu_{hatched,mbox}
 1.40 09-Feb-2020  skrll #if 0 / #endif -> a comment
 1.39 28-Jan-2020  maxv Fetch ID_AA64MMFR2_EL1. Okayed by Nick the other day.
 1.38 27-Jan-2020  skrll NVIDIA's breakaway marketing dept have been in touch.
 1.37 27-Jan-2020  skrll Identify the Denver2 CPU in the Nvidia TX2
 1.36 25-Jan-2020  skrll Trailing whitespace
 1.35 20-Jan-2020  skrll KNF
 1.34 15-Jan-2020  mrg port the arm64 cpu topology setup for big.little to arm.

rename arm64 cpu_do_topology() to arm_cpu_do_topology() and
call it from both arm cpu_attach().

replace both aarch64_set_topology() inline code in arm
cpu_attach() with new arm_cpu_do_topology(), which is called
by the arm64 locore as well (possibly not needed, which would
allow it to become static.)

not yet tested on a real big.little armv7 system. tested
on rockpro64 and pinebook pro.
 1.33 12-Jan-2020  mrg provide some semblance of valid cpu topology for big.little systems.

while attaching cpus, if the FDT provides "capacity-dmips-mhz" track
the fastest set, and call cpu_topology_set() with slow=true for any
cpus that are not the fastest.

bug fix for cpu_topology_set(): actually set ci_is_slow for slow cpus.

with this change, and -current's recent scheduler changes, this means
that long running processes run on the faster cores. on RK3399 based
systems, i am seeing 20-50% speed ups for many tasks.


XXX: all this can be made common with armv7 big.little.
 1.32 09-Jan-2020  martin When attaching the first fdtbus, use the root "comptabile" (or failing that:
"model") property to set the cpu model (in userland aka sysctl hw.model).
When attaching the first cpu, do not overwrite a cpu model if it already
had been set.
 1.31 28-Dec-2019  jmcneill branches: 1.31.2;
Identify Arm Neoverse E1 and N1 CPUs.
 1.30 27-Dec-2019  mlelstv Fix build.
 1.29 27-Dec-2019  skrll Add a missing newline
 1.28 21-Dec-2019  ad Fix build break (ci->ci_dev is not available on every port).
 1.27 20-Dec-2019  ad Some more CPU topology stuff:

- Use cegger@'s ACPI SRAT parsing code to figure out NUMA node ID for each
CPU as it is attached.

- For scheduler experiments with SMT, flag CPUs with the lowest numbered SMT
IDs as "primaries", link back to the primaries from secondaries, and build
a circular list of CPUs in each package with identical SMT IDs.

- No need for package/core/smt/numa IDs to be anything other than a u_int.
 1.26 22-Nov-2019  mlelstv Make cache operations available early.
 1.25 20-Oct-2019  jmcneill Use separate cacheline aligned arrays for mbox and hatched as before.
 1.24 20-Oct-2019  jmcneill Invalidate dcache before polling AP hatched status
 1.23 19-Oct-2019  jmcneill Increase aarch64 MAXCPUS to 256.
 1.22 14-Oct-2019  jmcneill Remove the A72 errata #859971 detection, it causes an illegal instruction on AWS A1 (virtualized)
 1.21 15-Sep-2019  tnn report A72 errata #859971 workaround status during boot
 1.20 16-Jul-2019  jmcneill branches: 1.20.2;
Need CPU_PARTMASK for eMAG CPU ID
 1.19 16-Jul-2019  jmcneill Add Ampere eMAG 8180 cpuid
 1.18 19-Jun-2019  mrg add several cortex CPU implementations found in their TRMs:
- A32 R1 (aarch32 only, not supported)
- A35 R1
- A65 R0
- A76AE R1
- A77

add the aarch64 ones to cpu.c for identification.
 1.17 09-May-2019  mrg add cortex A-76 detection.
 1.16 21-Jan-2019  skrll Use ci_{package,core,smt}_id instead of ci_data.cpu_{package,core,smt}_id

NFC
 1.15 21-Dec-2018  ryo - add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)
 1.14 28-Nov-2018  ryo support boot option "-1" to disable multiprocessor boot, and "-z" to set AB_SILENT flag.
 1.13 20-Nov-2018  mrg rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.
 1.12 14-Oct-2018  skrll Use __nothing
 1.11 04-Oct-2018  ryo remove XXX delay to attach cpus in order
 1.10 03-Oct-2018  skrll Another space that hurts Jared's eyes.
 1.9 03-Oct-2018  skrll Fix some product names and details as suggested by jmcneill
 1.8 03-Oct-2018  skrll Identify some Cavium ThunderX CPUs
 1.7 10-Sep-2018  ryo cleanup aarch64 mpstart and fdt bootstrap
* arm_cpu_hatch_arg is a bad idea. avoid serializing CPU startup, and eliminate arm_cpu_hatch_arg.
in mpstart, resolve own cpu index using array of cpu_mpidr[] (aarch64)
* add support fdt enable-method "spin-table"
* add support fdt enable-method "brcm,bcm2836-smp" (for 32bit RaspberryPi)
* use arm_fdt_cpu_bootstrap() instead of psci_fdt_bootstrap()
* rename "arm/fdt/psci_fdt.h" to "arm/fdt/psci_fdtvar.h" because of conflict of include file for needs-flag
* add devmap for cpu spin-table of raspberrypi3/aarch64
* no need to force hatch APs for raspberrypi3/arm32 ifndef MULTIPROCESSOR.
* fix to work pmap_extract(kerneltext/data/bss) even if before calling pmap_bootstrap

idea to use cpu_mpidr[] by jmcneill@. reviewd by skrll@. thanks.
 1.6 26-Aug-2018  ryo add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!
 1.5 20-Aug-2018  jmcneill Use __SHIFTOUT to extract MPIDR affinity levels
 1.4 31-Jul-2018  skrll Define and use VPRINTF
 1.3 17-Jul-2018  christos add default statements, use PRI?64 instead of ll?
 1.2 09-Jul-2018  ryo add MULTIPROCESSOR support
 1.1 01-Apr-2018  ryo branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1.4.2 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.1.4.1 10-Jun-2019  christos Sync with HEAD
 1.1.2.9 26-Jan-2019  pgoyette Sync with HEAD
 1.1.2.8 26-Dec-2018  pgoyette Sync with HEAD, resolve a few conflicts
 1.1.2.7 26-Nov-2018  pgoyette Sync with HEAD, resolve a couple of conflicts
 1.1.2.6 20-Oct-2018  pgoyette Sync with head
 1.1.2.5 30-Sep-2018  pgoyette Ssync with HEAD
 1.1.2.4 06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.1.2.3 28-Jul-2018  pgoyette Sync with HEAD
 1.1.2.2 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.2.1 01-Apr-2018  pgoyette file cpu.c was added on branch pgoyette-compat on 2018-04-07 04:12:10 +0000
 1.20.2.3 09-Jan-2020  snj Pull up following revision(s) (requested by martin in ticket #614):

sys/arch/aarch64/aarch64/cpu.c: 1.32
sys/arch/arm/arm32/cpu.c: 1.138
sys/dev/fdt/fdtbus.c: 1.31

When attaching the first fdtbus, use the root "comptabile" (or failing that:
"model") property to set the cpu model (in userland aka sysctl hw.model).
When attaching the first cpu, do not overwrite a cpu model if it already
had been set.
 1.20.2.2 29-Dec-2019  martin Pull up following revision(s) (requested by jmcneill in ticket #589):

sys/arch/arm/include/cputypes.h: revision 1.11
sys/arch/aarch64/aarch64/cpu.c: revision 1.31

Identify Arm Neoverse E1 and N1 CPUs.
 1.20.2.1 23-Oct-2019  martin Pull up following revision(s) (requested by jmcneill in ticket #359):

sys/arch/aarch64/aarch64/locore.S: revision 1.42
sys/arch/aarch64/aarch64/locore.S: revision 1.43
sys/arch/aarch64/aarch64/locore.S: revision 1.44
sys/arch/arm/fdt/cpu_fdt.c: revision 1.28
sys/arch/aarch64/include/cpu.h: revision 1.14
sys/arch/aarch64/include/param.h: revision 1.12
sys/arch/arm/arm32/cpu.c: revision 1.133
sys/arch/arm/arm32/cpu.c: revision 1.134
sys/arch/arm/include/cpu.h: revision 1.101
sys/arch/arm/acpi/cpu_acpi.c: revision 1.7
sys/arch/aarch64/aarch64/cpu.c: revision 1.23
sys/arch/aarch64/aarch64/cpu.c: revision 1.24
sys/arch/aarch64/aarch64/cpu.c: revision 1.25

Increase aarch64 MAXCPUS to 256.

-

Invalidate dcache before polling AP hatched status

-

Avoid overlap between BP and last AP stack. AP stacks are now in order of
increasing address order.

Spotted by and idea from mlelstv.

-

Use separate cacheline aligned arrays for mbox and hatched as before.

-

cpu_hatched_p only for MULTIPROCESSOR
 1.31.2.3 29-Feb-2020  ad Sync with head.
 1.31.2.2 25-Jan-2020  ad Sync with head.
 1.31.2.1 17-Jan-2020  ad Sync with head.
 1.56.2.2 03-Apr-2021  thorpej Sync with HEAD.
 1.56.2.1 14-Dec-2020  thorpej Sync w/ HEAD.
 1.59.4.1 01-Aug-2021  thorpej Sync with HEAD.
 1.70.4.4 13-Oct-2024  martin Pull up following revision(s) (requested by riastradh in ticket #955):

sys/arch/aarch64/aarch64/cpu.c: revision 1.78
sys/arch/aarch64/include/cpu.h: revision 1.51

aarch64: Count RNDRRS failure events and add dtrace probe.

PR port-arm/58572: aarch64 RNDRRS failures should be evcounted and
dtraced
 1.70.4.3 03-Oct-2024  martin Pull up following revision(s) (requested by jakllsch in ticket #922):

sys/arch/aarch64/aarch64/cpu.c: revision 1.79
sys/arch/arm/include/cputypes.h: revision 1.17
usr.sbin/cpuctl/arch/aarch64.c: revision 1.24
sys/arch/aarch64/aarch64/cpu.c: revision 1.80

Add Ampere 1 and 1A CPU IDs

refine previous
add Ampere 1 and 1A
 1.70.4.2 20-Sep-2024  martin Pull up following revision(s) (requested by rin in ticket #869):

usr.sbin/cpuctl/arch/aarch64.c: revision 1.22
sys/arch/aarch64/aarch64/cpu.c: revision 1.73

Remove useless/harmful casts in debug messages. MPIDR AFF3 would not
be printed before.

MPIDR is 64bits. Without this AFF3 would always be zero.
Spotted by Cyprien.
 1.70.4.1 23-Dec-2022  martin Pull up following revision(s) (requested by ryo in ticket #20):

sys/arch/arm/arm/cpufunc.c: revision 1.185
sys/dev/tprof/tprof.c: revision 1.22
sys/arch/arm/arm32/arm32_boot.c: revision 1.45
sys/dev/tprof/tprof_armv8.c: revision 1.19
sys/dev/tprof/tprof_armv7.c: revision 1.12
sys/arch/aarch64/aarch64/cpu.c: revision 1.71
sys/arch/aarch64/aarch64/cpu.c: revision 1.72

tprof_lock is not a spin mutex. use mutex_{enter,exit}(). oops

Explicitly disable overflow interrupts before enabling the cycle counter.

PMCR_EL0.LC should be set. ARM deprecates use of PMCR_EL0.LC=0

Even if an overflow interrupt is occured for a counter outside tprof management,
the bit of onverflow status register must be cleared to prevent an interrupt storm.
 1.76.2.2 02-Aug-2025  perseant Sync with HEAD
 1.76.2.1 01-Jul-2024  perseant Sync with HEAD.
 1.2 11-Feb-2020  riastradh Delete aarch64 cpu_in_cksum.S draft.

This isn't actually used in the kernel; it is only used to cause the
in_cksum tests to fail.

If you want to revive it and make it work, you can pull it out of the
attic.
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.30; 1.1.36;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.36.1 29-Feb-2020  ad Sync with head.
 1.1.30.1 08-Apr-2020  martin Merge changes from current as of 20200406
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file cpu_in_cksum.S was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.16 30-Dec-2024  jmcneill aarch64: Allow for alternate cpu_idle() implementations
 1.15 14-Apr-2024  skrll branches: 1.15.2;
kern/58149: aarch64: Cannot return from a signal handler if SP was misaligned when the signal arrived

Apply the kernel diff from the PR

1. sendsig_siginfo() previously assumed that user SP was always aligned to
16 bytes and could call signal handlers with SP misaligned. This is a
wrong assumption because aarch64 demands that SP is aligned *only while*
it's being used to access memory. Now it properly aligns it before
pusing anything on the stack.

2. cpu_mcontext_validate() used to check if _REG_SP was aligned and
considered the ucontext invalid otherwise. This meant if a signal was
sent to a process whose SP was misaligned, the signal handler would fail
to return because the ucontext passed from the kernel was an invalid
one. Now setcontext(2) doesn't complain about misaligned SP.
 1.14 25-Feb-2023  riastradh aarch64: curcpu() audit.

Sprinkle KASSERT (or KDASSERT in hot paths) for kpreempt_disabled()
when we use curcpu() and it's not immediately obvious that the caller
has preemption disabled but closer scrutiny suggests the caller has.

Note unsafe curcpu()s for syscall event counting. Not sure this is
worth changing.

Possible bugs fixed:

- cpu_irq and cpu_fiq could be preempted while trying to run softints
on this CPU.

- data_abort_handler might incorrectly think it was invoked in
interrupt context when it was only preempted and migrated to
another CPU.

- pmap_fault_fixup might report the wrong CPU logs.

(However, we don't currently run with kpreemption on aarch64, so
these are not yet real bugs fixed except if you patch it to build
with __HAVE_PREEMPTION.)
 1.13 28-Jul-2022  riastradh branches: 1.13.4;
aarch64: Refactor splhigh and restore in dosoftints.

No functional change intended. splhigh always returns ci->ci_cpl,
which should not be changing at this point. Makes the bracketing by
splhigh/splx clearer.
 1.12 23-Sep-2021  ryo use lwp_trapframe() macro. NFC.
 1.11 12-Aug-2020  skrll Part II of ad's aarch64 performance improvements (cpu_switch.S bugs are
all mine)

- Use tpidr_el1 to hold curlwp and not curcpu, because curlwp is accessed
much more often by MI code. It also makes curlwp preemption safe and
allows aarch64_curlwp() to be a const function (curcpu must be volatile).

- Make ASTs operate per-LWP rather than per-CPU, otherwise sometimes LWPs
can see spurious ASTs (which doesn't cause a problem, it just means some
time may be wasted).

- Use plain stores to set/clear ASTs. Make sure ASTs are always set on the
same CPU as the target LWP, and delivered via IPI if posted from a remote
CPU so that they are resolved quickly.

- Add some cache line padding to struct cpu_info, to match x86.

- Add a memory barrier in a couple of places where ci_curlwp is set. This
is needed whenever an LWP that is resuming on the CPU could hold an
adaptive mutex. The barrier needs to drain the CPU's store buffer, so
that the update to ci_curlwp becomes globally visible before the LWP can
resume and call mutex_exit(). By my reading of the ARM docs it looks like
the instruction I used will do the right thing, but I'm not 100% sure.
 1.10 21-May-2020  ryo Fix a problem that setcontext(2) sometimes fail on ARMv8.5-BTI cpu.

fixed to always get 0 for SPSR.BTYPE with getcontext(2).
a non-zero SPSR.BTYPE cannot be set with setcontext(2).
 1.9 01-May-2020  tnn aarch64: handle _UC_SETSTACK and _UC_CLRSTACK like on arm32

ok ryo@
 1.8 23-Nov-2019  ad cpu_need_resched():

- Remove all code that should be MI, leaving the bare minimum under arch/.
- Make the required actions very explicit.
- Pass in LWP pointer for convenience.
- When a trap is required on another CPU, have the IPI set it locally.
- Expunge cpu_did_resched().
 1.7 21-Nov-2019  ad mi_userret(): take care of calling preempt(), set spc_curpriority directly,
and remove MD code that does the same.
 1.6 03-Aug-2018  ryo branches: 1.6.4;
don't set lwp->l_private if no _UC_TLSBASE flag.
atf lib/libc/sys/t_swapcontext Passed.
 1.5 17-Jul-2018  christos add missing cast
 1.4 09-Jul-2018  ryo add MULTIPROCESSOR support
 1.3 01-Apr-2018  ryo branches: 1.3.2;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.2 14-Apr-2015  jmcneill branches: 1.2.16;
__HAVE_PREEEMPTION -> __HAVE_PREEMPTION
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.6;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.6.1 06-Jun-2015  skrll Sync with HEAD
 1.1.4.3 03-Dec-2017  jdolecek update from HEAD
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file cpu_machdep.c was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.2.16.3 06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.2.16.2 28-Jul-2018  pgoyette Sync with HEAD
 1.2.16.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.3.2.2 08-Apr-2020  martin Merge changes from current as of 20200406
 1.3.2.1 10-Jun-2019  christos Sync with HEAD
 1.6.4.1 02-May-2020  martin Pull up following revision(s) (requested by tnn in ticket #884):
sys/arch/aarch64/aarch64/sig_machdep.c: revision 1.5
sys/arch/aarch64/aarch64/cpu_machdep.c: revision 1.9
aarch64: handle _UC_SETSTACK and _UC_CLRSTACK like on arm32
ok ryo@
 1.13.4.1 18-Apr-2024  martin Pull up following revision(s) (requested by skrll in ticket #667):

sys/arch/aarch64/aarch64/sig_machdep.c: revision 1.9
sys/arch/aarch64/aarch64/cpu_machdep.c: revision 1.15

kern/58149: aarch64: Cannot return from a signal handler if SP was
misaligned when the signal arrived

Apply the kernel diff from the PR
1. sendsig_siginfo() previously assumed that user SP was always aligned to
16 bytes and could call signal handlers with SP misaligned. This is a
wrong assumption because aarch64 demands that SP is aligned *only while*
it's being used to access memory. Now it properly aligns it before
pusing anything on the stack.
2. cpu_mcontext_validate() used to check if _REG_SP was aligned and
considered the ucontext invalid otherwise. This meant if a signal was
sent to a process whose SP was misaligned, the signal handler would fail
to return because the ucontext passed from the kernel was an invalid
one. Now setcontext(2) doesn't complain about misaligned SP.
 1.15.2.1 02-Aug-2025  perseant Sync with HEAD
 1.36 07-Feb-2024  msaitoh Remove ryo@'s mail addresses.
 1.35 10-Jun-2023  skrll KASSERT(kpreempt_disabled()) before accessing curcpu() to reflect why
preemption needs to be disabled more clearly.
 1.34 25-Feb-2023  riastradh aarch64: curcpu() audit.

Sprinkle KASSERT (or KDASSERT in hot paths) for kpreempt_disabled()
when we use curcpu() and it's not immediately obvious that the caller
has preemption disabled but closer scrutiny suggests the caller has.

Note unsafe curcpu()s for syscall event counting. Not sure this is
worth changing.

Possible bugs fixed:

- cpu_irq and cpu_fiq could be preempted while trying to run softints
on this CPU.

- data_abort_handler might incorrectly think it was invoked in
interrupt context when it was only preempted and migrated to
another CPU.

- pmap_fault_fixup might report the wrong CPU logs.

(However, we don't currently run with kpreemption on aarch64, so
these are not yet real bugs fixed except if you patch it to build
with __HAVE_PREEMPTION.)
 1.33 31-Jan-2022  ryo add support Hardware updates to Access flag and Dirty state (FEAT_HAFDBS)

- The DBM bit of the PTE is now used to determine if it is writable, and
the AF bit is treated entirely as a reference bit. A valid PTE is always
treated as readable. There can be no valid PTE that is not readable.
- LX_BLKPAG_OS_{READ,WRITE} are used only for debugging purposes,
and has been superseded by LX_BLKPAG_AF and LX_BLKPAG_DBM.
- Improve comment

The need for reference/modify emulation has been eliminated,
and access/permission faults have been reduced, however,
there has been little change in overall performance.
 1.32 31-Oct-2021  skrll Rework Arm (32bit and 64bit) AP startup so that cpu_hatch doesn't sleep.

The AP initialisation code in cpu_init_secondary_processor will read and
initialise the required system registers and state for the BP to attach
and report.

Rework the interrupt handler code for this new sequence. Thankfully,
this removes a bunch of code for bcm2836mp.

The VFP detection handler on <= armv7 relies on the global undefined
handler being in place until the BP attaches vfp. That is, after the
APs have been spun up.

gicv3_its.c has a serialisation issue which is protected against in
the gicv3_its_cpu_init, which is called from cpu_hatch, with a spin
lock. The serialisation issue needs addressing more completely.

Tested on RPI3, Apple M1, QEMU, and lx2k

Fixes PR port-arm/56264:
diagnostic assertion "l->l_stat == LSONPROC" failed on RPI3
 1.31 31-Oct-2021  skrll Sprinkle some const
 1.30 23-Oct-2021  skrll Shrink a panic message to avoid a long line
 1.29 23-Oct-2021  skrll Remove unnecessary brackets in a conditional
 1.28 23-Sep-2021  skrll Print the cache information in similar formats and arm and aarch64, e.g.

arm before
[ 1.0000000] cpu0: 32KB/64B 2-way L1 PIPT Instruction cache
[ 1.0000000] cpu0: 32KB/64B 2-way write-back-locking-C L1 PIPT Data cache
[ 1.0000000] cpu0: 2304KB/64B 16-way write-through L2 PIPT Unified cache

arm after
[ 1.0000000] cpu0: L1 32KB/64B 2-way (256 set) PIPT Instruction cache
[ 1.0000000] cpu0: L1 32KB/64B 2-way (256 set) write-back-locking-C PIPT Data cache
[ 1.0000000] cpu0: L2 2304KB/64B 16-way (2304 set) write-through PIPT Unified cache

aarch64 before
[ 1.0000030] cpu1: L1 48KB/64B*256L*3W PIPT Instruction cache
[ 1.0000030] cpu1: L1 32KB/64B*256L*2W PIPT Data cache
[ 1.0000030] cpu1: L2 2048KB/64B*2048L*16W PIPT Unified cache

aarch64 after
[ 1.0000030] cpu1: L1 48KB/64B 3-way (256 set) PIPT Instruction cache
[ 1.0000030] cpu1: L1 32KB/64B 2-way (256 set) PIPT Data cache
[ 1.0000030] cpu1: L2 2048KB/64B 16-way (2048 set) PIPT Unified cache
 1.27 11-Jan-2021  skrll Small simplification
 1.26 22-Oct-2020  skrll branches: 1.26.2;
Use the dmb/dsb/isb macros... if nothing else they're all now consistent
about the "memory" assembler contraint.

No binary change
 1.25 22-Oct-2020  skrll Simplify the cpufunc.h header, i.e. always use #include <arm/cpufunc.h>
 1.24 02-Aug-2020  maxv Add support for Privileged Access Never (ARMv8.1-PAN).

PAN provides the same functionality as SMAP on x86: it forbids kernel
access to userland pages when PSTATE.PAN=1, and allows such accesses when
PSTATE.PAN=0.

We clear SCTLR_SPAN, to guarantee that PAN=1 each time the kernel is
entered. We catch PAN faults and panic right away without further
processing. In copyin, copyout, etc, we temporarily authorize access to
userland pages.

PAN is a very useful exploit mitigation. Reviewed by ryo@, thanks. Tested
on Qemu. Enabled by default.
 1.23 04-Jul-2020  rin Fix previous; add missing <uvm/uvm.h> include.
 1.22 04-Jul-2020  rin Fix uvmexp.ncolors for some big.LITTLE configuration; it is uncertain
which CPU is used as primary, and as a result, secondary CPUs can
require larger number of colors.

In order to solve this problem, update uvmexp.ncolors via
uvm_page_recolor(9) when secondary CPUs are attached, as done for
other ports like x86.

Pointed out by jmcneill@, and discussed on port-arm@:
http://mail-index.netbsd.org/port-arm/2020/07/03/msg006837.html

Tested and OK'd by ryo@.
 1.21 01-Jul-2020  ryo Switch the Icache sync operation to the necessary and sufficient one according to the CTR_EL0.DIC and CTR_EL0.IDC flags.

If CTR_EL0.DIC=1, Icache invalidation is not required.
If CTR_EL0.IDC=1, Dcache clean before Icache invalidation is not required.
CLIDR_EL1.LoC is 0, or CLIDR_EL1.LoUIS and CLIDR_EL1.LoUU are 0, Dcache clean is not required as well.

SEE ALSO ARMARM, "CTR_EL0 Cache Type Register", and "CLIDR_EL1 Cache Level ID Register"
 1.20 25-May-2020  ryo cache information can be detected correctly on newer CPUs

- add VPIPT cache type
- adapt to 64-bit CCSIDR (ARMv8.3-CCIDX)
- CCSIDR:[WT,WB,PA,WA] are deprecated
- show number of cache lines when attaching cpu
 1.19 23-May-2020  ryo Not only the kernel thread, but also the userland PAC keys
(APIA,APIB,APDA,APDB,APGA) are now randomly initialized at exec, and switched
when context switch.
userland programs are able to perform pointer authentication on ARMv8.3+PAC cpu.

reviewd by maxv@, thanks.
 1.18 15-May-2020  ryo SCTLR_EnIA should be enabled in the caller(locore).

For some reason, gcc make aarch64_pac_init() function non-leaf, and it uses paciasp/autiasp.
 1.17 12-Apr-2020  maxv Add support for Pointer Authentication (PAC).

We use the "pac-ret" option, to sign the return instruction pointer on
function entry, and authenticate it on function exit. This acts as a
mitigation against ROP.

The authentication uses a per-lwp (secret) I-A key stored in the 128bit
APIAKey register and part of the lwp context. During lwp creation, the
kernel generates a random key, and during context switches, it installs
the key of the target lwp on the CPU.

Userland cannot read the APIAKey register directly. However, it can sign
its pointers with it, because the register is architecturally shared
between userland and the kernel. Although part of the CPU design, it is
a bit of an undesired behavior, because it allows to forge valid kernel
pointers from userland. To avoid that, we don't share the key with
userland, and rather switch it in EL0<->EL1 transitions. This means that
when userland executes, a different key is loaded in APIAKey than the one
the kernel uses. For now the userland key is a fixed 128bit zero value.

The DDB stack unwinder is changed to strip the authentication code from
the pointers in lr.

Two problems are known:

* Currently the idlelwps' keys are not really secret. This is because
the RNG is not yet available when we spawn these lwps. Not overly
important, but would be nice to fix with UEFI RNG.
* The key switching in EL0<->EL1 transitions is not the most optimized
code on the planet. Instead of checking aarch64_pac_enabled, it would
be better to hot-patch the code at boot time, but there currently is
no hot-patch support on aarch64.

Tested on Qemu.
 1.16 05-Apr-2020  jmcneill branches: 1.16.2;
Cleanup CPU attach output:
- Always print the core's vendor and product name.
- Print the CPU ID on the same line as the name. Single line of dmesg
per core.
- Use aprint_verbose for reporting additional details.
 1.15 15-Jan-2020  mrg port the arm64 cpu topology setup for big.little to arm.

rename arm64 cpu_do_topology() to arm_cpu_do_topology() and
call it from both arm cpu_attach().

replace both aarch64_set_topology() inline code in arm
cpu_attach() with new arm_cpu_do_topology(), which is called
by the arm64 locore as well (possibly not needed, which would
allow it to become static.)

not yet tested on a real big.little armv7 system. tested
on rockpro64 and pinebook pro.
 1.14 12-Jan-2020  mrg provide some semblance of valid cpu topology for big.little systems.

while attaching cpus, if the FDT provides "capacity-dmips-mhz" track
the fastest set, and call cpu_topology_set() with slow=true for any
cpus that are not the fastest.

bug fix for cpu_topology_set(): actually set ci_is_slow for slow cpus.

with this change, and -current's recent scheduler changes, this means
that long running processes run on the faster cores. on RK3399 based
systems, i am seeing 20-50% speed ups for many tasks.


XXX: all this can be made common with armv7 big.little.
 1.13 09-Jan-2020  ad - Many small tweaks to the SMT awareness in the scheduler. It does a much
better job now at keeping all physical CPUs busy, while using the extra
threads to help out. In particular, during preempt() if we're using SMT,
try to find a better CPU to run on and teleport curlwp there.

- Change the CPU topology stuff so it can work on asymmetric systems. This
mainly entails rearranging one of the CPU lists so it makes sense in all
configurations.

- Add a parameter to cpu_topology_set() to note that a CPU is "slow", for
where there are fast CPUs and slow CPUs, like with the Rockwell RK3399.
Extend the SMT awareness to try and handle that situation too (keep fast
CPUs busy, use slow CPUs as helpers).
 1.12 20-Dec-2019  ad branches: 1.12.2;
Some more CPU topology stuff:

- Use cegger@'s ACPI SRAT parsing code to figure out NUMA node ID for each
CPU as it is attached.

- For scheduler experiments with SMT, flag CPUs with the lowest numbered SMT
IDs as "primaries", link back to the primaries from secondaries, and build
a circular list of CPUs in each package with identical SMT IDs.

- No need for package/core/smt/numa IDs to be anything other than a u_int.
 1.11 05-Dec-2019  ryo fix build error by my previous commit
 1.10 05-Dec-2019  ryo MAX_CACHE_LEVEL * struct aarch64_cache_info are required to statically allocate for cpu0.

avoid "cpu0: L2 512KB/64B 16-way write-back read-allocate write-allocate PIPT *UNK* cache" by r1.8
 1.9 02-Dec-2019  ad Take the basic CPU topology information we already collect, and use it
to make circular lists of CPU siblings in the same core, and in the
same package. Nothing fancy, just enough to have a bit of fun in the
scheduler trying out different tactics.
 1.8 22-Nov-2019  mlelstv Make cache operations available early.
 1.7 01-Oct-2019  chs in many device attach paths, allocate memory with KM_SLEEP instead of KM_NOSLEEP
and remove code to handle failures that can no longer happen.
 1.6 12-Sep-2019  jmcneill Do not attempt to change coherency_unit at runtime. Instead, if the
required coherency unit is greater than COHERENCY_UNIT in a MULTIPROCESSOR
kernel, just panic instead.

This makes non-MULTIPROCESSOR kernels work again.
 1.5 21-Dec-2018  ryo branches: 1.5.4;
- add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)
 1.4 29-Aug-2018  ryo Update coherency_unit if needed.

Pointed out by skrll@
 1.3 26-Aug-2018  ryo add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!
 1.2 17-Jul-2018  christos add default statement
 1.1 01-Apr-2018  ryo branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1.4.3 21-Apr-2020  martin Sync with HEAD
 1.1.4.2 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.1.4.1 10-Jun-2019  christos Sync with HEAD
 1.1.2.5 26-Dec-2018  pgoyette Sync with HEAD, resolve a few conflicts
 1.1.2.4 06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.1.2.3 28-Jul-2018  pgoyette Sync with HEAD
 1.1.2.2 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.2.1 01-Apr-2018  pgoyette file cpufunc.c was added on branch pgoyette-compat on 2018-04-07 04:12:10 +0000
 1.5.4.2 01-Jan-2021  martin Pull up following revision(s) (requested by rin in ticket #1170):

sys/arch/aarch64/aarch64/cpufunc.c: revision 1.22 (patch)
sys/arch/aarch64/aarch64/cpufunc.c: revision 1.23 (patch)
sys/arch/aarch64/aarch64/pmap.c: revision 1.81

Set uvmexp.ncolors appropriately, which is required for some CPU
models with VIPT icache.

Otherwise, alias in virtual address results in inconsistent results,
at least for applications that rewrite text of other process, e.g.,
GDB for arm32.

Also, this hopefully fixes other unexpected failures due to alias.
Confirmed that there's no observable regression in performance;
difference in ``time make -j8'' for GENERIC64 kernel on BCM2837
with and without setting uvmexp.ncolors is within 0.1%.

Thanks to ryo@ for discussion.


Fix uvmexp.ncolors for some big.LITTLE configuration; it is uncertain
which CPU is used as primary, and as a result, secondary CPUs can
require larger number of colors.

In order to solve this problem, update uvmexp.ncolors via
uvm_page_recolor(9) when secondary CPUs are attached, as done for
other ports like x86.

Pointed out by jmcneill@, and discussed on port-arm@:
http://mail-index.netbsd.org/port-arm/2020/07/03/msg006837.html
Tested and OK'd by ryo@.

Fix previous; add missing <uvm/uvm.h> include.
 1.5.4.1 22-Sep-2019  martin Pull up following revision(s) (requested by jmcneill in ticket #221):

sys/arch/aarch64/aarch64/cpufunc.c: revision 1.6

Do not attempt to change coherency_unit at runtime. Instead, if the
required coherency unit is greater than COHERENCY_UNIT in a MULTIPROCESSOR
kernel, just panic instead.

This makes non-MULTIPROCESSOR kernels work again.
 1.12.2.1 17-Jan-2020  ad Sync with head.
 1.16.2.1 20-Apr-2020  bouyer Sync with HEAD
 1.26.2.1 03-Apr-2021  thorpej Sync with HEAD.
 1.8 11-Feb-2021  ryo include "opt_gprof.h" so that _PROF_PROLOGUE works properly in ENTRY() macro in *.S files
 1.7 19-Jul-2020  ryo branches: 1.7.2;
fix build error with LLVM.
 1.6 01-Jul-2020  ryo Switch the Icache sync operation to the necessary and sufficient one according to the CTR_EL0.DIC and CTR_EL0.IDC flags.

If CTR_EL0.DIC=1, Icache invalidation is not required.
If CTR_EL0.IDC=1, Dcache clean before Icache invalidation is not required.
CLIDR_EL1.LoC is 0, or CLIDR_EL1.LoUIS and CLIDR_EL1.LoUU are 0, Dcache clean is not required as well.

SEE ALSO ARMARM, "CTR_EL0 Cache Type Register", and "CLIDR_EL1 Cache Level ID Register"
 1.5 01-Jun-2020  ryo even if the line size of Icache and Dcache is different, it was fixed to work correctly.

- MAX(IcacheShift,DcacheShift) is wrong and should be MIN(IcacheShift,DcacheShift).
Dcache and Icache are now done in independent loops instead of in the same loop.
- simplify the handling of cache_handle_range() macro arguments.
- cache_handle_range macro doesn't include "ret" anymore.
 1.4 12-Sep-2019  ryo even if "no options MULTIPROCESSOR" requires isb after tlbi op. since it should be harmless, dsb is also added.
fixed a problem that rockpro64 doesn't boot without MULTIPROCESSOR.
 1.3 21-Dec-2018  ryo branches: 1.3.4;
- add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)
 1.2 23-Jul-2018  ryo * fix icache invalidations.
* "ic ivau" (aarch64_icache_sync_range) with VA generates permission fault in some situations, therefore use KSEG address for now.
 1.1 01-Apr-2018  ryo branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1.4.2 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.1.4.1 10-Jun-2019  christos Sync with HEAD
 1.1.2.4 26-Dec-2018  pgoyette Sync with HEAD, resolve a few conflicts
 1.1.2.3 28-Jul-2018  pgoyette Sync with HEAD
 1.1.2.2 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.2.1 01-Apr-2018  pgoyette file cpufunc_asm_armv8.S was added on branch pgoyette-compat on 2018-04-07 04:12:10 +0000
 1.3.4.1 22-Sep-2019  martin Pull up following revision(s) (requested by ryo in ticket #213):

sys/arch/aarch64/aarch64/cpufunc_asm_armv8.S: revision 1.4

even if "no options MULTIPROCESSOR" requires isb after tlbi op. since it should be harmless, dsb is also added.
fixed a problem that rockpro64 doesn't boot without MULTIPROCESSOR.
 1.7.2.1 03-Apr-2021  thorpej Sync with HEAD.
 1.41 01-Mar-2023  riastradh aarch64: Optimization: Omit needless membar when triggering softint.

When we are triggering a softint, it can't already hold any mutexes.
So any path to mutex_exit(mtx) must go via mutex_enter(mtx), which is
always done with atomic r/m/w, and we need not issue any explicit
barrier between ci->ci_curlwp = softlwp and a potential load of
mtx->mtx_owner in mutex_exit.

PR kern/57240

XXX pullup-9
XXX pullup-10
 1.40 23-Feb-2023  riastradh aarch64: Add missing barriers in cpu_switchto.

Details in comments.

Note: This is a conservative change that inserts a barrier where
there was a comment saying none is needed, which is probably correct.
The goal of this change is to systematically add barriers to be
confident in correctness; subsequent changes may remove some bariers,
as an optimization, with an explanation of why each barrier is not
needed.

PR kern/57240

XXX pullup-9
XXX pullup-10
 1.39 19-Sep-2022  ryo branches: 1.39.4;
Move cpu_Debugger() into a more suitable file, from cpuswitch.S to db_interface.c.
 1.38 07-Jun-2022  ryo On aarch64, ddb backtrace can be performed without framepointer by specifying
the /s modifier to the ddb trace command (trace/s, bt/s).
The default is trace with framepointer (same as before).

This allows backtracing even on kernels compiled with -fomit-frame-pointer.
 1.37 07-Jun-2022  ryo use stp if possible.
 1.36 03-Jun-2022  ryo optimize. reduce 2 instructions.
 1.35 31-May-2022  ryo make a frame pointer to show a backtrace correctly.
 1.34 06-May-2022  ryo Sprinkle isb after modifying system regs of pointer auth.
With options ARMV83_PAC, it now works on native Mac M1.

TODO: Multiple ISBs should be combined in one place.
 1.33 09-Mar-2021  ryo Add support hardware breakpoint and watchpoint again.

Limited support for hardware watchpoint has been available for some time, but it
has not been working properly. In addition, it stopped working at the time of
the PTRACE support commit on 2018-12-13. This has been fixed to work correctly,
and also fixed to be practical by sharing hardware watchpoints and breakpoints
between CPUs on MULTIPROCESSOR.

Also fixed a bug that causes a malfunction when switching CPUs with
"machine cpu N" when entering ddb mode from other than cpu_Debugger().

I have confirmed that the CPU can be switched by "machine cpu N" and return from
ddb properly in each case where ddb is called triggered by ddb break/watchpoint,
hardware break/watchpoint, and cpu_Debugger().
 1.32 26-Dec-2020  jmcneill Always issue isb after cpacr_el1 writes since it is a context-changing
operation.
 1.31 22-Oct-2020  skrll branches: 1.31.2;
Use the correct (more relaxed) membar_exit barrier in cpu_switchto_softint
 1.30 13-Oct-2020  skrll Use corrcet membar_exit barrier
 1.29 06-Oct-2020  skrll move #include "opt_compat_netbsd32.h" to where it's required
 1.28 30-Sep-2020  skrll Move el[01]_trap_exit into vectors.S where the callers exist
 1.27 26-Sep-2020  skrll Use 'lr' instead of 'x30' in an instruction for clarity
 1.26 26-Sep-2020  skrll Fix a comment
 1.25 12-Aug-2020  skrll Part II of ad's aarch64 performance improvements (cpu_switch.S bugs are
all mine)

- Use tpidr_el1 to hold curlwp and not curcpu, because curlwp is accessed
much more often by MI code. It also makes curlwp preemption safe and
allows aarch64_curlwp() to be a const function (curcpu must be volatile).

- Make ASTs operate per-LWP rather than per-CPU, otherwise sometimes LWPs
can see spurious ASTs (which doesn't cause a problem, it just means some
time may be wasted).

- Use plain stores to set/clear ASTs. Make sure ASTs are always set on the
same CPU as the target LWP, and delivered via IPI if posted from a remote
CPU so that they are resolved quickly.

- Add some cache line padding to struct cpu_info, to match x86.

- Add a memory barrier in a couple of places where ci_curlwp is set. This
is needed whenever an LWP that is resuming on the CPU could hold an
adaptive mutex. The barrier needs to drain the CPU's store buffer, so
that the update to ci_curlwp becomes globally visible before the LWP can
resume and call mutex_exit(). By my reading of the ARM docs it looks like
the instruction I used will do the right thing, but I'm not 100% sure.
 1.24 06-Aug-2020  ryo revert the changes of http://mail-index.netbsd.org/source-changes/2020/08/03/msg120183.html

This change is overengineered.
bus_space_{peek,poke}_N does not have to be reentrant nor available for interrupt context.

requested by skrll@
 1.23 03-Aug-2020  ryo Fix a problem in which a fault occured in an interrupt handler during copyin/copyout was erroneously detected as being occured by copyin.

- keep idepth in faultbuf and compare it to avoid unnecessary fault recovery
- make cpu_set_onfault() nestable to use bus_space_{peek,poke}()
in hardware interrupt handlers during copyin & copyout.
 1.22 23-Jul-2020  skrll Reduce the window of having interrupts disabled in cpu_switchto{,_softint}
and ensure astpending is checked with interrupts disabled.
 1.21 23-May-2020  ryo Not only the kernel thread, but also the userland PAC keys
(APIA,APIB,APDA,APDB,APGA) are now randomly initialized at exec, and switched
when context switch.
userland programs are able to perform pointer authentication on ARMv8.3+PAC cpu.

reviewd by maxv@, thanks.
 1.20 22-May-2020  ryo fix to do backtrace properly for running LWPs and cpu_lwp_fork().
when dump of pcb_tf, only the switchframe part is now displayed instead of the whole trapframe.
 1.19 15-May-2020  ryo use ldp if possible
 1.18 13-Apr-2020  maxv Meant to do a store here, not a load. Ie we want to replace the initial
weak key by the stronger one we just generated.

Rototilled this place too many times.
 1.17 13-Apr-2020  maxv Add support for Branch Target Identification (BTI).

On the executable pages that have the GP (Guarded Page) bit, the semantic
of the "br" and "blr" instructions is changed: the CPU expects the first
instruction of the jump/call target to be "bti", and faults if it isn't.

We add the GP bit on the kernel .text pages (and incidentally the .rodata
pages, but we don't care). The compiler adds a "bti c" instruction at the
beginning of each C function. We modify the ENTRY() macros to manually add
"bti c" in the asm functions.

cpuswitch.S needs a specific change: with "br x27" the CPU expects "bti j",
which is bad because the functions begin with "bti c"; switch to "br x16",
for the CPU to accept "bti c".

BTI helps defend against JOP/COP. Tested on Qemu.
 1.16 12-Apr-2020  maxv Add support for Pointer Authentication (PAC).

We use the "pac-ret" option, to sign the return instruction pointer on
function entry, and authenticate it on function exit. This acts as a
mitigation against ROP.

The authentication uses a per-lwp (secret) I-A key stored in the 128bit
APIAKey register and part of the lwp context. During lwp creation, the
kernel generates a random key, and during context switches, it installs
the key of the target lwp on the CPU.

Userland cannot read the APIAKey register directly. However, it can sign
its pointers with it, because the register is architecturally shared
between userland and the kernel. Although part of the CPU design, it is
a bit of an undesired behavior, because it allows to forge valid kernel
pointers from userland. To avoid that, we don't share the key with
userland, and rather switch it in EL0<->EL1 transitions. This means that
when userland executes, a different key is loaded in APIAKey than the one
the kernel uses. For now the userland key is a fixed 128bit zero value.

The DDB stack unwinder is changed to strip the authentication code from
the pointers in lr.

Two problems are known:

* Currently the idlelwps' keys are not really secret. This is because
the RNG is not yet available when we spawn these lwps. Not overly
important, but would be nice to fix with UEFI RNG.
* The key switching in EL0<->EL1 transitions is not the most optimized
code on the planet. Instead of checking aarch64_pac_enabled, it would
be better to hot-patch the code at boot time, but there currently is
no hot-patch support on aarch64.

Tested on Qemu.
 1.15 08-Jan-2020  skrll branches: 1.15.4;
oldlwp is always non-NULL in cpu_switchto so remove the test for NULL.
 1.14 08-Jan-2020  ad Hopefully fix some problems seen with MP support on non-x86, in particular
where curcpu() is defined as curlwp->l_cpu:

- mi_switch(): undo the ~2007ish optimisation to unlock curlwp before
calling cpu_switchto(). It's not safe to let other actors mess with the
LWP (in particular l->l_cpu) while it's still context switching. This
removes l->l_ctxswtch.

- Move the LP_RUNNING flag into l->l_flag and rename to LW_RUNNING since
it's now covered by the LWP's lock.

- Ditch lwp_exit_switchaway() and just call mi_switch() instead. Everything
is in cache anyway so it wasn't buying much by trying to avoid saving old
state. This means cpu_switchto() will never be called with prevlwp ==
NULL.

- Remove some KERNEL_LOCK handling which hasn't been needed for years.
 1.13 20-Dec-2019  ryo branches: 1.13.2;
Add a speculation barrier after the 'eret'.

Some aarch64 cpus speculatively execute instructions after 'eret',
and this potentiates side-channel attack.

from
https://github.com/torvalds/linux/commit/679db70801da9fda91d26caf13bf5b5ccc74e8e8
 1.12 15-Sep-2019  skrll Trailing whitespace
 1.11 27-Dec-2018  mrg branches: 1.11.4;
make savecore for arm64 basically work.

- move MD lwp "md_ktf" member into struct pcb. the pcb is used by
the gdb "bsd-kvm" target code to find the stack of each thread
and needs to be available in a well known location.
- implement aarch64_nbsd_supply_pcb() in GDB. makes basic gdb work
on a crash dump.
- remove '#if L_MD_KTF + 8 == L_MD_CPACR' conditional code, as there
is no more L_MD_KTF.

with this gdb has minimal working functionality with "target kvm",
and crash can at least "ps" on a crash dump.

ok skrll.
 1.10 13-Dec-2018  ryo add support PT_STEP
 1.9 12-Dec-2018  ryo - need to save/restore interrupt mask when entering/exiting to/from cpu_switchto_softint().
- when call dosoftints from cpu_idle, interrupts should be disabled.

rarely, lwp stack had been exhausted when high interrupts.
reported by alnsn@. thanks.
 1.8 11-Dec-2018  ryo need to save/restore also x1. x1 is in-use as ipl.
 1.7 07-Dec-2018  ryo modifying curlwp->l_md_ktf, curlwp->l_md_cpacr, and curlwp should be protected by a critical section.
 1.6 08-Nov-2018  maxv Track the stack with kASan on aarch64. Same principle as on amd64. Illegal
accesses occurring there are now detected.

Originally written by me, but reworked by ryo@, thanks.
 1.5 12-Oct-2018  ryo add initial support of COMPAT_NETBSD32 on AArch64.
arm ELF32 EABI binaries could be execute in AArch32 state on AArch64. A32 THUMB mode is not supported yet.
 1.4 17-Jul-2018  christos centralize fp,lr definitions
 1.3 17-Jul-2018  ryo fix build with aarch64 gcc/gas
 1.2 09-Jul-2018  ryo remove unused code
 1.1 01-Apr-2018  ryo branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1.4.4 21-Apr-2020  martin Sync with HEAD
 1.1.4.3 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.1.4.2 08-Apr-2020  martin Merge changes from current as of 20200406
 1.1.4.1 10-Jun-2019  christos Sync with HEAD
 1.1.2.7 18-Jan-2019  pgoyette Synch with HEAD
 1.1.2.6 26-Dec-2018  pgoyette Sync with HEAD, resolve a few conflicts
 1.1.2.5 26-Nov-2018  pgoyette Sync with HEAD, resolve a couple of conflicts
 1.1.2.4 20-Oct-2018  pgoyette Sync with head
 1.1.2.3 28-Jul-2018  pgoyette Sync with HEAD
 1.1.2.2 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.2.1 01-Apr-2018  pgoyette file cpuswitch.S was added on branch pgoyette-compat on 2018-04-07 04:12:10 +0000
 1.11.4.2 31-Jul-2023  martin Pull up following revision(s) (requested by riastradh in ticket #1676):

sys/arch/ia64/ia64/vm_machdep.c: revision 1.18
sys/arch/powerpc/powerpc/locore_subr.S: revision 1.67
sys/arch/aarch64/aarch64/locore.S: revision 1.91
sys/arch/mips/include/asm.h: revision 1.74
sys/arch/hppa/include/cpu.h: revision 1.13
sys/arch/arm/arm/armv6_start.S: revision 1.38
sys/arch/evbmips/ingenic/cpu_startup.S: revision 1.2
sys/arch/mips/mips/locore.S: revision 1.229
sys/arch/aarch64/aarch64/cpuswitch.S: revision 1.40
sys/arch/alpha/include/asm.h: revision 1.45
sys/arch/sparc64/sparc64/locore.s: revision 1.432
sys/arch/vax/vax/subr.S: revision 1.42
sys/arch/mips/mips/locore_mips3.S: revision 1.116
sys/arch/ia64/ia64/machdep.c: revision 1.44
sys/arch/arm/arm32/cpuswitch.S: revision 1.106
sys/arch/sparc/sparc/locore.s: revision 1.284
(all via patch)

aarch64: Add missing barriers in cpu_switchto.
Details in comments.

Note: This is a conservative change that inserts a barrier where
there was a comment saying none is needed, which is probably correct.
The goal of this change is to systematically add barriers to be
confident in correctness; subsequent changes may remove some bariers,
as an optimization, with an explanation of why each barrier is not
needed.

PR kern/57240

alpha: Add missing barriers in cpu_switchto.
Details in comments.

arm32: Add missing barriers in cpu_switchto.
Details in comments.

hppa: Add missing barriers in cpu_switchto.
Not sure hppa has ever had working MULTIPROCESSOR, so maybe no
pullups needed?

ia64: Add missing barriers in cpu_switchto.
(ia64 has never really worked, so no pullups needed, right?)

mips: Add missing barriers in cpu_switchto.
Details in comments.

powerpc: Add missing barriers in cpu_switchto.
Details in comments.

sparc: Add missing barriers in cpu_switchto.

sparc64: Add missing barriers in cpu_switchto.
Details in comments.

vax: Note where cpu_switchto needs barriers.

Not sure vax has ever had working MULTIPROCESSOR, though, and I'm not
even sure how to spell store-before-load barriers on VAX, so no
functional change for now.
 1.11.4.1 24-Dec-2019  martin Pull up following revision(s) (requested by ryo in ticket #574):

sys/arch/aarch64/include/asm.h: revision 1.5
sys/arch/aarch64/aarch64/cpuswitch.S: revision 1.13

Add a speculation barrier after the 'eret'.

Some aarch64 cpus speculatively execute instructions after 'eret',
and this potentiates side-channel attack.

from
https://github.com/torvalds/linux/commit/679db70801da9fda91d26caf13bf5b5ccc74e8e8
 1.13.2.1 17-Jan-2020  ad Sync with head.
 1.15.4.1 20-Apr-2020  bouyer Sync with HEAD
 1.31.2.2 03-Apr-2021  thorpej Sync with HEAD.
 1.31.2.1 03-Jan-2021  thorpej Sync w/ HEAD.
 1.39.4.1 31-Jul-2023  martin Pull up following revision(s) (requested by riastradh in ticket #264):

sys/arch/ia64/ia64/vm_machdep.c: revision 1.18
sys/arch/powerpc/powerpc/locore_subr.S: revision 1.67
sys/arch/aarch64/aarch64/locore.S: revision 1.91
sys/arch/mips/include/asm.h: revision 1.74
sys/arch/hppa/include/cpu.h: revision 1.13
sys/arch/arm/arm/armv6_start.S: revision 1.38
sys/arch/evbmips/ingenic/cpu_startup.S: revision 1.2
sys/arch/mips/mips/locore.S: revision 1.229
sys/arch/aarch64/aarch64/cpuswitch.S: revision 1.40
sys/arch/alpha/include/asm.h: revision 1.45
sys/arch/sparc64/sparc64/locore.s: revision 1.432
sys/arch/vax/vax/subr.S: revision 1.42
sys/arch/mips/mips/locore_mips3.S: revision 1.116
sys/arch/riscv/riscv/cpu_switch.S: revision 1.3
sys/arch/ia64/ia64/machdep.c: revision 1.44
sys/arch/arm/arm32/cpuswitch.S: revision 1.106
sys/arch/sparc/sparc/locore.s: revision 1.284

aarch64: Add missing barriers in cpu_switchto.
Details in comments.

Note: This is a conservative change that inserts a barrier where
there was a comment saying none is needed, which is probably correct.
The goal of this change is to systematically add barriers to be
confident in correctness; subsequent changes may remove some bariers,
as an optimization, with an explanation of why each barrier is not
needed.

PR kern/57240

alpha: Add missing barriers in cpu_switchto.
Details in comments.

arm32: Add missing barriers in cpu_switchto.
Details in comments.

hppa: Add missing barriers in cpu_switchto.
Not sure hppa has ever had working MULTIPROCESSOR, so maybe no
pullups needed?

ia64: Add missing barriers in cpu_switchto.
(ia64 has never really worked, so no pullups needed, right?)

mips: Add missing barriers in cpu_switchto.
Details in comments.

powerpc: Add missing barriers in cpu_switchto.
Details in comments.

riscv: Add missing barriers in cpu_switchto.
Details in comments.

sparc: Add missing barriers in cpu_switchto.

sparc64: Add missing barriers in cpu_switchto.
Details in comments.

vax: Note where cpu_switchto needs barriers.

Not sure vax has ever had working MULTIPROCESSOR, though, and I'm not
even sure how to spell store-before-load barriers on VAX, so no
functional change for now.
 1.12 07-Feb-2024  msaitoh Remove ryo@'s mail addresses.
 1.11 11-Dec-2020  skrll s:aarch64/cpufunc.h:arm/cpufunc.h:

a baby step in the grand arm header unification challenge
 1.10 09-Jul-2020  ryo branches: 1.10.2;
fix build error of /usr/sbin/crash

pointed out by rjs@, thanks.
 1.9 08-Jul-2020  ryo Determination of A64,A32,T32 for disasm is now done in strrdisasm() instead of the caller.
correctly disassemble by processor state if defined DEBUG_DUMP_ON_USERFAULT or DEBUG_DDB_ON_USERFAULT.
 1.8 08-Jul-2020  ryo don't read memory directly.
In particular, userland memory may be unmapped at the time of reading.
 1.7 28-Oct-2019  joerg Format string annotation for strdisasm_printf
 1.6 27-Jan-2019  pgoyette Merge the [pgoyette-compat] branch
 1.5 12-Oct-2018  ryo add initial support of COMPAT_NETBSD32 on AArch64.
arm ELF32 EABI binaries could be execute in AArch32 state on AArch64. A32 THUMB mode is not supported yet.
 1.4 15-Sep-2018  jakllsch make kernel-groveling crash(8) work on aarch64
 1.3 17-Jul-2018  ryo use panic() instead of some printf to show fault status.
useful for ddb "show panic" command.
 1.2 01-Apr-2018  ryo branches: 1.2.2;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.28;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.28.4 20-Oct-2018  pgoyette Sync with head
 1.1.28.3 30-Sep-2018  pgoyette Ssync with HEAD
 1.1.28.2 28-Jul-2018  pgoyette Sync with HEAD
 1.1.28.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file db_disasm.c was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.2.2.2 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.2.2.1 10-Jun-2019  christos Sync with HEAD
 1.10.2.1 14-Dec-2020  thorpej Sync w/ HEAD.
 1.24 07-Feb-2024  msaitoh Remove ryo@'s mail addresses.
 1.23 02-Aug-2023  skrll Relax the TLB invalidation from full to by va for writing to kernel text
in db_write_text.
 1.22 02-Nov-2022  skrll Restore a '\n' I accidentally removed in 1.16
 1.21 23-Oct-2022  skrll KNF
 1.20 15-Oct-2022  jmcneill Use "non-posted" instead of "strongly ordered" to describe nGnRnE mappings

Rename the following defines:
- _ARM_BUS_SPACE_MAP_STRONGLY_ORDERED to BUS_SPACE_MAP_NONPOSTED
- PMAP_DEV_SO to PMAP_DEV_NP
- LX_BLKPAG_ATTR_DEVICE_MEM_SO to LX_BLKPAG_ATTR_DEVICE_MEM_NP
Rename the following option:
- AARCH64_DEVICE_MEM_STRONGLY_ORDERED to AARCH64_DEVICE_MEM_NONPOSTED
 1.19 19-Sep-2022  ryo Move cpu_Debugger() into a more suitable file, from cpuswitch.S to db_interface.c.
 1.18 29-May-2022  ryo Use the PAR register to check for accessibility in db_(read|write)_bytes().

db_(read|write)_bytes() uses the TTBR[01] at that time, so it must check
if it is accessible in context at that time, not pmap_extract()
which uses the struct pmap of the process.

- It also checks if the address is writable.
- db_write_bytes() also requires ARMV81_PAN control.
 1.17 26-May-2022  ryo In ddb, fixed "trace/u" and user process memory read/write to work correctly.

In the softint context, curlwp points the kernel lwp, so to get the pmap
of a user process, we had to use curcpu()->ci_onproc->l_proc instead of
curproc (curlwp->l_proc). Adviced by ad@.
 1.16 19-May-2021  skrll Make even more pmap agnostic
 1.15 19-May-2021  skrll Reduce characters to print in db_pte_print and unwrap some short lines.
 1.14 03-May-2021  skrll branches: 1.14.2;
Remove unnecssary brackets. Same binary before and after.
 1.13 30-Apr-2021  skrll Make the ddb for pmap / pte information pmap agnostic
 1.12 05-Feb-2021  joerg branches: 1.12.4;
Avoid duplicate definition of ddb_regs in crash(8).
 1.11 11-Dec-2020  skrll s:aarch64/cpufunc.h:arm/cpufunc.h:

a baby step in the grand arm header unification challenge
 1.10 14-Sep-2020  ryo branches: 1.10.2;
sprinkle LE32TOH to fetch instructions on aarch64eb
 1.9 11-Aug-2020  skrll Improve a comment
 1.8 02-Aug-2020  maxv Add support for Privileged Access Never (ARMv8.1-PAN).

PAN provides the same functionality as SMAP on x86: it forbids kernel
access to userland pages when PSTATE.PAN=1, and allows such accesses when
PSTATE.PAN=0.

We clear SCTLR_SPAN, to guarantee that PAN=1 each time the kernel is
entered. We catch PAN faults and panic right away without further
processing. In copyin, copyout, etc, we temporarily authorize access to
userland pages.

PAN is a very useful exploit mitigation. Reviewed by ryo@, thanks. Tested
on Qemu. Enabled by default.
 1.7 27-Jan-2019  pgoyette Merge the [pgoyette-compat] branch
 1.6 15-Sep-2018  ryo fixed to keep PROT_EXECUTE when writing the page/block.
this is required when the L2 block to which the target address belongs
and the L2 block to which this function itself belongs are the same.
 1.5 06-Aug-2018  ryo set kernel text/rodata readonly by default.
add function db_write_text() for setting ddb breakpoint.
 1.4 03-Jun-2018  christos branches: 1.4.2;
PR/53338: David Binderman: Widen shift to the LHS type.
 1.3 31-May-2018  ryo implement properly branch_taken() and inst_unconditional_flow_transfer().
 1.2 01-Apr-2018  ryo Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.28;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.28.4 30-Sep-2018  pgoyette Ssync with HEAD
 1.1.28.3 06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.1.28.2 25-Jun-2018  pgoyette Sync with HEAD
 1.1.28.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file db_interface.c was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.4.2.1 10-Jun-2019  christos Sync with HEAD
 1.10.2.2 03-Apr-2021  thorpej Sync with HEAD.
 1.10.2.1 14-Dec-2020  thorpej Sync w/ HEAD.
 1.12.4.2 17-Jun-2021  thorpej Sync w/ HEAD.
 1.12.4.1 13-May-2021  thorpej Sync with HEAD.
 1.14.2.1 31-May-2021  cjep sync with head
 1.45 26-Oct-2022  riastradh ddb/db_active.h: New home for extern db_active.

This can be included unconditionally, and db_active can then be
queried unconditionally; if DDB is not in the kernel, then db_active
is a constant zero. Reduces need for #include opt_ddb.h, #ifdef DDB.
 1.44 29-May-2022  ryo - Display "cpu[<CPUINDEX>]" instead of "cpu[<CPUID>]".
- Also add cpu_info->ci_onproc to display.
 1.43 02-May-2022  skrll Only print the appropriate PAR fields for PAR.F={0,1}

Group the fields in the header.
 1.42 31-Oct-2021  skrll Rework Arm (32bit and 64bit) AP startup so that cpu_hatch doesn't sleep.

The AP initialisation code in cpu_init_secondary_processor will read and
initialise the required system registers and state for the BP to attach
and report.

Rework the interrupt handler code for this new sequence. Thankfully,
this removes a bunch of code for bcm2836mp.

The VFP detection handler on <= armv7 relies on the global undefined
handler being in place until the BP attaches vfp. That is, after the
APs have been spun up.

gicv3_its.c has a serialisation issue which is protected against in
the gicv3_its_cpu_init, which is called from cpu_hatch, with a spin
lock. The serialisation issue needs addressing more completely.

Tested on RPI3, Apple M1, QEMU, and lx2k

Fixes PR port-arm/56264:
diagnostic assertion "l->l_stat == LSONPROC" failed on RPI3
 1.41 17-Oct-2021  ryo When booted with "boot netbsd -1" (disable multiprocessor boot),
"ddb>continue" didn't work when ddb was started by breakpoint trap.
 1.40 30-Apr-2021  skrll Make the ddb for pmap / pte information pmap agnostic
 1.39 11-Mar-2021  ryo branches: 1.39.4;
Numeric modifiers conflict with the syntax interpretation of ddb, so use 'b', 'w', 'l', 'q' instead.
Also, change load/store('l','s') to 'r','w' like the other arch.

>db{0}> machine watch/1 hostname
>Bad modifier

>db{0}> machine watch/s1 hostname
>add watchpoint 0 as ffffc00001087848
 1.38 11-Mar-2021  ryo - fixed a problem where hardware {break,watch}points other than #0 could not be cleared
- hardware {break,watch}point addresses are now strictly checked
 1.37 09-Mar-2021  ryo Add support hardware breakpoint and watchpoint again.

Limited support for hardware watchpoint has been available for some time, but it
has not been working properly. In addition, it stopped working at the time of
the PTRACE support commit on 2018-12-13. This has been fixed to work correctly,
and also fixed to be practical by sharing hardware watchpoints and breakpoints
between CPUs on MULTIPROCESSOR.

Also fixed a bug that causes a malfunction when switching CPUs with
"machine cpu N" when entering ddb mode from other than cpu_Debugger().

I have confirmed that the CPU can be switched by "machine cpu N" and return from
ddb properly in each case where ddb is called triggered by ddb break/watchpoint,
hardware break/watchpoint, and cpu_Debugger().
 1.36 09-Mar-2021  ryo "machine cpu" command shows pc of trapframe and the symbol
 1.35 09-Mar-2021  ryo match the macro name to the order of the arguments. NFC.
 1.34 23-Feb-2021  mrg introduce DDB_END_CMD and replace more than 20 copies of the same
list of NULLs and 0. idea from rillig@.

all touched ports built, several booted.
 1.33 05-Feb-2021  joerg Avoid duplicate definition of ddb_regs in crash(8).
 1.32 18-Jan-2021  rin Fix build as crash(8); Protect db_md_meminfo_cmd() by defined(_KERNEL).
 1.31 17-Jan-2021  mrg add a command to dump the bootconfig passed meminfo.
 1.30 11-Dec-2020  skrll s:aarch64/cpufunc.h:arm/cpufunc.h:

a baby step in the grand arm header unification challenge
 1.29 03-Dec-2020  skrll Provide and use a sev() macro for the sev instruction.

While here use the correct barrier to ensure completion of memory accesses
before a couple of the sev() calls.
 1.28 22-Oct-2020  skrll branches: 1.28.2;
Use the dmb/dsb/isb macros... if nothing else they're all now consistent
about the "memory" assembler contraint.

No binary change
 1.27 22-Oct-2020  skrll Simplify the cpufunc.h header, i.e. always use #include <arm/cpufunc.h>
 1.26 12-Aug-2020  skrll Part II of ad's aarch64 performance improvements (cpu_switch.S bugs are
all mine)

- Use tpidr_el1 to hold curlwp and not curcpu, because curlwp is accessed
much more often by MI code. It also makes curlwp preemption safe and
allows aarch64_curlwp() to be a const function (curcpu must be volatile).

- Make ASTs operate per-LWP rather than per-CPU, otherwise sometimes LWPs
can see spurious ASTs (which doesn't cause a problem, it just means some
time may be wasted).

- Use plain stores to set/clear ASTs. Make sure ASTs are always set on the
same CPU as the target LWP, and delivered via IPI if posted from a remote
CPU so that they are resolved quickly.

- Add some cache line padding to struct cpu_info, to match x86.

- Add a memory barrier in a couple of places where ci_curlwp is set. This
is needed whenever an LWP that is resuming on the CPU could hold an
adaptive mutex. The barrier needs to drain the CPU's store buffer, so
that the update to ci_curlwp becomes globally visible before the LWP can
resume and call mutex_exit(). By my reading of the ARM docs it looks like
the instruction I used will do the right thing, but I'm not 100% sure.
 1.25 02-Jul-2020  jmcneill Add ddb "mach reset" command for Arm ports.
 1.24 22-May-2020  ryo fix to do backtrace properly for running LWPs and cpu_lwp_fork().
when dump of pcb_tf, only the switchframe part is now displayed instead of the whole trapframe.
 1.23 22-May-2020  ryo instead of reading memory directly, db_read_bytes() is used to avoid faults in ddb.
 1.22 13-May-2020  chs for "mach cpuinfo", print ci_biglock_count too.
 1.21 16-Apr-2020  ryo add the case of kdb_trap(-1) called from pic_ipi_ddb().
it depended on the update timing of 'db_recover'.
 1.20 29-Feb-2020  ryo branches: 1.20.4;
use macro
 1.19 07-Sep-2019  ryo prevent switching to CPUs that are not responding to IPI_DDB.
 1.18 07-Sep-2019  ryo add "machine cpuinfo/a" to show cpuinfo[] of all cpus
 1.17 11-Aug-2019  skrll Align output from db_md_cpuinfo_cmd
 1.16 20-Mar-2019  ryo - add reg_{s1e0r,s1e0w,s1e1r,s1e1w}_write() macro.
- show the result of AT insn at ddb "machine pte" command.
 1.15 19-Mar-2019  ryo - add ddb command "machine ttbr" to dump MMU tables.
- tidy up descriptions, usages and messages.
 1.14 27-Jan-2019  pgoyette Merge the [pgoyette-compat] branch
 1.13 27-Dec-2018  mrg make savecore for arm64 basically work.

- move MD lwp "md_ktf" member into struct pcb. the pcb is used by
the gdb "bsd-kvm" target code to find the stack of each thread
and needs to be available in a well known location.
- implement aarch64_nbsd_supply_pcb() in GDB. makes basic gdb work
on a crash dump.
- remove '#if L_MD_KTF + 8 == L_MD_CPACR' conditional code, as there
is no more L_MD_KTF.

with this gdb has minimal working functionality with "target kvm",
and crash can at least "ps" on a crash dump.

ok skrll.
 1.12 13-Dec-2018  ryo add support PT_STEP
 1.11 28-Nov-2018  ryo Comment out implementation specific registers to avoid illegal instruction trap on ThunderX
 1.10 28-Nov-2018  ryo don't pass illegal cpu index to cpu_lookup(). it may cause KASSERT.
 1.9 12-Oct-2018  ryo add initial support of COMPAT_NETBSD32 on AArch64.
arm ELF32 EABI binaries could be execute in AArch32 state on AArch64. A32 THUMB mode is not supported yet.
 1.8 15-Sep-2018  jakllsch make kernel-groveling crash(8) work on aarch64
 1.7 14-Aug-2018  ryo no need machine local ddb command pmaphist any more.
 1.6 11-Aug-2018  ryo use DDB_EXPR_FMT. fix typo.
 1.5 17-Jul-2018  christos add missing casts; use PRI?64 instead of ll?
 1.4 09-Jul-2018  ryo fix compile error
 1.3 09-Jul-2018  ryo add MULTIPROCESSOR support
 1.2 01-Apr-2018  ryo branches: 1.2.2;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.28;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.28.7 18-Jan-2019  pgoyette Synch with HEAD
 1.1.28.6 26-Dec-2018  pgoyette Sync with HEAD, resolve a few conflicts
 1.1.28.5 20-Oct-2018  pgoyette Sync with head
 1.1.28.4 30-Sep-2018  pgoyette Ssync with HEAD
 1.1.28.3 06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.1.28.2 28-Jul-2018  pgoyette Sync with HEAD
 1.1.28.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file db_machdep.c was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.2.2.4 21-Apr-2020  martin Sync with HEAD
 1.2.2.3 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.2.2.2 08-Apr-2020  martin Merge changes from current as of 20200406
 1.2.2.1 10-Jun-2019  christos Sync with HEAD
 1.20.4.1 20-Apr-2020  bouyer Sync with HEAD
 1.28.2.2 03-Apr-2021  thorpej Sync with HEAD.
 1.28.2.1 14-Dec-2020  thorpej Sync w/ HEAD.
 1.39.4.1 13-May-2021  thorpej Sync with HEAD.
 1.25 06-Sep-2025  skrll KNF. (sort #includes)
 1.24 07-Feb-2024  msaitoh Remove ryo@'s mail addresses.
 1.23 22-Sep-2022  ryo oops, my previous commit is bad. revert previous.
<frame-address> is a frame pointer, not a trapframe, and it worked correctly. (e.g., trace $x29)
 1.22 22-Sep-2022  ryo Specifying the frame address "trace <frame-address>" was not working.
 1.21 22-Sep-2022  ryo If there was a "bl <func>" instruction at the end of a function block,
the stack analysis backtrace (bt/s) would fail because $lr would point
to the beginning of the next function.
 1.20 19-Sep-2022  ryo Fixed stack analyzing backtrace (bt/s) correctly for nested trapframes.
 1.19 07-Jun-2022  ryo Functionalize frame pointer backtrace.
 1.18 07-Jun-2022  ryo On aarch64, ddb backtrace can be performed without framepointer by specifying
the /s modifier to the ddb trace command (trace/s, bt/s).
The default is trace with framepointer (same as before).

This allows backtracing even on kernels compiled with -fomit-frame-pointer.
 1.17 02-Jun-2022  ryo tidy up backtrace from crash(9) on aarch64

- fix to dump trapframe when backtracing from crash(8).
- use db_read_bytes() when reading kernel memory.
 1.16 29-May-2022  ryo Display the trap type of trapframe when backtracing.
 1.15 29-May-2022  ryo Simplified termination conditions for ddb backtrace.

Exit backtrace when the user trapframe is invalid. (Mainly in kernel threads).
 1.14 27-Nov-2021  riastradh aarch64: Fix stack traces from jump-to-null.
 1.13 11-Dec-2020  skrll s:aarch64/cpufunc.h:arm/cpufunc.h:

a baby step in the grand arm header unification challenge
 1.12 27-Jun-2020  rin branches: 1.12.2;
Fix build failure due to -Werror=stack-usage.

Use db_read_bytes() against particular member of structure in use,
by which we can avoid to have whole structure in the stack.

Now, stack usage shrinks: 4240 --> 1296
 1.11 22-May-2020  ryo fix to do backtrace properly for running LWPs and cpu_lwp_fork().
when dump of pcb_tf, only the switchframe part is now displayed instead of the whole trapframe.
 1.10 13-May-2020  ryo - move aarch64 addressspace macros from pmap.h to cpufunc.h
- rename ptr_strip_pac() to aarch64_strip_pac()
 1.9 12-Apr-2020  maxv Add support for Pointer Authentication (PAC).

We use the "pac-ret" option, to sign the return instruction pointer on
function entry, and authenticate it on function exit. This acts as a
mitigation against ROP.

The authentication uses a per-lwp (secret) I-A key stored in the 128bit
APIAKey register and part of the lwp context. During lwp creation, the
kernel generates a random key, and during context switches, it installs
the key of the target lwp on the CPU.

Userland cannot read the APIAKey register directly. However, it can sign
its pointers with it, because the register is architecturally shared
between userland and the kernel. Although part of the CPU design, it is
a bit of an undesired behavior, because it allows to forge valid kernel
pointers from userland. To avoid that, we don't share the key with
userland, and rather switch it in EL0<->EL1 transitions. This means that
when userland executes, a different key is loaded in APIAKey than the one
the kernel uses. For now the userland key is a fixed 128bit zero value.

The DDB stack unwinder is changed to strip the authentication code from
the pointers in lr.

Two problems are known:

* Currently the idlelwps' keys are not really secret. This is because
the RNG is not yet available when we spawn these lwps. Not overly
important, but would be nice to fix with UEFI RNG.
* The key switching in EL0<->EL1 transitions is not the most optimized
code on the planet. Instead of checking aarch64_pac_enabled, it would
be better to hot-patch the code at boot time, but there currently is
no hot-patch support on aarch64.

Tested on Qemu.
 1.8 27-Jan-2019  pgoyette branches: 1.8.10;
Merge the [pgoyette-compat] branch
 1.7 27-Dec-2018  mrg make savecore for arm64 basically work.

- move MD lwp "md_ktf" member into struct pcb. the pcb is used by
the gdb "bsd-kvm" target code to find the stack of each thread
and needs to be available in a well known location.
- implement aarch64_nbsd_supply_pcb() in GDB. makes basic gdb work
on a crash dump.
- remove '#if L_MD_KTF + 8 == L_MD_CPACR' conditional code, as there
is no more L_MD_KTF.

with this gdb has minimal working functionality with "target kvm",
and crash can at least "ps" on a crash dump.

ok skrll.
 1.6 15-Sep-2018  jakllsch make kernel-groveling crash(8) work on aarch64
 1.5 15-Sep-2018  jakllsch aarch64/db_trace.c: annotate w/ __printflike; fix discovered problems
 1.4 30-Jul-2018  ryo don't depend on clang code to backtrace. keep trapframe as framepointer if DDB.
 1.3 17-Jul-2018  christos add missing casts
 1.2 01-Apr-2018  ryo branches: 1.2.2;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.28;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.28.5 18-Jan-2019  pgoyette Synch with HEAD
 1.1.28.4 30-Sep-2018  pgoyette Ssync with HEAD
 1.1.28.3 06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.1.28.2 28-Jul-2018  pgoyette Sync with HEAD
 1.1.28.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file db_trace.c was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.2.2.2 21-Apr-2020  martin Sync with HEAD
 1.2.2.1 10-Jun-2019  christos Sync with HEAD
 1.8.10.1 20-Apr-2020  bouyer Sync with HEAD
 1.12.2.1 14-Dec-2020  thorpej Sync w/ HEAD.
 1.16 07-Feb-2024  msaitoh Remove ryo@'s mail addresses.
 1.15 23-Feb-2021  ryo adjust tab. NFC
 1.14 23-Feb-2021  ryo fix wrong target register size of "ldrsh"

"ldrsh Xt, [Xn, Xm]" was being output as "ldrsh Wt, [Xn, Xm]"
 1.13 23-Feb-2021  ryo make more system registers are disassemblable
 1.12 23-Feb-2021  ryo The immediate offset of "ldtrb", "ldtrh", "sttrb", and "sttrh" was always output as unsigned.
Correctly, it is 9bit signed.
 1.11 23-Feb-2021  ryo The register operand size for "smnegl" and "smsubl' was wrong.
not "smsubl Xd, Xn, Xm, Xa", but "smsubl Xd, Wn, Wm, Xa".
 1.10 05-Sep-2020  jakllsch branches: 1.10.2;
AArch64 instructions are always LE: swap if we're BE
 1.9 03-Aug-2020  ryo make more ARMv8.x system registers are disassemblable
 1.8 26-May-2020  ryo disassemblable bti insns
 1.7 25-May-2020  ryo disassemblable pointer authentication insns
 1.6 04-Oct-2018  ryo disassemblable sha512 insns
 1.5 15-Sep-2018  jakllsch make kernel-groveling crash(8) work on aarch64
 1.4 28-Jul-2018  ryo add support for disasm pmull,aes,sha insns. most SIMD insns are not yet.
 1.3 17-Jul-2018  ryo use panic() instead of some printf to show fault status.
useful for ddb "show panic" command.
 1.2 14-Jun-2018  ryo branches: 1.2.2;
Widen shift to the LHS type.
same as aarch64/db_interface.c r1.4, PR/53338.
 1.1 01-Apr-2018  ryo branches: 1.1.2;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1.2.7 20-Oct-2018  pgoyette Sync with head
 1.1.2.6 30-Sep-2018  pgoyette Ssync with HEAD
 1.1.2.5 06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.1.2.4 28-Jul-2018  pgoyette Sync with HEAD
 1.1.2.3 25-Jun-2018  pgoyette Sync with HEAD
 1.1.2.2 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.2.1 01-Apr-2018  pgoyette file disasm.c was added on branch pgoyette-compat on 2018-04-07 04:12:10 +0000
 1.2.2.1 10-Jun-2019  christos Sync with HEAD
 1.10.2.1 03-Apr-2021  thorpej Sync with HEAD.
 1.2 07-Feb-2024  msaitoh Remove ryo@'s mail addresses.
 1.1 01-Apr-2018  ryo branches: 1.1.2;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1.2.2 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.2.1 01-Apr-2018  pgoyette file disasm.h was added on branch pgoyette-compat on 2018-04-07 04:12:10 +0000
 1.14 10-Jul-2023  rin Factor out some fdt(4) features from {,evb}arm into dev/fdt.

Now, FDT-based support to efirt, initrd, rndseed, and efirng can be
used from, e.g., riscv.

Mostly from Nick Hudson.

XXX
As Nick comments, there can be some optimizations for fdt_map_range().
efiboot may also be modified to load these objects into aligned PAs.
 1.13 03-May-2022  skrll Style. NFCI.
 1.12 27-Apr-2022  ryo since pmap_activate_efirt() rewrites TTBR0, it is necessary to pmap_activate() again after pmap_deactivate_efirt() to restore the original TTBR0.

- Fix to do pmap_{de,}activate() before/after pmap_{,de}activate_efirt().
- moved kpreempt_{disable,enable}() to the caller since everything between
arm_efirt_md_enter() and arm_efirt_md_exit() should be kpreempt disabled.

ok skrll@
 1.11 02-Apr-2022  skrll Update to support EFI runtime outside the kernel virtual address space
by creating an EFI RT pmap that can be activated / deactivated when
required.

Adds support for EFI RT to ARM_MMU_EXTENDED (ASID) 32-bit Arm machines.

On Arm64 the usage of pmapboot_enter is reduced and the mappings are
created much later in the boot process -- now in cpu_startup_hook.
Backward compatiblity for KVA mapped RT from old bootaa64.efi is
maintained.

Adding support to other platforms should be easier as a result.
 1.10 21-Mar-2021  skrll Remove the unnecessary invalidation code in arm_efirt_md_map_range.

pmapboot_enter will panic if any overlapping mappings existed before and
a full TLB invalidate was done as part of turning the MMU on in locore.
 1.9 20-Mar-2021  skrll Don't mark EFI runtime pages LX_BLKPAG_OS_READ | LX_BLKPAG_OS_WRITE as
these bits are only used by the current pmap fault code and these are
wired pages which will never fault.
 1.8 22-Oct-2020  skrll branches: 1.8.2; 1.8.4;
Use the dmb/dsb/isb macros... if nothing else they're all now consistent
about the "memory" assembler contraint.

No binary change
 1.7 22-Oct-2020  skrll Simplify the cpufunc.h header, i.e. always use #include <arm/cpufunc.h>
 1.6 16-Jul-2020  skrll pmapboot_enter simplication
- bootpage_alloc in asm becomes pmapboot_pagealloc in C
- PMAPBOOT_ENTER_NOBLOCK is removed as it's not used
- PMAPBOOT_ENTER_NOOVERWRITE is removed as it's now always on
- physpage_allocator argument is removed as it's always
pmapboot_pagealloc
- Support for EARLYCONS without CONSADDR is removed so that the identity
map for CONSADDR is always known.

For the assembly files:
2 files changed, 40 insertions(+), 89 deletions(-)

LGTM ryo
 1.5 16-Dec-2019  jmcneill Enable FP access for EFI RT and improve error handling.
 1.4 12-Aug-2019  skrll Trailing whitespace
 1.3 31-Oct-2018  jmcneill branches: 1.3.2; 1.3.6; 1.3.8;
EFI runtime code section needs to be writable, otherwise we fail with a permission fault at shutdown on QEMU when writing to the RTC
 1.2 31-Oct-2018  jmcneill Setup mappings for EFI runtime mmio ranges.
 1.1 28-Oct-2018  jmcneill Add support for EFI runtime services on aarch64.
 1.3.8.1 17-Dec-2019  martin Pull up following revision(s) (requested by jmcneill in ticket #558):

sys/arch/aarch64/aarch64/efi_machdep.c: revision 1.5
sys/arch/arm/arm/efi_runtime.h: revision 1.3
sys/arch/arm/arm/efi_runtime.c: revision 1.3

Enable FP access for EFI RT and improve error handling.
 1.3.6.4 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.3.6.3 08-Apr-2020  martin Merge changes from current as of 20200406
 1.3.6.2 10-Jun-2019  christos Sync with HEAD
 1.3.6.1 31-Oct-2018  christos file efi_machdep.c was added on branch phil-wifi on 2019-06-10 22:05:43 +0000
 1.3.2.2 26-Nov-2018  pgoyette Sync with HEAD, resolve a couple of conflicts
 1.3.2.1 31-Oct-2018  pgoyette file efi_machdep.c was added on branch pgoyette-compat on 2018-11-26 01:52:16 +0000
 1.8.4.1 03-Apr-2021  thorpej Sync with HEAD.
 1.8.2.1 03-Apr-2021  thorpej Sync with HEAD.
 1.2 01-Apr-2018  ryo Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.28;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.28.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file exception.S was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.10 23-Sep-2021  ryo use lwp_trapframe() macro. NFC.
 1.9 11-Dec-2020  skrll s:aarch64/cpufunc.h:arm/cpufunc.h:

a baby step in the grand arm header unification challenge
 1.8 13-Oct-2020  rin branches: 1.8.2;
BE32 binaries are no longer supported for ARMv7 and later, and
therefore for aarch64eb.

Reject them with ENOEXEC, rather than causing illegal instruction
exceptions due to unexpected binary format.
 1.7 23-May-2020  ryo Not only the kernel thread, but also the userland PAC keys
(APIA,APIB,APDA,APDB,APGA) are now randomly initialized at exec, and switched
when context switch.
userland programs are able to perform pointer authentication on ARMv8.3+PAC cpu.

reviewd by maxv@, thanks.
 1.6 24-Nov-2019  rin part of PR port-arm/54702

Make sure that md_march32 and ep_machine_arch have same size.

XXX
pullup to netbsd-9
 1.5 24-Nov-2019  rin PR port-arm/54702

Add support for earmv6hf binaries on COMPAT_NETBSD32 for aarch64:

- Emulate ARMv6 instructions with cache operations register (c7), that
are deprecated since ARMv7, and disabled on ARMv8 with LP64 kernel.

- ep_machine_arch (default: earmv7hf) is copied from executables, as we
do for mips64. "uname -p" reports earmv6hf if compiled for earmv6hf;
configure scripts etc can determine the appropriate architecture.

Many thanks to ryo@ for helping me to add support of Thumb-mode,
as well as providing exhaustive test cases:

https://github.com/ryo/mcr_test/

We've confirmed:

- Emulation works in Thumb-mode.
- T32 16-bit length illegal instruction results in SIGILL, even if
it is located nearby a boundary b/w mapped and unmapped pages.
- T32 32-bit instruction results in SIGSEGV if it is located across
a boundary b/w mapped and unmapped pages.

XXX
pullup to netbsd-9
 1.4 28-Nov-2018  ryo don't exec 32bit binary on the cpu that has no aarch32.
 1.3 12-Oct-2018  ryo add initial support of COMPAT_NETBSD32 on AArch64.
arm ELF32 EABI binaries could be execute in AArch32 state on AArch64. A32 THUMB mode is not supported yet.
 1.2 01-Apr-2018  ryo branches: 1.2.2;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.28;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.28.3 26-Dec-2018  pgoyette Sync with HEAD, resolve a few conflicts
 1.1.28.2 20-Oct-2018  pgoyette Sync with head
 1.1.28.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file exec_machdep.c was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.2.2.2 08-Apr-2020  martin Merge changes from current as of 20200406
 1.2.2.1 10-Jun-2019  christos Sync with HEAD
 1.8.2.1 14-Dec-2020  thorpej Sync w/ HEAD.
 1.26 07-Feb-2024  msaitoh Remove ryo@'s mail addresses.
 1.25 25-Feb-2023  riastradh aarch64: curcpu() audit.

Sprinkle KASSERT (or KDASSERT in hot paths) for kpreempt_disabled()
when we use curcpu() and it's not immediately obvious that the caller
has preemption disabled but closer scrutiny suggests the caller has.

Note unsafe curcpu()s for syscall event counting. Not sure this is
worth changing.

Possible bugs fixed:

- cpu_irq and cpu_fiq could be preempted while trying to run softints
on this CPU.

- data_abort_handler might incorrectly think it was invoked in
interrupt context when it was only preempted and migrated to
another CPU.

- pmap_fault_fixup might report the wrong CPU logs.

(However, we don't currently run with kpreemption on aarch64, so
these are not yet real bugs fixed except if you patch it to build
with __HAVE_PREEMPTION.)
 1.24 11-May-2022  andvar fix various typos in comments.
 1.23 30-Apr-2022  skrll whitespace
 1.22 31-Jan-2022  ryo add support Hardware updates to Access flag and Dirty state (FEAT_HAFDBS)

- The DBM bit of the PTE is now used to determine if it is writable, and
the AF bit is treated entirely as a reference bit. A valid PTE is always
treated as readable. There can be no valid PTE that is not readable.
- LX_BLKPAG_OS_{READ,WRITE} are used only for debugging purposes,
and has been superseded by LX_BLKPAG_AF and LX_BLKPAG_DBM.
- Improve comment

The need for reference/modify emulation has been eliminated,
and access/permission faults have been reduced, however,
there has been little change in overall performance.
 1.21 11-Dec-2020  skrll s:aarch64/cpufunc.h:arm/cpufunc.h:

a baby step in the grand arm header unification challenge
 1.20 15-Oct-2020  rin branches: 1.20.2;
For cpu_jump_onfault() in data_abort_handler(), stop returning
hard-coded EFAULT and use return value from uvm_fault() instead.

There are some paths that do not call uvm_fault():

(1) For fatalabort case, use EFAULT as before.
(2) When va range is invalid, use EFAULT instead of EINVAL.

These change fixes bytes_transfer_eof_* tests in
sys/lib/libc/sys/t_ptrace_wait*.

Note that without (2) above, some tests like
sys/lib/libc/sys/t_wait:write_error become newly failing.

I've confirmed that there's no new regression in full ATF run.

OK ryo
 1.19 09-Aug-2020  skrll Don't use %s in UVMHIST_PRINT. Remove an unnecessary #ifdef UVMHIST while
I'm here
 1.18 06-Aug-2020  ryo No need to recover from fault from within a hardware interrupt handler.
 1.17 06-Aug-2020  ryo revert the changes of http://mail-index.netbsd.org/source-changes/2020/08/03/msg120183.html

This change is overengineered.
bus_space_{peek,poke}_N does not have to be reentrant nor available for interrupt context.

requested by skrll@
 1.16 03-Aug-2020  ryo Fix a problem in which a fault occured in an interrupt handler during copyin/copyout was erroneously detected as being occured by copyin.

- keep idepth in faultbuf and compare it to avoid unnecessary fault recovery
- make cpu_set_onfault() nestable to use bus_space_{peek,poke}()
in hardware interrupt handlers during copyin & copyout.
 1.15 02-Aug-2020  maxv Add support for Privileged Access Never (ARMv8.1-PAN).

PAN provides the same functionality as SMAP on x86: it forbids kernel
access to userland pages when PSTATE.PAN=1, and allows such accesses when
PSTATE.PAN=0.

We clear SCTLR_SPAN, to guarantee that PAN=1 each time the kernel is
entered. We catch PAN faults and panic right away without further
processing. In copyin, copyout, etc, we temporarily authorize access to
userland pages.

PAN is a very useful exploit mitigation. Reviewed by ryo@, thanks. Tested
on Qemu. Enabled by default.
 1.14 08-Jul-2020  ryo Determination of A64,A32,T32 for disasm is now done in strrdisasm() instead of the caller.
correctly disassemble by processor state if defined DEBUG_DUMP_ON_USERFAULT or DEBUG_DDB_ON_USERFAULT.
 1.13 13-May-2020  ryo - move aarch64 addressspace macros from pmap.h to cpufunc.h
- rename ptr_strip_pac() to aarch64_strip_pac()
 1.12 29-Feb-2020  ryo Fix pmap to work correctly with tagged addresses

- when fault, untag from address before passing to uvm/pmap functions
- pmap_extract() checks more strictly and consider the address tag
 1.11 09-Jan-2020  ryo fix behaviour mmap()/mprotect() when passed only PROT_EXEC.

when mmap()/mprotect() with only PROT_EXEC, syscall will be successful,
but the page actually hadn't been mapped.
it should be mapped with PROT_READ|PROT_EXEC implicitly. (r-x)
 1.10 10-Jun-2019  ryo branches: 1.10.2; 1.10.4;
since uvm_faut() will fail if cache maintain instructions (e.g., "dc cvau".
that has ESR.WnR=1 = write access) for a read only page causes a data abort trap,
It shoulde be treat the access type as read access in a data abort with
cache operation (ESR_ISS_DATAABORT_CM=1)

pointed out and tested by mrg@. thanks
 1.9 06-Apr-2019  thorpej Overhaul the API used to fetch and store individual memory cells in
userspace. The old fetch(9) and store(9) APIs (fubyte(), fuword(),
subyte(), suword(), etc.) are retired and replaced with new ufetch(9)
and ustore(9) APIs that can return proper error codes, etc. and are
implemented consistently across all platforms. The interrupt-safe
variants are no longer supported (and several of the existing attempts
at fuswintr(), etc. were buggy and not actually interrupt-safe).

Also augmement the ucas(9) API, making it consistently available on
all plaforms, supporting uniprocessor and multiprocessor systems, even
those that do not have CAS or LL/SC primitives.

Welcome to NetBSD 8.99.37.
 1.8 26-Mar-2019  mlelstv Switch discriminates between fsc values and should check the masked fsc value,
not the whole register.
 1.7 12-Oct-2018  ryo add initial support of COMPAT_NETBSD32 on AArch64.
arm ELF32 EABI binaries could be execute in AArch32 state on AArch64. A32 THUMB mode is not supported yet.
 1.6 21-Jul-2018  ryo return correct signo and code in accordance with return value of uvm_fault.
 1.5 20-Jul-2018  ryo avoid double-fault caused by reading the instruction when panic
 1.4 19-Jul-2018  christos Implement TRAP_SIGDEBUG for aarch64...
ptraced programs die with:
data_abort_handler, 257: pid 199.1 (a.out): signal 11 (trap 0x82000006) @pc 0, addr 0x0, error=Instruction Abort (EL0)
 1.3 17-Jul-2018  ryo use panic() instead of some printf to show fault status.
useful for ddb "show panic" command.
 1.2 17-Jul-2018  christos fix uninitialized, add missing casts
 1.1 01-Apr-2018  ryo branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1.4.2 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.1.4.1 10-Jun-2019  christos Sync with HEAD
 1.1.2.4 20-Oct-2018  pgoyette Sync with head
 1.1.2.3 28-Jul-2018  pgoyette Sync with HEAD
 1.1.2.2 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.2.1 01-Apr-2018  pgoyette file fault.c was added on branch pgoyette-compat on 2018-04-07 04:12:10 +0000
 1.10.4.1 17-Jan-2020  ad Sync with head.
 1.10.2.1 21-Jan-2020  martin Pull up following revision(s) (requested by ryo in ticket #618):

sys/arch/aarch64/aarch64/fault.c: revision 1.11
sys/arch/aarch64/aarch64/pmap.c: revision 1.61

fix behaviour mmap()/mprotect() when passed only PROT_EXEC.
when mmap()/mprotect() with only PROT_EXEC, syscall will be successful,
but the page actually hadn't been mapped.
it should be mapped with PROT_READ|PROT_EXEC implicitly. (r-x)
 1.20.2.1 14-Dec-2020  thorpej Sync w/ HEAD.
 1.13 20-Aug-2022  riastradh fpu_kern_enter/leave: Disable IPL assertions.

These don't work because mutex_enter/exit on a spin lock may raise an
IPL but not lower it, if another spin lock was already held. For
example,

mutex_enter(some_lock_at_IPL_VM);
printf("foo\n");
fpu_kern_enter();
...
fpu_kern_leave();
mutex_exit(some_lock_at_IPL_VM);

will trigger the panic, because printf takes a lock at IPL_HIGH where
the IPL wil remain until the mutex_exit. (This was a nightmare to
track down before I remembered that detail of spin lock IPL
semantics...)
 1.12 01-Apr-2022  riastradh x86, arm: Allow fpu_kern_enter/leave while cold.

Normally these are forbidden above IPL_VM, so that FPU usage doesn't
block IPL_SCHED or IPL_HIGH interrupts. But while cold, e.g. during
builtin module initialization at boot, all interrupts are blocked
anyway so it's a moot point.

Also initialize x86 cpu_info_primary.ci_kfpu_spl to -1 so we don't
trip over an assertion about it while cold -- the assertion is meant
to detect reentrance into fpu_kern_enter/leave, which is prohibited.

Also initialize cpu0's ci_kfpu_spl.
 1.11 11-Dec-2020  skrll s:aarch64/cpufunc.h:arm/cpufunc.h:

a baby step in the grand arm header unification challenge
 1.10 22-Oct-2020  skrll branches: 1.10.2;
Use the dmb/dsb/isb macros... if nothing else they're all now consistent
about the "memory" assembler contraint.

No binary change
 1.9 22-Oct-2020  skrll Simplify the cpufunc.h header, i.e. always use #include <arm/cpufunc.h>
 1.8 01-Aug-2020  riastradh Add kthread_fpu_enter/exit support to aarch64.
 1.7 13-Jul-2020  riastradh Use pcu_save_all_on_cpu, not pcu_save.

We don't care what curlwp is here; we care whose state is in the fpu
registers.
 1.6 13-Jul-2020  riastradh Limit aarch64 fpu_kern_enter/leave to IPL_VM or below.
 1.5 29-Jun-2020  riastradh Move aarch64/fpu.h to arm/fpu.h.
 1.4 29-Jun-2020  riastradh Draft fpu_kern_enter/leave on aarch64.
 1.3 07-Nov-2018  riastradh When hardware subnormal support is available, disable flush-to-zero.

Similarly, when hardware NaN propagation is available, disable
default-NaN substitution.

This enables IEEE 754 semantics on any hardware that supports it by
default. Programs that want flush-to-zero or default-NaN substitution
can enable them explicitly.

ok ryo@
 1.2 01-Apr-2018  ryo branches: 1.2.2;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.28;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.28.2 26-Nov-2018  pgoyette Sync with HEAD, resolve a couple of conflicts
 1.1.28.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file fpu.c was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.2.2.1 10-Jun-2019  christos Sync with HEAD
 1.10.2.1 14-Dec-2020  thorpej Sync w/ HEAD.
 1.11 11-Feb-2021  ryo include "opt_gprof.h" so that _PROF_PROLOGUE works properly in ENTRY() macro in *.S files
 1.10 12-Aug-2020  skrll branches: 1.10.2;
Part II of ad's aarch64 performance improvements (cpu_switch.S bugs are
all mine)

- Use tpidr_el1 to hold curlwp and not curcpu, because curlwp is accessed
much more often by MI code. It also makes curlwp preemption safe and
allows aarch64_curlwp() to be a const function (curcpu must be volatile).

- Make ASTs operate per-LWP rather than per-CPU, otherwise sometimes LWPs
can see spurious ASTs (which doesn't cause a problem, it just means some
time may be wasted).

- Use plain stores to set/clear ASTs. Make sure ASTs are always set on the
same CPU as the target LWP, and delivered via IPI if posted from a remote
CPU so that they are resolved quickly.

- Add some cache line padding to struct cpu_info, to match x86.

- Add a memory barrier in a couple of places where ci_curlwp is set. This
is needed whenever an LWP that is resuming on the CPU could hold an
adaptive mutex. The barrier needs to drain the CPU's store buffer, so
that the update to ci_curlwp becomes globally visible before the LWP can
resume and call mutex_exit(). By my reading of the ARM docs it looks like
the instruction I used will do the right thing, but I'm not 100% sure.
 1.9 06-Aug-2020  ryo revert the changes of http://mail-index.netbsd.org/source-changes/2020/08/03/msg120183.html

This change is overengineered.
bus_space_{peek,poke}_N does not have to be reentrant nor available for interrupt context.

requested by skrll@
 1.8 03-Aug-2020  ryo Fix a problem in which a fault occured in an interrupt handler during copyin/copyout was erroneously detected as being occured by copyin.

- keep idepth in faultbuf and compare it to avoid unnecessary fault recovery
- make cpu_set_onfault() nestable to use bus_space_{peek,poke}()
in hardware interrupt handlers during copyin & copyout.
 1.7 02-Aug-2020  maxv Add support for Privileged Access Never (ARMv8.1-PAN).

PAN provides the same functionality as SMAP on x86: it forbids kernel
access to userland pages when PSTATE.PAN=1, and allows such accesses when
PSTATE.PAN=0.

We clear SCTLR_SPAN, to guarantee that PAN=1 each time the kernel is
entered. We catch PAN faults and panic right away without further
processing. In copyin, copyout, etc, we temporarily authorize access to
userland pages.

PAN is a very useful exploit mitigation. Reviewed by ryo@, thanks. Tested
on Qemu. Enabled by default.
 1.6 06-Apr-2019  thorpej Overhaul the API used to fetch and store individual memory cells in
userspace. The old fetch(9) and store(9) APIs (fubyte(), fuword(),
subyte(), suword(), etc.) are retired and replaced with new ufetch(9)
and ustore(9) APIs that can return proper error codes, etc. and are
implemented consistently across all platforms. The interrupt-safe
variants are no longer supported (and several of the existing attempts
at fuswintr(), etc. were buggy and not actually interrupt-safe).

Also augmement the ucas(9) API, making it consistently available on
all plaforms, supporting uniprocessor and multiprocessor systems, even
those that do not have CAS or LL/SC primitives.

Welcome to NetBSD 8.99.37.
 1.5 17-Jul-2018  christos centralize fp,lr definitions
 1.4 17-Jul-2018  ryo fix build with aarch64 gcc/gas
 1.3 01-Apr-2018  ryo branches: 1.3.2;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.2 16-Aug-2017  nisimura branches: 1.2.2;
retire copyinout.S and fusu.S
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.6;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.6.1 28-Aug-2017  skrll Sync with HEAD
 1.1.4.3 03-Dec-2017  jdolecek update from HEAD
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file fusu.S was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.2.2.2 28-Jul-2018  pgoyette Sync with HEAD
 1.2.2.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.3.2.1 10-Jun-2019  christos Sync with HEAD
 1.10.2.1 03-Apr-2021  thorpej Sync with HEAD.
 1.41 31-Jan-2025  jmcneill fixme! CNTKCTL_EL1 init.

Fix a dumb mistake in previous - we don't want CNTKCTL.EL0VTEN to be set.
 1.40 30-Jan-2025  jmcneill Fix CNTKCTL_EL1 initialization.

Explicitly initialize all fields in CNTKCTL_EL1 as many of them reset
to an architecturally UNKNOWN value.
 1.39 16-Apr-2023  skrll branches: 1.39.6;
Rename VM_KERNEL_IO_ADDRESS to VM_KERNEL_IO_BASE to match RISC-V

It's less letters, matches other similar variables and will help with
sharing code between the two architectures.

NFCI.
 1.38 25-Jun-2022  jmcneill Remove GIC_SPLFUNCS.
 1.37 30-Oct-2021  jmcneill Add __HAVE_PREEMPTION support to gic_splfuncs asm funcs.

"looks right to me" - thorpej
 1.36 30-Oct-2021  jmcneill Add CI_SPLX_SAVEDIPL and CI_HWPL
 1.35 30-Sep-2021  skrll Ensure TCR_EPD0 is set on entry to pmap_activate and ensure it is set as
early as possible for APs.
 1.34 18-Sep-2021  jmcneill gic_splx: performance optimizations

Avoid any kind of register access (DAIF, PMR, etc), barriers, and atomic
operations in the common case where no interrupt fires between spl being
raised and lowered.

This introduces a per-CPU return address (ci_splx_restart) used by the
vector handler to restart a sequence in splx that compares the new ipl
with the per-CPU hardware priority state stored in ci_hwpl.
 1.33 11-Dec-2020  skrll s:aarch64/cpufunc.h:arm/cpufunc.h:

a baby step in the grand arm header unification challenge
 1.32 10-Nov-2020  skrll AA64 is not MIPS.

Change all KSEG references to directmap
 1.31 15-Sep-2020  ryo branches: 1.31.2;
fix typo
 1.30 12-Aug-2020  skrll Part II of ad's aarch64 performance improvements (cpu_switch.S bugs are
all mine)

- Use tpidr_el1 to hold curlwp and not curcpu, because curlwp is accessed
much more often by MI code. It also makes curlwp preemption safe and
allows aarch64_curlwp() to be a const function (curcpu must be volatile).

- Make ASTs operate per-LWP rather than per-CPU, otherwise sometimes LWPs
can see spurious ASTs (which doesn't cause a problem, it just means some
time may be wasted).

- Use plain stores to set/clear ASTs. Make sure ASTs are always set on the
same CPU as the target LWP, and delivered via IPI if posted from a remote
CPU so that they are resolved quickly.

- Add some cache line padding to struct cpu_info, to match x86.

- Add a memory barrier in a couple of places where ci_curlwp is set. This
is needed whenever an LWP that is resuming on the CPU could hold an
adaptive mutex. The barrier needs to drain the CPU's store buffer, so
that the update to ci_curlwp becomes globally visible before the LWP can
resume and call mutex_exit(). By my reading of the ARM docs it looks like
the instruction I used will do the right thing, but I'm not 100% sure.
 1.29 06-Aug-2020  ryo revert the changes of http://mail-index.netbsd.org/source-changes/2020/08/03/msg120183.html

This change is overengineered.
bus_space_{peek,poke}_N does not have to be reentrant nor available for interrupt context.

requested by skrll@
 1.28 03-Aug-2020  ryo Implement MD ucas(9) (__HAVE_UCAS_FULL)
 1.27 03-Aug-2020  ryo Fix a problem in which a fault occured in an interrupt handler during copyin/copyout was erroneously detected as being occured by copyin.

- keep idepth in faultbuf and compare it to avoid unnecessary fault recovery
- make cpu_set_onfault() nestable to use bus_space_{peek,poke}()
in hardware interrupt handlers during copyin & copyout.
 1.26 28-May-2020  ryo - make AP{IB,DA,DB}Key are also enabled when ARMV83_PAC.
- If no ARMV83_PAC, clearly disable SCTLR_En{IA,IB,DA,DB}
 1.25 23-May-2020  ryo Not only the kernel thread, but also the userland PAC keys
(APIA,APIB,APDA,APDB,APGA) are now randomly initialized at exec, and switched
when context switch.
userland programs are able to perform pointer authentication on ARMv8.3+PAC cpu.

reviewd by maxv@, thanks.
 1.24 15-May-2020  ryo SCTLR_EnIA should be enabled in the caller(locore).

For some reason, gcc make aarch64_pac_init() function non-leaf, and it uses paciasp/autiasp.
 1.23 12-Apr-2020  maxv Add support for Pointer Authentication (PAC).

We use the "pac-ret" option, to sign the return instruction pointer on
function entry, and authenticate it on function exit. This acts as a
mitigation against ROP.

The authentication uses a per-lwp (secret) I-A key stored in the 128bit
APIAKey register and part of the lwp context. During lwp creation, the
kernel generates a random key, and during context switches, it installs
the key of the target lwp on the CPU.

Userland cannot read the APIAKey register directly. However, it can sign
its pointers with it, because the register is architecturally shared
between userland and the kernel. Although part of the CPU design, it is
a bit of an undesired behavior, because it allows to forge valid kernel
pointers from userland. To avoid that, we don't share the key with
userland, and rather switch it in EL0<->EL1 transitions. This means that
when userland executes, a different key is loaded in APIAKey than the one
the kernel uses. For now the userland key is a fixed 128bit zero value.

The DDB stack unwinder is changed to strip the authentication code from
the pointers in lr.

Two problems are known:

* Currently the idlelwps' keys are not really secret. This is because
the RNG is not yet available when we spawn these lwps. Not overly
important, but would be nice to fix with UEFI RNG.
* The key switching in EL0<->EL1 transitions is not the most optimized
code on the planet. Instead of checking aarch64_pac_enabled, it would
be better to hot-patch the code at boot time, but there currently is
no hot-patch support on aarch64.

Tested on Qemu.
 1.22 20-Feb-2020  skrll branches: 1.22.4;
G/C
 1.21 29-Jan-2020  skrll G/C some more
 1.20 29-Jan-2020  skrll G/C
 1.19 28-Jan-2020  maxv Jazelle and T32EE are not part of ARMv8, fix the bits to their real
meanings. No functional change.
 1.18 08-Jan-2020  ad Hopefully fix some problems seen with MP support on non-x86, in particular
where curcpu() is defined as curlwp->l_cpu:

- mi_switch(): undo the ~2007ish optimisation to unlock curlwp before
calling cpu_switchto(). It's not safe to let other actors mess with the
LWP (in particular l->l_cpu) while it's still context switching. This
removes l->l_ctxswtch.

- Move the LP_RUNNING flag into l->l_flag and rename to LW_RUNNING since
it's now covered by the LWP's lock.

- Ditch lwp_exit_switchaway() and just call mi_switch() instead. Everything
is in cache anyway so it wasn't buying much by trying to avoid saving old
state. This means cpu_switchto() will never be called with prevlwp ==
NULL.

- Remove some KERNEL_LOCK handling which hasn't been needed for years.
 1.17 28-Dec-2019  jmcneill branches: 1.17.2;
Do not use Early Write Acknowledge for PCIe I/O and config space.
 1.16 27-Dec-2019  jmcneill Enable early write acknowledge for device memory mappings.
 1.15 24-Nov-2019  skrll corect #include order
 1.14 07-Sep-2019  ryo add AARCH64_KSEG_MASK. pmap_page.S refer it. (but no functional changed)
 1.13 13-Jul-2019  skrll branches: 1.13.2;
G/C
 1.12 27-Dec-2018  mrg make savecore for arm64 basically work.

- move MD lwp "md_ktf" member into struct pcb. the pcb is used by
the gdb "bsd-kvm" target code to find the stack of each thread
and needs to be available in a well known location.
- implement aarch64_nbsd_supply_pcb() in GDB. makes basic gdb work
on a crash dump.
- remove '#if L_MD_KTF + 8 == L_MD_CPACR' conditional code, as there
is no more L_MD_KTF.

with this gdb has minimal working functionality with "target kvm",
and crash can at least "ps" on a crash dump.

ok skrll.
 1.11 13-Dec-2018  ryo add support PT_STEP
 1.10 11-Dec-2018  ryo fix build failure without options MULTIPROCESSOR
 1.9 20-Nov-2018  mrg rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.
 1.8 04-Oct-2018  ryo * define LX_BLKPAG_{OS,ATTR}_* for OS dependent PTE attributes in pmap.h
* cleanup macros
 1.7 26-Aug-2018  ryo add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!
 1.6 03-Aug-2018  ryo set kernel text/rodata readonly when not defined DDB.
set readonly segment on 2Mbytes aligned. (kernel image is mapped with 2Mbytes L2 block)
 1.5 17-Jul-2018  ryo fix build with aarch64 gcc/gas
 1.4 10-Jul-2018  ryo allow to read CNTVCT_EL0 and CNTFRQ_EL0 from EL0
 1.3 09-Jul-2018  ryo add MULTIPROCESSOR support
 1.2 01-Apr-2018  ryo branches: 1.2.2;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.28;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.28.7 18-Jan-2019  pgoyette Synch with HEAD
 1.1.28.6 26-Dec-2018  pgoyette Sync with HEAD, resolve a few conflicts
 1.1.28.5 26-Nov-2018  pgoyette Sync with HEAD, resolve a couple of conflicts
 1.1.28.4 20-Oct-2018  pgoyette Sync with head
 1.1.28.3 06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.1.28.2 28-Jul-2018  pgoyette Sync with HEAD
 1.1.28.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file genassym.cf was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.2.2.4 21-Apr-2020  martin Sync with HEAD
 1.2.2.3 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.2.2.2 08-Apr-2020  martin Merge changes from current as of 20200406
 1.2.2.1 10-Jun-2019  christos Sync with HEAD
 1.13.2.1 29-Dec-2019  martin Pull up following revision(s) (requested by jmcneill in ticket #586):

sys/arch/arm/nvidia/tegra_pcie.c: revision 1.27
sys/arch/aarch64/aarch64/pmap.c: revision 1.57
sys/arch/aarch64/aarch64/locore.S: revision 1.48
sys/arch/aarch64/include/armreg.h: revision 1.29
sys/arch/aarch64/aarch64/pmap.c: revision 1.58
sys/arch/aarch64/aarch64/locore.S: revision 1.49
sys/arch/arm/acpi/acpipchb.c: revision 1.14
sys/arch/aarch64/aarch64/genassym.cf: revision 1.16
sys/arch/arm/acpi/acpi_machdep.c: revision 1.13
sys/arch/aarch64/include/pmap.h: revision 1.27
sys/arch/aarch64/aarch64/genassym.cf: revision 1.17
sys/arch/aarch64/include/pmap.h: revision 1.28
sys/arch/arm/fdt/pcihost_fdtvar.h: revision 1.3
sys/arch/arm/include/bus_defs.h: revision 1.14
sys/arch/aarch64/aarch64/bus_space.c: revision 1.9
sys/arch/arm/fdt/pcihost_fdt.c: revision 1.12
sys/arch/aarch64/conf/files.aarch64: revision 1.15
sys/arch/aarch64/conf/files.aarch64: revision 1.16
sys/arch/arm/rockchip/rk3399_pcie.c: revision 1.9

Enable early write acknowledge for device memory mappings.

Do not use Early Write Acknowledge for PCIe I/O and config space.
 1.17.2.2 29-Feb-2020  ad Sync with head.
 1.17.2.1 17-Jan-2020  ad Sync with head.
 1.22.4.1 20-Apr-2020  bouyer Sync with HEAD
 1.31.2.1 14-Dec-2020  thorpej Sync w/ HEAD.
 1.39.6.1 02-Aug-2025  perseant Sync with HEAD
 1.13 30-Dec-2024  jmcneill aarch64: Allow for alternate cpu_idle() implementations
 1.12 29-May-2022  ryo branches: 1.12.10;
ESR_EL1 and FAR_EL1 are not required in interrupt trapframe and their values are meaningless.
To identify it as an interrupt trap frame, store -1 and 0.
 1.11 10-Oct-2021  skrll KNF
 1.10 30-Aug-2021  jmcneill Ensure that all memory accesses prior to executing WFI have been completed
by adding a DSB SY before stopping execution and entering a low power
state. From the ARM Cortex-A Series Programmer's Guide for ARMv8-A:

"ARM recommends the use of a Data Synchronization Barrier (DSB) instruction
before WFI or WFE, to ensure that pending memory transactions complete before
changing state."
 1.9 23-Feb-2021  ryo Just a few optimizations.

- in cpu_idle(), ci_intr_depth is always 0, so there is no need to fetch for increment or conditional branch.
- curcpu() is immutable in idle lwp, there is no need to consider KPREEMPT. Therefore, get curcpu() first and keep using it.
- add more comment.
 1.8 21-Feb-2021  jmcneill When waking from cpu_idle(), only call dosoftints if ci_intr_depth == 0
 1.7 11-Feb-2021  ryo include "opt_gprof.h" so that _PROF_PROLOGUE works properly in ENTRY() macro in *.S files
 1.6 12-Aug-2020  skrll branches: 1.6.2;
Part II of ad's aarch64 performance improvements (cpu_switch.S bugs are
all mine)

- Use tpidr_el1 to hold curlwp and not curcpu, because curlwp is accessed
much more often by MI code. It also makes curlwp preemption safe and
allows aarch64_curlwp() to be a const function (curcpu must be volatile).

- Make ASTs operate per-LWP rather than per-CPU, otherwise sometimes LWPs
can see spurious ASTs (which doesn't cause a problem, it just means some
time may be wasted).

- Use plain stores to set/clear ASTs. Make sure ASTs are always set on the
same CPU as the target LWP, and delivered via IPI if posted from a remote
CPU so that they are resolved quickly.

- Add some cache line padding to struct cpu_info, to match x86.

- Add a memory barrier in a couple of places where ci_curlwp is set. This
is needed whenever an LWP that is resuming on the CPU could hold an
adaptive mutex. The barrier needs to drain the CPU's store buffer, so
that the update to ci_curlwp becomes globally visible before the LWP can
resume and call mutex_exit(). By my reading of the ARM docs it looks like
the instruction I used will do the right thing, but I'm not 100% sure.
 1.5 27-Jan-2019  dholland restore accidentally-removed rcsid
 1.4 27-Jan-2019  pgoyette Merge the [pgoyette-compat] branch
 1.3 12-Dec-2018  ryo - need to save/restore interrupt mask when entering/exiting to/from cpu_switchto_softint().
- when call dosoftints from cpu_idle, interrupts should be disabled.

rarely, lwp stack had been exhausted when high interrupts.
reported by alnsn@. thanks.
 1.2 01-Apr-2018  ryo branches: 1.2.2;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.28;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.28.2 26-Dec-2018  pgoyette Sync with HEAD, resolve a few conflicts
 1.1.28.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file idle_machdep.S was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.2.2.1 10-Jun-2019  christos Sync with HEAD
 1.6.2.1 03-Apr-2021  thorpej Sync with HEAD.
 1.12.10.1 02-Aug-2025  perseant Sync with HEAD
 1.9 16-Feb-2024  andvar Fix closing bracket for strdisasm() function.

Fixes KOBJ_MACHDEP_DEBUG enabled build for aarch64.
 1.8 07-Feb-2024  msaitoh Remove ryo@'s mail addresses.
 1.7 28-Apr-2023  skrll Pass local symbols relocations in both passes and provide the kobj_reloc
implementation visibility of these relocations.

Currently all implementations resolve local symbol relocations in the first
pass and simply skip them in the second. The RISC-V implementation will
make use of this visiblity.
 1.6 11-Dec-2020  skrll s:aarch64/cpufunc.h:arm/cpufunc.h:

a baby step in the grand arm header unification challenge
 1.5 14-Sep-2020  ryo branches: 1.5.2;
swap insns for aarch64eb
 1.4 08-Jul-2020  ryo Determination of A64,A32,T32 for disasm is now done in strrdisasm() instead of the caller.
correctly disassemble by processor state if defined DEBUG_DUMP_ON_USERFAULT or DEBUG_DDB_ON_USERFAULT.
 1.3 01-Dec-2019  jmcneill Flush insn / data caches after loading modules
 1.2 19-Aug-2018  ryo branches: 1.2.2; 1.2.6; 1.2.8;
show correct relocation address when overflowed.
 1.1 15-Aug-2018  ryo MODULAR support
 1.2.8.1 08-Dec-2019  martin Pull up following revision(s) (requested by jmcneill in ticket #513):

sys/arch/aarch64/aarch64/kobj_machdep.c: revision 1.3

Flush insn / data caches after loading modules
 1.2.6.3 08-Apr-2020  martin Merge changes from current as of 20200406
 1.2.6.2 10-Jun-2019  christos Sync with HEAD
 1.2.6.1 19-Aug-2018  christos file kobj_machdep.c was added on branch phil-wifi on 2019-06-10 22:05:43 +0000
 1.2.2.2 06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.2.2.1 19-Aug-2018  pgoyette file kobj_machdep.c was added on branch pgoyette-compat on 2018-09-06 06:55:22 +0000
 1.5.2.1 14-Dec-2020  thorpej Sync w/ HEAD.
 1.1 25-Nov-2021  ryo add support COMPAT_LINUX32 for aarch64
 1.3 07-Feb-2024  msaitoh Remove ryo@'s mail addresses.
 1.2 27-Sep-2021  ryo linux syscall should not break x1 register
 1.1 23-Sep-2021  ryo add support COMPAT_LINUX for aarch64
 1.4 11-Feb-2021  ryo include "opt_gprof.h" so that _PROF_PROLOGUE works properly in ENTRY() macro in *.S files
 1.3 13-Oct-2020  skrll branches: 1.3.2;
Use load-acquire exclusive and store-release exclusive (and remove the
barrier instructions) as suggested by riastradh a little while ago.
 1.2 13-Aug-2020  skrll Trailing whitespace
 1.1 12-Aug-2020  skrll Part III of ad's performance improvements for aarch64

- Assembly language stubs for mutex_enter() and mutex_exit().
 1.3.2.1 03-Apr-2021  thorpej Sync with HEAD.
 1.99 12-Aug-2025  skrll Remove unnecessary line continuations in printcpu (it's not a #define)
 1.98 12-Aug-2025  skrll More comment improvement
 1.97 08-Aug-2025  skrll Improve a comment
 1.96 21-Jul-2025  skrll Improve a comment.
 1.95 31-Jan-2025  jmcneill fixme! CNTKCTL_EL1 init.

Fix a dumb mistake in previous - we don't want CNTKCTL.EL0VTEN to be set.
 1.94 30-Jan-2025  jmcneill Fix CNTKCTL_EL1 initialization.

Explicitly initialize all fields in CNTKCTL_EL1 as many of them reset
to an architecturally UNKNOWN value.
 1.93 07-Feb-2024  msaitoh branches: 1.93.2;
Remove ryo@'s mail addresses.
 1.92 16-Apr-2023  skrll Rename VM_KERNEL_IO_ADDRESS to VM_KERNEL_IO_BASE to match RISC-V

It's less letters, matches other similar variables and will help with
sharing code between the two architectures.

NFCI.
 1.91 23-Feb-2023  riastradh aarch64: Add missing barriers in cpu_switchto.

Details in comments.

Note: This is a conservative change that inserts a barrier where
there was a comment saying none is needed, which is probably correct.
The goal of this change is to systematically add barriers to be
confident in correctness; subsequent changes may remove some bariers,
as an optimization, with an explanation of why each barrier is not
needed.

PR kern/57240

XXX pullup-9
XXX pullup-10
 1.90 17-Feb-2023  skrll Improve an error message
 1.89 29-Oct-2022  skrll branches: 1.89.2;
Slightly better English in a comment.
 1.88 15-Oct-2022  jmcneill Use "non-posted" instead of "strongly ordered" to describe nGnRnE mappings

Rename the following defines:
- _ARM_BUS_SPACE_MAP_STRONGLY_ORDERED to BUS_SPACE_MAP_NONPOSTED
- PMAP_DEV_SO to PMAP_DEV_NP
- LX_BLKPAG_ATTR_DEVICE_MEM_SO to LX_BLKPAG_ATTR_DEVICE_MEM_NP
Rename the following option:
- AARCH64_DEVICE_MEM_STRONGLY_ORDERED to AARCH64_DEVICE_MEM_NONPOSTED
 1.87 23-Aug-2022  ryo Bss clearing is now done at the beginning of start.S.

Some `__attribute__((__section__(".data")))' hack will no longer be needed.
 1.86 06-May-2022  ryo Sprinkle isb after modifying system regs of pointer auth.
With options ARMV83_PAC, it now works on native Mac M1.

TODO: Multiple ISBs should be combined in one place.
 1.85 31-Jan-2022  ryo add support Hardware updates to Access flag and Dirty state (FEAT_HAFDBS)

- The DBM bit of the PTE is now used to determine if it is writable, and
the AF bit is treated entirely as a reference bit. A valid PTE is always
treated as readable. There can be no valid PTE that is not readable.
- LX_BLKPAG_OS_{READ,WRITE} are used only for debugging purposes,
and has been superseded by LX_BLKPAG_AF and LX_BLKPAG_DBM.
- Improve comment

The need for reference/modify emulation has been eliminated,
and access/permission faults have been reduced, however,
there has been little change in overall performance.
 1.84 10-Dec-2021  andvar s/occured/occurred/ in comments, log messages and man pages.
 1.83 14-Nov-2021  riastradh arm: Fix CPU startup synchronization.

- Use load-acquire instead of (wrong) membar_consumer then load in
cpu_boot_secondary_processors and cpu_hatched_p.

=> (Could use load then membar_consumer instead but load-acquire is
shorter.)

- Issue dmb ish before setting or clearing the bit in
cpu_set_hatched and cpu_clr_mbox to effect a store-release.

=> (Could use membar_exit, which is semantically weaker than dmb ish
but on arm is just implemented as dmb ish.)

=> (Could use stlr except we don't have atomic_ops(9) to do that.)

This way, everything before cpu_set_hatched or cpu_clr_mbox is
guaranteed to happen before everything after
cpu_boot_secondary_processors, which was previously not guaranteed.
 1.82 31-Oct-2021  skrll Rework Arm (32bit and 64bit) AP startup so that cpu_hatch doesn't sleep.

The AP initialisation code in cpu_init_secondary_processor will read and
initialise the required system registers and state for the BP to attach
and report.

Rework the interrupt handler code for this new sequence. Thankfully,
this removes a bunch of code for bcm2836mp.

The VFP detection handler on <= armv7 relies on the global undefined
handler being in place until the BP attaches vfp. That is, after the
APs have been spun up.

gicv3_its.c has a serialisation issue which is protected against in
the gicv3_its_cpu_init, which is called from cpu_hatch, with a spin
lock. The serialisation issue needs addressing more completely.

Tested on RPI3, Apple M1, QEMU, and lx2k

Fixes PR port-arm/56264:
diagnostic assertion "l->l_stat == LSONPROC" failed on RPI3
 1.81 21-Oct-2021  skrll fix gimplish
 1.80 30-Sep-2021  skrll Ensure TCR_EPD0 is set on entry to pmap_activate and ensure it is set as
early as possible for APs.
 1.79 30-Aug-2021  jmcneill Add instruction barrier after write to mair_el1
 1.78 21-Mar-2021  skrll Fix a comment
 1.77 20-Mar-2021  skrll Make pmapboot_enter panic if anything goes wrong and any mappings overlap
rather than only doing it in locore.S
 1.76 09-Jan-2021  jmcneill branches: 1.76.2;
Avoid mismatched memory attributes for kernel and page table memory.

The initial page table code enters mappings first through an identity
mapped normal-NC mapping. Then later on, additional mappings are added
through a KVA-mapped normal-WB mapping. There is a warning about this
in the Armv8 ARM:

Bytes written without the Write-Back cacheable attribute within the
same Write-Back granule as bytes written with the Write-Back cacheable
attribute might have their values reverted to the old values as a result
of cache Write-Back.

Change the identity mapping attributes to match the KVA-mapping. This
fixes an issue where the kernel often doesn't start under ESXi-Arm Fling.
 1.75 26-Dec-2020  jmcneill Always issue isb after cpacr_el1 writes since it is a context-changing
operation.
 1.74 22-Oct-2020  ryo branches: 1.74.2;
Don't trap EL0 accesses to the DCC registers.
VMWare use "mrs xzr, mdccsr_el0" for guest side backdoor.
 1.73 15-Sep-2020  ryo fix typo
 1.72 15-Sep-2020  ryo fix aarch64eb MULTIPROCESSOR boot

- set endian of EL2,EL1 and EL0 at the beginning of start() and cpu_mpstart()
- drop_to_el1() keeps the endian setting
 1.71 16-Aug-2020  skrll Improve comments
 1.70 12-Aug-2020  skrll Part II of ad's aarch64 performance improvements (cpu_switch.S bugs are
all mine)

- Use tpidr_el1 to hold curlwp and not curcpu, because curlwp is accessed
much more often by MI code. It also makes curlwp preemption safe and
allows aarch64_curlwp() to be a const function (curcpu must be volatile).

- Make ASTs operate per-LWP rather than per-CPU, otherwise sometimes LWPs
can see spurious ASTs (which doesn't cause a problem, it just means some
time may be wasted).

- Use plain stores to set/clear ASTs. Make sure ASTs are always set on the
same CPU as the target LWP, and delivered via IPI if posted from a remote
CPU so that they are resolved quickly.

- Add some cache line padding to struct cpu_info, to match x86.

- Add a memory barrier in a couple of places where ci_curlwp is set. This
is needed whenever an LWP that is resuming on the CPU could hold an
adaptive mutex. The barrier needs to drain the CPU's store buffer, so
that the update to ci_curlwp becomes globally visible before the LWP can
resume and call mutex_exit(). By my reading of the ARM docs it looks like
the instruction I used will do the right thing, but I'm not 100% sure.
 1.69 02-Aug-2020  maxv Add support for Privileged Access Never (ARMv8.1-PAN).

PAN provides the same functionality as SMAP on x86: it forbids kernel
access to userland pages when PSTATE.PAN=1, and allows such accesses when
PSTATE.PAN=0.

We clear SCTLR_SPAN, to guarantee that PAN=1 each time the kernel is
entered. We catch PAN faults and panic right away without further
processing. In copyin, copyout, etc, we temporarily authorize access to
userland pages.

PAN is a very useful exploit mitigation. Reviewed by ryo@, thanks. Tested
on Qemu. Enabled by default.
 1.68 17-Jul-2020  ryo Add options PMAPBOOT_DEBUG to dump TTBR when pmapboot_enter().
Formerly DEBUG_MMU in locore.S, but there was a bit of confusion.
 1.67 16-Jul-2020  skrll pmapboot_enter simplication
- bootpage_alloc in asm becomes pmapboot_pagealloc in C
- PMAPBOOT_ENTER_NOBLOCK is removed as it's not used
- PMAPBOOT_ENTER_NOOVERWRITE is removed as it's now always on
- physpage_allocator argument is removed as it's always
pmapboot_pagealloc
- Support for EARLYCONS without CONSADDR is removed so that the identity
map for CONSADDR is always known.

For the assembly files:
2 files changed, 40 insertions(+), 89 deletions(-)

LGTM ryo
 1.66 12-Jul-2020  skrll More DEBUG
 1.65 12-Jul-2020  skrll KNF (whitespace)
 1.64 28-May-2020  ryo - make AP{IB,DA,DB}Key are also enabled when ARMV83_PAC.
- If no ARMV83_PAC, clearly disable SCTLR_En{IA,IB,DA,DB}
 1.63 27-May-2020  ryo don't use x8 (caller-saved register) across functions

fix llvm+EARLYCONS kernel doesn't boot. it was working luckily with gcc.
 1.62 26-May-2020  ryo clang assembler evaluates #'\r' as #0x72. Grrr
 1.61 26-May-2020  ryo fixed that BTI trap will be occured when AP jumps to mp_vstart on ARMV85_BTI+SMP evironment.
 1.60 15-May-2020  ryo SCTLR_EnIA should be enabled in the caller(locore).

For some reason, gcc make aarch64_pac_init() function non-leaf, and it uses paciasp/autiasp.
 1.59 12-Apr-2020  maxv Add support for Pointer Authentication (PAC).

We use the "pac-ret" option, to sign the return instruction pointer on
function entry, and authenticate it on function exit. This acts as a
mitigation against ROP.

The authentication uses a per-lwp (secret) I-A key stored in the 128bit
APIAKey register and part of the lwp context. During lwp creation, the
kernel generates a random key, and during context switches, it installs
the key of the target lwp on the CPU.

Userland cannot read the APIAKey register directly. However, it can sign
its pointers with it, because the register is architecturally shared
between userland and the kernel. Although part of the CPU design, it is
a bit of an undesired behavior, because it allows to forge valid kernel
pointers from userland. To avoid that, we don't share the key with
userland, and rather switch it in EL0<->EL1 transitions. This means that
when userland executes, a different key is loaded in APIAKey than the one
the kernel uses. For now the userland key is a fixed 128bit zero value.

The DDB stack unwinder is changed to strip the authentication code from
the pointers in lr.

Two problems are known:

* Currently the idlelwps' keys are not really secret. This is because
the RNG is not yet available when we spawn these lwps. Not overly
important, but would be nice to fix with UEFI RNG.
* The key switching in EL0<->EL1 transitions is not the most optimized
code on the planet. Instead of checking aarch64_pac_enabled, it would
be better to hot-patch the code at boot time, but there currently is
no hot-patch support on aarch64.

Tested on Qemu.
 1.58 20-Feb-2020  skrll branches: 1.58.4;
Use orr instead of mov (an alias for orr) to appease clang... *shrug*
 1.57 15-Feb-2020  tnn avoid nesting /*'s (-Wcomment)
 1.56 15-Feb-2020  skrll Various updates and improvements to cpu start up on arm/aarch64

- start sharing more code around the AP startup messaging.
- call arm_cpu_topology_set early so that ci_core_id is available for
drivers, e.g. bcm2835_intr.c
- both arm and aarch64 now have
- a static cpu_info_store array
- the same arm_cpu_{hatched,mbox}
 1.55 09-Feb-2020  skrll Improve comments
 1.54 28-Jan-2020  maxv Jazelle and T32EE are not part of ARMv8, fix the bits to their real
meanings. No functional change.
 1.53 19-Jan-2020  skrll Replace the two copies of the ADDR macro with a centralised adrl macro.
The adrl name matches the one used by armasm.
 1.52 15-Jan-2020  mrg port the arm64 cpu topology setup for big.little to arm.

rename arm64 cpu_do_topology() to arm_cpu_do_topology() and
call it from both arm cpu_attach().

replace both aarch64_set_topology() inline code in arm
cpu_attach() with new arm_cpu_do_topology(), which is called
by the arm64 locore as well (possibly not needed, which would
allow it to become static.)

not yet tested on a real big.little armv7 system. tested
on rockpro64 and pinebook pro.
 1.51 12-Jan-2020  mrg provide some semblance of valid cpu topology for big.little systems.

while attaching cpus, if the FDT provides "capacity-dmips-mhz" track
the fastest set, and call cpu_topology_set() with slow=true for any
cpus that are not the fastest.

bug fix for cpu_topology_set(): actually set ci_is_slow for slow cpus.

with this change, and -current's recent scheduler changes, this means
that long running processes run on the faster cores. on RK3399 based
systems, i am seeing 20-50% speed ups for many tasks.


XXX: all this can be made common with armv7 big.little.
 1.50 08-Jan-2020  ryo branches: 1.50.2;
fix panic when modload.

>panic: kernel diagnostic assertion "!pmap_extract(pmap_kernel(), loopva, NULL)" failed: file "../../../../uvm/uvm_km.c", line 674 loopva=0xffffffc001000000'

The space allocated by bootpage_alloc() is only used as a physical page
for pagetable pages, so there is no need to map it with KVA.
And kernend_extra should not have consumed any KVA space.
 1.49 28-Dec-2019  jmcneill Do not use Early Write Acknowledge for PCIe I/O and config space.
 1.48 27-Dec-2019  jmcneill Enable early write acknowledge for device memory mappings.
 1.47 26-Dec-2019  skrll Whitespace
 1.46 08-Dec-2019  skrll Mark FDT as non-exec and create KVA=VA mapping of same size as identity
mapping, i.e. include BOOTPAGE_ALLOC_MAX
 1.45 22-Nov-2019  mlelstv Make cache operations available early.
 1.44 20-Oct-2019  jmcneill Use separate cacheline aligned arrays for mbox and hatched as before.
 1.43 20-Oct-2019  skrll Avoid overlap between BP and last AP stack. AP stacks are now in order of
increasing address order.

Spotted by and idea from mlelstv.
 1.42 19-Oct-2019  jmcneill Increase aarch64 MAXCPUS to 256.
 1.41 29-Sep-2019  skrll Typo in comment
 1.40 08-Sep-2019  jmcneill Map device memory for early console XN
 1.39 17-Jul-2019  skrll branches: 1.39.2;
Spell endianness correctly in comments
 1.38 15-Jul-2019  skrll Fix a comment
 1.37 15-Jul-2019  skrll Restore the comment against the line changed in the last commit
 1.36 15-Jul-2019  skrll Pass the VA of start (and not VM_MIN_KERNEL_ADDRESS) when mapping the
kernel at its KVA address. Previously the last 64 bytes of the .bss might
not be mapped if _end was within 64 bytes of a L2_SIZE boundary
 1.35 11-Jul-2019  skrll Typo in comment
 1.34 11-Jul-2019  skrll Remove unnecessary #include
 1.33 27-Jan-2019  pgoyette Merge the [pgoyette-compat] branch
 1.32 13-Dec-2018  ryo add support PT_STEP
 1.31 08-Nov-2018  maxv Track the stack with kASan on aarch64. Same principle as on amd64. Illegal
accesses occurring there are now detected.

Originally written by me, but reworked by ryo@, thanks.
 1.30 18-Oct-2018  skrll Provide generic start code that assumes the MMU is off and caches are
disabled as per the linux booting protocol for ARMv6 and ARMv7 boards.
u-boot image type should be changed to 'linux' for correct behaviour.

The new start code builds a minimal "bootstrap" L1PT with cached access
disabled and uses the same table for all processors. AP startup is
performed in less steps and more code is written in C.

The bootstrap tables and stack are placed into an (orphaned) section
"_init_memory" which is given to uvm when it is no longer used.

Various kernels have been converted to use this code and tested. Some
boards were provided by TNF. Thanks!

The GENERIC kernel now boots on boards using the TEGRA, SUNXI and EXYNOS
kernels. The GENERIC kernel will also work on RPI2 using u-boot.

Thanks to martin@ and aymeric@ for testing on parallella and nanosoc
respectively
 1.29 12-Oct-2018  ryo add initial support of COMPAT_NETBSD32 on AArch64.
arm ELF32 EABI binaries could be execute in AArch32 state on AArch64. A32 THUMB mode is not supported yet.
 1.28 04-Oct-2018  ryo cleanup locore, and changed the way to map memories during boot.
- add functions bootpage_enter() and bootpage_alloc() to adapt various layout
of physical memory map. especially for 64bit physical memory layout.
pmapboot_alloc() allocates pagetable pages from _end[].
- changed to map only the required amount for PA=VA identity mapping
(kernel image, UART device, and FDT blob) with L2_BLOCK(2Mbyte).
- changing page permission for kernel image, and making KSEG mapping are done
at cpu_kernel_vm_init() instead of at locore.
- optimize PTE entries with PTE Contiguous bit. it is enabled on devmap only for now.

reviewed by skrll@, thanks.
 1.27 04-Oct-2018  ryo * define LX_BLKPAG_{OS,ATTR}_* for OS dependent PTE attributes in pmap.h
* cleanup macros
 1.26 01-Oct-2018  skrll Comment out printing L2CTLR_EL1 as it is implementation specific.

OK ryo
 1.25 10-Sep-2018  ryo cleanup aarch64 mpstart and fdt bootstrap
* arm_cpu_hatch_arg is a bad idea. avoid serializing CPU startup, and eliminate arm_cpu_hatch_arg.
in mpstart, resolve own cpu index using array of cpu_mpidr[] (aarch64)
* add support fdt enable-method "spin-table"
* add support fdt enable-method "brcm,bcm2836-smp" (for 32bit RaspberryPi)
* use arm_fdt_cpu_bootstrap() instead of psci_fdt_bootstrap()
* rename "arm/fdt/psci_fdt.h" to "arm/fdt/psci_fdtvar.h" because of conflict of include file for needs-flag
* add devmap for cpu spin-table of raspberrypi3/aarch64
* no need to force hatch APs for raspberrypi3/arm32 ifndef MULTIPROCESSOR.
* fix to work pmap_extract(kerneltext/data/bss) even if before calling pmap_bootstrap

idea to use cpu_mpidr[] by jmcneill@. reviewd by skrll@. thanks.
 1.24 10-Sep-2018  skrll Don't use printx before setting up stack.
 1.23 10-Sep-2018  skrll Fix typos and DEBUG_MMU output. From Rin Okuyama.
 1.22 04-Sep-2018  skrll Adjust register usage a bit and unbreak DEBUG_MMU as a result.

The change moves to using callee-saved registers more so that any call
into C will have them preserved (if they're used or not). It's safe
to use stack as it's setup very early for BP/APs.

Discussed with ryo@
 1.21 30-Aug-2018  maxv Use ASM markers for functions, it makes the code easier to understand and
eliminates raw symbols. No functional change (tested on RPI3B+).
 1.20 26-Aug-2018  ryo add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!
 1.19 24-Aug-2018  ryo set correctly TCR_EL1 for inner shareable when MULTIPROCESSOR
 1.18 10-Aug-2018  ryo treat kernel-exec attr and user-exec attr separately.
kernel cannot execute userland exec page, and user cannot execute kernel page.
 1.17 10-Aug-2018  maxv Enlighten a little.
 1.16 06-Aug-2018  ryo set kernel rodata/data non-executable.
set rodata section on 2Mbytes aligned. (kernel image is mapped with 2Mbytes L2 block)
 1.15 06-Aug-2018  ryo set kernel text/rodata readonly by default.
add function db_write_text() for setting ddb breakpoint.
 1.14 03-Aug-2018  ryo set kernel text/rodata readonly when not defined DDB.
set readonly segment on 2Mbytes aligned. (kernel image is mapped with 2Mbytes L2 block)
 1.13 17-Jul-2018  christos centralize fp,lr definitions
 1.12 17-Jul-2018  ryo fix build with aarch64 gcc/gas
 1.11 17-Jul-2018  christos use c comments instead of #, consistently
 1.10 10-Jul-2018  ryo allow to read CNTVCT_EL0 and CNTFRQ_EL0 from EL0
 1.9 10-Jul-2018  ryo allow to execute wfi/wfe instruction on EL0. some userland program use them.
 1.8 09-Jul-2018  ryo add MULTIPROCESSOR support
 1.7 21-Jun-2018  ryo branches: 1.7.2;
* make to work printf() and panic() even before consinit().
* tidy up output for VERBOSE_INIT_ARM.
 1.6 17-May-2018  ryo allow to execute cache operation (DC CVAU,DC CIVAC, DC CVAC, IC IVAU) from userland.
 1.5 01-Apr-2018  ryo Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.4 25-Aug-2017  nisimura branches: 1.4.2;

- reorder faultbuf member.
- introduce trap() and interrupt(). now brk insn work.
-
 1.3 25-Aug-2017  nisimura make them better shape
 1.2 16-Aug-2017  nisimura add cpu_set_onfault glue
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.6; 1.1.22;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.22.1 31-Jul-2023  martin Pull up following revision(s) (requested by riastradh in ticket #1859):

sys/arch/ia64/ia64/vm_machdep.c: revision 1.18
sys/arch/powerpc/powerpc/locore_subr.S: revision 1.67
sys/arch/aarch64/aarch64/locore.S: revision 1.91
sys/arch/mips/include/asm.h: revision 1.74
sys/arch/hppa/include/cpu.h: revision 1.13
sys/arch/arm/arm/armv6_start.S: revision 1.38
(applied also to sys/arch/arm/cortex/a9_mpsubr.S,
sys/arch/arm/cortex/a9_mpsubr.S,
sys/arch/arm/cortex/cortex_init.S)
sys/arch/evbmips/ingenic/cpu_startup.S: revision 1.2
sys/arch/mips/mips/locore.S: revision 1.229
sys/arch/alpha/include/asm.h: revision 1.45
(applied to sys/arch/alpha/alpha/multiproc.s)
sys/arch/sparc64/sparc64/locore.s: revision 1.432
sys/arch/vax/vax/subr.S: revision 1.42
sys/arch/mips/mips/locore_mips3.S: revision 1.116
sys/arch/ia64/ia64/machdep.c: revision 1.44
sys/arch/arm/arm32/cpuswitch.S: revision 1.106
sys/arch/sparc/sparc/locore.s: revision 1.284
(all via patch)

aarch64: Add missing barriers in cpu_switchto.
Details in comments.

Note: This is a conservative change that inserts a barrier where
there was a comment saying none is needed, which is probably correct.
The goal of this change is to systematically add barriers to be
confident in correctness; subsequent changes may remove some bariers,
as an optimization, with an explanation of why each barrier is not
needed.

PR kern/57240

alpha: Add missing barriers in cpu_switchto.
Details in comments.

arm32: Add missing barriers in cpu_switchto.
Details in comments.

hppa: Add missing barriers in cpu_switchto.
Not sure hppa has ever had working MULTIPROCESSOR, so maybe no
pullups needed?

ia64: Add missing barriers in cpu_switchto.
(ia64 has never really worked, so no pullups needed, right?)

mips: Add missing barriers in cpu_switchto.
Details in comments.

powerpc: Add missing barriers in cpu_switchto.
Details in comments.

sparc: Add missing barriers in cpu_switchto.

sparc64: Add missing barriers in cpu_switchto.
Details in comments.

vax: Note where cpu_switchto needs barriers.

Not sure vax has ever had working MULTIPROCESSOR, though, and I'm not
even sure how to spell store-before-load barriers on VAX, so no
functional change for now.
 1.1.6.1 28-Aug-2017  skrll Sync with HEAD
 1.1.4.3 03-Dec-2017  jdolecek update from HEAD
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file locore.S was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.4.2.9 26-Dec-2018  pgoyette Sync with HEAD, resolve a few conflicts
 1.4.2.8 26-Nov-2018  pgoyette Sync with HEAD, resolve a couple of conflicts
 1.4.2.7 20-Oct-2018  pgoyette Sync with head
 1.4.2.6 30-Sep-2018  pgoyette Ssync with HEAD
 1.4.2.5 06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.4.2.4 28-Jul-2018  pgoyette Sync with HEAD
 1.4.2.3 25-Jun-2018  pgoyette Sync with HEAD
 1.4.2.2 21-May-2018  pgoyette Sync with HEAD
 1.4.2.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.7.2.3 21-Apr-2020  martin Sync with HEAD
 1.7.2.2 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.7.2.1 10-Jun-2019  christos Sync with HEAD
 1.39.2.6 31-Jul-2023  martin Pull up following revision(s) (requested by riastradh in ticket #1676):

sys/arch/ia64/ia64/vm_machdep.c: revision 1.18
sys/arch/powerpc/powerpc/locore_subr.S: revision 1.67
sys/arch/aarch64/aarch64/locore.S: revision 1.91
sys/arch/mips/include/asm.h: revision 1.74
sys/arch/hppa/include/cpu.h: revision 1.13
sys/arch/arm/arm/armv6_start.S: revision 1.38
sys/arch/evbmips/ingenic/cpu_startup.S: revision 1.2
sys/arch/mips/mips/locore.S: revision 1.229
sys/arch/aarch64/aarch64/cpuswitch.S: revision 1.40
sys/arch/alpha/include/asm.h: revision 1.45
sys/arch/sparc64/sparc64/locore.s: revision 1.432
sys/arch/vax/vax/subr.S: revision 1.42
sys/arch/mips/mips/locore_mips3.S: revision 1.116
sys/arch/ia64/ia64/machdep.c: revision 1.44
sys/arch/arm/arm32/cpuswitch.S: revision 1.106
sys/arch/sparc/sparc/locore.s: revision 1.284
(all via patch)

aarch64: Add missing barriers in cpu_switchto.
Details in comments.

Note: This is a conservative change that inserts a barrier where
there was a comment saying none is needed, which is probably correct.
The goal of this change is to systematically add barriers to be
confident in correctness; subsequent changes may remove some bariers,
as an optimization, with an explanation of why each barrier is not
needed.

PR kern/57240

alpha: Add missing barriers in cpu_switchto.
Details in comments.

arm32: Add missing barriers in cpu_switchto.
Details in comments.

hppa: Add missing barriers in cpu_switchto.
Not sure hppa has ever had working MULTIPROCESSOR, so maybe no
pullups needed?

ia64: Add missing barriers in cpu_switchto.
(ia64 has never really worked, so no pullups needed, right?)

mips: Add missing barriers in cpu_switchto.
Details in comments.

powerpc: Add missing barriers in cpu_switchto.
Details in comments.

sparc: Add missing barriers in cpu_switchto.

sparc64: Add missing barriers in cpu_switchto.
Details in comments.

vax: Note where cpu_switchto needs barriers.

Not sure vax has ever had working MULTIPROCESSOR, though, and I'm not
even sure how to spell store-before-load barriers on VAX, so no
functional change for now.
 1.39.2.5 21-Jan-2020  martin Pull up following revision(s) (requested by ryo in ticket #617):

sys/arch/aarch64/aarch64/aarch64_machdep.c: revision 1.37
sys/arch/aarch64/aarch64/locore.S: revision 1.50

fix panic when modload.

panic: kernel diagnostic assertion "!pmap_extract(pmap_kernel(), loopva, NULL)" failed: file "../../../../uvm/uvm_km.c", line 674 loopva=0xffffffc001000000'

The space allocated by bootpage_alloc() is only used as a physical page
for pagetable pages, so there is no need to map it with KVA.
And kernend_extra should not have consumed any KVA space.
 1.39.2.4 29-Dec-2019  martin Pull up following revision(s) (requested by jmcneill in ticket #586):

sys/arch/arm/nvidia/tegra_pcie.c: revision 1.27
sys/arch/aarch64/aarch64/pmap.c: revision 1.57
sys/arch/aarch64/aarch64/locore.S: revision 1.48
sys/arch/aarch64/include/armreg.h: revision 1.29
sys/arch/aarch64/aarch64/pmap.c: revision 1.58
sys/arch/aarch64/aarch64/locore.S: revision 1.49
sys/arch/arm/acpi/acpipchb.c: revision 1.14
sys/arch/aarch64/aarch64/genassym.cf: revision 1.16
sys/arch/arm/acpi/acpi_machdep.c: revision 1.13
sys/arch/aarch64/include/pmap.h: revision 1.27
sys/arch/aarch64/aarch64/genassym.cf: revision 1.17
sys/arch/aarch64/include/pmap.h: revision 1.28
sys/arch/arm/fdt/pcihost_fdtvar.h: revision 1.3
sys/arch/arm/include/bus_defs.h: revision 1.14
sys/arch/aarch64/aarch64/bus_space.c: revision 1.9
sys/arch/arm/fdt/pcihost_fdt.c: revision 1.12
sys/arch/aarch64/conf/files.aarch64: revision 1.15
sys/arch/aarch64/conf/files.aarch64: revision 1.16
sys/arch/arm/rockchip/rk3399_pcie.c: revision 1.9

Enable early write acknowledge for device memory mappings.

Do not use Early Write Acknowledge for PCIe I/O and config space.
 1.39.2.3 09-Dec-2019  martin Pull up following revision(s) (requested by skrll in ticket #532):

sys/arch/aarch64/aarch64/locore.S: revision 1.46

Mark FDT as non-exec and create KVA=VA mapping of same size as identity
mapping, i.e. include BOOTPAGE_ALLOC_MAX
 1.39.2.2 23-Oct-2019  martin Pull up following revision(s) (requested by jmcneill in ticket #359):

sys/arch/aarch64/aarch64/locore.S: revision 1.42
sys/arch/aarch64/aarch64/locore.S: revision 1.43
sys/arch/aarch64/aarch64/locore.S: revision 1.44
sys/arch/arm/fdt/cpu_fdt.c: revision 1.28
sys/arch/aarch64/include/cpu.h: revision 1.14
sys/arch/aarch64/include/param.h: revision 1.12
sys/arch/arm/arm32/cpu.c: revision 1.133
sys/arch/arm/arm32/cpu.c: revision 1.134
sys/arch/arm/include/cpu.h: revision 1.101
sys/arch/arm/acpi/cpu_acpi.c: revision 1.7
sys/arch/aarch64/aarch64/cpu.c: revision 1.23
sys/arch/aarch64/aarch64/cpu.c: revision 1.24
sys/arch/aarch64/aarch64/cpu.c: revision 1.25

Increase aarch64 MAXCPUS to 256.

-

Invalidate dcache before polling AP hatched status

-

Avoid overlap between BP and last AP stack. AP stacks are now in order of
increasing address order.

Spotted by and idea from mlelstv.

-

Use separate cacheline aligned arrays for mbox and hatched as before.

-

cpu_hatched_p only for MULTIPROCESSOR
 1.39.2.1 22-Sep-2019  martin Pull up following revision(s) (requested by jmcneill in ticket #226):

sys/arch/aarch64/aarch64/locore.S: revision 1.40

Map device memory for early console XN
 1.50.2.3 29-Feb-2020  ad Sync with head.
 1.50.2.2 25-Jan-2020  ad Sync with head.
 1.50.2.1 17-Jan-2020  ad Sync with head.
 1.58.4.1 20-Apr-2020  bouyer Sync with HEAD
 1.74.2.2 03-Apr-2021  thorpej Sync with HEAD.
 1.74.2.1 03-Jan-2021  thorpej Sync w/ HEAD.
 1.76.2.1 03-Apr-2021  thorpej Sync with HEAD.
 1.89.2.1 31-Jul-2023  martin Pull up following revision(s) (requested by riastradh in ticket #264):

sys/arch/ia64/ia64/vm_machdep.c: revision 1.18
sys/arch/powerpc/powerpc/locore_subr.S: revision 1.67
sys/arch/aarch64/aarch64/locore.S: revision 1.91
sys/arch/mips/include/asm.h: revision 1.74
sys/arch/hppa/include/cpu.h: revision 1.13
sys/arch/arm/arm/armv6_start.S: revision 1.38
sys/arch/evbmips/ingenic/cpu_startup.S: revision 1.2
sys/arch/mips/mips/locore.S: revision 1.229
sys/arch/aarch64/aarch64/cpuswitch.S: revision 1.40
sys/arch/alpha/include/asm.h: revision 1.45
sys/arch/sparc64/sparc64/locore.s: revision 1.432
sys/arch/vax/vax/subr.S: revision 1.42
sys/arch/mips/mips/locore_mips3.S: revision 1.116
sys/arch/riscv/riscv/cpu_switch.S: revision 1.3
sys/arch/ia64/ia64/machdep.c: revision 1.44
sys/arch/arm/arm32/cpuswitch.S: revision 1.106
sys/arch/sparc/sparc/locore.s: revision 1.284

aarch64: Add missing barriers in cpu_switchto.
Details in comments.

Note: This is a conservative change that inserts a barrier where
there was a comment saying none is needed, which is probably correct.
The goal of this change is to systematically add barriers to be
confident in correctness; subsequent changes may remove some bariers,
as an optimization, with an explanation of why each barrier is not
needed.

PR kern/57240

alpha: Add missing barriers in cpu_switchto.
Details in comments.

arm32: Add missing barriers in cpu_switchto.
Details in comments.

hppa: Add missing barriers in cpu_switchto.
Not sure hppa has ever had working MULTIPROCESSOR, so maybe no
pullups needed?

ia64: Add missing barriers in cpu_switchto.
(ia64 has never really worked, so no pullups needed, right?)

mips: Add missing barriers in cpu_switchto.
Details in comments.

powerpc: Add missing barriers in cpu_switchto.
Details in comments.

riscv: Add missing barriers in cpu_switchto.
Details in comments.

sparc: Add missing barriers in cpu_switchto.

sparc64: Add missing barriers in cpu_switchto.
Details in comments.

vax: Note where cpu_switchto needs barriers.

Not sure vax has ever had working MULTIPROCESSOR, though, and I'm not
even sure how to spell store-before-load barriers on VAX, so no
functional change for now.
 1.93.2.1 02-Aug-2025  perseant Sync with HEAD
 1.11 11-Aug-2025  skrll gas has icc_sre_el2 now so remove an old hack
 1.10 30-Jan-2025  jmcneill Fix CNTHCTL_EL2 initialization when FEAT_ECV is present.

When FEAT_ECV is present, the ECV bit (and other ECV fields) reset to
an architecturally UNKNOWN value. As NetBSD does not use ECV today,
let's explicitly initialize all bits in CNTHCTL_EL2.
 1.9 30-Aug-2021  jmcneill branches: 1.9.10;
If we start in EL2 mode and the CPU supports EL2 host mode, don't bother
dropping to EL1 and just run the kernel in EL2 instead.
 1.8 26-Dec-2020  jmcneill Always issue isb after cpacr_el1 writes since it is a context-changing
operation.
 1.7 15-Sep-2020  ryo branches: 1.7.2;
fix typo
 1.6 15-Sep-2020  ryo fix aarch64eb MULTIPROCESSOR boot

- set endian of EL2,EL1 and EL0 at the beginning of start() and cpu_mpstart()
- drop_to_el1() keeps the endian setting
 1.5 05-Sep-2020  jakllsch aarch64: switch CPU to the kernel's byte order during boot
 1.4 29-Aug-2020  maxv Slightly clarify, and style.
 1.3 17-Jul-2018  christos deal with gas not having icc_sre_el2 (from jmcneill)
 1.2 09-Jul-2018  ryo keep stack pointer when changing from EL2 to EL1.
 1.1 01-Apr-2018  ryo branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1.4.1 10-Jun-2019  christos Sync with HEAD
 1.1.2.3 28-Jul-2018  pgoyette Sync with HEAD
 1.1.2.2 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.2.1 01-Apr-2018  pgoyette file locore_el2.S was added on branch pgoyette-compat on 2018-04-07 04:12:10 +0000
 1.7.2.1 03-Jan-2021  thorpej Sync w/ HEAD.
 1.9.10.1 02-Aug-2025  perseant Sync with HEAD
 1.25 18-Jun-2024  rin aarch64: cpu_getmcontext32: Fix sign compare for ras_lookup(9)

Now, compare with `(void *)-1` is done for x0, instead of w0.
No binary changes except for that.

Found by WARNS=5 build (as a module).
 1.24 07-Feb-2024  msaitoh Remove ryo@'s mail addresses.
 1.23 14-Nov-2021  skrll Trailing whitespace
 1.22 06-Nov-2021  thorpej COMPAT_NETBSD32 is all about running the 32-bit flavor of native
binaries on a 64-bit platform[*], as such:
- Make the logic about which "sendsig" flavor to call MI (as it is in the
native 64-bit environment) and follow the same rules as the native 32-bit
environment.
- Make COMPAT_NETBSD32 x COMPAT_16 work the same as it would in the
native 32-bit environment by providing a netbsd32_sendsig_sigcontext_16_hook,
rather than overriding the entire sendsig logic with a netbsd32_sendsig_hook.
- In netbsd32___sigaction_sigtramp(), make sure the compat_netbsd32_16
module is loaded if the trampoline version specifies a sigcontext style
handler, otherwise return EINVAL so that libc can try again with siginfo
style.

[*] ...except for arm32, which uses it to mean "run 32-bit OABI binaries
from the 32-bit EABI environment". Doing it this way was arguably a mistake,
but we are stuck with it for now, so support it by providing a machine-
dependent override for netbsd32_sendsig() that also disables the corresponding
logic in netbsd32___sigaction_sigtramp().

Fixes PR kern/56487.
 1.21 01-Nov-2021  thorpej Use "stack_t" instead of "struct sigaltstack", as the former is the
newer standardized name. NFC.
 1.20 27-Oct-2021  thorpej Use the signal trampoline version constants from <sys/signal.h>.
 1.19 23-Sep-2021  ryo use lwp_trapframe() macro. NFC.
 1.18 30-May-2021  rin Fix conversion between aarch64 and aarch32 fpreg's; in aarch32 mode,
d0-d31 are packed into v0-v15 (== q0-q15).

This fixes crashes in VFP-optimized codes running on COMPAT_NETBSD32.

OK ryo
 1.17 11-Dec-2020  skrll branches: 1.17.4; 1.17.6;
s:aarch64/cpufunc.h:arm/cpufunc.h:

a baby step in the grand arm header unification challenge
 1.16 15-Oct-2020  rin branches: 1.16.2;
Call netbsd32_adjust_limits() in netbsd32_setregs() for sure,
as done for amd64 and sparc64.
 1.15 15-Oct-2020  rin For rev 1.14 and before, netbsd32_process_write_regs() returns EINVAL
if non-modifiable bits are set in CPSR.

Instead, mask out non-modifiable bits and make this function success
regardless of value in CPSR. New behavior matches that of arm:

https://nxr.netbsd.org/xref/src/sys/arch/arm/arm/process_machdep.c#187

This fixes lib/libc/sys/t_ptrace_wait*:access_regs6 tests, in which
register contents retrieved by PT_GETREGS are set back by PT_SETREGS.

No new regression is observed in full ATF run.

OK ryo
 1.14 02-Jul-2020  rin Add support of ptrace(2) for COMPAT_NETBSD32.

Now, GDB for arm32 is usable for debugging 32bit applications.

OK ryo@
 1.13 23-May-2020  ryo Not only the kernel thread, but also the userland PAC keys
(APIA,APIB,APDA,APDB,APGA) are now randomly initialized at exec, and switched
when context switch.
userland programs are able to perform pointer authentication on ARMv8.3+PAC cpu.

reviewd by maxv@, thanks.
 1.12 23-Apr-2020  skrll Typo in comment
 1.11 23-Apr-2020  tnn fix inverted logic in NETBSD32 user signal stack handling (PR evbarm/55200)
 1.10 31-Jan-2020  maxv branches: 1.10.4;
D means E here (aarch32), so don't check it. A-I-F are checked below
already, so drop the whole line.
 1.9 24-Nov-2019  rin branches: 1.9.2;
PR port-arm/54702

Add support for earmv6hf binaries on COMPAT_NETBSD32 for aarch64:

- Emulate ARMv6 instructions with cache operations register (c7), that
are deprecated since ARMv7, and disabled on ARMv8 with LP64 kernel.

- ep_machine_arch (default: earmv7hf) is copied from executables, as we
do for mips64. "uname -p" reports earmv6hf if compiled for earmv6hf;
configure scripts etc can determine the appropriate architecture.

Many thanks to ryo@ for helping me to add support of Thumb-mode,
as well as providing exhaustive test cases:

https://github.com/ryo/mcr_test/

We've confirmed:

- Emulation works in Thumb-mode.
- T32 16-bit length illegal instruction results in SIGILL, even if
it is located nearby a boundary b/w mapped and unmapped pages.
- T32 32-bit instruction results in SIGSEGV if it is located across
a boundary b/w mapped and unmapped pages.

XXX
pullup to netbsd-9
 1.8 20-Nov-2019  pgoyette Move all non-emulation-specific coredump code into the coredump module,
and remove all #ifdef COREDUMP conditional compilation. Now, the
coredump module is completely separated from the emulation modules, and
they can all be independently loaded and unloaded.

Welcome to 9.99.18 !
 1.7 12-Jul-2019  skrll branches: 1.7.2;
Fix STACK_ALIGN "bytes" argument which actually should be a mask.

Spotted by Mark Millard (marklmi at yahoo.com)
 1.6 12-Apr-2019  ryo COMPAT_NETBSD32 to work on also thumbmode
 1.5 27-Jan-2019  alnsn Local variable p is __diagused.
 1.4 27-Jan-2019  pgoyette Merge the [pgoyette-compat] branch
 1.3 27-Nov-2018  maxv Fix widespread leak in the sendsig_siginfo() functions. sigframe_siginfo
has padding, so zero it out properly. While here I'm also zeroing out some
other things in several ports, for safety. Same problem in netbsd32, so
fix that too.

I can't compile-test on each architecture, but there should be no
breakage (tm).

Overall this fixes at least 14 info leaks. Prompted by the discovery by
KLEAK of a leak in amd64's sendsig_siginfo.
 1.2 12-Oct-2018  ryo add initial support of COMPAT_NETBSD32 on AArch64.
arm ELF32 EABI binaries could be execute in AArch32 state on AArch64. A32 THUMB mode is not supported yet.
 1.1 01-Apr-2018  ryo branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1.4.3 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.1.4.2 08-Apr-2020  martin Merge changes from current as of 20200406
 1.1.4.1 10-Jun-2019  christos Sync with HEAD
 1.1.2.6 26-Dec-2018  pgoyette Sync with HEAD, resolve a few conflicts
 1.1.2.5 20-Oct-2018  pgoyette Sync with head
 1.1.2.4 02-Oct-2018  pgoyette Use a hook callback to allow sparc fpu code to determine if a process
is running under sunos emulation (in which case, fpu cleanup uses a
different set of fpu_codes[]).
 1.1.2.3 01-Oct-2018  pgoyette Implement dummy netbsd32_compat_{13,16} routines. aarch64 doesn't have
compat going that far back, but the build infrastructure expects to see
these sources and *_{init,fini} symbols.
 1.1.2.2 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.2.1 01-Apr-2018  pgoyette file netbsd32_machdep.c was added on branch pgoyette-compat on 2018-04-07 04:12:10 +0000
 1.7.2.3 01-Jun-2021  martin Pull up following revision(s) (requested by rin in ticket #1277):

sys/arch/aarch64/aarch64/netbsd32_machdep.c: revision 1.18

Fix conversion between aarch64 and aarch32 fpreg's; in aarch32 mode,
d0-d31 are packed into v0-v15 (== q0-q15).

This fixes crashes in VFP-optimized codes running on COMPAT_NETBSD32.

OK ryo
 1.7.2.2 01-Jan-2021  martin Pull up following revision(s) (requested by rin in ticket #1172):

sys/arch/aarch64/aarch64/trap.c: revision 1.30
sys/arch/aarch64/include/ptrace.h: revision 1.10
sys/arch/aarch64/include/netbsd32_machdep.h: revision 1.4 (patch)
sys/arch/aarch64/aarch64/netbsd32_machdep.c: revision 1.14
sys/arch/aarch64/aarch64/netbsd32_machdep.c: revision 1.15

Add support of ptrace(2) for COMPAT_NETBSD32.

Now, GDB for arm32 is usable for debugging 32bit applications.
OK ryo@

For rev 1.14 and before, netbsd32_process_write_regs() returns EINVAL
if non-modifiable bits are set in CPSR.
Instead, mask out non-modifiable bits and make this function success
regardless of value in CPSR. New behavior matches that of arm:
https://nxr.netbsd.org/xref/src/sys/arch/arm/arm/process_machdep.c#187

This fixes lib/libc/sys/t_ptrace_wait*:access_regs6 tests, in which
register contents retrieved by PT_GETREGS are set back by PT_SETREGS.

No new regression is observed in full ATF run.

OK ryo
 1.7.2.1 02-May-2020  martin Pull up following revision(s) (requested by tnn in ticket #883):

sys/arch/aarch64/aarch64/netbsd32_machdep.c: revision 1.11

fix inverted logic in NETBSD32 user signal stack handling (PR evbarm/55200)
 1.9.2.1 29-Feb-2020  ad Sync with head.
 1.10.4.1 25-Apr-2020  bouyer Sync with bouyer-xenpvh-base2 (HEAD)
 1.16.2.1 14-Dec-2020  thorpej Sync w/ HEAD.
 1.17.6.1 31-May-2021  cjep sync with head
 1.17.4.1 17-Jun-2021  thorpej Sync w/ HEAD.
 1.4 17-Jun-2024  pgoyette Include required headers
 1.3 14-Nov-2021  skrll branches: 1.3.4;
Trailing whitespace
 1.2 27-Jan-2019  pgoyette branches: 1.2.4;
Merge the [pgoyette-compat] branch
 1.1 01-Oct-2018  pgoyette branches: 1.1.2;
file netbsd32_machdep_13.c was initially added on branch pgoyette-compat.
 1.1.2.1 01-Oct-2018  pgoyette Implement dummy netbsd32_compat_{13,16} routines. aarch64 doesn't have
compat going that far back, but the build infrastructure expects to see
these sources and *_{init,fini} symbols.
 1.2.4.2 10-Jun-2019  christos Sync with HEAD
 1.2.4.1 27-Jan-2019  christos file netbsd32_machdep_13.c was added on branch phil-wifi on 2019-06-10 22:05:43 +0000
 1.3.4.1 22-Jun-2024  martin Pull up following revision(s) (requested by pgoyette in ticket #724):

sys/modules/compat_netbsd32_16/Makefile: revision 1.5
sys/arch/powerpc/powerpc/compat_16_machdep.c: revision 1.25
sys/arch/powerpc/powerpc/compat_16_machdep.c: revision 1.26
sys/modules/compat_16/Makefile: revision 1.3
sys/modules/compat_netbsd32_13/Makefile: revision 1.5
sys/modules/compat_16/Makefile: revision 1.4
sys/arch/sun2/sun2/genassym.cf: revision 1.17
sys/arch/sun2/sun2/enable.h: revision 1.5
sys/modules/compat_13/Makefile: revision 1.3
sys/modules/compat_13/Makefile: revision 1.4
sys/modules/compat_13/Makefile: revision 1.5
sys/arch/mips/mips/netbsd32_machdep_16.c: revision 1.8
sys/modules/Makefile.compat: revision 1.1
sys/arch/mips/mips/netbsd32_machdep_13.c: revision 1.4
share/mk/bsd.kmodule.mk: revision 1.86
sys/arch/aarch64/aarch64/netbsd32_machdep_16.c: revision 1.4
sys/arch/powerpc/powerpc/compat_13_machdep.c: revision 1.23
sys/arch/aarch64/aarch64/netbsd32_machdep_13.c: revision 1.4

Import AFLAGS to allow processing of assembler files in modules.
Prerequisite for kern/583346.

Introduce sys/modules/Makefile.compat and hook some compat_1[36]
machdep code into the modules. kern/58346

Ooops missed a source file!

Proteect #include of kernel options files with #ifdef _KERNEL_OPT

XXX Add to existing 10.0 and 9.0 tickets for kern/583346

Include required headers

Add required include for compat_16 machdep code

fix the m68k compat_13 build.

include Makefile.assym to generate assym.h.
use -I. and -x assembler-with-cpp to actually use cpp and find assym.h.
also apply m68k assym.h fix here as well as compat_13.

powerpc64: Provide dummy stubs for compat1[36]
as done for amd64. We haven't had working userland for powerpc64,
and therefore compatible to 1.[36] is only useful for netbsd32.

Fix build failure for evbppc64 for PR kern/58346 (my bug!).
sun2/genassym.cf: Skip KERNBASE for _MODULE
as it is not a compile-time constant; see sun2/vmparam.h.

It should not be, and is not actually, used for modules.

PR kern/58346

sun2/enable.h: Fix -Wold-style-definition for WARNS=5 build as modules
Finally fix sun2 build for PR kern/58346
 1.4 17-Jun-2024  pgoyette Include required headers
 1.3 14-Nov-2021  skrll branches: 1.3.4;
Trailing whitespace
 1.2 27-Jan-2019  pgoyette branches: 1.2.4;
Merge the [pgoyette-compat] branch
 1.1 01-Oct-2018  pgoyette branches: 1.1.2;
file netbsd32_machdep_16.c was initially added on branch pgoyette-compat.
 1.1.2.1 01-Oct-2018  pgoyette Implement dummy netbsd32_compat_{13,16} routines. aarch64 doesn't have
compat going that far back, but the build infrastructure expects to see
these sources and *_{init,fini} symbols.
 1.2.4.2 10-Jun-2019  christos Sync with HEAD
 1.2.4.1 27-Jan-2019  christos file netbsd32_machdep_16.c was added on branch phil-wifi on 2019-06-10 22:05:43 +0000
 1.3.4.1 22-Jun-2024  martin Pull up following revision(s) (requested by pgoyette in ticket #724):

sys/modules/compat_netbsd32_16/Makefile: revision 1.5
sys/arch/powerpc/powerpc/compat_16_machdep.c: revision 1.25
sys/arch/powerpc/powerpc/compat_16_machdep.c: revision 1.26
sys/modules/compat_16/Makefile: revision 1.3
sys/modules/compat_netbsd32_13/Makefile: revision 1.5
sys/modules/compat_16/Makefile: revision 1.4
sys/arch/sun2/sun2/genassym.cf: revision 1.17
sys/arch/sun2/sun2/enable.h: revision 1.5
sys/modules/compat_13/Makefile: revision 1.3
sys/modules/compat_13/Makefile: revision 1.4
sys/modules/compat_13/Makefile: revision 1.5
sys/arch/mips/mips/netbsd32_machdep_16.c: revision 1.8
sys/modules/Makefile.compat: revision 1.1
sys/arch/mips/mips/netbsd32_machdep_13.c: revision 1.4
share/mk/bsd.kmodule.mk: revision 1.86
sys/arch/aarch64/aarch64/netbsd32_machdep_16.c: revision 1.4
sys/arch/powerpc/powerpc/compat_13_machdep.c: revision 1.23
sys/arch/aarch64/aarch64/netbsd32_machdep_13.c: revision 1.4

Import AFLAGS to allow processing of assembler files in modules.
Prerequisite for kern/583346.

Introduce sys/modules/Makefile.compat and hook some compat_1[36]
machdep code into the modules. kern/58346

Ooops missed a source file!

Proteect #include of kernel options files with #ifdef _KERNEL_OPT

XXX Add to existing 10.0 and 9.0 tickets for kern/583346

Include required headers

Add required include for compat_16 machdep code

fix the m68k compat_13 build.

include Makefile.assym to generate assym.h.
use -I. and -x assembler-with-cpp to actually use cpp and find assym.h.
also apply m68k assym.h fix here as well as compat_13.

powerpc64: Provide dummy stubs for compat1[36]
as done for amd64. We haven't had working userland for powerpc64,
and therefore compatible to 1.[36] is only useful for netbsd32.

Fix build failure for evbppc64 for PR kern/58346 (my bug!).
sun2/genassym.cf: Skip KERNBASE for _MODULE
as it is not a compile-time constant; see sun2/vmparam.h.

It should not be, and is not actually, used for modules.

PR kern/58346

sun2/enable.h: Fix -Wold-style-definition for WARNS=5 build as modules
Finally fix sun2 build for PR kern/58346
 1.2 07-Feb-2024  msaitoh Remove ryo@'s mail addresses.
 1.1 12-Oct-2018  ryo branches: 1.1.2; 1.1.6;
add initial support of COMPAT_NETBSD32 on AArch64.
arm ELF32 EABI binaries could be execute in AArch32 state on AArch64. A32 THUMB mode is not supported yet.
 1.1.6.2 10-Jun-2019  christos Sync with HEAD
 1.1.6.1 12-Oct-2018  christos file netbsd32_syscall.c was added on branch phil-wifi on 2019-06-10 22:05:43 +0000
 1.1.2.2 20-Oct-2018  pgoyette Sync with head
 1.1.2.1 12-Oct-2018  pgoyette file netbsd32_syscall.c was added on branch pgoyette-compat on 2018-10-20 06:58:23 +0000
 1.151 16-Feb-2024  andvar Replace obsolete pv_dump() call with pmap_db_mdpg_print().

It was rewritten on rev 1.107, but not replaced with new implementation in
PMAP_PV_DEBUG guarded block.
 1.150 07-Feb-2024  msaitoh Remove ryo@'s mail addresses.
 1.149 20-Apr-2023  skrll Provide a shared pmap_devmap implementation and convert all pmap_devmap
arrays to use DEVMAP_ENTRY{,_END}
 1.148 16-Apr-2023  skrll Rename VM_KERNEL_IO_ADDRESS to VM_KERNEL_IO_BASE to match RISC-V

It's less letters, matches other similar variables and will help with
sharing code between the two architectures.

NFCI.
 1.147 30-Oct-2022  riastradh aarch64/pmap: Fix criterion in previous.

Use the pte bit that says whether this is a PMAP_WIRED page, not the
bit that says whether this is a non-global page.

(Forgot to git commit --amend before exporting to CVS, sorry!)
 1.146 30-Oct-2022  riastradh aarch64/pmap(9): Teach pmap_protect about pmap_kenter_pa mappings.

Pages mapped with pmap_kenter_pa are necessarily unmanaged, so there
are no P->V records, and pmap_kenter_pa leaves pp->pp_pv.pv_va zero
with no modified/referenced state.

However, pmap_protect erroneously examined pp->pp_pv.pv_va to
ascertain the modified/referenced state -- and if the page was not
marked referenced, pmap_protect would clear the LX_BLKPAG_AF bit
(Access Flag), with the effect that subsequent uses of the page fault
and require a detour through pmap_fault_fixup.

This caused problems for the kernel module loader:

- When loading the text section, kobj_load first allocates kva with
uvm_km_alloc(UVM_KMF_WIRED|UVM_KMF_EXEC), which creates ptes with
pmap_kenter_pa. These ptes are writable, so we can copy the text
section into them, and have LX_BLKPAG_AF set so there will be no
fault when they are used by the kernel.

- But then kobj_affix makes the text section read/execute-only (and
nonwritable) with uvm_km_protect(VM_PROT_READ|VM_PROT_EXECUTE),
which updates the ptes with pmap_protect. This _should_ leave
LX_BLKPAG_AF set, but by inadvertently treating the page as managed
when it should be unmanaged, pmap_protect cleared it instead.

- Most of the time, clearing LX_BLKPAG_AF caused no problem, because
pmap_fault_fixup would silently resolve it. But if a hard
interrupt handler tried to use any page in the module's text (or
rodata, I suspect) that was not yet fixed up, the CPU would fault
and enter pmap_fault_fixup -- which would promptly crash (or hang)
by trying to take the pmap lock in interrupt context, which is
forbidden.

I observed this by loading dtrace.kmod early at boot and trying to
dtrace hard interrupt handlers.

With this change, pmap_protect now recognizes wired mappings (as
created by pmap_kenter_pa) before consulting pp->pp_pv.pv_va, and
preserves then LX_BLKPAG_AF bit in that case.

ok skrll
 1.145 29-Oct-2022  skrll fix a spello in a comment
 1.144 28-Oct-2022  skrll Remove some empty lines
 1.143 23-Oct-2022  skrll Use UVMHIST_CALLARGS in pmap_bootstrap
 1.142 23-Oct-2022  skrll Only define the EFI variable if EFI_RUNTIME
 1.141 20-Oct-2022  skrll KNF
 1.140 15-Oct-2022  jmcneill Use "non-posted" instead of "strongly ordered" to describe nGnRnE mappings

Rename the following defines:
- _ARM_BUS_SPACE_MAP_STRONGLY_ORDERED to BUS_SPACE_MAP_NONPOSTED
- PMAP_DEV_SO to PMAP_DEV_NP
- LX_BLKPAG_ATTR_DEVICE_MEM_SO to LX_BLKPAG_ATTR_DEVICE_MEM_NP
Rename the following option:
- AARCH64_DEVICE_MEM_STRONGLY_ORDERED to AARCH64_DEVICE_MEM_NONPOSTED
 1.139 19-Aug-2022  ryo Fixed a bug that pte's __BIT(63,48) could be set when accessing addresses above 0x0001000000000000 in /dev/mem with mmap().
 1.138 19-Aug-2022  ryo When accessed in mmap by the device pager, pmap_enter() may be called with prot == PROT_WRITE.
 1.137 03-May-2022  skrll Sprinkle some KASSERT(kpreempt_disabled());
 1.136 27-Apr-2022  ryo since pmap_activate_efirt() rewrites TTBR0, it is necessary to pmap_activate() again after pmap_deactivate_efirt() to restore the original TTBR0.

- Fix to do pmap_{de,}activate() before/after pmap_{,de}activate_efirt().
- moved kpreempt_{disable,enable}() to the caller since everything between
arm_efirt_md_enter() and arm_efirt_md_exit() should be kpreempt disabled.

ok skrll@
 1.135 17-Apr-2022  skrll Add the missing kpreempt_enable to pmap_deactivate_efirt
 1.134 10-Apr-2022  skrll No need to flush icache for EFI RT mappings as bootaa64.efi flushed
the full icache for us. (Also this avoids traps)
 1.133 09-Apr-2022  riastradh sys: Use membar_release/acquire around reference drop.

This just goes through my recent reference count membar audit and
changes membar_exit to membar_release and membar_enter to
membar_acquire -- this should make everything cheaper on most CPUs
without hurting correctness, because membar_acquire is generally
cheaper than membar_enter.
 1.132 02-Apr-2022  skrll Update to support EFI runtime outside the kernel virtual address space
by creating an EFI RT pmap that can be activated / deactivated when
required.

Adds support for EFI RT to ARM_MMU_EXTENDED (ASID) 32-bit Arm machines.

On Arm64 the usage of pmapboot_enter is reduced and the mappings are
created much later in the boot process -- now in cpu_startup_hook.
Backward compatiblity for KVA mapped RT from old bootaa64.efi is
maintained.

Adding support to other platforms should be easier as a result.
 1.131 19-Mar-2022  skrll Slight code re-organisation. NFCI.
 1.130 12-Mar-2022  riastradh sys: Membar audit around reference count releases.

If two threads are using an object that is freed when the reference
count goes to zero, we need to ensure that all memory operations
related to the object happen before freeing the object.

Using an atomic_dec_uint_nv(&refcnt) == 0 ensures that only one
thread takes responsibility for freeing, but it's not enough to
ensure that the other thread's memory operations happen before the
freeing.

Consider:

Thread A Thread B
obj->foo = 42; obj->baz = 73;
mumble(&obj->bar); grumble(&obj->quux);
/* membar_exit(); */ /* membar_exit(); */
atomic_dec -- not last atomic_dec -- last
/* membar_enter(); */
KASSERT(invariant(obj->foo,
obj->bar));
free_stuff(obj);

The memory barriers ensure that

obj->foo = 42;
mumble(&obj->bar);

in thread A happens before

KASSERT(invariant(obj->foo, obj->bar));
free_stuff(obj);

in thread B. Without them, this ordering is not guaranteed.

So in general it is necessary to do

membar_exit();
if (atomic_dec_uint_nv(&obj->refcnt) != 0)
return;
membar_enter();

to release a reference, for the `last one out hit the lights' style
of reference counting. (This is in contrast to the style where one
thread blocks new references and then waits under a lock for existing
ones to drain with a condvar -- no membar needed thanks to mutex(9).)

I searched for atomic_dec to find all these. Obviously we ought to
have a better abstraction for this because there's so much copypasta.
This is a stop-gap measure to fix actual bugs until we have that. It
would be nice if an abstraction could gracefully handle the different
styles of reference counting in use -- some years ago I drafted an
API for this, but making it cover everything got a little out of hand
(particularly with struct vnode::v_usecount) and I ended up setting
it aside to work on psref/localcount instead for better scalability.

I got bored of adding #ifdef __HAVE_ATOMIC_AS_MEMBAR everywhere, so I
only put it on things that look performance-critical on 5sec review.
We should really adopt membar_enter_preatomic/membar_exit_postatomic
or something (except they are applicable only to atomic r/m/w, not to
atomic_load/store_*, making the naming annoying) and get rid of all
the ifdefs.
 1.129 05-Mar-2022  skrll Slight comment improvement.
 1.128 16-Feb-2022  andvar fix various typos, mainly in comments.
 1.127 31-Jan-2022  ryo add support Hardware updates to Access flag and Dirty state (FEAT_HAFDBS)

- The DBM bit of the PTE is now used to determine if it is writable, and
the AF bit is treated entirely as a reference bit. A valid PTE is always
treated as readable. There can be no valid PTE that is not readable.
- LX_BLKPAG_OS_{READ,WRITE} are used only for debugging purposes,
and has been superseded by LX_BLKPAG_AF and LX_BLKPAG_DBM.
- Improve comment

The need for reference/modify emulation has been eliminated,
and access/permission faults have been reduced, however,
there has been little change in overall performance.
 1.126 31-Jan-2022  ryo Skip unnecessary PTE operations and TLB invalidation.
 1.125 16-Jan-2022  rillig aarch64/pmap: remove stray semicolon

No binary change.
 1.124 15-Jan-2022  skrll The translations that need invalidation are always last level so remove
the (dubious) logic around tracking the level. The "any level" TLB
invalidation maintenance operation are used, but this may change after
further testing.

before
======
1661.0u 420.2s 2:53.82 1197.3% 231+29k 10+33918io 102pf+0w
1646.8u 425.2s 2:52.96 1198.0% 232+29k 1+33937io 49pf+0w
1647.9u 425.7s 2:52.58 1201.6% 232+29k 0+33940io 32pf+0w

After
=====
1602.5u 420.8s 2:49.09 1196.6% 238+30k 24+33893io 54pf+0w
1600.7u 421.3s 2:51.53 1178.8% 238+30k 1+33914io 33pf+0w
1597.5u 424.3s 2:50.46 1186.1% 238+30k 0+33915io 17pf+0w

LGTM from ryo@
 1.123 14-Jan-2022  skrll Restore the previous pmap_remove_all behaviour as the new method meant
the n1sdp couldn't complete a build.

No noticeable change in kernel build performance.
 1.122 04-Jan-2022  skrll KNF
 1.121 10-Dec-2021  andvar s/occured/occurred/ in comments, log messages and man pages.
 1.120 07-Dec-2021  andvar fix various typos, mainly in comments.
 1.119 23-Oct-2021  skrll Fix non-UVMHIST build
 1.118 16-Oct-2021  ryo fix non-MULTIPROCESSOR build
 1.117 10-Oct-2021  skrll Use sys/uvm/pmap/pmap_tlb.c on Aarch64 in the same way that some Arm, MIPS,
and some PPC kernels do. This removes the limitation of 256 processes on
CPUs with 8bit ASID field, e.g. Apple M1.

Additionally the following changes have been made

- removed a couple of unnecessary aarch64_tlbi_all calls
- removed any invalidation after freeing page tables due to
_pmap_sweep_pdp. This was never necessary afaict.
- all kernel mappings are marked global and userland mapping not-global.

Performance testing hasn't show a significant difference. The data here
is from building a kernel on an lx2k system with nvme.

before
1489.6u 400.4s 2:40.65 1176.5% 228+224k 0+32289io 57pf+0w
1482.6u 403.2s 2:38.49 1189.9% 228+222k 0+32274io 46pf+0w
1485.4u 402.2s 2:37.27 1200.2% 228+222k 0+32275io 12pf+0w

after
1493.9u 404.6s 2:37.50 1205.4% 227+221k 0+32265io 48pf+0w
1485.0u 408.0s 2:38.54 1194.0% 227+222k 0+32272io 36pf+0w
1484.3u 407.0s 2:35.88 1213.3% 228+224k 0+32268io 14pf+0w

>>> stats.ttest_ind([160.65,158.49,157.27], [157.5,158.54,155.88])
Ttest_indResult(statistic=1.1923622711296888, pvalue=0.2990182944606766)
>>>
 1.116 30-Sep-2021  skrll Ensure TCR_EPD0 is set on entry to pmap_activate and ensure it is set as
early as possible for APs.
 1.115 26-Sep-2021  skrll Use UVMHIST_CALLARGS
 1.114 26-Sep-2021  skrll '\n' is not required in KASSERTMSG either.
 1.113 26-Sep-2021  skrll "\n" is not required in KERNHIST
 1.112 15-Sep-2021  skrll Use __SHIFTIN. Same code before and after.
 1.111 12-Sep-2021  skrll pmap_page_remove: simply and reduce the code size slightly.
 1.110 09-Sep-2021  skrll In pmap_icache_sync_range change

for (...) {
...
if (condition) {
// do stuff
}
}

to

for (...) {
...
if (!conditional)
continue;
// do stuff
}

to save on indentation. Same code (modulo register usage) before and
after.
 1.109 09-Sep-2021  skrll KNF
 1.108 29-May-2021  skrll Deal with the pmap limitation of maxproc in a more complete way and
recognise CPUs with only 8bit ASIDs.
 1.107 30-Apr-2021  skrll branches: 1.107.2;
Make the ddb for pmap / pte information pmap agnostic
 1.106 29-Apr-2021  skrll Remove some unnecessary tlb invalidate in pmap_growkernel and ASAN shadow
map. Ensure the shadow map mappings are visible to the TLB walkers.
 1.105 21-Apr-2021  ryo branches: 1.105.2;
added more attributes of PTE displayed by "ddb>machine pte"
 1.104 17-Apr-2021  mrg remove KERNHIST_INIT_STATIC(). it stradles the line between usable
early in boot and broken early in boot by requiring a partly static
structure with another structure that must be present by the time
any uses are performed. theoretically platform code could allocate
a chunk while seting up memory and assign it here, giving a dynamic
sizing for the entry list, but the reality is that all users have
a statically allocated entry list as well.

the existing KERNHIST_LINK_STATIC() is used in conjunction with
KERNHIST_INITIALIZER() instead.

this stops a NULL pointer deref when the _LOG() macro is called
before the storage is linked in, which happens with GCC 10 on OCTEON
with UVMHIST enabled, crashing in very early kernel init.
 1.103 09-Mar-2021  ryo branches: 1.103.2;
fix build error without options DDB.

kvtopte() is referenced from arm/acpi/acpi_machdep.c
 1.102 13-Feb-2021  ryo No assignment is needed here.

the loop in pmap_page_remove() always removes the first pv,
and since the list is managed by _pmap_remove_pv(), pp->pp_pv.pv_next always points to the first.
 1.101 01-Feb-2021  ryo It is enough to make a page accessible instead of writable.
same fix as r1.76
 1.100 31-Jan-2021  ryo implement pmap_remove_all().

The size of struct pv_entry has increased, but speed of kernel build has improved by about 1%
exec and exit should have been improved.
 1.99 20-Dec-2020  skrll Improve the English in the previous comment fix.
 1.98 19-Dec-2020  skrll Tweak a comment
 1.97 11-Dec-2020  skrll s:aarch64/cpufunc.h:arm/cpufunc.h:

a baby step in the grand arm header unification challenge
 1.96 10-Nov-2020  skrll AA64 is not MIPS.

Change all KSEG references to directmap
 1.95 07-Nov-2020  skrll In pmap_devmap_bootstrap only set pmap_devmap_bootstrap_done if there
is an entry and ALL of the entries have been done. The entry required
for EARLYCONS might not be the first/only one...
 1.94 01-Nov-2020  jmcneill branches: 1.94.2;
No need to disable translation table walks in pmap_activate().
 1.93 22-Oct-2020  skrll Use the isb macro - missed in previous commit
 1.92 22-Oct-2020  skrll Use the dmb/dsb/isb macros... if nothing else they're all now consistent
about the "memory" assembler contraint.

No binary change
 1.91 28-Sep-2020  skrll Only set pmap_devmap_bootstrap_done if something gets mapped.

Think acpi_platform_devmap
 1.90 19-Sep-2020  skrll Make __md_palloc pmap agnostic (think sys/uvm/pmap)
 1.89 14-Sep-2020  ryo PID_MAX is just an initial value (soft maximum). Don't use it for CTASSERT.
defined __HAVE_CPU_MAXPROC to use function cpu_maxproc().

pointed out by mrg@, thanks.
 1.88 06-Sep-2020  ryo Fix panic caused by modload. http://mail-index.netbsd.org/port-arm/2020/08/30/msg006960.html

The address space reserved for modules may not be mapped in L1-L3.
 1.87 14-Aug-2020  skrll Whitespace
 1.86 12-Aug-2020  skrll Part IV of ad's performance improvements for aarch64

- Implement pmap_growkernel(), and update kernel pmap's stats with atomics.

- Then, pmap_kenter_pa() and pmap_kremove() no longer need to allocate
memory nor take pm_lock, because they only modify L3 PTEs.

- Then, pm_lock and pp_lock can be adaptive mutexes at IPL_NONE which are
cheaper than spin mutexes.

- Take the pmap's lock in pmap_extract() if not the kernel's pmap, otherwise
pmap_extract() might see inconsistent state.
 1.85 09-Aug-2020  skrll Fix another UVMHIST so it doesn't use %s
 1.84 16-Jul-2020  skrll pmapboot_enter simplication
- bootpage_alloc in asm becomes pmapboot_pagealloc in C
- PMAPBOOT_ENTER_NOBLOCK is removed as it's not used
- PMAPBOOT_ENTER_NOOVERWRITE is removed as it's now always on
- physpage_allocator argument is removed as it's always
pmapboot_pagealloc
- Support for EARLYCONS without CONSADDR is removed so that the identity
map for CONSADDR is always known.

For the assembly files:
2 files changed, 40 insertions(+), 89 deletions(-)

LGTM ryo
 1.83 04-Jul-2020  rin Use tlen for temporary length variable instead of l, which is usually
used for struct lwp *.

No binary changes.
 1.82 02-Jul-2020  rin pmap_procwr(): sync icache even if p != curproc. This fixes applications
like GDB for arm32, that rewrite text of other process.

Thanks to ryo@ for discussion.
 1.81 02-Jul-2020  rin Set uvmexp.ncolors appropriately, which is required for some CPU
models with VIPT icache.

Otherwise, alias in virtual address results in inconsistent results,
at least for applications that rewrite text of other process, e.g.,
GDB for arm32.

Also, this hopefully fixes other unexpected failures due to alias.

Confirmed that there's no observable regression in performance;
difference in ``time make -j8'' for GENERIC64 kernel on BCM2837
with and without setting uvmexp.ncolors is within 0.1%.

Thanks to ryo@ for discussion.
 1.80 27-Jun-2020  rin Fix typo in name of evcnt(4) counter.
 1.79 24-Jun-2020  ryo Fix bug with incorrect range calculation when doing icache sync.
This is called by sysarch(ARM_SYNC_ICACHE) from aarch32 (compat_netbsd32) emul process.

pointed out by rin@, thanks.

XXX pullup-9
 1.78 14-Jun-2020  ad - Fix a lock order reversal in pmap_page_protect().

- Make sure pmap is always locked when updating stats; atomics no longer
needed to do that.

- Remove unneeded traversal of pv list in pmap_enter_pv().

- Shrink struct vm_page from 136 to 128 bytes (cache line sized) and struct
pv_entry from 48 to 32 bytes (power of 2 sized).

- Embed a pv_entry in each vm_page. This means PV entries don't need to
be allocated for private anonymous memory / COW pages / most UBC mappings.
Dynamic PV entries are then used only for stuff like shared libraries and
shared memory.

Proposed on port-arm@.
 1.77 10-Jun-2020  ad - Wired/resident stats shouldn't covered by PMAPCOUNTERS.
- Rename need_update_pv -> need_enter_pv.

Ok ryo@
 1.76 01-Jun-2020  ryo no need to make the PTE writable to do icache_sync, enough to accessible.
 1.75 15-May-2020  skrll Use __diagused
 1.74 15-May-2020  tnn fix non-diag build
 1.73 14-May-2020  skrll Use MUTEX_NODEBUG for PV locks as is commonly done. OK ryo.
 1.72 13-May-2020  jmcneill Implement pmap_extract_coherency
 1.71 18-Apr-2020  skrll PMAP_DEBUG has been deleted on arm
 1.70 13-Apr-2020  maxv Add support for Branch Target Identification (BTI).

On the executable pages that have the GP (Guarded Page) bit, the semantic
of the "br" and "blr" instructions is changed: the CPU expects the first
instruction of the jump/call target to be "bti", and faults if it isn't.

We add the GP bit on the kernel .text pages (and incidentally the .rodata
pages, but we don't care). The compiler adds a "bti c" instruction at the
beginning of each C function. We modify the ENTRY() macros to manually add
"bti c" in the asm functions.

cpuswitch.S needs a specific change: with "br x27" the CPU expects "bti j",
which is bad because the functions begin with "bti c"; switch to "br x16",
for the CPU to accept "bti c".

BTI helps defend against JOP/COP. Tested on Qemu.
 1.69 08-Apr-2020  ryo branches: 1.69.2;
use PMAP_PAGE_INIT() to initialize mutex in pmap_page.

VM_MDPAGE_INIT() in pmap_free_pdp() had initialized pp_flags,
so it unintentionally cleared PMAP_PAGE_FLAGS_PV_TRACKED.
use PMAP_PAGE_INIT to avoid using PMAP_PAGE_FLAGS_PV_TRACKED.

pointed out by tnn@, thanks
 1.68 14-Mar-2020  ad pmap_remove_all(): Return a boolean value to indicate the behaviour. If
true, all mappings have been removed, the pmap is totally cleared out, and
UVM can then avoid doing the work to call pmap_remove() for each map entry.
If false, either nothing has been done, or some helpful arch-specific voodoo
has taken place.
 1.67 02-Mar-2020  ryo oops, fix incorrect usage of daif_enable() in my previous commit.
 1.66 29-Feb-2020  ryo Fix pmap to work correctly with tagged addresses

- when fault, untag from address before passing to uvm/pmap functions
- pmap_extract() checks more strictly and consider the address tag
 1.65 29-Feb-2020  ryo use pmapboot_enter_range()
 1.64 10-Feb-2020  ryo use LIST(3) instead of TAILQ(3) to save one word in struct vm_page and struct pmap.

pointed out by riastradh@. thanks
 1.63 03-Feb-2020  ryo add support pmap_pv(9)

Patch originally from jmcneill@. thanks
 1.62 03-Feb-2020  ryo separate struct vm_page_md into vm_page_md and pmap_page
for preparation pmap_pv(9)
 1.61 09-Jan-2020  ryo fix behaviour mmap()/mprotect() when passed only PROT_EXEC.

when mmap()/mprotect() with only PROT_EXEC, syscall will be successful,
but the page actually hadn't been mapped.
it should be mapped with PROT_READ|PROT_EXEC implicitly. (r-x)
 1.60 30-Dec-2019  skrll branches: 1.60.2;
Update pmap_map_chunk to allow L[12] block mappings and L3 page mappings
 1.59 30-Dec-2019  skrll Remove unnecessary brackets and unwrap a conditional. Same code before
and after.
 1.58 28-Dec-2019  jmcneill Do not use Early Write Acknowledge for PCIe I/O and config space.
 1.57 27-Dec-2019  jmcneill Enable early write acknowledge for device memory mappings.
 1.56 19-Dec-2019  skrll G/C kasan_shadow_map call in pmap_enter

pmap_growkernel calls kasan_shadow_map for KVA
 1.55 18-Dec-2019  ryo atomic_add_16() is not used in pmap.c anymore. no need decl here.
 1.54 18-Dec-2019  ryo space to tab
 1.53 14-Dec-2019  skrll Fix build... wire_count probably doesn't need atomics
 1.52 13-Dec-2019  skrll Fix KASAN support by calling kasan_shadow_map in pmap_growkernel
 1.51 10-Dec-2019  ad pg->phys_addr -> VM_PAGE_TO_PHYS(pg)
 1.50 14-Nov-2019  maxv Mark several kASan functions with __nothing, to avoid annoying #ifdefs.
Same as kCSan and kMSan.
 1.49 14-Nov-2019  maxv Don't include "opt_kasan.h" when there's already <sys/asan.h> included.
 1.48 29-Oct-2019  maya Define PMAP_NEED_PROCWR, providing strategically placed i-cache
synchronization where just-changed memory is about to be executed.

Fixes SIGILLs seen when running Mono 6 on QEMU Cortex-A57.

ok ryo
 1.47 22-Sep-2019  jmcneill Disable translation table walks using TTBR0 while changing its value and
when deactivating a pmap. Fixes stability issues on Ampere eMAG CPUs.
 1.46 20-Sep-2019  ryo ref/mod bit should be set according to 'flags' argument, not 'prot'. r1.44 was incomplete.
 1.45 13-Sep-2019  ryo In pmap_devmap_bootstrap(), cpu_earlydevice_va_p() must not return true until *all* devmap tables have been enabled.
console mapping may be present in the last table.
 1.44 07-Sep-2019  ryo - remove incorrect KASSERT. mmap(2) with prot=PROT_WRITE calls pmap_enter(..., PROT_WRITE) internally.
- fix to update page reference flags when only PROT_WRITE or PROT_EXECUTE specified
 1.43 15-Aug-2019  skrll Make pmap_db_pte_print more terse so it's quicker on serial consoles
 1.42 12-Aug-2019  skrll Use PMAP_DEV in DEVMAP_ENTRY rather than pmap_map_chunk. It's clearer and
means pmap_map_chunk can be made to map other memory types.
 1.41 17-May-2019  mrg branches: 1.41.2;
apply some __diagused.
 1.40 08-Apr-2019  ryo - free empty page tables pages if reach a certain usage.
- need to lock at removing an old pg (_pmap_remove_pv) in _pmap_enter()
 1.39 06-Apr-2019  ryo Fix race conditions about pmap_page_protect() and pmap_enter().

while handling same PTE by these functions in same time, there
is a critical path that the number of valid PTEs and wire_count
are inconsistent, and it caused KASSERT.
Need to hold a pv_lock while modifying them.
 1.38 20-Mar-2019  ryo spinkle __printflike(), and use PRIxxx
 1.37 19-Mar-2019  ryo - add ddb command "machine ttbr" to dump MMU tables.
- tidy up descriptions, usages and messages.
 1.36 19-Mar-2019  ryo - free L1-L3 pages that has been emptied by pmap_remove().
- if no memories, pmap_enter will return correctly ENOMEM if PMAP_CANFAIL, or wait until available any memories if !PMAP_CANFAIL.

These changes improves the stability when we use a huge virtual memory spaces with mmap.
 1.35 06-Feb-2019  ryo improve pmap_remove
- don't lock/unlock per page in pmap_remove()
- speedup pte lookup for continuous addresses
- bring out pool_cache_put(&_pmap_pv_pool, pv) from lock/unlock section
 1.34 21-Dec-2018  ryo - add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)
 1.33 01-Nov-2018  maxv Add kASan support for aarch64. Stack tracking needs more investigation
and will come in a separate commit.

Reviewed by ryo@ jmcneill@ skrll@.
 1.32 31-Oct-2018  ryo invalidate icache correctly.
l3pte_executable() should be used for only valid pte.
 1.31 18-Oct-2018  skrll Provide generic start code that assumes the MMU is off and caches are
disabled as per the linux booting protocol for ARMv6 and ARMv7 boards.
u-boot image type should be changed to 'linux' for correct behaviour.

The new start code builds a minimal "bootstrap" L1PT with cached access
disabled and uses the same table for all processors. AP startup is
performed in less steps and more code is written in C.

The bootstrap tables and stack are placed into an (orphaned) section
"_init_memory" which is given to uvm when it is no longer used.

Various kernels have been converted to use this code and tested. Some
boards were provided by TNF. Thanks!

The GENERIC kernel now boots on boards using the TEGRA, SUNXI and EXYNOS
kernels. The GENERIC kernel will also work on RPI2 using u-boot.

Thanks to martin@ and aymeric@ for testing on parallella and nanosoc
respectively
 1.30 14-Oct-2018  skrll Use __nothing
 1.29 12-Oct-2018  ryo add initial support of COMPAT_NETBSD32 on AArch64.
arm ELF32 EABI binaries could be execute in AArch32 state on AArch64. A32 THUMB mode is not supported yet.
 1.28 12-Oct-2018  ryo - cleanup checking address ranges with IN_RANGE macro
- change PM_ADDR_CHECK macro to KASSERTMSG
- restore fast lookup cases with IN_RANGE macro for pmap_extract changed in my previous commit.
 1.27 12-Oct-2018  ryo rewrite pmap_pte_lookup() to share similar code.
 1.26 04-Oct-2018  ryo cleanup locore, and changed the way to map memories during boot.
- add functions bootpage_enter() and bootpage_alloc() to adapt various layout
of physical memory map. especially for 64bit physical memory layout.
pmapboot_alloc() allocates pagetable pages from _end[].
- changed to map only the required amount for PA=VA identity mapping
(kernel image, UART device, and FDT blob) with L2_BLOCK(2Mbyte).
- changing page permission for kernel image, and making KSEG mapping are done
at cpu_kernel_vm_init() instead of at locore.
- optimize PTE entries with PTE Contiguous bit. it is enabled on devmap only for now.

reviewed by skrll@, thanks.
 1.25 04-Oct-2018  ryo * define LX_BLKPAG_{OS,ATTR}_* for OS dependent PTE attributes in pmap.h
* cleanup macros
 1.24 17-Sep-2018  ryo delete debug printf and KASSERT.
 1.23 10-Sep-2018  maxv Replace KDASSERT by panic.
 1.22 10-Sep-2018  maxv Rename _pmap_alloc_pdp -> pmap_alloc_pdp, and make it public.
 1.21 10-Sep-2018  ryo cleanup aarch64 mpstart and fdt bootstrap
* arm_cpu_hatch_arg is a bad idea. avoid serializing CPU startup, and eliminate arm_cpu_hatch_arg.
in mpstart, resolve own cpu index using array of cpu_mpidr[] (aarch64)
* add support fdt enable-method "spin-table"
* add support fdt enable-method "brcm,bcm2836-smp" (for 32bit RaspberryPi)
* use arm_fdt_cpu_bootstrap() instead of psci_fdt_bootstrap()
* rename "arm/fdt/psci_fdt.h" to "arm/fdt/psci_fdtvar.h" because of conflict of include file for needs-flag
* add devmap for cpu spin-table of raspberrypi3/aarch64
* no need to force hatch APs for raspberrypi3/arm32 ifndef MULTIPROCESSOR.
* fix to work pmap_extract(kerneltext/data/bss) even if before calling pmap_bootstrap

idea to use cpu_mpidr[] by jmcneill@. reviewd by skrll@. thanks.
 1.20 27-Aug-2018  ryo need to add VM_PROT_READ when pmap_kenter_pa(va, pa, VM_PROT_WRITE, 0) or pmap_kenter_pa(va, pa, VM_PROT_EXECUTE, 0).
VM_PROT_READ is treated as an access permission inernally.
 1.19 11-Aug-2018  ryo change to minimum invalidation of TLB.
specifying not only va but also asid, and not invalidate L0-L2 entry using tlbi_*_ll() if needed.
 1.18 10-Aug-2018  ryo treat kernel-exec attr and user-exec attr separately.
kernel cannot execute userland exec page, and user cannot execute kernel page.
 1.17 06-Aug-2018  ryo set kernel text/rodata readonly by default.
add function db_write_text() for setting ddb breakpoint.
 1.16 31-Jul-2018  skrll Define and use VPRINTF
 1.15 27-Jul-2018  ryo changes of pmap.c r1.13 seems to be unstable.
In order to invalidate icache, not to invalidate all icache,
but temporary to make the page writable and invalidate target address only.
 1.14 24-Jul-2018  ryo don't call pool_cache_put with locking pmap. pool_cache_put call pmap_kenter_pa internally.
(pool_cache_put_paddr -> pool_cache_put_slow -> pool_get -> pmap_kenter_pa)
 1.13 23-Jul-2018  ryo * fix icache invalidations.
* "ic ivau" (aarch64_icache_sync_range) with VA generates permission fault in some situations, therefore use KSEG address for now.
 1.12 23-Jul-2018  ryo rather than using flags to resolve nested locks, reserve pool_cache before locking.
 1.11 21-Jul-2018  ryo * avoid deadlock. mutex_owned() works only for adaptive lock, therefore we cannot use it for spinlock...
* add more NULL check
* clear pte when pmap_enter() fails
 1.10 17-Jul-2018  ryo Use __debugused
 1.9 17-Jul-2018  christos Add missing casts, remove unused variables.
 1.8 09-Jul-2018  ryo need locks in pmap_kremove() and pmap_fault_fixup()
 1.7 20-May-2018  ryo branches: 1.7.2;
pmap_enter() must update modified/referenced flags by 'flags' not 'prot'.
 1.6 16-May-2018  ryo Fix memory leak. it was leaking one page every pmap_create().
pm->pm_vmlist must be initialized before calling _pmap_alloc_pdp().
 1.5 29-Apr-2018  ryo fix KASSERT panic. pv_entry may not be exists in pvlist when pmap_remove().
 1.4 29-Apr-2018  ryo delete unused code
 1.3 27-Apr-2018  ryo fix instability behavior of bufcache on aarch64.
* fix to return correct ref/mod when PMAP_WIRED.
* changed to keep wired flags in pte instead of pv_entry, and cleanup.
 1.2 01-Apr-2018  ryo Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.28;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.28.9 26-Dec-2018  pgoyette Sync with HEAD, resolve a few conflicts
 1.1.28.8 26-Nov-2018  pgoyette Sync with HEAD, resolve a couple of conflicts
 1.1.28.7 20-Oct-2018  pgoyette Sync with head
 1.1.28.6 30-Sep-2018  pgoyette Ssync with HEAD
 1.1.28.5 06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.1.28.4 28-Jul-2018  pgoyette Sync with HEAD
 1.1.28.3 21-May-2018  pgoyette Sync with HEAD
 1.1.28.2 02-May-2018  pgoyette Synch with HEAD
 1.1.28.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file pmap.c was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.7.2.3 21-Apr-2020  martin Sync with HEAD
 1.7.2.2 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.7.2.1 10-Jun-2019  christos Sync with HEAD
 1.41.2.8 01-Jan-2021  martin Pull up following revision(s) (requested by rin in ticket #1171):

sys/arch/aarch64/aarch64/pmap.c: revision 1.82
sys/arch/aarch64/aarch64/pmap.c: revision 1.83

pmap_procwr(): sync icache even if p != curproc. This fixes applications
like GDB for arm32, that rewrite text of other process.

Thanks to ryo@ for discussion.

Use tlen for temporary length variable instead of l, which is usually
used for struct lwp *.
No binary changes.
 1.41.2.7 01-Jan-2021  martin Pull up following revision(s) (requested by rin in ticket #1170):

sys/arch/aarch64/aarch64/cpufunc.c: revision 1.22 (patch)
sys/arch/aarch64/aarch64/cpufunc.c: revision 1.23 (patch)
sys/arch/aarch64/aarch64/pmap.c: revision 1.81

Set uvmexp.ncolors appropriately, which is required for some CPU
models with VIPT icache.

Otherwise, alias in virtual address results in inconsistent results,
at least for applications that rewrite text of other process, e.g.,
GDB for arm32.

Also, this hopefully fixes other unexpected failures due to alias.
Confirmed that there's no observable regression in performance;
difference in ``time make -j8'' for GENERIC64 kernel on BCM2837
with and without setting uvmexp.ncolors is within 0.1%.

Thanks to ryo@ for discussion.


Fix uvmexp.ncolors for some big.LITTLE configuration; it is uncertain
which CPU is used as primary, and as a result, secondary CPUs can
require larger number of colors.

In order to solve this problem, update uvmexp.ncolors via
uvm_page_recolor(9) when secondary CPUs are attached, as done for
other ports like x86.

Pointed out by jmcneill@, and discussed on port-arm@:
http://mail-index.netbsd.org/port-arm/2020/07/03/msg006837.html
Tested and OK'd by ryo@.

Fix previous; add missing <uvm/uvm.h> include.
 1.41.2.6 30-Jun-2020  martin Pull up following revision(s) (requested by ryo in ticket #976):

sys/arch/aarch64/aarch64/pmap.c: revision 1.79

Fix bug with incorrect range calculation when doing icache sync.

This is called by sysarch(ARM_SYNC_ICACHE) from aarch32 (compat_netbsd32) emul process.
pointed out by rin@, thanks.

XXX pullup-9
 1.41.2.5 21-Jan-2020  martin Pull up following revision(s) (requested by ryo in ticket #618):

sys/arch/aarch64/aarch64/fault.c: revision 1.11
sys/arch/aarch64/aarch64/pmap.c: revision 1.61

fix behaviour mmap()/mprotect() when passed only PROT_EXEC.
when mmap()/mprotect() with only PROT_EXEC, syscall will be successful,
but the page actually hadn't been mapped.
it should be mapped with PROT_READ|PROT_EXEC implicitly. (r-x)
 1.41.2.4 29-Dec-2019  martin Pull up following revision(s) (requested by jmcneill in ticket #586):

sys/arch/arm/nvidia/tegra_pcie.c: revision 1.27
sys/arch/aarch64/aarch64/pmap.c: revision 1.57
sys/arch/aarch64/aarch64/locore.S: revision 1.48
sys/arch/aarch64/include/armreg.h: revision 1.29
sys/arch/aarch64/aarch64/pmap.c: revision 1.58
sys/arch/aarch64/aarch64/locore.S: revision 1.49
sys/arch/arm/acpi/acpipchb.c: revision 1.14
sys/arch/aarch64/aarch64/genassym.cf: revision 1.16
sys/arch/arm/acpi/acpi_machdep.c: revision 1.13
sys/arch/aarch64/include/pmap.h: revision 1.27
sys/arch/aarch64/aarch64/genassym.cf: revision 1.17
sys/arch/aarch64/include/pmap.h: revision 1.28
sys/arch/arm/fdt/pcihost_fdtvar.h: revision 1.3
sys/arch/arm/include/bus_defs.h: revision 1.14
sys/arch/aarch64/aarch64/bus_space.c: revision 1.9
sys/arch/arm/fdt/pcihost_fdt.c: revision 1.12
sys/arch/aarch64/conf/files.aarch64: revision 1.15
sys/arch/aarch64/conf/files.aarch64: revision 1.16
sys/arch/arm/rockchip/rk3399_pcie.c: revision 1.9

Enable early write acknowledge for device memory mappings.

Do not use Early Write Acknowledge for PCIe I/O and config space.
 1.41.2.3 04-Nov-2019  martin Pull up following revision(s) (requested by maya in ticket #393):

sys/arch/aarch64/include/pmap.h: revision 1.26
sys/arch/aarch64/aarch64/pmap.c: revision 1.48

Define PMAP_NEED_PROCWR, providing strategically placed i-cache
synchronization where just-changed memory is about to be executed.

Fixes SIGILLs seen when running Mono 6 on QEMU Cortex-A57.

ok ryo
 1.41.2.2 23-Sep-2019  martin Pull up following revision(s) (requested by jmcneill in ticket #229):

sys/arch/aarch64/aarch64/pmap.c: revision 1.47

Disable translation table walks using TTBR0 while changing its value and
when deactivating a pmap. Fixes stability issues on Ampere eMAG CPUs.
 1.41.2.1 22-Sep-2019  martin Pull up following revision(s) (requested by ryo in ticket #214):

sys/arch/aarch64/aarch64/pmap.c: revision 1.44
sys/arch/aarch64/aarch64/pmap.c: revision 1.46

- remove incorrect KASSERT. mmap(2) with prot=PROT_WRITE calls pmap_enter(..., PROT_WRITE) internally.
- fix to update page reference flags when only PROT_WRITE or PROT_EXECUTE specified
ref/mod bit should be set according to 'flags' argument, not 'prot'. r1.44 was incomplete.
 1.60.2.2 29-Feb-2020  ad Sync with head.
 1.60.2.1 17-Jan-2020  ad Sync with head.
 1.69.2.1 20-Apr-2020  bouyer Sync with HEAD
 1.94.2.3 03-Apr-2021  thorpej Sync with HEAD.
 1.94.2.2 03-Jan-2021  thorpej Sync w/ HEAD.
 1.94.2.1 14-Dec-2020  thorpej Sync w/ HEAD.
 1.103.2.1 17-Apr-2021  thorpej Sync with HEAD.
 1.105.2.2 17-Jun-2021  thorpej Sync w/ HEAD.
 1.105.2.1 13-May-2021  thorpej Sync with HEAD.
 1.107.2.1 31-May-2021  cjep sync with head
 1.6 20-Apr-2023  skrll Provide a shared pmap_devmap implementation and convert all pmap_devmap
arrays to use DEVMAP_ENTRY{,_END}
 1.5 16-Apr-2023  skrll Rename VM_KERNEL_IO_ADDRESS to VM_KERNEL_IO_BASE to match RISC-V

It's less letters, matches other similar variables and will help with
sharing code between the two architectures.

NFCI.
 1.4 12-Apr-2023  skrll Use CACHE_LINE_SIZE instead of magic number 128.
 1.3 25-Feb-2023  riastradh aarch64: curcpu() audit.

Sprinkle KASSERT (or KDASSERT in hot paths) for kpreempt_disabled()
when we use curcpu() and it's not immediately obvious that the caller
has preemption disabled but closer scrutiny suggests the caller has.

Note unsafe curcpu()s for syscall event counting. Not sure this is
worth changing.

Possible bugs fixed:

- cpu_irq and cpu_fiq could be preempted while trying to run softints
on this CPU.

- data_abort_handler might incorrectly think it was invoked in
interrupt context when it was only preempted and migrated to
another CPU.

- pmap_fault_fixup might report the wrong CPU logs.

(However, we don't currently run with kpreemption on aarch64, so
these are not yet real bugs fixed except if you patch it to build
with __HAVE_PREEMPTION.)
 1.2 21-Dec-2022  skrll Rename pmap_md_pdetab_destroy to pmap_md_pdetab_fini to match
pmap_md_pdetab_init.

Call pmap_md_pdetab_fini from pmap_segtab_destroy.
 1.1 03-Nov-2022  skrll Provide MI PMAP support on AARCH64
 1.6 11-Feb-2021  ryo include "opt_gprof.h" so that _PROF_PROLOGUE works properly in ENTRY() macro in *.S files
 1.5 10-Nov-2020  skrll AA64 is not MIPS.

Change all KSEG references to directmap
 1.4 26-Sep-2018  ryo branches: 1.4.12;
avoid hardcode. don't depend that AARCH64_KSEG_START is 0xffff000000000000.
 1.3 15-Aug-2018  ryo fix typo in comment
 1.2 27-Aug-2017  ryo branches: 1.2.2; 1.2.4;
DCZID_EL0:BS[0:3] is log2 of the block size in *words*, or 4. Not 16.
 1.1 10-Aug-2014  matt branches: 1.1.4;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.4.3 03-Dec-2017  jdolecek update from HEAD
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file pmap_page.S was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.2.4.1 10-Jun-2019  christos Sync with HEAD
 1.2.2.2 30-Sep-2018  pgoyette Ssync with HEAD
 1.2.2.1 06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.4.12.2 03-Apr-2021  thorpej Sync with HEAD.
 1.4.12.1 14-Dec-2020  thorpej Sync w/ HEAD.
 1.20 14-Dec-2024  skrll KNF
 1.19 07-Feb-2024  msaitoh branches: 1.19.2;
Remove ryo@'s mail addresses.
 1.18 03-Aug-2022  ryo fix build with options PMAPBOOT_DEBUG and options DDB
 1.17 30-Apr-2021  skrll Make the ddb for pmap / pte information pmap agnostic
 1.16 20-Mar-2021  skrll branches: 1.16.2;
Make pmapboot_enter panic if anything goes wrong and any mappings overlap
rather than only doing it in locore.S
 1.15 09-Jan-2021  jmcneill branches: 1.15.2;
Fix a potential issue in pmapboot_enter_range and pmapboot_enter where
if the va and size are not page aligned, there is a possibility of the
last page not being taken into consideration.
 1.14 11-Dec-2020  skrll s:aarch64/cpufunc.h:arm/cpufunc.h:

a baby step in the grand arm header unification challenge
 1.13 04-Dec-2020  skrll Ensure translation table updates are visible to the hardware walker(s)
in pmapboot_enter.
 1.12 10-Nov-2020  skrll AA64 is not MIPS.

Change all KSEG references to directmap
 1.11 07-Nov-2020  skrll Fix the use of the contiguous bit by checking the output address as well.
 1.10 17-Jul-2020  ryo branches: 1.10.2;
KNF. 80 cols, use tab.
 1.9 17-Jul-2020  ryo Add options PMAPBOOT_DEBUG to dump TTBR when pmapboot_enter().
Formerly DEBUG_MMU in locore.S, but there was a bit of confusion.
 1.8 16-Jul-2020  skrll pmapboot_enter simplication
- bootpage_alloc in asm becomes pmapboot_pagealloc in C
- PMAPBOOT_ENTER_NOBLOCK is removed as it's not used
- PMAPBOOT_ENTER_NOOVERWRITE is removed as it's now always on
- physpage_allocator argument is removed as it's always
pmapboot_pagealloc
- Support for EARLYCONS without CONSADDR is removed so that the identity
map for CONSADDR is always known.

For the assembly files:
2 files changed, 40 insertions(+), 89 deletions(-)

LGTM ryo
 1.7 13-Apr-2020  maxv Add support for Branch Target Identification (BTI).

On the executable pages that have the GP (Guarded Page) bit, the semantic
of the "br" and "blr" instructions is changed: the CPU expects the first
instruction of the jump/call target to be "bti", and faults if it isn't.

We add the GP bit on the kernel .text pages (and incidentally the .rodata
pages, but we don't care). The compiler adds a "bti c" instruction at the
beginning of each C function. We modify the ENTRY() macros to manually add
"bti c" in the asm functions.

cpuswitch.S needs a specific change: with "br x27" the CPU expects "bti j",
which is bad because the functions begin with "bti c"; switch to "br x16",
for the CPU to accept "bti c".

BTI helps defend against JOP/COP. Tested on Qemu.
 1.6 29-Feb-2020  ryo branches: 1.6.4;
Fix pmap to work correctly with tagged addresses

- when fault, untag from address before passing to uvm/pmap functions
- pmap_extract() checks more strictly and consider the address tag
 1.5 29-Feb-2020  ryo replace KSEG pages mapping code with generic function pmapboot_enter_range()
 1.4 18-Jul-2019  skrll branches: 1.4.2;
Simplify conditionals when clearing the CONTIG flag in pmapboot_enter and
update the comments to be a little clearer.
 1.3 29-Dec-2018  alnsn branches: 1.3.4;
pmapboot_pte_print() is only used when VERBOSE_INIT_ARM is defined.
 1.2 05-Oct-2018  ryo branches: 1.2.2;
fix build error without DDB
 1.1 04-Oct-2018  ryo cleanup locore, and changed the way to map memories during boot.
- add functions bootpage_enter() and bootpage_alloc() to adapt various layout
of physical memory map. especially for 64bit physical memory layout.
pmapboot_alloc() allocates pagetable pages from _end[].
- changed to map only the required amount for PA=VA identity mapping
(kernel image, UART device, and FDT blob) with L2_BLOCK(2Mbyte).
- changing page permission for kernel image, and making KSEG mapping are done
at cpu_kernel_vm_init() instead of at locore.
- optimize PTE entries with PTE Contiguous bit. it is enabled on devmap only for now.

reviewed by skrll@, thanks.
 1.2.2.3 18-Jan-2019  pgoyette Synch with HEAD
 1.2.2.2 20-Oct-2018  pgoyette Sync with head
 1.2.2.1 05-Oct-2018  pgoyette file pmapboot.c was added on branch pgoyette-compat on 2018-10-20 06:58:23 +0000
 1.3.4.5 21-Apr-2020  martin Sync with HEAD
 1.3.4.4 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.3.4.3 08-Apr-2020  martin Merge changes from current as of 20200406
 1.3.4.2 10-Jun-2019  christos Sync with HEAD
 1.3.4.1 29-Dec-2018  christos file pmapboot.c was added on branch phil-wifi on 2019-06-10 22:05:43 +0000
 1.4.2.1 09-Nov-2020  martin Pull up following revision(s) (requested by skrll in ticket #1128):

sys/arch/aarch64/aarch64/pmapboot.c: revision 1.11

Fix the use of the contiguous bit by checking the output address as well.
 1.6.4.1 20-Apr-2020  bouyer Sync with HEAD
 1.10.2.2 03-Apr-2021  thorpej Sync with HEAD.
 1.10.2.1 14-Dec-2020  thorpej Sync w/ HEAD.
 1.15.2.1 03-Apr-2021  thorpej Sync with HEAD.
 1.16.2.1 13-May-2021  thorpej Sync with HEAD.
 1.19.2.1 02-Aug-2025  perseant Sync with HEAD
 1.5 23-Sep-2021  ryo use lwp_trapframe() macro. NFC.
 1.4 13-Dec-2018  ryo add support PT_STEP
 1.3 17-Jul-2018  christos add missing casts
 1.2 01-Apr-2018  ryo branches: 1.2.2;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.28;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.28.3 26-Dec-2018  pgoyette Sync with HEAD, resolve a few conflicts
 1.1.28.2 28-Jul-2018  pgoyette Sync with HEAD
 1.1.28.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file process_machdep.c was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.2.2.1 10-Jun-2019  christos Sync with HEAD
 1.6 07-Feb-2024  msaitoh Remove ryo@'s mail addresses.
 1.5 27-May-2021  ryo fix build error with options ARMV85_BTI
 1.4 01-Oct-2020  skrll branches: 1.4.6; 1.4.8;
KNF (and some newlines)
 1.3 01-Oct-2020  ryo fix build error with LLVM
 1.2 30-Sep-2020  ryo add linux compatible /proc/cpuinfo
 1.1 10-Aug-2014  matt branches: 1.1.4;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file procfs_machdep.c was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.4.8.1 31-May-2021  cjep sync with head
 1.4.6.1 17-Jun-2021  thorpej Sync w/ HEAD.
 1.9 14-Apr-2024  skrll kern/58149: aarch64: Cannot return from a signal handler if SP was misaligned when the signal arrived

Apply the kernel diff from the PR

1. sendsig_siginfo() previously assumed that user SP was always aligned to
16 bytes and could call signal handlers with SP misaligned. This is a
wrong assumption because aarch64 demands that SP is aligned *only while*
it's being used to access memory. Now it properly aligns it before
pusing anything on the stack.

2. cpu_mcontext_validate() used to check if _REG_SP was aligned and
considered the ucontext invalid otherwise. This meant if a signal was
sent to a process whose SP was misaligned, the signal handler would fail
to return because the ucontext passed from the kernel was an invalid
one. Now setcontext(2) doesn't complain about misaligned SP.
 1.8 01-Nov-2021  thorpej branches: 1.8.4;
Use "stack_t" instead of "struct sigaltstack", as the former is the
newer standardized name. NFC.
 1.7 27-Oct-2021  thorpej Use the signal trampoline version constants from <sys/signal.h>.
 1.6 23-Sep-2021  ryo use lwp_trapframe() macro. NFC.
 1.5 01-May-2020  tnn aarch64: handle _UC_SETSTACK and _UC_CLRSTACK like on arm32

ok ryo@
 1.4 23-Apr-2020  skrll Typo in comment
 1.3 17-Jul-2018  christos branches: 1.3.4; 1.3.10;
add missing casts
 1.2 01-Apr-2018  ryo branches: 1.2.2;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.28;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.28.2 28-Jul-2018  pgoyette Sync with HEAD
 1.1.28.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file sig_machdep.c was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.2.2.1 10-Jun-2019  christos Sync with HEAD
 1.3.10.1 25-Apr-2020  bouyer Sync with bouyer-xenpvh-base2 (HEAD)
 1.3.4.1 02-May-2020  martin Pull up following revision(s) (requested by tnn in ticket #884):
sys/arch/aarch64/aarch64/sig_machdep.c: revision 1.5
sys/arch/aarch64/aarch64/cpu_machdep.c: revision 1.9
aarch64: handle _UC_SETSTACK and _UC_CLRSTACK like on arm32
ok ryo@
 1.8.4.1 18-Apr-2024  martin Pull up following revision(s) (requested by skrll in ticket #667):

sys/arch/aarch64/aarch64/sig_machdep.c: revision 1.9
sys/arch/aarch64/aarch64/cpu_machdep.c: revision 1.15

kern/58149: aarch64: Cannot return from a signal handler if SP was
misaligned when the signal arrived

Apply the kernel diff from the PR
1. sendsig_siginfo() previously assumed that user SP was always aligned to
16 bytes and could call signal handlers with SP misaligned. This is a
wrong assumption because aarch64 demands that SP is aligned *only while*
it's being used to access memory. Now it properly aligns it before
pusing anything on the stack.
2. cpu_mcontext_validate() used to check if _REG_SP was aligned and
considered the ucontext invalid otherwise. This meant if a signal was
sent to a process whose SP was misaligned, the signal handler would fail
to return because the ucontext passed from the kernel was an invalid
one. Now setcontext(2) doesn't complain about misaligned SP.
 1.14 07-Feb-2024  msaitoh Remove ryo@'s mail addresses.
 1.13 23-Aug-2022  ryo Bss clearing is now done at the beginning of start.S.

Some `__attribute__((__section__(".data")))' hack will no longer be needed.
 1.12 23-Aug-2022  ryo Align the loaded kernel image to 2Mbytes, if necessary.

It appears that there are bootloaders that cannot specify the load address or ignore it.
 1.11 15-Sep-2020  ryo fix typo
 1.10 15-Sep-2020  ryo fix aarch64eb MULTIPROCESSOR boot

- set endian of EL2,EL1 and EL0 at the beginning of start() and cpu_mpstart()
- drop_to_el1() keeps the endian setting
 1.9 05-Sep-2020  jakllsch aarch64: switch CPU to the kernel's byte order during boot
 1.8 16-Jul-2020  skrll pmapboot_enter simplication
- bootpage_alloc in asm becomes pmapboot_pagealloc in C
- PMAPBOOT_ENTER_NOBLOCK is removed as it's not used
- PMAPBOOT_ENTER_NOOVERWRITE is removed as it's now always on
- physpage_allocator argument is removed as it's always
pmapboot_pagealloc
- Support for EARLYCONS without CONSADDR is removed so that the identity
map for CONSADDR is always known.

For the assembly files:
2 files changed, 40 insertions(+), 89 deletions(-)

LGTM ryo
 1.7 19-Jan-2020  skrll Replace the two copies of the ADDR macro with a centralised adrl macro.
The adrl name matches the one used by armasm.
 1.6 19-Jan-2020  skrll Style. NFCI
 1.5 14-Dec-2019  skrll branches: 1.5.2;
revert previous - i was confused about boot files on rpi + aarch64
 1.4 14-Dec-2019  skrll Allow RPI firmware boots to work again
 1.3 04-Dec-2019  jmcneill Fix alignment of .text section by changing load address to
0xffffffc000000000 and adding 64 bytes of padding before the entry point.
 1.2 18-Oct-2018  skrll branches: 1.2.4; 1.2.6;
Provide generic start code that assumes the MMU is off and caches are
disabled as per the linux booting protocol for ARMv6 and ARMv7 boards.
u-boot image type should be changed to 'linux' for correct behaviour.

The new start code builds a minimal "bootstrap" L1PT with cached access
disabled and uses the same table for all processors. AP startup is
performed in less steps and more code is written in C.

The bootstrap tables and stack are placed into an (orphaned) section
"_init_memory" which is given to uvm when it is no longer used.

Various kernels have been converted to use this code and tested. Some
boards were provided by TNF. Thanks!

The GENERIC kernel now boots on boards using the TEGRA, SUNXI and EXYNOS
kernels. The GENERIC kernel will also work on RPI2 using u-boot.

Thanks to martin@ and aymeric@ for testing on parallella and nanosoc
respectively
 1.1 14-Sep-2018  skrll branches: 1.1.2;
Move the aarch64 start stub from sys/arch/evbarm to sys/arch/aarch64.

Delete the unused/empty evbarm/fdt/genassym.cf while I'm here.
 1.1.2.3 20-Oct-2018  pgoyette Sync with head
 1.1.2.2 30-Sep-2018  pgoyette Ssync with HEAD
 1.1.2.1 14-Sep-2018  pgoyette file start.S was added on branch pgoyette-compat on 2018-09-30 01:45:35 +0000
 1.2.6.1 09-Dec-2019  martin Pull up following revision(s) (requested by jmcneill in ticket #525):

external/cddl/osnet/dev/dtrace/aarch64/dtrace_isa.c: revision 1.1
distrib/sets/lists/modules/md.i386: revision 1.83
share/mk/bsd.own.mk: revision 1.1168
usr.bin/mkubootimage/mkubootimage.c: revision 1.25
sys/modules/dtrace/Makefile: revision 1.7
usr.bin/mkubootimage/mkubootimage.c: revision 1.26
sys/modules/dtrace/Makefile: revision 1.8
external/cddl/osnet/dist/lib/libdtrace/aarch64/dt_isadep.c: revision 1.2
distrib/sets/lists/modules/mi: revision 1.128
sys/arch/aarch64/include/frame.h: revision 1.3
sys/arch/evbarm/conf/mk.generic64: revision 1.4
external/cddl/osnet/dist/lib/libdtrace/common/dt_link.c: revision 1.12
sys/modules/cyclic/Makefile: revision 1.4
sys/arch/aarch64/conf/Makefile.aarch64: revision 1.16
external/cddl/osnet/dev/dtrace/aarch64/dtrace_subr.c: revision 1.1
sys/arch/aarch64/aarch64/start.S: revision 1.3
sys/arch/aarch64/aarch64/trap.c: revision 1.22
external/cddl/osnet/dev/fbt/aarch64/fbt_isa.c: revision 1.1
external/cddl/osnet/dev/dtrace/aarch64/dtrace_asm.S: revision 1.1
external/cddl/osnet/dev/fbt/aarch64/fbt_isa.h: revision 1.1
external/cddl/osnet/dev/dtrace/aarch64/regset.h: revision 1.1
external/cddl/osnet/lib/libdtrace/Makefile: revision 1.26
distrib/sets/lists/modules/md.amd64: revision 1.82
usr.bin/mkubootimage/mkubootimage.1: revision 1.13
distrib/sets/lists/modules/ad.arm: revision 1.14

Add KDTRACE_HOOKS support.

Define lwp_trapframe() macro

dtrace: add support for aarch64

Add syscall_linux back for other arm architectures (accidently removed
in previous)

Add -u flag for updating headers in place.

Fix alignment of .text section by changing load address to
0xffffffc000000000 and adding 64 bytes of padding before the entry point.

Update arm64 image header in place

Move dtrace_syscall_linux out of mi set list

Enable DTrace on aarch64

Fix signed/unsigned comparison
 1.2.4.3 08-Apr-2020  martin Merge changes from current as of 20200406
 1.2.4.2 10-Jun-2019  christos Sync with HEAD
 1.2.4.1 18-Oct-2018  christos file start.S was added on branch phil-wifi on 2019-06-10 22:05:43 +0000
 1.5.2.1 25-Jan-2020  ad Sync with head.
 1.4 06-Dec-2019  kamil Remove __HAVE_CPU_LWP_SETPRIVATE from aarch64

aarch64 specific cpu_lwp_setprivate() is redundant with its caller
lwp_setprivate() and there are no MD bits.
 1.3 17-Jul-2018  christos add missing casts
 1.2 01-Apr-2018  ryo branches: 1.2.2;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.28;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.28.2 28-Jul-2018  pgoyette Sync with HEAD
 1.1.28.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file sys_machdep.c was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.2.2.2 08-Apr-2020  martin Merge changes from current as of 20200406
 1.2.2.1 10-Jun-2019  christos Sync with HEAD
 1.13 05-Oct-2023  ad Arrange to update cached LWP credentials in userret() rather than during
syscall/trap entry, eliminating a test+branch on every syscall/trap.

This wasn't possible in the 3.99.x timeframe when l->l_cred came about
because there wasn't a reliable/timely way to force an ONPROC LWP running on
a remote CPU into the kernel (which is just about the only new thing in
this scheme).
 1.12 25-Feb-2023  riastradh aarch64: curcpu() audit.

Sprinkle KASSERT (or KDASSERT in hot paths) for kpreempt_disabled()
when we use curcpu() and it's not immediately obvious that the caller
has preemption disabled but closer scrutiny suggests the caller has.

Note unsafe curcpu()s for syscall event counting. Not sure this is
worth changing.

Possible bugs fixed:

- cpu_irq and cpu_fiq could be preempted while trying to run softints
on this CPU.

- data_abort_handler might incorrectly think it was invoked in
interrupt context when it was only preempted and migrated to
another CPU.

- pmap_fault_fixup might report the wrong CPU logs.

(However, we don't currently run with kpreemption on aarch64, so
these are not yet real bugs fixed except if you patch it to build
with __HAVE_PREEMPTION.)
 1.11 27-Sep-2021  ryo remove unused code.
The syscall for 32bit uses aarch32_syscall.c, so there is no need to make syscall.c support it.
 1.10 27-Sep-2021  ryo linux syscall should not break x1 register
 1.9 27-Sep-2021  ryo In order to prevent uninitialized values from being reflected in the registers after syscall, rval[] must be initialized.
 1.8 23-Sep-2021  ryo use lwp_trapframe() macro. NFC.
 1.7 23-Sep-2021  ryo add support COMPAT_LINUX for aarch64
 1.6 10-Apr-2019  ryo add missing userret() at the end of md_child_return().
this change make some ATF to pass.
 1.5 06-Apr-2019  kamil Centralized shared part of child_return() into MI part

Add a new function md_child_return() for MD specific bits only.

New child_return() is now part of MI and central code that handles
uniformly tracing code (KTR and ptrace(2)).

Synchronize value passed to ktrsysret() among ports to SYS_fork. This is
a traditional value and accessing p_lflag to check for PL_PPWAIT shall
use locking against proc_lock. Returning SYS_fork vs SYS_vfork still isn't
correct enough as there are more entry points to forking code. Instead of
making it too good, just settle with plain SYS_fork for all ports.
 1.4 01-Mar-2019  mrg no need to include opt_multiprocessor.h here.
 1.3 17-Jul-2018  christos add missing casts
 1.2 01-Apr-2018  ryo branches: 1.2.2;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.28;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.28.2 28-Jul-2018  pgoyette Sync with HEAD
 1.1.28.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file syscall.c was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.2.2.1 10-Jun-2019  christos Sync with HEAD
 1.53 21-Jul-2025  skrll KNF
 1.52 21-Jul-2025  skrll Don't #include <arm/cpufunc.h> twice
 1.51 18-Feb-2024  andvar branches: 1.51.2;
Change KDB to KGDB, including "sys/kgdb.h", which were likely meant to be defined.

Also comment out kgdb_machdep.c in files.aarch64, it doesn't exist yet.
 1.50 04-Oct-2023  ad Eliminate l->l_ncsw and l->l_nivcsw. From memory think they were added
before we had per-LWP struct rusage; the same is now tracked there.
 1.49 16-Jul-2023  riastradh aarch64: Omit needless xcfunc_t casts by using xcfunc_t correctly.

No functional change intended, except for avoiding possible undefined
behaviour that could have made demons come flying out your nose.
 1.48 25-Feb-2023  riastradh aarch64: curcpu() audit.

Sprinkle KASSERT (or KDASSERT in hot paths) for kpreempt_disabled()
when we use curcpu() and it's not immediately obvious that the caller
has preemption disabled but closer scrutiny suggests the caller has.

Note unsafe curcpu()s for syscall event counting. Not sure this is
worth changing.

Possible bugs fixed:

- cpu_irq and cpu_fiq could be preempted while trying to run softints
on this CPU.

- data_abort_handler might incorrectly think it was invoked in
interrupt context when it was only preempted and migrated to
another CPU.

- pmap_fault_fixup might report the wrong CPU logs.

(However, we don't currently run with kpreemption on aarch64, so
these are not yet real bugs fixed except if you patch it to build
with __HAVE_PREEMPTION.)
 1.47 30-Aug-2021  jmcneill Add FIQ support.
 1.46 14-Apr-2021  ryo Fix the problem "pcictl pci0 list" causes "panic: trap_el1h_error" on rockpro64.

The panic occures in bus_space_barrier() in rk3399_pcie.c:rkpcie_conf_read().
We expected bus_space_peek_4() to trap and recover in the path
trap_el1h_sync() -> data_abort_handler(), but In fact, the read is delayed
until bus_space_barrier(), and we get an SError interrupt (trap_el1h_error)
instead of a Synchronous Exception (trap_el1h_sync).

To catch this correctly, An implicit barrier in bus_space_peek have been added,
and trap the SError interrupt to recover from.
 1.45 09-Mar-2021  ryo branches: 1.45.2;
Add support hardware breakpoint and watchpoint again.

Limited support for hardware watchpoint has been available for some time, but it
has not been working properly. In addition, it stopped working at the time of
the PTRACE support commit on 2018-12-13. This has been fixed to work correctly,
and also fixed to be practical by sharing hardware watchpoints and breakpoints
between CPUs on MULTIPROCESSOR.

Also fixed a bug that causes a malfunction when switching CPUs with
"machine cpu N" when entering ddb mode from other than cpu_Debugger().

I have confirmed that the CPU can be switched by "machine cpu N" and return from
ddb properly in each case where ddb is called triggered by ddb break/watchpoint,
hardware break/watchpoint, and cpu_Debugger().
 1.44 22-Feb-2021  jmcneill KNF
 1.43 18-Feb-2021  jmcneill revert previous; user reports of panics under load
 1.42 15-Feb-2021  jmcneill interrupt: enable interrupts before running soft intr handlers. To avoid
stack usage going out of control, only do this at ci_intr_depth==0.
 1.41 11-Dec-2020  skrll s:aarch64/cpufunc.h:arm/cpufunc.h:

a baby step in the grand arm header unification challenge
 1.40 22-Oct-2020  skrll branches: 1.40.2;
Use the dmb/dsb/isb macros... if nothing else they're all now consistent
about the "memory" assembler contraint.

No binary change
 1.39 22-Oct-2020  skrll Simplify the cpufunc.h header, i.e. always use #include <arm/cpufunc.h>
 1.38 15-Oct-2020  rin Byte-swapping instructions for arm and thumb on aarch64eb;
instructions are stored in little-endian byte-order for BE8,
an only valid binary format for ILP32BE executables.

XXX
Apply similar fixes to armv7{,hf}eb.
 1.37 14-Sep-2020  ryo sprinkle LE32TOH to fetch instructions on aarch64eb
 1.36 02-Aug-2020  maxv Add support for Privileged Access Never (ARMv8.1-PAN).

PAN provides the same functionality as SMAP on x86: it forbids kernel
access to userland pages when PSTATE.PAN=1, and allows such accesses when
PSTATE.PAN=0.

We clear SCTLR_SPAN, to guarantee that PAN=1 each time the kernel is
entered. We catch PAN faults and panic right away without further
processing. In copyin, copyout, etc, we temporarily authorize access to
userland pages.

PAN is a very useful exploit mitigation. Reviewed by ryo@, thanks. Tested
on Qemu. Enabled by default.
 1.35 01-Aug-2020  riastradh Add kthread_fpu_enter/exit support to aarch64.
 1.34 27-Jul-2020  ryo fix build error. need cast.
 1.33 26-Jul-2020  ryo add support swp,swpb instruction emulation
 1.32 26-Jul-2020  ryo - add support conditionally execution for A32 instruction emulation
- separated the processing of ARM and THUMB emul clearly. do not confuse the Thumb-32bit instruction with the ARM instruction.
- use far_el1 instead of tf_pc to return correct fault address when instruction emulation
 1.31 08-Jul-2020  ryo Determination of A64,A32,T32 for disasm is now done in strrdisasm() instead of the caller.
correctly disassemble by processor state if defined DEBUG_DUMP_ON_USERFAULT or DEBUG_DDB_ON_USERFAULT.
 1.30 02-Jul-2020  rin Add support of ptrace(2) for COMPAT_NETBSD32.

Now, GDB for arm32 is usable for debugging 32bit applications.

OK ryo@
 1.29 01-Jul-2020  ryo add workaround for Neoverse N1 erratum 1542419
 1.28 01-Jul-2020  ryo - On some systems with a different cache line size (and DIC,IDC) per CPU, trap "mrs Xt,ctr_el0" instruction
to return the minimum cache line size of the system to userland.
- add CLIDR_EL1 and CTR_EL0 to struct aarch64_sysctl_cpu_id.

On most systems, cache line size is the same for all CPUs, so this mechanism won't be required.
Rather, this is primarily for errata support, which will be committed later.
 1.27 13-Apr-2020  maxv Add support for Branch Target Identification (BTI).

On the executable pages that have the GP (Guarded Page) bit, the semantic
of the "br" and "blr" instructions is changed: the CPU expects the first
instruction of the jump/call target to be "bti", and faults if it isn't.

We add the GP bit on the kernel .text pages (and incidentally the .rodata
pages, but we don't care). The compiler adds a "bti c" instruction at the
beginning of each C function. We modify the ENTRY() macros to manually add
"bti c" in the asm functions.

cpuswitch.S needs a specific change: with "br x27" the CPU expects "bti j",
which is bad because the functions begin with "bti c"; switch to "br x16",
for the CPU to accept "bti c".

BTI helps defend against JOP/COP. Tested on Qemu.
 1.26 20-Feb-2020  rin branches: 1.26.4;
When emulating obsoleted arm32 instructions, use ufetch(9) rather than
dereference tf_pc directly to retrieve an instruction.

Even if tf_pc is valid when processor decodes the instruction, someone
can unmap its page before tf_pc is read in the exception handler.

Now, SIGSEGV is delivered correctly to the process in this case, rather
than kernel panic.

Pointed out by maxv.
Discussed with ryo and skrll.
 1.25 31-Jan-2020  maxv BTI definitions.
 1.24 06-Jan-2020  skrll branches: 1.24.2;
fix new cpu_intr_p
 1.23 05-Jan-2020  ad Give aarch64 a preemption safe cpu_intr_p().
 1.22 03-Dec-2019  jmcneill Add KDTRACE_HOOKS support.
 1.21 24-Nov-2019  rin PR port-arm/54702

Add support for earmv6hf binaries on COMPAT_NETBSD32 for aarch64:

- Emulate ARMv6 instructions with cache operations register (c7), that
are deprecated since ARMv7, and disabled on ARMv8 with LP64 kernel.

- ep_machine_arch (default: earmv7hf) is copied from executables, as we
do for mips64. "uname -p" reports earmv6hf if compiled for earmv6hf;
configure scripts etc can determine the appropriate architecture.

Many thanks to ryo@ for helping me to add support of Thumb-mode,
as well as providing exhaustive test cases:

https://github.com/ryo/mcr_test/

We've confirmed:

- Emulation works in Thumb-mode.
- T32 16-bit length illegal instruction results in SIGILL, even if
it is located nearby a boundary b/w mapped and unmapped pages.
- T32 32-bit instruction results in SIGSEGV if it is located across
a boundary b/w mapped and unmapped pages.

XXX
pullup to netbsd-9
 1.20 21-Nov-2019  ad mi_userret(): take care of calling preempt(), set spc_curpriority directly,
and remove MD code that does the same.
 1.19 28-Sep-2019  skrll newline after break
 1.18 07-Aug-2019  jmcneill trap_el0_32sync: add missing break to ESR_EC_FP_TRAP_A32 case
 1.17 06-Apr-2019  thorpej branches: 1.17.4;
Overhaul the API used to fetch and store individual memory cells in
userspace. The old fetch(9) and store(9) APIs (fubyte(), fuword(),
subyte(), suword(), etc.) are retired and replaced with new ufetch(9)
and ustore(9) APIs that can return proper error codes, etc. and are
implemented consistently across all platforms. The interrupt-safe
variants are no longer supported (and several of the existing attempts
at fuswintr(), etc. were buggy and not actually interrupt-safe).

Also augmement the ucas(9) API, making it consistently available on
all plaforms, supporting uniprocessor and multiprocessor systems, even
those that do not have CAS or LL/SC primitives.

Welcome to NetBSD 8.99.37.
 1.16 27-Jan-2019  dholland fix duplicated chunk from merge
 1.15 27-Jan-2019  pgoyette Merge the [pgoyette-compat] branch
 1.14 13-Dec-2018  ryo add support PT_STEP
 1.13 12-Dec-2018  ryo need space
 1.12 07-Dec-2018  ryo add simple stack overflow checker for debugging
 1.11 12-Oct-2018  ryo add initial support of COMPAT_NETBSD32 on AArch64.
arm ELF32 EABI binaries could be execute in AArch32 state on AArch64. A32 THUMB mode is not supported yet.
 1.10 14-Sep-2018  ryo change copystr() to asm so that we don't have to add __noasan.
Also copyinout.S is the right place for copystr().
 1.9 10-Sep-2018  ryo changed kcopy() to asm to avoid replacement memcpy() to kasan_memcpy() when defined KASAN.
 1.8 28-Jul-2018  ryo Implement sigill_debug variable for debug (with DDB). if sigill_debug = 1, illegal instruction will be logged.

e.g.) [ 75914.9966392] TRAP: pid 1422 (ssh), uid 1074: Unknown Reason (Illegal Instruction): pc=0x0000faa29ae35088: pmull v0.1q, v0.1d, v0.1d
 1.7 19-Jul-2018  christos fix printf format.
 1.6 19-Jul-2018  christos Implement TRAP_SIGDEBUG for aarch64...
ptraced programs die with:
data_abort_handler, 257: pid 199.1 (a.out): signal 11 (trap 0x82000006) @pc 0, addr 0x0, error=Instruction Abort (EL0)
 1.5 17-Jul-2018  christos - add missing casts
- use PRI?64 instead of ll?
 1.4 01-Apr-2018  ryo branches: 1.4.2;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.3 25-Aug-2017  nisimura branches: 1.3.2;

- reorder faultbuf member.
- introduce trap() and interrupt(). now brk insn work.
-
 1.2 16-Aug-2017  nisimura reimplement copy/fetch/store(9). mostly copied from riscv
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.6;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.6.1 28-Aug-2017  skrll Sync with HEAD
 1.1.4.3 03-Dec-2017  jdolecek update from HEAD
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file trap.c was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.3.2.6 26-Dec-2018  pgoyette Sync with HEAD, resolve a few conflicts
 1.3.2.5 20-Oct-2018  pgoyette Sync with head
 1.3.2.4 30-Sep-2018  pgoyette Ssync with HEAD
 1.3.2.3 06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.3.2.2 28-Jul-2018  pgoyette Sync with HEAD
 1.3.2.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.4.2.4 21-Apr-2020  martin Sync with HEAD
 1.4.2.3 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.4.2.2 08-Apr-2020  martin Merge changes from current as of 20200406
 1.4.2.1 10-Jun-2019  christos Sync with HEAD
 1.17.4.5 01-Jan-2021  martin Pull up following revision(s) (requested by rin in ticket #1175):

sys/arch/aarch64/aarch64/trap.c: revision 1.28,1.31,1.32 (patch)

- add support conditionally execution for A32 instruction emulation
- separated the processing of ARM and THUMB emul clearly. do not confuse the Thumb-32bit instruction with the ARM instruction.
- use far_el1 instead of tf_pc to return correct fault address when instruction emulation
 1.17.4.4 01-Jan-2021  martin Pull up following revision(s) (requested by rin in ticket #1172):

sys/arch/aarch64/aarch64/trap.c: revision 1.30
sys/arch/aarch64/include/ptrace.h: revision 1.10
sys/arch/aarch64/include/netbsd32_machdep.h: revision 1.4 (patch)
sys/arch/aarch64/aarch64/netbsd32_machdep.c: revision 1.14
sys/arch/aarch64/aarch64/netbsd32_machdep.c: revision 1.15

Add support of ptrace(2) for COMPAT_NETBSD32.

Now, GDB for arm32 is usable for debugging 32bit applications.
OK ryo@

For rev 1.14 and before, netbsd32_process_write_regs() returns EINVAL
if non-modifiable bits are set in CPSR.
Instead, mask out non-modifiable bits and make this function success
regardless of value in CPSR. New behavior matches that of arm:
https://nxr.netbsd.org/xref/src/sys/arch/arm/arm/process_machdep.c#187

This fixes lib/libc/sys/t_ptrace_wait*:access_regs6 tests, in which
register contents retrieved by PT_GETREGS are set back by PT_SETREGS.

No new regression is observed in full ATF run.

OK ryo
 1.17.4.3 01-Jan-2021  martin Pull up following revision(s) (requested by rin in ticket #1169):

sys/arch/aarch64/aarch64/trap.c: revision 1.21
sys/arch/aarch64/aarch64/trap.c: revision 1.26

PR port-arm/54702
Add support for earmv6hf binaries on COMPAT_NETBSD32 for aarch64:
- Emulate ARMv6 instructions with cache operations register (c7), that
are deprecated since ARMv7, and disabled on ARMv8 with LP64 kernel.

Many thanks to ryo@ for helping me to add support of Thumb-mode,
as well as providing exhaustive test cases:
https://github.com/ryo/mcr_test/

We've confirmed:
- Emulation works in Thumb-mode.
- T32 16-bit length illegal instruction results in SIGILL, even if
it is located nearby a boundary b/w mapped and unmapped pages.
- T32 32-bit instruction results in SIGSEGV if it is located across
a boundary b/w mapped and unmapped pages.


When emulating obsoleted arm32 instructions, use ufetch(9) rather than
dereference tf_pc directly to retrieve an instruction.
Even if tf_pc is valid when processor decodes the instruction, someone
can unmap its page before tf_pc is read in the exception handler.
Now, SIGSEGV is delivered correctly to the process in this case, rather
than kernel panic.

Pointed out by maxv.

Discussed with ryo and skrll.
 1.17.4.2 09-Dec-2019  martin Pull up following revision(s) (requested by jmcneill in ticket #525):

external/cddl/osnet/dev/dtrace/aarch64/dtrace_isa.c: revision 1.1
distrib/sets/lists/modules/md.i386: revision 1.83
share/mk/bsd.own.mk: revision 1.1168
usr.bin/mkubootimage/mkubootimage.c: revision 1.25
sys/modules/dtrace/Makefile: revision 1.7
usr.bin/mkubootimage/mkubootimage.c: revision 1.26
sys/modules/dtrace/Makefile: revision 1.8
external/cddl/osnet/dist/lib/libdtrace/aarch64/dt_isadep.c: revision 1.2
distrib/sets/lists/modules/mi: revision 1.128
sys/arch/aarch64/include/frame.h: revision 1.3
sys/arch/evbarm/conf/mk.generic64: revision 1.4
external/cddl/osnet/dist/lib/libdtrace/common/dt_link.c: revision 1.12
sys/modules/cyclic/Makefile: revision 1.4
sys/arch/aarch64/conf/Makefile.aarch64: revision 1.16
external/cddl/osnet/dev/dtrace/aarch64/dtrace_subr.c: revision 1.1
sys/arch/aarch64/aarch64/start.S: revision 1.3
sys/arch/aarch64/aarch64/trap.c: revision 1.22
external/cddl/osnet/dev/fbt/aarch64/fbt_isa.c: revision 1.1
external/cddl/osnet/dev/dtrace/aarch64/dtrace_asm.S: revision 1.1
external/cddl/osnet/dev/fbt/aarch64/fbt_isa.h: revision 1.1
external/cddl/osnet/dev/dtrace/aarch64/regset.h: revision 1.1
external/cddl/osnet/lib/libdtrace/Makefile: revision 1.26
distrib/sets/lists/modules/md.amd64: revision 1.82
usr.bin/mkubootimage/mkubootimage.1: revision 1.13
distrib/sets/lists/modules/ad.arm: revision 1.14

Add KDTRACE_HOOKS support.

Define lwp_trapframe() macro

dtrace: add support for aarch64

Add syscall_linux back for other arm architectures (accidently removed
in previous)

Add -u flag for updating headers in place.

Fix alignment of .text section by changing load address to
0xffffffc000000000 and adding 64 bytes of padding before the entry point.

Update arm64 image header in place

Move dtrace_syscall_linux out of mi set list

Enable DTrace on aarch64

Fix signed/unsigned comparison
 1.17.4.1 07-Aug-2019  martin Pull up following revision(s) (requested by jmcneill in ticket #27):

sys/arch/aarch64/aarch64/trap.c: revision 1.18

trap_el0_32sync: add missing break to ESR_EC_FP_TRAP_A32 case
 1.24.2.1 29-Feb-2020  ad Sync with head.
 1.26.4.1 20-Apr-2020  bouyer Sync with HEAD
 1.40.2.2 03-Apr-2021  thorpej Sync with HEAD.
 1.40.2.1 14-Dec-2020  thorpej Sync w/ HEAD.
 1.45.2.1 17-Apr-2021  thorpej Sync with HEAD.
 1.51.2.1 02-Aug-2025  perseant Sync with HEAD
 1.29 26-Jun-2022  jmcneill build fix: remove includes of opt_gic.h
 1.28 25-Jun-2022  jmcneill Remove GIC_SPLFUNCS.
 1.27 29-May-2022  ryo ESR_EL1 and FAR_EL1 are not required in interrupt trapframe and their values are meaningless.
To identify it as an interrupt trap frame, store -1 and 0.
 1.26 06-May-2022  ryo Sprinkle isb after modifying system regs of pointer auth.
With options ARMV83_PAC, it now works on native Mac M1.

TODO: Multiple ISBs should be combined in one place.
 1.25 06-May-2022  ryo md_astpending is uint32_t
 1.24 18-Sep-2021  jmcneill gic_splx: performance optimizations

Avoid any kind of register access (DAIF, PMR, etc), barriers, and atomic
operations in the common case where no interrupt fires between spl being
raised and lowered.

This introduces a per-CPU return address (ci_splx_restart) used by the
vector handler to restart a sequence in splx that compares the new ipl
with the per-CPU hardware priority state stored in ci_hwpl.
 1.23 30-Aug-2021  jmcneill Add FIQ support.
 1.22 09-Mar-2021  ryo Add support hardware breakpoint and watchpoint again.

Limited support for hardware watchpoint has been available for some time, but it
has not been working properly. In addition, it stopped working at the time of
the PTRACE support commit on 2018-12-13. This has been fixed to work correctly,
and also fixed to be practical by sharing hardware watchpoints and breakpoints
between CPUs on MULTIPROCESSOR.

Also fixed a bug that causes a malfunction when switching CPUs with
"machine cpu N" when entering ddb mode from other than cpu_Debugger().

I have confirmed that the CPU can be switched by "machine cpu N" and return from
ddb properly in each case where ddb is called triggered by ddb break/watchpoint,
hardware break/watchpoint, and cpu_Debugger().
 1.21 15-Oct-2020  ryo branches: 1.21.2;
slightly optimized loop for trap_doast() calls
 1.20 06-Oct-2020  skrll move #include "opt_compat_netbsd32.h" to where it's required
 1.19 30-Sep-2020  skrll Move el[01]_trap_exit into vectors.S where the callers exist
 1.18 12-Aug-2020  skrll Part II of ad's aarch64 performance improvements (cpu_switch.S bugs are
all mine)

- Use tpidr_el1 to hold curlwp and not curcpu, because curlwp is accessed
much more often by MI code. It also makes curlwp preemption safe and
allows aarch64_curlwp() to be a const function (curcpu must be volatile).

- Make ASTs operate per-LWP rather than per-CPU, otherwise sometimes LWPs
can see spurious ASTs (which doesn't cause a problem, it just means some
time may be wasted).

- Use plain stores to set/clear ASTs. Make sure ASTs are always set on the
same CPU as the target LWP, and delivered via IPI if posted from a remote
CPU so that they are resolved quickly.

- Add some cache line padding to struct cpu_info, to match x86.

- Add a memory barrier in a couple of places where ci_curlwp is set. This
is needed whenever an LWP that is resuming on the CPU could hold an
adaptive mutex. The barrier needs to drain the CPU's store buffer, so
that the update to ci_curlwp becomes globally visible before the LWP can
resume and call mutex_exit(). By my reading of the ARM docs it looks like
the instruction I used will do the right thing, but I'm not 100% sure.
 1.17 23-May-2020  ryo Not only the kernel thread, but also the userland PAC keys
(APIA,APIB,APDA,APDB,APGA) are now randomly initialized at exec, and switched
when context switch.
userland programs are able to perform pointer authentication on ARMv8.3+PAC cpu.

reviewd by maxv@, thanks.
 1.16 15-May-2020  ryo use ldp if possible
 1.15 16-Apr-2020  skrll Shave off 3 instructions per trap
 1.14 13-Apr-2020  maxv Add support for Branch Target Identification (BTI).

On the executable pages that have the GP (Guarded Page) bit, the semantic
of the "br" and "blr" instructions is changed: the CPU expects the first
instruction of the jump/call target to be "bti", and faults if it isn't.

We add the GP bit on the kernel .text pages (and incidentally the .rodata
pages, but we don't care). The compiler adds a "bti c" instruction at the
beginning of each C function. We modify the ENTRY() macros to manually add
"bti c" in the asm functions.

cpuswitch.S needs a specific change: with "br x27" the CPU expects "bti j",
which is bad because the functions begin with "bti c"; switch to "br x16",
for the CPU to accept "bti c".

BTI helps defend against JOP/COP. Tested on Qemu.
 1.13 12-Apr-2020  maxv Add support for Pointer Authentication (PAC).

We use the "pac-ret" option, to sign the return instruction pointer on
function entry, and authenticate it on function exit. This acts as a
mitigation against ROP.

The authentication uses a per-lwp (secret) I-A key stored in the 128bit
APIAKey register and part of the lwp context. During lwp creation, the
kernel generates a random key, and during context switches, it installs
the key of the target lwp on the CPU.

Userland cannot read the APIAKey register directly. However, it can sign
its pointers with it, because the register is architecturally shared
between userland and the kernel. Although part of the CPU design, it is
a bit of an undesired behavior, because it allows to forge valid kernel
pointers from userland. To avoid that, we don't share the key with
userland, and rather switch it in EL0<->EL1 transitions. This means that
when userland executes, a different key is loaded in APIAKey than the one
the kernel uses. For now the userland key is a fixed 128bit zero value.

The DDB stack unwinder is changed to strip the authentication code from
the pointers in lr.

Two problems are known:

* Currently the idlelwps' keys are not really secret. This is because
the RNG is not yet available when we spawn these lwps. Not overly
important, but would be nice to fix with UEFI RNG.
* The key switching in EL0<->EL1 transitions is not the most optimized
code on the planet. Instead of checking aarch64_pac_enabled, it would
be better to hot-patch the code at boot time, but there currently is
no hot-patch support on aarch64.

Tested on Qemu.
 1.12 11-Apr-2020  maxv The vectors allow for up to 0x80 bytes of instructions, but we've reached
this limit already, so implement the handler functions outside, and jump
to them. This allows to add instructions in the future.

Sent to ryo@ and skrll@.
 1.11 12-Feb-2020  skrll branches: 1.11.4;
Adjust comments
 1.10 12-Feb-2020  riastradh Create a buffer space of 512 bytes before the trapframe.

dtrace fbt needs enough space to emulate an

stp x29, x30, [sp,#-FRAMESIZE]!

instruction in a function prologue. In the aarch64 instruction
encoding, FRAMESIZE can be as large as 512 bytes, so reserve this
much space when KDTRACE_HOOKS is enabled.
 1.9 12-Oct-2018  ryo branches: 1.9.4; 1.9.6;
add initial support of COMPAT_NETBSD32 on AArch64.
arm ELF32 EABI binaries could be execute in AArch32 state on AArch64. A32 THUMB mode is not supported yet.
 1.8 14-Sep-2018  ryo use ENTRY_NP to avoid added _PROF_PROLOGUE.
 1.7 30-Jul-2018  ryo don't depend on clang code to backtrace. keep trapframe as framepointer if DDB.
 1.6 17-Jul-2018  ryo fix build with aarch64 gcc/gas
 1.5 01-Apr-2018  ryo branches: 1.5.2;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.4 23-Aug-2017  nisimura branches: 1.4.2;

- don't use ENTRY() for exception entries.
- correct section definition.
- designate long pointer ldr.
 1.3 22-Aug-2017  nisimura use lr for current x30. some comment snip
 1.2 22-Aug-2017  nisimura fill EL1 exception entry vector
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.6;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.6.1 28-Aug-2017  skrll Sync with HEAD
 1.1.4.3 03-Dec-2017  jdolecek update from HEAD
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file vectors.S was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.4.2.5 20-Oct-2018  pgoyette Sync with head
 1.4.2.4 30-Sep-2018  pgoyette Ssync with HEAD
 1.4.2.3 06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.4.2.2 28-Jul-2018  pgoyette Sync with HEAD
 1.4.2.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.5.2.4 21-Apr-2020  martin Sync with HEAD
 1.5.2.3 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.5.2.2 08-Apr-2020  martin Merge changes from current as of 20200406
 1.5.2.1 10-Jun-2019  christos Sync with HEAD
 1.9.6.1 29-Feb-2020  ad Sync with head.
 1.9.4.1 12-Feb-2020  martin Pull up following revision(s) (requested by riastradh in ticket #701):

external/cddl/osnet/dev/dtrace/aarch64/dtrace_isa.c: revision 1.2
external/cddl/osnet/dist/lib/libdtrace/common/dt_open.c: revision 1.17
external/cddl/osnet/dist/lib/libdtrace/common/dt_module.c: revision 1.18
sys/modules/cyclic/Makefile: revision 1.5
external/cddl/osnet/dev/dtrace/aarch64/dtrace_subr.c: revision 1.2
external/cddl/osnet/dev/dtrace/aarch64/dtrace_subr.c: revision 1.3
sys/arch/aarch64/aarch64/vectors.S: revision 1.10
external/cddl/osnet/dev/fbt/aarch64/fbt_isa.c: revision 1.2
external/cddl/osnet/dev/fbt/aarch64/fbt_isa.c: revision 1.3
external/cddl/osnet/dev/fbt/aarch64/fbt_isa.c: revision 1.4
external/cddl/osnet/dev/fbt/aarch64/fbt_isa.c: revision 1.5
external/cddl/osnet/dev/fbt/aarch64/fbt_isa.c: revision 1.6
sys/arch/aarch64/include/cpu.h: revision 1.20
external/cddl/osnet/dist/lib/libdtrace/common/dt_impl.h: revision 1.9

Create a buffer space of 512 bytes before the trapframe.

dtrace fbt needs enough space to emulate an

stp x29, x30, [sp,#-FRAMESIZE]!

instruction in a function prologue. In the aarch64 instruction
encoding, FRAMESIZE can be as large as 512 bytes, so reserve this
much space when KDTRACE_HOOKS is enabled.

Use db_write_bytes to overwrite kernel text.

Tidy up a bit. No functional change intended.

aarch64 fbt_invop doesn't actually use the argument, but it would
make more sense for it to be the return value and/or first argument
register. Certainly it's not `eax'!

Tidy up a bit: don't set things we won't use; assert nonzeroness.

Use /dev/ksyms, not /netbsd, for the running kernel's symbols.

Teach dtrace about el1_trap_exit frames on aarch64.

Implement dtrace_getarg and dtrace_getreg while here.

Count the number of artificial frames in aarch64 fbt probe correctly.

Change the address ranges that aarch64 considers toxic for dtrace.
`Toxic' means dtrace forbids D scripts from even attempting to read
or write at them.

Previously we considered [0, VM_MIN_KERNEL_ADDRESS) toxic, but
VM_MIN_KERNEL_ADDRESS is only the minimum address of the kernel map;
the direct-mapped region lies below it, and with PMAP_MAP_POOLPAGE we
allocate virtual pages for pool backing directly from physical pages
through the direct-mapped region. Also, this did not consider I/O
mappings to be toxic, which they probably should be.

Instead, treat:

[0, AARCH64_KSEG_START)
and
[VM_KERNEL_IO_ADDRESS, 0xfff...ff)

as toxic. (The upper bound for 0xfff...ff ought to be inclusive, not
exclusive, but I think we'll need another mechanism for expressing
that to dtrace!)

Switch from db_write_bytes to using direct-mapping.

This way there's no dependency on ddb.

Define the MULTIPROCESSOR cpu_number() for modules too.
Modules should work whether the main kernel is multiprocessor or not.
In particular, dtrace should not think cpu_number() is 0 while
cpu_index(curcpu()) and curcpu()->ci_index are nonzero, leading to
rather spectacularly bogus results...

cyclic.kmod needs -Wno-sign-compare for aarch64 CPU_INFO_FOREACH.
Provisional workaround; feel free to fix.
 1.11.4.1 20-Apr-2020  bouyer Sync with HEAD
 1.21.2.1 03-Apr-2021  thorpej Sync with HEAD.
 1.15 20-Dec-2023  thorpej Remove unnecessary <sys/malloc.h>.
 1.14 25-Feb-2023  skrll Add a KASSERT
 1.13 29-May-2022  ryo Simplified termination conditions for ddb backtrace.

Exit backtrace when the user trapframe is invalid. (Mainly in kernel threads).
 1.12 30-Aug-2021  jmcneill Interrupts may not be enabled yet when cpu_lwp_fork is called during boot,
so remove incorrect KASSERT.
 1.11 28-Mar-2021  skrll fix a comment that has been c&p'ed around and not updated
 1.10 01-Mar-2021  jmcneill branches: 1.10.2;
cpu_lwp_fork: KASSERT -> KASSERTMSG to print the actual value of DAIF if
it is not 0 in cpu_lwp_fork
 1.9 15-Oct-2020  rin branches: 1.9.2;
Fix clone(2) for COMPAT_NETBSD32.

(1) Set r13 (sp for arm32 processes) appropriately when stack is
specified to fork1().

(2) For arm32 processes, align stack to 8-byte boundary, instead of
16-byte for native aarch64 processes, to match our 32-bit ABI:

https://nxr.netbsd.org/xref/src/sys/arch/arm/arm32/vm_machdep.c#150

Note that sp alignment checking is disabled in aarch32 mode, and
this works fine with AARCH64_EL0_STACK_ALIGNMENT_CHECK option.

OK ryo
 1.8 23-May-2020  ryo Not only the kernel thread, but also the userland PAC keys
(APIA,APIB,APDA,APDB,APGA) are now randomly initialized at exec, and switched
when context switch.
userland programs are able to perform pointer authentication on ARMv8.3+PAC cpu.

reviewd by maxv@, thanks.
 1.7 22-May-2020  ryo fix to do backtrace properly for running LWPs and cpu_lwp_fork().
when dump of pcb_tf, only the switchframe part is now displayed instead of the whole trapframe.
 1.6 12-Apr-2020  maxv Add support for Pointer Authentication (PAC).

We use the "pac-ret" option, to sign the return instruction pointer on
function entry, and authenticate it on function exit. This acts as a
mitigation against ROP.

The authentication uses a per-lwp (secret) I-A key stored in the 128bit
APIAKey register and part of the lwp context. During lwp creation, the
kernel generates a random key, and during context switches, it installs
the key of the target lwp on the CPU.

Userland cannot read the APIAKey register directly. However, it can sign
its pointers with it, because the register is architecturally shared
between userland and the kernel. Although part of the CPU design, it is
a bit of an undesired behavior, because it allows to forge valid kernel
pointers from userland. To avoid that, we don't share the key with
userland, and rather switch it in EL0<->EL1 transitions. This means that
when userland executes, a different key is loaded in APIAKey than the one
the kernel uses. For now the userland key is a fixed 128bit zero value.

The DDB stack unwinder is changed to strip the authentication code from
the pointers in lr.

Two problems are known:

* Currently the idlelwps' keys are not really secret. This is because
the RNG is not yet available when we spawn these lwps. Not overly
important, but would be nice to fix with UEFI RNG.
* The key switching in EL0<->EL1 transitions is not the most optimized
code on the planet. Instead of checking aarch64_pac_enabled, it would
be better to hot-patch the code at boot time, but there currently is
no hot-patch support on aarch64.

Tested on Qemu.
 1.5 27-Dec-2018  mrg branches: 1.5.4; 1.5.10;
make savecore for arm64 basically work.

- move MD lwp "md_ktf" member into struct pcb. the pcb is used by
the gdb "bsd-kvm" target code to find the stack of each thread
and needs to be available in a well known location.
- implement aarch64_nbsd_supply_pcb() in GDB. makes basic gdb work
on a crash dump.
- remove '#if L_MD_KTF + 8 == L_MD_CPACR' conditional code, as there
is no more L_MD_KTF.

with this gdb has minimal working functionality with "target kvm",
and crash can at least "ps" on a crash dump.

ok skrll.
 1.4 17-Jul-2018  christos add missing casts
 1.3 12-Jul-2018  maxv Remove the kernel PMC code. Sent yesterday on tech-kern@.

This change:

* Removes "options PERFCTRS", the associated includes, and the associated
ifdefs. In doing so, it removes several XXXSMPs in the MI code, which is
good.

* Removes the PMC code of ARM XSCALE.

* Removes all the pmc.h files. They were all empty, except for ARM XSCALE.

* Reorders the x86 PMC code not to rely on the legacy pmc.h file. The
definitions are put in sysarch.h.

* Removes the kern/sys_pmc.c file, and along with it, the sys_pmc_control
and sys_pmc_get_info syscalls. They are marked as OBSOL in kern,
netbsd32 and rump.

* Removes the pmc_evid_t and pmc_ctr_t types.

* Removes all the associated man pages. The sets are marked as obsolete.
 1.2 01-Apr-2018  ryo branches: 1.2.2;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.28;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.28.3 18-Jan-2019  pgoyette Synch with HEAD
 1.1.28.2 28-Jul-2018  pgoyette Sync with HEAD
 1.1.28.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file vm_machdep.c was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.2.2.2 21-Apr-2020  martin Sync with HEAD
 1.2.2.1 10-Jun-2019  christos Sync with HEAD
 1.5.10.1 20-Apr-2020  bouyer Sync with HEAD
 1.5.4.1 01-Jan-2021  martin Pull up following revision(s) (requested by rin in ticket #1174):

sys/arch/aarch64/aarch64/vm_machdep.c: revision 1.9 (patch)

Fix clone(2) for COMPAT_NETBSD32.
(1) Set r13 (sp for arm32 processes) appropriately when stack is
specified to fork1().
(2) For arm32 processes, align stack to 8-byte boundary, instead of
16-byte for native aarch64 processes, to match our 32-bit ABI:
https://nxr.netbsd.org/xref/src/sys/arch/arm/arm32/vm_machdep.c#150

Note that sp alignment checking is disabled in aarch32 mode, and
this works fine with AARCH64_EL0_STACK_ALIGNMENT_CHECK option.

OK ryo
 1.9.2.1 03-Apr-2021  thorpej Sync with HEAD.
 1.10.2.1 03-Apr-2021  thorpej Sync with HEAD.
 1.2 17-May-2025  andvar Fix RCS ID in few more files.
 1.1 09-May-2024  pho branches: 1.1.2;
port-arm/58194: Resurrect vmt(4) from bitrot

On this architecture vmt(4) used to search for a node "/hypervisor" in the
FDT and probed the VMware hypervisor call only when the node was
found. However, things appear to have changed and VMware no longer provides
the FDT node.

Since vmt(4) doesn't actually need to read anything from FDT, and the
hypervisor call logically resides in virtual CPUs themselves, it would be
better to attach it directly to cpu, just like how it's probed on x86.
 1.1.2.1 02-Aug-2025  perseant Sync with HEAD
 1.25 28-Jul-2023  rin Simplify fix for PR toolchain/57146

Introduce ARCH_STRIP_SYMBOLS variable to centralize logic for debug
symbols from MD Makefile's to Makefile.kern.inc.
 1.24 26-Jul-2023  rin Fix kernel size inflation for arm and aarch64 (PR toolchain/57146)

For some conditions, SYSTEM_LD_TAIL is set for arm and aarch64.
Then, ctfmerge(1) in default SYSTEM_LD_TAIL is unintentionally
skipped, which results in the catastrophic kernel size inflation,
as reported in the PR.

Also, introduce and use OBJCOPY_STRIPFLAGS variable instead of
STRIPFLAGS, as strip(1) is replaced by objcopy(1) during MI
kernel build procedure.

XXX
For Makefile.{arm,aarch64}, weird logic is used to determine how
to handle debug symbols; MKDEBUG{,KERNEL} are taken into account
later in sys/conf/Makefile.kern.inc.
 1.23 27-May-2021  ryo branches: 1.23.12;
In gcc10, -msign-return-address is no longer supported.
Instead, (LLVM-compatible) -mbranch-protection option is supported.
 1.22 10-Feb-2021  ryo branches: 1.22.4; 1.22.6;
add support kernel profiling on aarch64

- add MCOUNT_ENTER, MCOUNT_EXIT macro
- __mcount() function should be aligned
- add "-fno-optimize-sibling-calls" option when PROF. for accurate profiling, it is better to suppress the tail call.
 1.21 11-May-2020  ryo branches: 1.21.2;
"options ARMV83_PAC" is now supported for gcc as well.

- add "-msign-return-address=all" to CFLAGS for gcc when specified options ARMV83_PAC
- AARCH64REG_{READ,WRITE}_INLINE3 macro can now use the APIAKey registers in both gcc and llvm.
llvm requires asm(".arch armv8.3-a"), whereas gcc requires __attribute__((target("arch=armv8.3-a"))).
- use ".arch armv8.3-a" rather than ".arch armv8.3-a+pac" in *.S for llvm.
 1.20 13-Apr-2020  maxv Add KASAN instrumentation on on-stack VLAs, same as amd64.
 1.19 13-Apr-2020  maxv Add support for Branch Target Identification (BTI).

On the executable pages that have the GP (Guarded Page) bit, the semantic
of the "br" and "blr" instructions is changed: the CPU expects the first
instruction of the jump/call target to be "bti", and faults if it isn't.

We add the GP bit on the kernel .text pages (and incidentally the .rodata
pages, but we don't care). The compiler adds a "bti c" instruction at the
beginning of each C function. We modify the ENTRY() macros to manually add
"bti c" in the asm functions.

cpuswitch.S needs a specific change: with "br x27" the CPU expects "bti j",
which is bad because the functions begin with "bti c"; switch to "br x16",
for the CPU to accept "bti c".

BTI helps defend against JOP/COP. Tested on Qemu.
 1.18 12-Apr-2020  maxv Add support for Pointer Authentication (PAC).

We use the "pac-ret" option, to sign the return instruction pointer on
function entry, and authenticate it on function exit. This acts as a
mitigation against ROP.

The authentication uses a per-lwp (secret) I-A key stored in the 128bit
APIAKey register and part of the lwp context. During lwp creation, the
kernel generates a random key, and during context switches, it installs
the key of the target lwp on the CPU.

Userland cannot read the APIAKey register directly. However, it can sign
its pointers with it, because the register is architecturally shared
between userland and the kernel. Although part of the CPU design, it is
a bit of an undesired behavior, because it allows to forge valid kernel
pointers from userland. To avoid that, we don't share the key with
userland, and rather switch it in EL0<->EL1 transitions. This means that
when userland executes, a different key is loaded in APIAKey than the one
the kernel uses. For now the userland key is a fixed 128bit zero value.

The DDB stack unwinder is changed to strip the authentication code from
the pointers in lr.

Two problems are known:

* Currently the idlelwps' keys are not really secret. This is because
the RNG is not yet available when we spawn these lwps. Not overly
important, but would be nice to fix with UEFI RNG.
* The key switching in EL0<->EL1 transitions is not the most optimized
code on the planet. Instead of checking aarch64_pac_enabled, it would
be better to hot-patch the code at boot time, but there currently is
no hot-patch support on aarch64.

Tested on Qemu.
 1.17 04-Mar-2020  ryo branches: 1.17.2;
change kernel vm base address to use more than 256GB of memory. (up to 64TB)

also enlarge KSEG(direct map) region from 512GB to 64TB.
KASAN works ok.

Note: -fasan-shadow-offset=
KASAN_SHADOW_START - (CANONICAL_BASE >> 3) =
0xFFFF400000000000 - (0xFFFF000000000000 >> 3) =
0xDFFF600000000000
 1.16 04-Dec-2019  jmcneill Fix alignment of .text section by changing load address to
0xffffffc000000000 and adding 64 bytes of padding before the entry point.
 1.15 03-Mar-2019  maxv branches: 1.15.4;
Add KASAN use-after-scope detection in aarch64, tested by Ryo Shimizu,
thanks.
 1.14 08-Nov-2018  maxv Track the stack with kASan on aarch64. Same principle as on amd64. Illegal
accesses occurring there are now detected.

Originally written by me, but reworked by ryo@, thanks.
 1.13 01-Nov-2018  maxv Add kASan support for aarch64. Stack tracking needs more investigation
and will come in a separate commit.

Reviewed by ryo@ jmcneill@ skrll@.
 1.12 22-Sep-2018  rin - Determine KERN_AS automatically depending on whether OPT_MODULAR is
set or not, in the same way as libcompat.

- Specify OPT_MODULAR in the port Makefile instead of KERN_AS.

Now, KERN_AS=library is used for kernels without module(7) for all ports.

OK christos
 1.11 14-Sep-2018  skrll s/A64/ARM/

no functional change
 1.10 23-Jun-2018  jakllsch branches: 1.10.2;
locore.S is a MD_SFILES.

This keeps the dependency handling in the loop, so rebuilds after
changing options, say EARLYCONS, don't fail.
 1.9 01-Apr-2018  ryo Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.8 10-Dec-2017  christos branches: 1.8.2;
- Allow multiple .BEGIN targets
- Make their protection consistent
 1.7 05-May-2016  rjs Fix config(1) errors and warnings.

Set up arm headers for the build.
 1.6 24-Aug-2015  uebayasi Define ${LINKSCRIPT} in one place.
 1.5 20-Aug-2015  uebayasi Use ${KERNLDSCRIPT}.
 1.4 15-Nov-2014  uebayasi branches: 1.4.2;
Use LINKSCRIPT.
 1.3 17-Aug-2014  joerg branches: 1.3.2;
Reorganize symbol table embedding. The existing option SYMTAB_SPACE is
replaced by the make option COPY_SYMTAB set to any value. The copy of
the symbol table is no longer put into a buffer in kern_ksyms.o, but a
small helper object. This object is build first with a dummy size, then
the kernel is linked to compute the real dimension of the symbol table
buffer. After that, the helper object is rebuild and the kernel linked
again.
 1.2 14-Aug-2014  joerg Use wildcards for stripping/preserving the mapping symbols on ARM and
AArch64. LLVM creates unique symbols in each file of the form $a.n etc.
 1.1 10-Aug-2014  matt branches: 1.1.2;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.2.1 05-Jan-2016  snj Pull up following revision(s) (requested by martin in ticket #1058):
sys/arch/arm/conf/Makefile.arm: revision 1.43
sys/arch/aarch64/conf/Makefile.aarch64: revision 1.2
share/mk/bsd.sys.mk: revision 1.243, 1.244
Use wildcards for stripping/preserving the mapping symbols on ARM and
AArch64. LLVM creates unique symbols in each file of the form $a.n etc.
--
Fix typo in OBJCOPYLIBFLAGS_EXTRA for aarch64eb.
 1.3.2.3 03-Dec-2017  jdolecek update from HEAD
 1.3.2.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.3.2.1 17-Aug-2014  tls file Makefile.aarch64 was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.4.2.2 29-May-2016  skrll Sync with HEAD
 1.4.2.1 22-Sep-2015  skrll Sync with HEAD
 1.8.2.4 26-Nov-2018  pgoyette Sync with HEAD, resolve a couple of conflicts
 1.8.2.3 30-Sep-2018  pgoyette Ssync with HEAD
 1.8.2.2 25-Jun-2018  pgoyette Sync with HEAD
 1.8.2.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.10.2.3 21-Apr-2020  martin Sync with HEAD
 1.10.2.2 08-Apr-2020  martin Merge changes from current as of 20200406
 1.10.2.1 10-Jun-2019  christos Sync with HEAD
 1.15.4.1 09-Dec-2019  martin Pull up following revision(s) (requested by jmcneill in ticket #525):

external/cddl/osnet/dev/dtrace/aarch64/dtrace_isa.c: revision 1.1
distrib/sets/lists/modules/md.i386: revision 1.83
share/mk/bsd.own.mk: revision 1.1168
usr.bin/mkubootimage/mkubootimage.c: revision 1.25
sys/modules/dtrace/Makefile: revision 1.7
usr.bin/mkubootimage/mkubootimage.c: revision 1.26
sys/modules/dtrace/Makefile: revision 1.8
external/cddl/osnet/dist/lib/libdtrace/aarch64/dt_isadep.c: revision 1.2
distrib/sets/lists/modules/mi: revision 1.128
sys/arch/aarch64/include/frame.h: revision 1.3
sys/arch/evbarm/conf/mk.generic64: revision 1.4
external/cddl/osnet/dist/lib/libdtrace/common/dt_link.c: revision 1.12
sys/modules/cyclic/Makefile: revision 1.4
sys/arch/aarch64/conf/Makefile.aarch64: revision 1.16
external/cddl/osnet/dev/dtrace/aarch64/dtrace_subr.c: revision 1.1
sys/arch/aarch64/aarch64/start.S: revision 1.3
sys/arch/aarch64/aarch64/trap.c: revision 1.22
external/cddl/osnet/dev/fbt/aarch64/fbt_isa.c: revision 1.1
external/cddl/osnet/dev/dtrace/aarch64/dtrace_asm.S: revision 1.1
external/cddl/osnet/dev/fbt/aarch64/fbt_isa.h: revision 1.1
external/cddl/osnet/dev/dtrace/aarch64/regset.h: revision 1.1
external/cddl/osnet/lib/libdtrace/Makefile: revision 1.26
distrib/sets/lists/modules/md.amd64: revision 1.82
usr.bin/mkubootimage/mkubootimage.1: revision 1.13
distrib/sets/lists/modules/ad.arm: revision 1.14

Add KDTRACE_HOOKS support.

Define lwp_trapframe() macro

dtrace: add support for aarch64

Add syscall_linux back for other arm architectures (accidently removed
in previous)

Add -u flag for updating headers in place.

Fix alignment of .text section by changing load address to
0xffffffc000000000 and adding 64 bytes of padding before the entry point.

Update arm64 image header in place

Move dtrace_syscall_linux out of mi set list

Enable DTrace on aarch64

Fix signed/unsigned comparison
 1.17.2.1 20-Apr-2020  bouyer Sync with HEAD
 1.21.2.1 03-Apr-2021  thorpej Sync with HEAD.
 1.22.6.1 31-May-2021  cjep sync with head
 1.22.4.1 17-Jun-2021  thorpej Sync w/ HEAD.
 1.23.12.1 11-Sep-2023  martin Pull up following revision(s) (requested by rin in ticket #363):

sys/arch/aarch64/conf/Makefile.aarch64: revision 1.24
sys/arch/aarch64/conf/Makefile.aarch64: revision 1.25
sys/arch/shark/conf/Makefile.shark.inc: revision 1.28
sys/arch/alpha/conf/Makefile.alpha: revision 1.88
sys/arch/mips/conf/Makefile.mips: revision 1.73
sys/conf/Makefile.kern.inc: revision 1.298
sys/conf/Makefile.kern.inc: revision 1.299
sys/arch/cats/conf/Makefile.cats.inc: revision 1.37
sys/arch/arm/conf/Makefile.arm: revision 1.56
sys/arch/arm/conf/Makefile.arm: revision 1.57
sys/arch/riscv/conf/Makefile.riscv: revision 1.10

Always use arm-elf2aout; no a.out support both for binutils{,.old}

Fix kernel size inflation for arm and aarch64 (PR toolchain/57146)

For some conditions, SYSTEM_LD_TAIL is set for arm and aarch64.
Then, ctfmerge(1) in default SYSTEM_LD_TAIL is unintentionally
skipped, which results in the catastrophic kernel size inflation,
as reported in the PR.

Also, introduce and use OBJCOPY_STRIPFLAGS variable instead of
STRIPFLAGS, as strip(1) is replaced by objcopy(1) during MI
kernel build procedure.

For Makefile.{arm,aarch64}, weird logic is used to determine how
to handle debug symbols; MKDEBUG{,KERNEL} are taken into account
later in sys/conf/Makefile.kern.inc.

Use OBJCOPY_STRIPFLAGS instead of STRIPFLAGS.
Simplify fix for PR toolchain/57146

Introduce ARCH_STRIP_SYMBOLS variable to centralize logic for debug
symbols from MD Makefile's to Makefile.kern.inc.
 1.45 09-May-2024  pho port-arm/58194: Resurrect vmt(4) from bitrot

On this architecture vmt(4) used to search for a node "/hypervisor" in the
FDT and probed the VMware hypervisor call only when the node was
found. However, things appear to have changed and VMware no longer provides
the FDT node.

Since vmt(4) doesn't actually need to read anything from FDT, and the
hypervisor call logically resides in virtual CPUs themselves, it would be
better to attach it directly to cpu, just like how it's probed on x86.
 1.44 18-Feb-2024  andvar Change KDB to KGDB, including "sys/kgdb.h", which were likely meant to be defined.

Also comment out kgdb_machdep.c in files.aarch64, it doesn't exist yet.
 1.43 20-Apr-2023  skrll Provide a shared pmap_devmap implementation and convert all pmap_devmap
arrays to use DEVMAP_ENTRY{,_END}
 1.42 05-Nov-2022  skrll G/C
 1.41 03-Nov-2022  skrll Provide MI PMAP support on AARCH64
 1.40 28-Oct-2022  skrll MI PMAP EFI_RUNTIME support
 1.39 15-Oct-2022  jmcneill Use "non-posted" instead of "strongly ordered" to describe nGnRnE mappings

Rename the following defines:
- _ARM_BUS_SPACE_MAP_STRONGLY_ORDERED to BUS_SPACE_MAP_NONPOSTED
- PMAP_DEV_SO to PMAP_DEV_NP
- LX_BLKPAG_ATTR_DEVICE_MEM_SO to LX_BLKPAG_ATTR_DEVICE_MEM_NP
Rename the following option:
- AARCH64_DEVICE_MEM_STRONGLY_ORDERED to AARCH64_DEVICE_MEM_NONPOSTED
 1.38 25-Jun-2022  jmcneill Remove GIC_SPLFUNCS.
 1.37 31-Jan-2022  ryo add support Hardware updates to Access flag and Dirty state (FEAT_HAFDBS)

- The DBM bit of the PTE is now used to determine if it is writable, and
the AF bit is treated entirely as a reference bit. A valid PTE is always
treated as readable. There can be no valid PTE that is not readable.
- LX_BLKPAG_OS_{READ,WRITE} are used only for debugging purposes,
and has been superseded by LX_BLKPAG_AF and LX_BLKPAG_DBM.
- Improve comment

The need for reference/modify emulation has been eliminated,
and access/permission faults have been reduced, however,
there has been little change in overall performance.
 1.36 25-Nov-2021  ryo add support COMPAT_LINUX32 for aarch64
 1.35 30-Oct-2021  jmcneill Implement gic_splraise and the gic_splx fast path in asm (armv8).
 1.34 10-Oct-2021  skrll Use sys/uvm/pmap/pmap_tlb.c on Aarch64 in the same way that some Arm, MIPS,
and some PPC kernels do. This removes the limitation of 256 processes on
CPUs with 8bit ASID field, e.g. Apple M1.

Additionally the following changes have been made

- removed a couple of unnecessary aarch64_tlbi_all calls
- removed any invalidation after freeing page tables due to
_pmap_sweep_pdp. This was never necessary afaict.
- all kernel mappings are marked global and userland mapping not-global.

Performance testing hasn't show a significant difference. The data here
is from building a kernel on an lx2k system with nvme.

before
1489.6u 400.4s 2:40.65 1176.5% 228+224k 0+32289io 57pf+0w
1482.6u 403.2s 2:38.49 1189.9% 228+222k 0+32274io 46pf+0w
1485.4u 402.2s 2:37.27 1200.2% 228+222k 0+32275io 12pf+0w

after
1493.9u 404.6s 2:37.50 1205.4% 227+221k 0+32265io 48pf+0w
1485.0u 408.0s 2:38.54 1194.0% 227+222k 0+32272io 36pf+0w
1484.3u 407.0s 2:35.88 1213.3% 228+224k 0+32268io 14pf+0w

>>> stats.ttest_ind([160.65,158.49,157.27], [157.5,158.54,155.88])
Ttest_indResult(statistic=1.1923622711296888, pvalue=0.2990182944606766)
>>>
 1.33 23-Sep-2021  ryo add support COMPAT_LINUX for aarch64
 1.32 06-Aug-2021  jmcneill Arm: Add support for SMC Calling Convention

Arm DEN0028 defines a calling mechanism used with Secure Monitor Call (SMC)
and Hypervisor Call (HVC) instructions. To discover SMCCC, we must:

1) Find the PSCI conduit (either via ACPI FADT, or Device Tree)
2) Use PSCI_VERSION to determine whether PSCI_FEATURES is supported
3) Call PSCI_FEATURES with SMCCC_VERSION to determine the implementation
version.
 1.31 24-Jul-2021  jmcneill aarch64: Remove empty source file and references to it.
 1.30 21-Oct-2020  christos branches: 1.30.6;
make process_machdep.c included always since it provides register i/o used by
sys_process_getlwpstatus.c which is always included.
 1.29 20-Oct-2020  christos harmonize process_machdep.c inclusion.
 1.28 29-Sep-2020  jmcneill Collapse all CPU_CORTEXA<n> options into CPU_CORTEX and do runtime
detection instead of ifdefs where required.
 1.27 12-Aug-2020  skrll Part III of ad's performance improvements for aarch64

- Assembly language stubs for mutex_enter() and mutex_exit().
 1.26 25-Jul-2020  riastradh Implement ChaCha with NEON on ARM.

XXX Needs performance measurement.
XXX Needs adaptation to arm32 neon which has half the registers.
 1.25 17-Jul-2020  ryo Add options PMAPBOOT_DEBUG to dump TTBR when pmapboot_enter().
Formerly DEBUG_MMU in locore.S, but there was a bit of confusion.
 1.24 29-Jun-2020  riastradh New permutation-based AES implementation using ARM NEON.

Also derived from Mike Hamburg's public-domain vpaes code.
 1.23 29-Jun-2020  riastradh Implement AES in kernel using ARMv8.0-AES on aarch64.
 1.22 18-Apr-2020  skrll PMAP_DEBUG has been deleted on arm
 1.21 13-Apr-2020  maxv Add KASAN-DMA support on aarch64, same as amd64. Discussed with skrll@.
 1.20 15-Feb-2020  skrll branches: 1.20.4;
Various updates and improvements to cpu start up on arm/aarch64

- start sharing more code around the AP startup messaging.
- call arm_cpu_topology_set early so that ci_core_id is available for
drivers, e.g. bcm2835_intr.c
- both arm and aarch64 now have
- a static cpu_info_store array
- the same arm_cpu_{hatched,mbox}
 1.19 03-Feb-2020  ryo add support pmap_pv(9)

Patch originally from jmcneill@. thanks
 1.18 21-Jan-2020  skrll Small re-org. NFCI.
 1.17 15-Jan-2020  mrg port the arm64 cpu topology setup for big.little to arm.

rename arm64 cpu_do_topology() to arm_cpu_do_topology() and
call it from both arm cpu_attach().

replace both aarch64_set_topology() inline code in arm
cpu_attach() with new arm_cpu_do_topology(), which is called
by the arm64 locore as well (possibly not needed, which would
allow it to become static.)

not yet tested on a real big.little armv7 system. tested
on rockpro64 and pinebook pro.
 1.16 28-Dec-2019  jmcneill branches: 1.16.2;
Do not use Early Write Acknowledge for PCIe I/O and config space.
 1.15 27-Dec-2019  jmcneill Enable early write acknowledge for device memory mappings.
 1.14 20-Nov-2019  pgoyette Move all non-emulation-specific coredump code into the coredump module,
and remove all #ifdef COREDUMP conditional compilation. Now, the
coredump module is completely separated from the emulation modules, and
they can all be independently loaded and unloaded.

Welcome to 9.99.18 !
 1.13 27-Jan-2019  pgoyette branches: 1.13.4;
Merge the [pgoyette-compat] branch
 1.12 05-Dec-2018  jmcneill Add needs-flag to tprof_armv8.c
 1.11 05-Dec-2018  jmcneill Split armv7/armv8 tprof backend config logic from the fdt bus glue.
 1.10 18-Nov-2018  skrll Add CPU_THUNDERX which sets COHERENCY_UNIT and CACHE_LINE_SIZE to 128
 1.9 28-Oct-2018  jmcneill Add support for EFI runtime services on aarch64.
 1.8 18-Oct-2018  skrll Provide generic start code that assumes the MMU is off and caches are
disabled as per the linux booting protocol for ARMv6 and ARMv7 boards.
u-boot image type should be changed to 'linux' for correct behaviour.

The new start code builds a minimal "bootstrap" L1PT with cached access
disabled and uses the same table for all processors. AP startup is
performed in less steps and more code is written in C.

The bootstrap tables and stack are placed into an (orphaned) section
"_init_memory" which is given to uvm when it is no longer used.

Various kernels have been converted to use this code and tested. Some
boards were provided by TNF. Thanks!

The GENERIC kernel now boots on boards using the TEGRA, SUNXI and EXYNOS
kernels. The GENERIC kernel will also work on RPI2 using u-boot.

Thanks to martin@ and aymeric@ for testing on parallella and nanosoc
respectively
 1.7 12-Oct-2018  ryo add initial support of COMPAT_NETBSD32 on AArch64.
arm ELF32 EABI binaries could be execute in AArch32 state on AArch64. A32 THUMB mode is not supported yet.
 1.6 06-Oct-2018  skrll Whitespace
 1.5 04-Oct-2018  ryo cleanup locore, and changed the way to map memories during boot.
- add functions bootpage_enter() and bootpage_alloc() to adapt various layout
of physical memory map. especially for 64bit physical memory layout.
pmapboot_alloc() allocates pagetable pages from _end[].
- changed to map only the required amount for PA=VA identity mapping
(kernel image, UART device, and FDT blob) with L2_BLOCK(2Mbyte).
- changing page permission for kernel image, and making KSEG mapping are done
at cpu_kernel_vm_init() instead of at locore.
- optimize PTE entries with PTE Contiguous bit. it is enabled on devmap only for now.

reviewed by skrll@, thanks.
 1.4 21-Sep-2018  jakllsch catch up to files.arm's recent "opt_console.h" changes
 1.3 01-Apr-2018  ryo branches: 1.3.2;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.2 16-Aug-2017  nisimura branches: 1.2.2;
retire copyinout.S and fusu.S
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.6;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.6.1 28-Aug-2017  skrll Sync with HEAD
 1.1.4.3 03-Dec-2017  jdolecek update from HEAD
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file files.aarch64 was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.2.2.6 26-Dec-2018  pgoyette Sync with HEAD, resolve a few conflicts
 1.2.2.5 26-Nov-2018  pgoyette Sync with HEAD, resolve a couple of conflicts
 1.2.2.4 20-Oct-2018  pgoyette Sync with head
 1.2.2.3 02-Oct-2018  pgoyette Use a hook callback to allow sparc fpu code to determine if a process
is running under sunos emulation (in which case, fpu cleanup uses a
different set of fpu_codes[]).
 1.2.2.2 30-Sep-2018  pgoyette Ssync with HEAD
 1.2.2.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.3.2.3 21-Apr-2020  martin Sync with HEAD
 1.3.2.2 08-Apr-2020  martin Merge changes from current as of 20200406
 1.3.2.1 10-Jun-2019  christos Sync with HEAD
 1.13.4.1 29-Dec-2019  martin Pull up following revision(s) (requested by jmcneill in ticket #586):

sys/arch/arm/nvidia/tegra_pcie.c: revision 1.27
sys/arch/aarch64/aarch64/pmap.c: revision 1.57
sys/arch/aarch64/aarch64/locore.S: revision 1.48
sys/arch/aarch64/include/armreg.h: revision 1.29
sys/arch/aarch64/aarch64/pmap.c: revision 1.58
sys/arch/aarch64/aarch64/locore.S: revision 1.49
sys/arch/arm/acpi/acpipchb.c: revision 1.14
sys/arch/aarch64/aarch64/genassym.cf: revision 1.16
sys/arch/arm/acpi/acpi_machdep.c: revision 1.13
sys/arch/aarch64/include/pmap.h: revision 1.27
sys/arch/aarch64/aarch64/genassym.cf: revision 1.17
sys/arch/aarch64/include/pmap.h: revision 1.28
sys/arch/arm/fdt/pcihost_fdtvar.h: revision 1.3
sys/arch/arm/include/bus_defs.h: revision 1.14
sys/arch/aarch64/aarch64/bus_space.c: revision 1.9
sys/arch/arm/fdt/pcihost_fdt.c: revision 1.12
sys/arch/aarch64/conf/files.aarch64: revision 1.15
sys/arch/aarch64/conf/files.aarch64: revision 1.16
sys/arch/arm/rockchip/rk3399_pcie.c: revision 1.9

Enable early write acknowledge for device memory mappings.

Do not use Early Write Acknowledge for PCIe I/O and config space.
 1.16.2.3 29-Feb-2020  ad Sync with head.
 1.16.2.2 25-Jan-2020  ad Sync with head.
 1.16.2.1 17-Jan-2020  ad Sync with head.
 1.20.4.1 20-Apr-2020  bouyer Sync with HEAD
 1.30.6.1 01-Aug-2021  thorpej Sync with HEAD.
 1.12 10-Feb-2021  ryo add support kernel profiling on aarch64

- add MCOUNT_ENTER, MCOUNT_EXIT macro
- __mcount() function should be aligned
- add "-fno-optimize-sibling-calls" option when PROF. for accurate profiling, it is better to suppress the tail call.
 1.11 01-Nov-2018  maxv branches: 1.11.12;
Add kASan support for aarch64. Stack tracking needs more investigation
and will come in a separate commit.

Reviewed by ryo@ jmcneill@ skrll@.
 1.10 07-Oct-2018  skrll Don't use a magic number for COHERENCY_UNIT use COHERENCY_UNIT
 1.9 10-Sep-2018  maxv reduce the battlefield
 1.8 30-Aug-2018  maxv style, no functional change
 1.7 06-Aug-2018  ryo set kernel rodata/data non-executable.
set rodata section on 2Mbytes aligned. (kernel image is mapped with 2Mbytes L2 block)
 1.6 03-Aug-2018  ryo set kernel text/rodata readonly when not defined DDB.
set readonly segment on 2Mbytes aligned. (kernel image is mapped with 2Mbytes L2 block)
 1.5 01-Apr-2018  ryo branches: 1.5.2;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.4 16-Aug-2017  nisimura branches: 1.4.2;
add more sence. now compilable
 1.3 24-Aug-2015  uebayasi Don't mention stab and DWARF sections, because these (poorly mtaintained)
lists only help to make them harder to read.

If those sections are found in inputs, they simply appear in outputs as
orphaned sections, sorted by section types and attributes.
 1.2 20-Aug-2015  uebayasi Indent with 2 spaces.
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.6;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.6.2 28-Aug-2017  skrll Sync with HEAD
 1.1.6.1 22-Sep-2015  skrll Sync with HEAD
 1.1.4.3 03-Dec-2017  jdolecek update from HEAD
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file kern.ldscript was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.4.2.5 26-Nov-2018  pgoyette Sync with HEAD, resolve a couple of conflicts
 1.4.2.4 20-Oct-2018  pgoyette Sync with head
 1.4.2.3 30-Sep-2018  pgoyette Ssync with HEAD
 1.4.2.2 06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.4.2.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.5.2.1 10-Jun-2019  christos Sync with HEAD
 1.11.12.1 03-Apr-2021  thorpej Sync with HEAD.
 1.7 29-Jun-2021  nia Remove uscanner(4) driver

This exists for compatibility with a Linux interface which was apparently
deprecated in Linux 2.6. There are various mailing list threads going
back to 2004 where the usefulness of this driver is discussed, but
the conclusion is that scanner software has all moved to using ugen(4)
instead, and enabling this driver will not help you scan things.
 1.6 04-Apr-2020  jdolecek branches: 1.6.8;
mark nsmb major obsolete
 1.5 29-Jan-2020  maya remove urio(4), a driver for the Rio500 MP3 player.

At this point it is highly unlikely this 1999 device still has users,
but it still comes up in the context of maxv's USB-fuzzing (and any device
could pretend to be a urio(4)), so it's best to get rid of it.

Renamed all major entries to obsolete, as was done in previous removals.

This still requires an update to sanitizers, but they're located in
"external", perhaps it should be first committed upstream?

Proposed on tech-kern a month ago.
 1.4 28-Jan-2019  dholland branches: 1.4.6;
Systematize handling of removed drivers.

- Every driver that was removed and whose number hasn't already been
reused is now listed with a commented-out "obsolete" line.
- The format of these has been systematized. Future format changes can
probably be safely done with a script.
- This does not include a few cases of assignments that only lasted a
couple days, or stuff from before major reorgs. Some of these may
be included nonetheless, because there was a lot of ground to cover
and therefore not a lot of time to dig into history in detail.

Note that the obsolete listings do not mean the major numbers can
never be reused; that's up to portmasters and/or core. It does mean
that they won't be reused by accident, however, which in some cases
(depending on the driver, how widely used it was, its family of device
nodes, their default permissions, etc.) can be quite dangerous.

Note that some of the things now explicitly listed as obsolete are
really ancient history. My scan went back as far as when the majors
files were added. (But not before that.)
 1.3 23-Sep-2018  maxv Remove ISDN from the kernel. It has remained unmaintained for a long time,
is of poor quality, and is now an obstacle to MP-ification. It was removed
ten years ago from FreeBSD for the same reason.

This retires a big user of the mbuf API, and will ease maintenance of the
kernel.
 1.2 23-Apr-2015  pgoyette branches: 1.2.16; 1.2.18;
Update device dependency information - the sysmon major device now depends on the sysmon module itself, not on the individual components.
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.6;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.6.1 06-Jun-2015  skrll Sync with HEAD
 1.1.4.3 03-Dec-2017  jdolecek update from HEAD
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file majors.aarch64 was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.2.18.2 08-Apr-2020  martin Merge changes from current as of 20200406
 1.2.18.1 10-Jun-2019  christos Sync with HEAD
 1.2.16.1 30-Sep-2018  pgoyette Ssync with HEAD
 1.4.6.1 29-Feb-2020  ad Sync with head.
 1.6.8.1 01-Aug-2021  thorpej Sync with HEAD.
 1.3 01-Apr-2018  ryo Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.2 07-May-2015  mrg branches: 1.2.16;
bump CHILD_MAX and OPEN_MAX defaults on several platforms, both to 1024.
 1.1 10-Aug-2014  matt branches: 1.1.2; 1.1.4; 1.1.6;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.6.1 06-Jun-2015  skrll Sync with HEAD
 1.1.4.3 03-Dec-2017  jdolecek update from HEAD
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file std.aarch64 was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.1.2.1 09-May-2015  snj Pull up following revision(s) (requested by mrg in ticket #741):
sys/arch/aarch64/conf/std.aarch64: revision 1.2
sys/arch/amd64/conf/std.amd64: revision 1.10
sys/arch/evbarm/conf/std.evbarm: revision 1.4
sys/arch/evbarm64/conf/std.evbarm64: revision 1.2
sys/arch/i386/conf/std.i386: revision 1.34
sys/arch/sparc64/conf/std.sparc64: revision 1.19
bump CHILD_MAX and OPEN_MAX defaults on several platforms, both to 1024.
 1.2.16.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.5 30-Nov-2024  christos Create a new header lwp_private.h to contain _lwp_getprivate_fast,
_lwp_gettcb_fast, _lwp_settcb and remove them from mcontext.h, so that:
1. we don't need special hacks to hide them
2. we can include <lwp.h> where needed to get the necessary prototypes
without redefining them locally.
 1.4 10-May-2020  skrll branches: 1.4.26;
Provide a trap.h (currently empty)
 1.3 09-Dec-2018  alnsn branches: 1.3.4;
Install aarch64/sljit_machdep.h.
 1.2 01-Apr-2018  ryo branches: 1.2.2;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.28;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.28.2 26-Dec-2018  pgoyette Sync with HEAD, resolve a few conflicts
 1.1.28.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file Makefile was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.2.2.1 10-Jun-2019  christos Sync with HEAD
 1.3.4.1 13-May-2020  martin Pull up following revision(s) (requested by skrll in ticket #900):

sys/arch/aarch64/include/Makefile: revision 1.4
sys/arch/aarch64/include/trap.h: revision 1.3
distrib/sets/lists/comp/ad.aarch64: revision 1.40

Provide a trap.h (currently empty)

Update for trap.h
 1.4.26.1 02-Aug-2025  perseant Sync with HEAD
 1.1 10-Aug-2014  matt branches: 1.1.4;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file ansi.h was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.2 01-Apr-2018  ryo Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.28;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.28.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file aout_machdep.h was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.71 23-Aug-2025  skrll Add a #define for the RES1 bits (just bit 31) in MPIDR_EL1.
 1.70 12-Aug-2025  skrll Add MDCR_EL2 accessors and bit definitions.
 1.69 12-Aug-2025  skrll Add sp_el2 accessors.
 1.68 12-Aug-2025  skrll Remove the XXXNH I had against ESR_EC_LS64.

The FEAT_LS64 instructions are all A64.
 1.67 27-Feb-2025  andvar Fix various typos in comments.
 1.66 03-Jan-2024  andvar branches: 1.66.2;
ddress->address in comment.
 1.65 24-Sep-2023  skrll Add a bunch of system registers and their bit / bit field definitions.
Taken from ryo's nvmm branch with updates from me.
 1.64 06-May-2023  andvar s/Regiser/Register/ and s/regester/register/ in comments.
 1.63 01-Dec-2022  ryo Improve tprof(4)

- Multiple events can now be handled simultaneously.
- Counters should be configured with TPROF_IOC_CONFIGURE_EVENT in advance,
instead of being configured at TPROF_IOC_START.
- The configured counters can be started and stopped repeatedly by
PROF_IOC_START/TPROF_IOC_STOP.
- The value of the performance counter can be obtained at any timing as a 64bit
value with TPROF_IOC_GETCOUNTS.
- Backend common parts are handled in tprof.c as much as possible, and functions
on the tprof_backend side have been reimplemented to be more primitive.
- The reset value of counter overflows for profiling can now be adjusted.
It is calculated by default from the CPU clock (speed of cycle counter) and
TPROF_HZ, but for some events the value may be too large to be sufficient for
profiling. The event counter can be specified as a ratio to the default or as
an absolute value when configuring the event counter.
- Due to overall changes, API and ABI have been changed. TPROF_VERSION and
TPROF_BACKEND_VERSION were updated.
 1.62 01-Dec-2022  ryo PMCR.E should not be disabled from tprof.

PMCR.E controls not only performance event counters but also the cycle
counter operation, and the cycle counter may be used for cpu_counter.
Similarly, the 31st bit in PMINTENCLR and PMCNTENCLR controls the cycle
counter, not performance event counters, and should not be modified.
 1.61 02-May-2022  skrll Only print the appropriate PAR fields for PAR.F={0,1}

Group the fields in the header.
 1.60 05-Jan-2022  ryo fix ID_AA64ISAR0_EL1.ATOMIC field definition
 1.59 26-Oct-2021  ryo fix build with COPTS=-O0
 1.58 23-Oct-2021  skrll Typo in comment
 1.57 19-Jun-2021  jmcneill Do not try to initialize PMU if ID_AA64DFR0_EL1 reports a non-standard
PMU implementation.
 1.56 19-Jun-2021  jmcneill CNTV_CTL_EL0 is a 64-bit register
 1.55 09-Mar-2021  ryo branches: 1.55.4;
fixed mask width of DBGWVR_MASK, and added definition of DBGBVR_MASK
 1.54 30-Sep-2020  ryo branches: 1.54.2;
add some fields of ID_AA64ISAR1_EL1 definition (ARMv8.6)
 1.53 15-Sep-2020  ryo fix typo
 1.52 02-Aug-2020  maxv Add support for Privileged Access Never (ARMv8.1-PAN).

PAN provides the same functionality as SMAP on x86: it forbids kernel
access to userland pages when PSTATE.PAN=1, and allows such accesses when
PSTATE.PAN=0.

We clear SCTLR_SPAN, to guarantee that PAN=1 each time the kernel is
entered. We catch PAN faults and panic right away without further
processing. In copyin, copyout, etc, we temporarily authorize access to
userland pages.

PAN is a very useful exploit mitigation. Reviewed by ryo@, thanks. Tested
on Qemu. Enabled by default.
 1.51 01-Aug-2020  maxv The system registers we modify can have an impact on memory accesses, and
we don't want the compiler to randomly re-order the instructions, so add
barriers. Same as WRMSR on x86.
 1.50 01-Jul-2020  ryo - On some systems with a different cache line size (and DIC,IDC) per CPU, trap "mrs Xt,ctr_el0" instruction
to return the minimum cache line size of the system to userland.
- add CLIDR_EL1 and CTR_EL0 to struct aarch64_sysctl_cpu_id.

On most systems, cache line size is the same for all CPUs, so this mechanism won't be required.
Rather, this is primarily for errata support, which will be committed later.
 1.49 14-Jun-2020  riastradh Add some more id_aa64pfr0_el1 bits.
 1.48 28-May-2020  skrll Add some new CTR_EL0 bits
 1.47 25-May-2020  ryo add ARMv8.1-8.5 definitions of TCR_EL1
 1.46 25-May-2020  ryo cache information can be detected correctly on newer CPUs

- add VPIPT cache type
- adapt to 64-bit CCSIDR (ARMv8.3-CCIDX)
- CCSIDR:[WT,WB,PA,WA] are deprecated
- show number of cache lines when attaching cpu
 1.45 23-May-2020  ryo Not only the kernel thread, but also the userland PAC keys
(APIA,APIB,APDA,APDB,APGA) are now randomly initialized at exec, and switched
when context switch.
userland programs are able to perform pointer authentication on ARMv8.3+PAC cpu.

reviewd by maxv@, thanks.
 1.44 21-May-2020  ryo fix typo
 1.43 13-May-2020  ryo - move aarch64 addressspace macros from pmap.h to cpufunc.h
- rename ptr_strip_pac() to aarch64_strip_pac()
 1.42 11-May-2020  ryo "options ARMV83_PAC" is now supported for gcc as well.

- add "-msign-return-address=all" to CFLAGS for gcc when specified options ARMV83_PAC
- AARCH64REG_{READ,WRITE}_INLINE3 macro can now use the APIAKey registers in both gcc and llvm.
llvm requires asm(".arch armv8.3-a"), whereas gcc requires __attribute__((target("arch=armv8.3-a"))).
- use ".arch armv8.3-a" rather than ".arch armv8.3-a+pac" in *.S for llvm.
 1.41 10-May-2020  riastradh Fix ID_AA64ISAR0_EL1_RNDR field definition for RNDR support.

ARMv8.5 ARM, p. D13-3232
 1.40 12-Apr-2020  maxv Add support for Pointer Authentication (PAC).

We use the "pac-ret" option, to sign the return instruction pointer on
function entry, and authenticate it on function exit. This acts as a
mitigation against ROP.

The authentication uses a per-lwp (secret) I-A key stored in the 128bit
APIAKey register and part of the lwp context. During lwp creation, the
kernel generates a random key, and during context switches, it installs
the key of the target lwp on the CPU.

Userland cannot read the APIAKey register directly. However, it can sign
its pointers with it, because the register is architecturally shared
between userland and the kernel. Although part of the CPU design, it is
a bit of an undesired behavior, because it allows to forge valid kernel
pointers from userland. To avoid that, we don't share the key with
userland, and rather switch it in EL0<->EL1 transitions. This means that
when userland executes, a different key is loaded in APIAKey than the one
the kernel uses. For now the userland key is a fixed 128bit zero value.

The DDB stack unwinder is changed to strip the authentication code from
the pointers in lr.

Two problems are known:

* Currently the idlelwps' keys are not really secret. This is because
the RNG is not yet available when we spawn these lwps. Not overly
important, but would be nice to fix with UEFI RNG.
* The key switching in EL0<->EL1 transitions is not the most optimized
code on the planet. Instead of checking aarch64_pac_enabled, it would
be better to hot-patch the code at boot time, but there currently is
no hot-patch support on aarch64.

Tested on Qemu.
 1.39 30-Mar-2020  jmcneill branches: 1.39.2;
Enable the cycle counter when a CPU hatches and store an estimate of the
frequency in ci_data.cpu_cc_freq.
 1.38 06-Mar-2020  ryo fix missing paren
 1.37 06-Mar-2020  ryo add more definitions for ARMv8.1-ARMv8.4
 1.36 29-Feb-2020  ryo widen bit PAR_EL1.PAR_PA from [47:12] to [51:12] for ARMv8.2 (and later).

PAR_EL1:[51:48] is RES0 in ARMv8.1 and ARMv8.0.
 1.35 31-Jan-2020  maxv BTI definitions.
 1.34 28-Jan-2020  maxv More SCTLR.
 1.33 28-Jan-2020  maxv Fetch ID_AA64MMFR2_EL1. Okayed by Nick the other day.
 1.32 28-Jan-2020  maxv Jazelle and T32EE are not part of ARMv8, fix the bits to their real
meanings. No functional change.
 1.31 28-Jan-2020  maxv More definitions.
 1.30 28-Dec-2019  rjs branches: 1.30.2;
s/Memroy/Memory/ in comment.
 1.29 27-Dec-2019  jmcneill Enable early write acknowledge for device memory mappings.
 1.28 15-Sep-2019  tnn report A72 errata #859971 workaround status during boot
 1.27 11-Sep-2019  skrll Move the TCR and TTBR defines into armreg.h where they below. NFCI.
 1.26 12-Aug-2019  jmcneill Add support for physical timers and sprinkle isb where needed.
 1.25 16-Jun-2019  skrll branches: 1.25.2;
Provide icc_pmr_read
 1.24 20-Mar-2019  ryo - add reg_{s1e0r,s1e0w,s1e1r,s1e1w}_write() macro.
- show the result of AT insn at ddb "machine pte" command.
 1.23 30-Jan-2019  jmcneill add gtmr_cntv_cval_write
 1.22 13-Dec-2018  ryo add support PT_STEP
 1.21 20-Nov-2018  mrg rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.
 1.20 07-Nov-2018  riastradh When hardware subnormal support is available, disable flush-to-zero.

Similarly, when hardware NaN propagation is available, disable
default-NaN substitution.

This enables IEEE 754 semantics on any hardware that supports it by
default. Programs that want flush-to-zero or default-NaN substitution
can enable them explicitly.

ok ryo@
 1.19 12-Oct-2018  ryo add initial support of COMPAT_NETBSD32 on AArch64.
arm ELF32 EABI binaries could be execute in AArch32 state on AArch64. A32 THUMB mode is not supported yet.
 1.18 12-Aug-2018  skrll Provide and use cpu_mpidr_aff_read in psci_fdt_bootstrap
 1.17 12-Aug-2018  skrll Whitespace
 1.16 09-Aug-2018  jmcneill Restore ICC_SRE_EL2 registers lost in previous commit
 1.15 08-Aug-2018  jmcneill Add GICv3 system registers
 1.14 05-Aug-2018  skrll More whitespace
 1.13 01-Aug-2018  skrll Some whitespace improvements. NFC.
 1.12 17-Jul-2018  christos - use #define to define constants instead of static const variables so that
gcc can compile the code.
- fix position of inline, and use __inline
 1.11 15-Jul-2018  jmcneill Add more PMC registers
 1.10 14-May-2018  joerg branches: 1.10.2;
Workaround A-008585 errata in GTMR.

Register reads and writes may provide unstable results if the counter
hardware is active at the same time. This results in non-monotonic
counters seen by both the gtmr interrupt and time counter.

The loops are currently applied unconditionally, restricting them to
appropiate FDT markers can be applied later.
 1.9 01-Apr-2018  ryo Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.8 20-Mar-2018  ryo separate cputypes.h for CPU_ID_* from armreg.h,
and add some implementor IDs, CortexA55,73,75 IDs.

(preliminary changes for merging aarch64)
 1.7 06-Mar-2018  skrll Sprinkle __volatile on asm instructions
 1.6 06-Mar-2018  skrll Convert decimal to hex to make comparison to arm arm (slightly) easier.
 1.5 06-Mar-2018  skrll Another harmless typo
 1.4 06-Mar-2018  skrll Fix harmless typo
 1.3 20-Dec-2017  skrll branches: 1.3.2;
Trailing whitespace
 1.2 27-Apr-2015  skrll ARM spells the System Control Register SCTLR
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.6;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.6.1 06-Jun-2015  skrll Sync with HEAD
 1.1.4.3 03-Dec-2017  jdolecek update from HEAD
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file armreg.h was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.3.2.9 26-Dec-2018  pgoyette Sync with HEAD, resolve a few conflicts
 1.3.2.8 26-Nov-2018  pgoyette Sync with HEAD, resolve a couple of conflicts
 1.3.2.7 20-Oct-2018  pgoyette Sync with head
 1.3.2.6 06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.3.2.5 28-Jul-2018  pgoyette Sync with HEAD
 1.3.2.4 21-May-2018  pgoyette Sync with HEAD
 1.3.2.3 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.3.2.2 22-Mar-2018  pgoyette Synch with HEAD, resolve conflicts
 1.3.2.1 15-Mar-2018  pgoyette Synch with HEAD
 1.10.2.3 21-Apr-2020  martin Sync with HEAD
 1.10.2.2 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.10.2.1 10-Jun-2019  christos Sync with HEAD
 1.25.2.2 29-Dec-2019  martin Pull up following revision(s) (requested by jmcneill in ticket #586):

sys/arch/arm/nvidia/tegra_pcie.c: revision 1.27
sys/arch/aarch64/aarch64/pmap.c: revision 1.57
sys/arch/aarch64/aarch64/locore.S: revision 1.48
sys/arch/aarch64/include/armreg.h: revision 1.29
sys/arch/aarch64/aarch64/pmap.c: revision 1.58
sys/arch/aarch64/aarch64/locore.S: revision 1.49
sys/arch/arm/acpi/acpipchb.c: revision 1.14
sys/arch/aarch64/aarch64/genassym.cf: revision 1.16
sys/arch/arm/acpi/acpi_machdep.c: revision 1.13
sys/arch/aarch64/include/pmap.h: revision 1.27
sys/arch/aarch64/aarch64/genassym.cf: revision 1.17
sys/arch/aarch64/include/pmap.h: revision 1.28
sys/arch/arm/fdt/pcihost_fdtvar.h: revision 1.3
sys/arch/arm/include/bus_defs.h: revision 1.14
sys/arch/aarch64/aarch64/bus_space.c: revision 1.9
sys/arch/arm/fdt/pcihost_fdt.c: revision 1.12
sys/arch/aarch64/conf/files.aarch64: revision 1.15
sys/arch/aarch64/conf/files.aarch64: revision 1.16
sys/arch/arm/rockchip/rk3399_pcie.c: revision 1.9

Enable early write acknowledge for device memory mappings.

Do not use Early Write Acknowledge for PCIe I/O and config space.
 1.25.2.1 13-Aug-2019  martin Pull up following revision(s) (requested by jmcneill in ticket #54):

sys/arch/aarch64/include/armreg.h: revision 1.26
sys/arch/arm/cortex/gtmr.c: revision 1.41
sys/arch/arm/include/armreg.h: revision 1.128
sys/arch/arm/cortex/gtmr_var.h: revision 1.12

Add support for physical timers and sprinkle isb where needed.
 1.30.2.1 29-Feb-2020  ad Sync with head.
 1.39.2.1 20-Apr-2020  bouyer Sync with HEAD
 1.54.2.1 03-Apr-2021  thorpej Sync with HEAD.
 1.55.4.1 01-Aug-2021  thorpej Sync with HEAD.
 1.66.2.1 02-Aug-2025  perseant Sync with HEAD
 1.19 16-Apr-2023  skrll Rename VM_KERNEL_IO_ADDRESS to VM_KERNEL_IO_BASE to match RISC-V

It's less letters, matches other similar variables and will help with
sharing code between the two architectures.

NFCI.
 1.18 29-Apr-2021  skrll Remove some unnecessary tlb invalidate in pmap_growkernel and ASAN shadow
map. Ensure the shadow map mappings are visible to the TLB walkers.
 1.17 21-Mar-2021  skrll branches: 1.17.2;
Adjust the kernel virtual address space so that KASAN will map the kernel
seperately from managed kernel virtual memory and not map the unused space
between the two.
 1.16 11-Dec-2020  skrll branches: 1.16.2;
s:aarch64/cpufunc.h:arm/cpufunc.h:

a baby step in the grand arm header unification challenge
 1.15 26-Nov-2020  skrll Mark KASAN shadow pages as LX_BLKPAG_ATTR_NORMAL_WB. NFC as this is zero,
but someone might change it one day.
 1.14 10-Nov-2020  skrll AA64 is not MIPS.

Change all KSEG references to directmap
 1.13 20-Sep-2020  skrll branches: 1.13.2;
Use pmap_growkernel(VM_KERNEL_VM_BASE) rather than pmap_virtual_space to
work out what to map initially.

XXX could do better mapping the kernel and modules more accurately
 1.12 19-Sep-2020  skrll Make __md_palloc pmap agnostic (think sys/uvm/pmap)
 1.11 10-Sep-2020  maxv kasan: fix the copyright notices
 1.10 05-Sep-2020  riastradh Round of uvm.h cleanup.

The poorly named uvm.h is generally supposed to be for uvm-internal
users only.

- Narrow it to files that actually need it -- mostly files that need
to query whether curlwp is the pagedaemon, which should maybe be
exposed by an external header.

- Use uvm_extern.h where feasible and uvm_*.h for things not exposed
by it. We should split up uvm_extern.h but this will serve for now
to reduce the uvm.h dependencies.

- Use uvm_stat.h and #ifdef UVMHIST uvm.h for files that use
UVMHIST(ubchist), since ubchist is declared in uvm.h but the
reference evaporates if UVMHIST is not defined, so we reduce header
file dependencies.

- Make uvm_device.h and uvm_swap.h independently includable while
here.

ok chs@
 1.9 01-Aug-2020  maxv Use large pages for the KASAN shadow, same as amd64, discussed with ryo@.
 1.8 16-Jul-2020  skrll pmapboot_enter simplication
- bootpage_alloc in asm becomes pmapboot_pagealloc in C
- PMAPBOOT_ENTER_NOBLOCK is removed as it's not used
- PMAPBOOT_ENTER_NOOVERWRITE is removed as it's now always on
- physpage_allocator argument is removed as it's always
pmapboot_pagealloc
- Support for EARLYCONS without CONSADDR is removed so that the identity
map for CONSADDR is always known.

For the assembly files:
2 files changed, 40 insertions(+), 89 deletions(-)

LGTM ryo
 1.7 23-Jun-2020  maxv Rename __MD_CANONICAL_BASE -> __MD_KERNMEM_BASE for clarity.
 1.6 08-Apr-2019  ryo branches: 1.6.4;
- free empty page tables pages if reach a certain usage.
- need to lock at removing an old pg (_pmap_remove_pv) in _pmap_enter()
 1.5 19-Mar-2019  ryo - free L1-L3 pages that has been emptied by pmap_remove().
- if no memories, pmap_enter will return correctly ENOMEM if PMAP_CANFAIL, or wait until available any memories if !PMAP_CANFAIL.

These changes improves the stability when we use a huge virtual memory spaces with mmap.
 1.4 10-Nov-2018  ryo branches: 1.4.2;
add LX_BLKPAG_SH_IS pte attribute for MP
 1.3 08-Nov-2018  maxv Track the stack with kASan on aarch64. Same principle as on amd64. Illegal
accesses occurring there are now detected.

Originally written by me, but reworked by ryo@, thanks.
 1.2 02-Nov-2018  skrll Provide a kasan_md_unwind

OK maxv
 1.1 01-Nov-2018  maxv Add kASan support for aarch64. Stack tracking needs more investigation
and will come in a separate commit.

Reviewed by ryo@ jmcneill@ skrll@.
 1.4.2.2 26-Nov-2018  pgoyette Sync with HEAD, resolve a couple of conflicts
 1.4.2.1 10-Nov-2018  pgoyette file asan.h was added on branch pgoyette-compat on 2018-11-26 01:52:16 +0000
 1.6.4.2 10-Jun-2019  christos Sync with HEAD
 1.6.4.1 08-Apr-2019  christos file asan.h was added on branch phil-wifi on 2019-06-10 22:05:43 +0000
 1.13.2.2 03-Apr-2021  thorpej Sync with HEAD.
 1.13.2.1 14-Dec-2020  thorpej Sync w/ HEAD.
 1.16.2.1 03-Apr-2021  thorpej Sync with HEAD.
 1.17.2.1 13-May-2021  thorpej Sync with HEAD.
 1.9 02-Aug-2020  maxv Add support for Privileged Access Never (ARMv8.1-PAN).

PAN provides the same functionality as SMAP on x86: it forbids kernel
access to userland pages when PSTATE.PAN=1, and allows such accesses when
PSTATE.PAN=0.

We clear SCTLR_SPAN, to guarantee that PAN=1 each time the kernel is
entered. We catch PAN faults and panic right away without further
processing. In copyin, copyout, etc, we temporarily authorize access to
userland pages.

PAN is a very useful exploit mitigation. Reviewed by ryo@, thanks. Tested
on Qemu. Enabled by default.
 1.8 11-May-2020  ryo "options ARMV83_PAC" is now supported for gcc as well.

- add "-msign-return-address=all" to CFLAGS for gcc when specified options ARMV83_PAC
- AARCH64REG_{READ,WRITE}_INLINE3 macro can now use the APIAKey registers in both gcc and llvm.
llvm requires asm(".arch armv8.3-a"), whereas gcc requires __attribute__((target("arch=armv8.3-a"))).
- use ".arch armv8.3-a" rather than ".arch armv8.3-a+pac" in *.S for llvm.
 1.7 12-Apr-2020  maxv Add support for Pointer Authentication (PAC).

We use the "pac-ret" option, to sign the return instruction pointer on
function entry, and authenticate it on function exit. This acts as a
mitigation against ROP.

The authentication uses a per-lwp (secret) I-A key stored in the 128bit
APIAKey register and part of the lwp context. During lwp creation, the
kernel generates a random key, and during context switches, it installs
the key of the target lwp on the CPU.

Userland cannot read the APIAKey register directly. However, it can sign
its pointers with it, because the register is architecturally shared
between userland and the kernel. Although part of the CPU design, it is
a bit of an undesired behavior, because it allows to forge valid kernel
pointers from userland. To avoid that, we don't share the key with
userland, and rather switch it in EL0<->EL1 transitions. This means that
when userland executes, a different key is loaded in APIAKey than the one
the kernel uses. For now the userland key is a fixed 128bit zero value.

The DDB stack unwinder is changed to strip the authentication code from
the pointers in lr.

Two problems are known:

* Currently the idlelwps' keys are not really secret. This is because
the RNG is not yet available when we spawn these lwps. Not overly
important, but would be nice to fix with UEFI RNG.
* The key switching in EL0<->EL1 transitions is not the most optimized
code on the planet. Instead of checking aarch64_pac_enabled, it would
be better to hot-patch the code at boot time, but there currently is
no hot-patch support on aarch64.

Tested on Qemu.
 1.6 19-Jan-2020  skrll branches: 1.6.4;
Replace the two copies of the ADDR macro with a centralised adrl macro.
The adrl name matches the one used by armasm.
 1.5 20-Dec-2019  ryo branches: 1.5.2;
Add a speculation barrier after the 'eret'.

Some aarch64 cpus speculatively execute instructions after 'eret',
and this potentiates side-channel attack.

from
https://github.com/torvalds/linux/commit/679db70801da9fda91d26caf13bf5b5ccc74e8e8
 1.4 05-Aug-2019  joerg Don't define register replacements when targetting 32bit ARM.
 1.3 17-Jul-2018  christos branches: 1.3.4;
centralize fp,lr definitions
 1.2 01-Apr-2018  ryo branches: 1.2.2;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.28;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.28.2 28-Jul-2018  pgoyette Sync with HEAD
 1.1.28.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file asm.h was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.2.2.3 21-Apr-2020  martin Sync with HEAD
 1.2.2.2 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.2.2.1 10-Jun-2019  christos Sync with HEAD
 1.3.4.1 24-Dec-2019  martin Pull up following revision(s) (requested by ryo in ticket #574):

sys/arch/aarch64/include/asm.h: revision 1.5
sys/arch/aarch64/aarch64/cpuswitch.S: revision 1.13

Add a speculation barrier after the 'eret'.

Some aarch64 cpus speculatively execute instructions after 'eret',
and this potentiates side-channel attack.

from
https://github.com/torvalds/linux/commit/679db70801da9fda91d26caf13bf5b5ccc74e8e8
 1.5.2.1 25-Jan-2020  ad Sync with head.
 1.6.4.1 20-Apr-2020  bouyer Sync with HEAD
 1.1 10-Aug-2014  matt branches: 1.1.4;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file bswap.h was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.2 01-Apr-2018  ryo Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.28;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.28.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file bus_defs.h was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.3 10-Aug-2019  skrll Really provide bus_funcs.h
 1.2 01-Apr-2018  ryo branches: 1.2.2; 1.2.6;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.28;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.28.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file bus_funcs.h was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.2.6.1 13-Aug-2019  martin Pull up following revision(s) (requested by skrll in ticket #53):

sys/arch/aarch64/include/bus_funcs.h: revision 1.3

Really provide bus_funcs.h
 1.2.2.1 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.4 17-Jan-2017  rin avoid conversion warnings
 1.3 29-Oct-2014  dennis branches: 1.3.2; 1.3.4; 1.3.6;
Correct 32 and 64 bit byte swap inlines
 1.2 11-Aug-2014  matt branches: 1.2.4;
Use %x/%w as appropriate.
 1.1 10-Aug-2014  matt Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.2.4.3 03-Dec-2017  jdolecek update from HEAD
 1.2.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.2.4.1 11-Aug-2014  tls file byte_swap.h was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.3.6.1 21-Apr-2017  bouyer Sync with HEAD
 1.3.4.1 20-Mar-2017  pgoyette Sync with HEAD
 1.3.2.1 05-Feb-2017  skrll Sync with HEAD
 1.1 10-Aug-2014  matt branches: 1.1.4;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file cdefs.h was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.53 30-Dec-2024  jmcneill arm64: Enable support for low power idle CPU states on ACPI platforms.

The ACPI CPU driver parses the _LPI package on each CPU and builds a
table of supported low power states. A custom cpu_idle() implementation
is registered that uses the time previously spent idle to select an
entry method for low power on the next idle entry.

A boot option, "nolpi", can be used to ignore _LPI and use the normal
WFI idle method.

This decreases the battery discharge rate on my Snapdragon X1E laptop from
~17W to ~10W when idle.
 1.52 10-Dec-2024  jmcneill fix 32-bit arm builds
 1.51 10-Aug-2024  riastradh aarch64: Count RNDRRS failure events and add dtrace probe.

PR port-arm/58572: aarch64 RNDRRS failures should be evcounted and
dtraced
 1.50 09-May-2024  pho branches: 1.50.2;
port-arm/58194: Resurrect vmt(4) from bitrot

On this architecture vmt(4) used to search for a node "/hypervisor" in the
FDT and probed the VMware hypervisor call only when the node was
found. However, things appear to have changed and VMware no longer provides
the FDT node.

Since vmt(4) doesn't actually need to read anything from FDT, and the
hypervisor call logically resides in virtual CPUs themselves, it would be
better to attach it directly to cpu, just like how it's probed on x86.
 1.49 25-Feb-2023  riastradh aarch64: curcpu() audit.

Sprinkle KASSERT (or KDASSERT in hot paths) for kpreempt_disabled()
when we use curcpu() and it's not immediately obvious that the caller
has preemption disabled but closer scrutiny suggests the caller has.

Note unsafe curcpu()s for syscall event counting. Not sure this is
worth changing.

Possible bugs fixed:

- cpu_irq and cpu_fiq could be preempted while trying to run softints
on this CPU.

- data_abort_handler might incorrectly think it was invoked in
interrupt context when it was only preempted and migrated to
another CPU.

- pmap_fault_fixup might report the wrong CPU logs.

(However, we don't currently run with kpreemption on aarch64, so
these are not yet real bugs fixed except if you patch it to build
with __HAVE_PREEMPTION.)
 1.48 03-Nov-2022  skrll branches: 1.48.2;
Provide MI PMAP support on AARCH64
 1.47 25-Jun-2022  jmcneill Remove GIC_SPLFUNCS.
 1.46 25-Jun-2022  jmcneill pic: Update ci_cpl in pic_set_priority callback.

Not all ICs need interrupts disabled to update the priority. DAIF accesses
are not cheap, so push the update of ci_cpl from pic_set_priority to the
IC's pic_set_priority callback, and let the IC driver determine whether
or not it needs interrupts disabled.
 1.45 02-Nov-2021  ryo In order to prevent _mcount() from being recursively called when built with COPTS=-O0,
sprinkle `__always_inline' to make _mcount() be generated as a single function.
 1.44 01-Nov-2021  skrll Fix a last minute rebase/merge botch so that the cpu_hatch commit actually
works.
 1.43 31-Oct-2021  skrll Rework Arm (32bit and 64bit) AP startup so that cpu_hatch doesn't sleep.

The AP initialisation code in cpu_init_secondary_processor will read and
initialise the required system registers and state for the BP to attach
and report.

Rework the interrupt handler code for this new sequence. Thankfully,
this removes a bunch of code for bcm2836mp.

The VFP detection handler on <= armv7 relies on the global undefined
handler being in place until the BP attaches vfp. That is, after the
APs have been spun up.

gicv3_its.c has a serialisation issue which is protected against in
the gicv3_its_cpu_init, which is called from cpu_hatch, with a spin
lock. The serialisation issue needs addressing more completely.

Tested on RPI3, Apple M1, QEMU, and lx2k

Fixes PR port-arm/56264:
diagnostic assertion "l->l_stat == LSONPROC" failed on RPI3
 1.42 31-Oct-2021  skrll Annotate some cpu_info members
 1.41 26-Oct-2021  skrll Add a comment and adjust whitespace to match style in this file
 1.40 10-Oct-2021  skrll Use sys/uvm/pmap/pmap_tlb.c on Aarch64 in the same way that some Arm, MIPS,
and some PPC kernels do. This removes the limitation of 256 processes on
CPUs with 8bit ASID field, e.g. Apple M1.

Additionally the following changes have been made

- removed a couple of unnecessary aarch64_tlbi_all calls
- removed any invalidation after freeing page tables due to
_pmap_sweep_pdp. This was never necessary afaict.
- all kernel mappings are marked global and userland mapping not-global.

Performance testing hasn't show a significant difference. The data here
is from building a kernel on an lx2k system with nvme.

before
1489.6u 400.4s 2:40.65 1176.5% 228+224k 0+32289io 57pf+0w
1482.6u 403.2s 2:38.49 1189.9% 228+222k 0+32274io 46pf+0w
1485.4u 402.2s 2:37.27 1200.2% 228+222k 0+32275io 12pf+0w

after
1493.9u 404.6s 2:37.50 1205.4% 227+221k 0+32265io 48pf+0w
1485.0u 408.0s 2:38.54 1194.0% 227+222k 0+32272io 36pf+0w
1484.3u 407.0s 2:35.88 1213.3% 228+224k 0+32268io 14pf+0w

>>> stats.ttest_ind([160.65,158.49,157.27], [157.5,158.54,155.88])
Ttest_indResult(statistic=1.1923622711296888, pvalue=0.2990182944606766)
>>>
 1.39 18-Sep-2021  jmcneill gic_splx: performance optimizations

Avoid any kind of register access (DAIF, PMR, etc), barriers, and atomic
operations in the common case where no interrupt fires between spl being
raised and lowered.

This introduces a per-CPU return address (ci_splx_restart) used by the
vector handler to restart a sequence in splx that compares the new ipl
with the per-CPU hardware priority state stored in ci_hwpl.
 1.38 14-Aug-2021  ryo Improved the performance of kernel profiling on MULTIPROCESSOR, and possible to get profiling data for each CPU.

In the current implementation, locks are acquired at the entrance of the mcount
internal function, so the higher the number of cores, the more lock conflict
occurs, making profiling performance in a MULTIPROCESSOR environment unusable
and slow. Profiling buffers has been changed to be reserved for each CPU,
improving profiling performance in MP by several to several dozen times.

- Eliminated cpu_simple_lock in mcount internal function, using per-CPU buffers.
- Add ci_gmon member to struct cpu_info of each MP arch.
- Add kern.profiling.percpu node in sysctl tree.
- Add new -c <cpuid> option to kgmon(8) to specify the cpuid, like openbsd.
For compatibility, if the -c option is not specified, the entire system can be
operated as before, and the -p option will get the total profiling data for
all CPUs.
 1.37 08-Aug-2021  skrll Re-apply

Move 'struct pic_pending' from percpu to struct cpu_info. Saves a few
instructions in splx.

There is(/was) no need to use atomic operations on the percpu / cpu_info
members, so don't.

Finally removng the use of percpu should help avoid problems with "late"
attaching cpus.
 1.36 29-May-2021  skrll Deal with the pmap limitation of maxproc in a more complete way and
recognise CPUs with only 8bit ASIDs.
 1.35 29-May-2021  skrll Sort includes. NFCI.
 1.34 27-Mar-2021  jmcneill branches: 1.34.2; 1.34.4;
Revert recent pic optimizations until I have more time to work on this.
 1.33 21-Feb-2021  jmcneill branches: 1.33.2;
Add cpu_dosoftints_ci(). Like cpu_dosoftints(), but takes a cpu_info ptr
so we can avoid the extra tpidr_el1 access if cpu_info is already known.
 1.32 21-Feb-2021  jmcneill Keep current hardware priority value in struct cpu_info and use it instead
of reading icc_pmr_el1 in gicv3_set_priority.
 1.31 20-Feb-2021  jmcneill Move 'struct pic_pending' from percpu to struct cpu_info. Saves a few
instructions in splx.
 1.30 07-Dec-2020  jmcneill ACPI Processor UID is 32-bits (ci_acpiid).
 1.29 21-Nov-2020  jmcneill Add a per-CPU event counter that counts every time an interrupt handler is
preempted by a higher priority interrupt.
 1.28 01-Oct-2020  ryo branches: 1.28.2;
fix build error with LLVM
 1.27 14-Sep-2020  ryo PID_MAX is just an initial value (soft maximum). Don't use it for CTASSERT.
defined __HAVE_CPU_MAXPROC to use function cpu_maxproc().

pointed out by mrg@, thanks.
 1.26 12-Aug-2020  skrll Part II of ad's aarch64 performance improvements (cpu_switch.S bugs are
all mine)

- Use tpidr_el1 to hold curlwp and not curcpu, because curlwp is accessed
much more often by MI code. It also makes curlwp preemption safe and
allows aarch64_curlwp() to be a const function (curcpu must be volatile).

- Make ASTs operate per-LWP rather than per-CPU, otherwise sometimes LWPs
can see spurious ASTs (which doesn't cause a problem, it just means some
time may be wasted).

- Use plain stores to set/clear ASTs. Make sure ASTs are always set on the
same CPU as the target LWP, and delivered via IPI if posted from a remote
CPU so that they are resolved quickly.

- Add some cache line padding to struct cpu_info, to match x86.

- Add a memory barrier in a couple of places where ci_curlwp is set. This
is needed whenever an LWP that is resuming on the CPU could hold an
adaptive mutex. The barrier needs to drain the CPU's store buffer, so
that the update to ci_curlwp becomes globally visible before the LWP can
resume and call mutex_exit(). By my reading of the ARM docs it looks like
the instruction I used will do the right thing, but I'm not 100% sure.
 1.25 01-Jul-2020  ryo - On some systems with a different cache line size (and DIC,IDC) per CPU, trap "mrs Xt,ctr_el0" instruction
to return the minimum cache line size of the system to userland.
- add CLIDR_EL1 and CTR_EL0 to struct aarch64_sysctl_cpu_id.

On most systems, cache line size is the same for all CPUs, so this mechanism won't be required.
Rather, this is primarily for errata support, which will be committed later.
 1.24 01-Jul-2020  ryo Switch the Icache sync operation to the necessary and sufficient one according to the CTR_EL0.DIC and CTR_EL0.IDC flags.

If CTR_EL0.DIC=1, Icache invalidation is not required.
If CTR_EL0.IDC=1, Dcache clean before Icache invalidation is not required.
CLIDR_EL1.LoC is 0, or CLIDR_EL1.LoUIS and CLIDR_EL1.LoUU are 0, Dcache clean is not required as well.

SEE ALSO ARMARM, "CTR_EL0 Cache Type Register", and "CLIDR_EL1 Cache Level ID Register"
 1.23 29-Jun-2020  riastradh Draft fpu_kern_enter/leave on aarch64.
 1.22 10-Mar-2020  christos protect curcpu/curlwp from _KMEMUSER
 1.21 15-Feb-2020  skrll Various updates and improvements to cpu start up on arm/aarch64

- start sharing more code around the AP startup messaging.
- call arm_cpu_topology_set early so that ci_core_id is available for
drivers, e.g. bcm2835_intr.c
- both arm and aarch64 now have
- a static cpu_info_store array
- the same arm_cpu_{hatched,mbox}
 1.20 12-Feb-2020  riastradh Define the MULTIPROCESSOR cpu_number() for modules too.

Modules should work whether the main kernel is multiprocessor or not.
In particular, dtrace should not think cpu_number() is 0 while
cpu_index(curcpu()) and curcpu()->ci_index are nonzero, leading to
rather spectacularly bogus results...
 1.19 15-Jan-2020  mrg port the arm64 cpu topology setup for big.little to arm.

rename arm64 cpu_do_topology() to arm_cpu_do_topology() and
call it from both arm cpu_attach().

replace both aarch64_set_topology() inline code in arm
cpu_attach() with new arm_cpu_do_topology(), which is called
by the arm64 locore as well (possibly not needed, which would
allow it to become static.)

not yet tested on a real big.little armv7 system. tested
on rockpro64 and pinebook pro.
 1.18 12-Jan-2020  mrg provide some semblance of valid cpu topology for big.little systems.

while attaching cpus, if the FDT provides "capacity-dmips-mhz" track
the fastest set, and call cpu_topology_set() with slow=true for any
cpus that are not the fastest.

bug fix for cpu_topology_set(): actually set ci_is_slow for slow cpus.

with this change, and -current's recent scheduler changes, this means
that long running processes run on the faster cores. on RK3399 based
systems, i am seeing 20-50% speed ups for many tasks.


XXX: all this can be made common with armv7 big.little.
 1.17 05-Jan-2020  ad branches: 1.17.2;
Give aarch64 a preemption safe cpu_intr_p().
 1.16 02-Dec-2019  ad + ci_onproc
 1.15 21-Nov-2019  ad mi_userret(): take care of calling preempt(), set spc_curpriority directly,
and remove MD code that does the same.
 1.14 19-Oct-2019  jmcneill Increase aarch64 MAXCPUS to 256.
 1.13 21-Dec-2018  ryo branches: 1.13.4;
- add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)
 1.12 24-Nov-2018  skrll Provide a LWP_PC for Taylor
 1.11 20-Nov-2018  mrg rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.
 1.10 18-Oct-2018  skrll Provide generic start code that assumes the MMU is off and caches are
disabled as per the linux booting protocol for ARMv6 and ARMv7 boards.
u-boot image type should be changed to 'linux' for correct behaviour.

The new start code builds a minimal "bootstrap" L1PT with cached access
disabled and uses the same table for all processors. AP startup is
performed in less steps and more code is written in C.

The bootstrap tables and stack are placed into an (orphaned) section
"_init_memory" which is given to uvm when it is no longer used.

Various kernels have been converted to use this code and tested. Some
boards were provided by TNF. Thanks!

The GENERIC kernel now boots on boards using the TEGRA, SUNXI and EXYNOS
kernels. The GENERIC kernel will also work on RPI2 using u-boot.

Thanks to martin@ and aymeric@ for testing on parallella and nanosoc
respectively
 1.9 12-Oct-2018  jmcneill Add ACPI Processor Unique ID (ci_acpiid) to struct cpu_info, required by
ACPI subsystem.
 1.8 10-Sep-2018  ryo cleanup aarch64 mpstart and fdt bootstrap
* arm_cpu_hatch_arg is a bad idea. avoid serializing CPU startup, and eliminate arm_cpu_hatch_arg.
in mpstart, resolve own cpu index using array of cpu_mpidr[] (aarch64)
* add support fdt enable-method "spin-table"
* add support fdt enable-method "brcm,bcm2836-smp" (for 32bit RaspberryPi)
* use arm_fdt_cpu_bootstrap() instead of psci_fdt_bootstrap()
* rename "arm/fdt/psci_fdt.h" to "arm/fdt/psci_fdtvar.h" because of conflict of include file for needs-flag
* add devmap for cpu spin-table of raspberrypi3/aarch64
* no need to force hatch APs for raspberrypi3/arm32 ifndef MULTIPROCESSOR.
* fix to work pmap_extract(kerneltext/data/bss) even if before calling pmap_bootstrap

idea to use cpu_mpidr[] by jmcneill@. reviewd by skrll@. thanks.
 1.7 26-Aug-2018  ryo add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!
 1.6 08-Aug-2018  jmcneill Add fields for per-cpu GICv3 state
 1.5 23-Jul-2018  ryo rather than using flags to resolve nested locks, reserve pool_cache before locking.
 1.4 21-Jul-2018  ryo * avoid deadlock. mutex_owned() works only for adaptive lock, therefore we cannot use it for spinlock...
* add more NULL check
* clear pte when pmap_enter() fails
 1.3 09-Jul-2018  ryo add MULTIPROCESSOR support
 1.2 01-Apr-2018  ryo branches: 1.2.2;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.28;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.28.7 26-Dec-2018  pgoyette Sync with HEAD, resolve a few conflicts
 1.1.28.6 26-Nov-2018  pgoyette Sync with HEAD, resolve a couple of conflicts
 1.1.28.5 20-Oct-2018  pgoyette Sync with head
 1.1.28.4 30-Sep-2018  pgoyette Ssync with HEAD
 1.1.28.3 06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.1.28.2 28-Jul-2018  pgoyette Sync with HEAD
 1.1.28.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file cpu.h was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.2.2.2 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.2.2.1 10-Jun-2019  christos Sync with HEAD
 1.13.4.2 12-Feb-2020  martin Pull up following revision(s) (requested by riastradh in ticket #701):

external/cddl/osnet/dev/dtrace/aarch64/dtrace_isa.c: revision 1.2
external/cddl/osnet/dist/lib/libdtrace/common/dt_open.c: revision 1.17
external/cddl/osnet/dist/lib/libdtrace/common/dt_module.c: revision 1.18
sys/modules/cyclic/Makefile: revision 1.5
external/cddl/osnet/dev/dtrace/aarch64/dtrace_subr.c: revision 1.2
external/cddl/osnet/dev/dtrace/aarch64/dtrace_subr.c: revision 1.3
sys/arch/aarch64/aarch64/vectors.S: revision 1.10
external/cddl/osnet/dev/fbt/aarch64/fbt_isa.c: revision 1.2
external/cddl/osnet/dev/fbt/aarch64/fbt_isa.c: revision 1.3
external/cddl/osnet/dev/fbt/aarch64/fbt_isa.c: revision 1.4
external/cddl/osnet/dev/fbt/aarch64/fbt_isa.c: revision 1.5
external/cddl/osnet/dev/fbt/aarch64/fbt_isa.c: revision 1.6
sys/arch/aarch64/include/cpu.h: revision 1.20
external/cddl/osnet/dist/lib/libdtrace/common/dt_impl.h: revision 1.9

Create a buffer space of 512 bytes before the trapframe.

dtrace fbt needs enough space to emulate an

stp x29, x30, [sp,#-FRAMESIZE]!

instruction in a function prologue. In the aarch64 instruction
encoding, FRAMESIZE can be as large as 512 bytes, so reserve this
much space when KDTRACE_HOOKS is enabled.

Use db_write_bytes to overwrite kernel text.

Tidy up a bit. No functional change intended.

aarch64 fbt_invop doesn't actually use the argument, but it would
make more sense for it to be the return value and/or first argument
register. Certainly it's not `eax'!

Tidy up a bit: don't set things we won't use; assert nonzeroness.

Use /dev/ksyms, not /netbsd, for the running kernel's symbols.

Teach dtrace about el1_trap_exit frames on aarch64.

Implement dtrace_getarg and dtrace_getreg while here.

Count the number of artificial frames in aarch64 fbt probe correctly.

Change the address ranges that aarch64 considers toxic for dtrace.
`Toxic' means dtrace forbids D scripts from even attempting to read
or write at them.

Previously we considered [0, VM_MIN_KERNEL_ADDRESS) toxic, but
VM_MIN_KERNEL_ADDRESS is only the minimum address of the kernel map;
the direct-mapped region lies below it, and with PMAP_MAP_POOLPAGE we
allocate virtual pages for pool backing directly from physical pages
through the direct-mapped region. Also, this did not consider I/O
mappings to be toxic, which they probably should be.

Instead, treat:

[0, AARCH64_KSEG_START)
and
[VM_KERNEL_IO_ADDRESS, 0xfff...ff)

as toxic. (The upper bound for 0xfff...ff ought to be inclusive, not
exclusive, but I think we'll need another mechanism for expressing
that to dtrace!)

Switch from db_write_bytes to using direct-mapping.

This way there's no dependency on ddb.

Define the MULTIPROCESSOR cpu_number() for modules too.
Modules should work whether the main kernel is multiprocessor or not.
In particular, dtrace should not think cpu_number() is 0 while
cpu_index(curcpu()) and curcpu()->ci_index are nonzero, leading to
rather spectacularly bogus results...

cyclic.kmod needs -Wno-sign-compare for aarch64 CPU_INFO_FOREACH.
Provisional workaround; feel free to fix.
 1.13.4.1 23-Oct-2019  martin Pull up following revision(s) (requested by jmcneill in ticket #359):

sys/arch/aarch64/aarch64/locore.S: revision 1.42
sys/arch/aarch64/aarch64/locore.S: revision 1.43
sys/arch/aarch64/aarch64/locore.S: revision 1.44
sys/arch/arm/fdt/cpu_fdt.c: revision 1.28
sys/arch/aarch64/include/cpu.h: revision 1.14
sys/arch/aarch64/include/param.h: revision 1.12
sys/arch/arm/arm32/cpu.c: revision 1.133
sys/arch/arm/arm32/cpu.c: revision 1.134
sys/arch/arm/include/cpu.h: revision 1.101
sys/arch/arm/acpi/cpu_acpi.c: revision 1.7
sys/arch/aarch64/aarch64/cpu.c: revision 1.23
sys/arch/aarch64/aarch64/cpu.c: revision 1.24
sys/arch/aarch64/aarch64/cpu.c: revision 1.25

Increase aarch64 MAXCPUS to 256.

-

Invalidate dcache before polling AP hatched status

-

Avoid overlap between BP and last AP stack. AP stacks are now in order of
increasing address order.

Spotted by and idea from mlelstv.

-

Use separate cacheline aligned arrays for mbox and hatched as before.

-

cpu_hatched_p only for MULTIPROCESSOR
 1.17.2.2 29-Feb-2020  ad Sync with head.
 1.17.2.1 17-Jan-2020  ad Sync with head.
 1.28.2.2 03-Apr-2021  thorpej Sync with HEAD.
 1.28.2.1 14-Dec-2020  thorpej Sync w/ HEAD.
 1.33.2.1 03-Apr-2021  thorpej Sync with HEAD.
 1.34.4.1 31-May-2021  cjep sync with head
 1.34.2.1 17-Jun-2021  thorpej Sync w/ HEAD.
 1.48.2.1 13-Oct-2024  martin Pull up following revision(s) (requested by riastradh in ticket #955):

sys/arch/aarch64/aarch64/cpu.c: revision 1.78
sys/arch/aarch64/include/cpu.h: revision 1.51

aarch64: Count RNDRRS failure events and add dtrace probe.

PR port-arm/58572: aarch64 RNDRRS failures should be evcounted and
dtraced
 1.50.2.1 02-Aug-2025  perseant Sync with HEAD
 1.2 22-Feb-2021  ryo PR/56002: aarch64 has a true 64bit CPU cycle counter, we will use it.

This fix solves PR/56002 on aarch64, but this problems can occur on
all other architectures where cpu_counter() is 32bit.
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.42;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.42.1 03-Apr-2021  thorpej Sync with HEAD.
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file cpu_counter.h was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.28 30-Dec-2024  jmcneill aarch64: Allow for alternate cpu_idle() implementations
 1.27 07-Feb-2024  msaitoh branches: 1.27.2;
Remove ryo@'s mail addresses.
 1.26 20-Apr-2023  skrll Provide a shared pmap_devmap implementation and convert all pmap_devmap
arrays to use DEVMAP_ENTRY{,_END}
 1.25 10-Sep-2022  rillig fix misspellings of 'available' and nearby typos
 1.24 20-Jul-2022  riastradh aarch64: Make cpufunc.h includable without sys/cpu.h first.
 1.23 31-Jan-2022  ryo add support Hardware updates to Access flag and Dirty state (FEAT_HAFDBS)

- The DBM bit of the PTE is now used to determine if it is writable, and
the AF bit is treated entirely as a reference bit. A valid PTE is always
treated as readable. There can be no valid PTE that is not readable.
- LX_BLKPAG_OS_{READ,WRITE} are used only for debugging purposes,
and has been superseded by LX_BLKPAG_AF and LX_BLKPAG_DBM.
- Improve comment

The need for reference/modify emulation has been eliminated,
and access/permission faults have been reduced, however,
there has been little change in overall performance.
 1.22 31-Oct-2021  skrll Rework Arm (32bit and 64bit) AP startup so that cpu_hatch doesn't sleep.

The AP initialisation code in cpu_init_secondary_processor will read and
initialise the required system registers and state for the BP to attach
and report.

Rework the interrupt handler code for this new sequence. Thankfully,
this removes a bunch of code for bcm2836mp.

The VFP detection handler on <= armv7 relies on the global undefined
handler being in place until the BP attaches vfp. That is, after the
APs have been spun up.

gicv3_its.c has a serialisation issue which is protected against in
the gicv3_its_cpu_init, which is called from cpu_hatch, with a spin
lock. The serialisation issue needs addressing more completely.

Tested on RPI3, Apple M1, QEMU, and lx2k

Fixes PR port-arm/56264:
diagnostic assertion "l->l_stat == LSONPROC" failed on RPI3
 1.21 23-Oct-2021  skrll Whitespace
 1.20 27-May-2021  ryo fix build error with options ARMV85_BTI
 1.19 04-Dec-2020  skrll branches: 1.19.4; 1.19.6;
Remove unnecessary casts
 1.18 03-Aug-2020  ryo branches: 1.18.2;
Implement MD ucas(9) (__HAVE_UCAS_FULL)
 1.17 02-Aug-2020  maxv Add support for Privileged Access Never (ARMv8.1-PAN).

PAN provides the same functionality as SMAP on x86: it forbids kernel
access to userland pages when PSTATE.PAN=1, and allows such accesses when
PSTATE.PAN=0.

We clear SCTLR_SPAN, to guarantee that PAN=1 each time the kernel is
entered. We catch PAN faults and panic right away without further
processing. In copyin, copyout, etc, we temporarily authorize access to
userland pages.

PAN is a very useful exploit mitigation. Reviewed by ryo@, thanks. Tested
on Qemu. Enabled by default.
 1.16 01-Jul-2020  ryo Switch the Icache sync operation to the necessary and sufficient one according to the CTR_EL0.DIC and CTR_EL0.IDC flags.

If CTR_EL0.DIC=1, Icache invalidation is not required.
If CTR_EL0.IDC=1, Dcache clean before Icache invalidation is not required.
CLIDR_EL1.LoC is 0, or CLIDR_EL1.LoUIS and CLIDR_EL1.LoUU are 0, Dcache clean is not required as well.

SEE ALSO ARMARM, "CTR_EL0 Cache Type Register", and "CLIDR_EL1 Cache Level ID Register"
 1.15 25-May-2020  ryo cache information can be detected correctly on newer CPUs

- add VPIPT cache type
- adapt to 64-bit CCSIDR (ARMv8.3-CCIDX)
- CCSIDR:[WT,WB,PA,WA] are deprecated
- show number of cache lines when attaching cpu
 1.14 15-May-2020  ryo SCTLR_EnIA should be enabled in the caller(locore).

For some reason, gcc make aarch64_pac_init() function non-leaf, and it uses paciasp/autiasp.
 1.13 13-May-2020  ryo - move aarch64 addressspace macros from pmap.h to cpufunc.h
- rename ptr_strip_pac() to aarch64_strip_pac()
 1.12 12-Apr-2020  maxv Add support for Pointer Authentication (PAC).

We use the "pac-ret" option, to sign the return instruction pointer on
function entry, and authenticate it on function exit. This acts as a
mitigation against ROP.

The authentication uses a per-lwp (secret) I-A key stored in the 128bit
APIAKey register and part of the lwp context. During lwp creation, the
kernel generates a random key, and during context switches, it installs
the key of the target lwp on the CPU.

Userland cannot read the APIAKey register directly. However, it can sign
its pointers with it, because the register is architecturally shared
between userland and the kernel. Although part of the CPU design, it is
a bit of an undesired behavior, because it allows to forge valid kernel
pointers from userland. To avoid that, we don't share the key with
userland, and rather switch it in EL0<->EL1 transitions. This means that
when userland executes, a different key is loaded in APIAKey than the one
the kernel uses. For now the userland key is a fixed 128bit zero value.

The DDB stack unwinder is changed to strip the authentication code from
the pointers in lr.

Two problems are known:

* Currently the idlelwps' keys are not really secret. This is because
the RNG is not yet available when we spawn these lwps. Not overly
important, but would be nice to fix with UEFI RNG.
* The key switching in EL0<->EL1 transitions is not the most optimized
code on the planet. Instead of checking aarch64_pac_enabled, it would
be better to hot-patch the code at boot time, but there currently is
no hot-patch support on aarch64.

Tested on Qemu.
 1.11 15-Jan-2020  mrg branches: 1.11.4;
port the arm64 cpu topology setup for big.little to arm.

rename arm64 cpu_do_topology() to arm_cpu_do_topology() and
call it from both arm cpu_attach().

replace both aarch64_set_topology() inline code in arm
cpu_attach() with new arm_cpu_do_topology(), which is called
by the arm64 locore as well (possibly not needed, which would
allow it to become static.)

not yet tested on a real big.little armv7 system. tested
on rockpro64 and pinebook pro.
 1.10 12-Jan-2020  mrg provide some semblance of valid cpu topology for big.little systems.

while attaching cpus, if the FDT provides "capacity-dmips-mhz" track
the fastest set, and call cpu_topology_set() with slow=true for any
cpus that are not the fastest.

bug fix for cpu_topology_set(): actually set ci_is_slow for slow cpus.

with this change, and -current's recent scheduler changes, this means
that long running processes run on the faster cores. on RK3399 based
systems, i am seeing 20-50% speed ups for many tasks.


XXX: all this can be made common with armv7 big.little.
 1.9 19-Dec-2019  ryo branches: 1.9.2;
aarch64_cache_info[] is not global
 1.8 22-Nov-2019  mlelstv Make cache operations available early.
 1.7 13-Sep-2019  ryo In pmap_devmap_bootstrap(), cpu_earlydevice_va_p() must not return true until *all* devmap tables have been enabled.
console mapping may be present in the last table.
 1.6 07-Sep-2019  ryo add checking status of MMU and devmap to make _platform_early_putchar() available at all times.
 1.5 21-Dec-2018  ryo - add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)
 1.4 15-Dec-2018  alnsn Add missing include for device_t declaration.
 1.3 26-Aug-2018  ryo add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!
 1.2 23-Jul-2018  ryo * fix icache invalidations.
* "ic ivau" (aarch64_icache_sync_range) with VA generates permission fault in some situations, therefore use KSEG address for now.
 1.1 01-Apr-2018  ryo branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1.4.4 21-Apr-2020  martin Sync with HEAD
 1.1.4.3 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.1.4.2 08-Apr-2020  martin Merge changes from current as of 20200406
 1.1.4.1 10-Jun-2019  christos Sync with HEAD
 1.1.2.5 26-Dec-2018  pgoyette Sync with HEAD, resolve a few conflicts
 1.1.2.4 06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.1.2.3 28-Jul-2018  pgoyette Sync with HEAD
 1.1.2.2 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.2.1 01-Apr-2018  pgoyette file cpufunc.h was added on branch pgoyette-compat on 2018-04-07 04:12:11 +0000
 1.9.2.1 17-Jan-2020  ad Sync with head.
 1.11.4.1 20-Apr-2020  bouyer Sync with HEAD
 1.18.2.1 14-Dec-2020  thorpej Sync w/ HEAD.
 1.19.6.1 31-May-2021  cjep sync with head
 1.19.4.1 17-Jun-2021  thorpej Sync w/ HEAD.
 1.27.2.1 02-Aug-2025  perseant Sync with HEAD
 1.16 31-Oct-2021  skrll Fix crash(8) build
 1.15 31-Oct-2021  skrll Rework Arm (32bit and 64bit) AP startup so that cpu_hatch doesn't sleep.

The AP initialisation code in cpu_init_secondary_processor will read and
initialise the required system registers and state for the BP to attach
and report.

Rework the interrupt handler code for this new sequence. Thankfully,
this removes a bunch of code for bcm2836mp.

The VFP detection handler on <= armv7 relies on the global undefined
handler being in place until the BP attaches vfp. That is, after the
APs have been spun up.

gicv3_its.c has a serialisation issue which is protected against in
the gicv3_its_cpu_init, which is called from cpu_hatch, with a spin
lock. The serialisation issue needs addressing more completely.

Tested on RPI3, Apple M1, QEMU, and lx2k

Fixes PR port-arm/56264:
diagnostic assertion "l->l_stat == LSONPROC" failed on RPI3
 1.14 30-Apr-2021  skrll Make the ddb for pmap / pte information pmap agnostic
 1.13 11-Mar-2021  ryo branches: 1.13.4;
- fixed a problem where hardware {break,watch}points other than #0 could not be cleared
- hardware {break,watch}point addresses are now strictly checked
 1.12 09-Mar-2021  ryo Add support hardware breakpoint and watchpoint again.

Limited support for hardware watchpoint has been available for some time, but it
has not been working properly. In addition, it stopped working at the time of
the PTRACE support commit on 2018-12-13. This has been fixed to work correctly,
and also fixed to be practical by sharing hardware watchpoints and breakpoints
between CPUs on MULTIPROCESSOR.

Also fixed a bug that causes a malfunction when switching CPUs with
"machine cpu N" when entering ddb mode from other than cpu_Debugger().

I have confirmed that the CPU can be switched by "machine cpu N" and return from
ddb properly in each case where ddb is called triggered by ddb break/watchpoint,
hardware break/watchpoint, and cpu_Debugger().
 1.11 14-Sep-2020  ryo branches: 1.11.2;
sprinkle LE32TOH to fetch instructions on aarch64eb
 1.10 08-Jul-2020  ryo Determination of A64,A32,T32 for disasm is now done in strrdisasm() instead of the caller.
correctly disassemble by processor state if defined DEBUG_DUMP_ON_USERFAULT or DEBUG_DDB_ON_USERFAULT.
 1.9 22-May-2020  ryo fix to do backtrace properly for running LWPs and cpu_lwp_fork().
when dump of pcb_tf, only the switchframe part is now displayed instead of the whole trapframe.
 1.8 12-Oct-2018  ryo add initial support of COMPAT_NETBSD32 on AArch64.
arm ELF32 EABI binaries could be execute in AArch32 state on AArch64. A32 THUMB mode is not supported yet.
 1.7 15-Sep-2018  jakllsch make kernel-groveling crash(8) work on aarch64
 1.6 17-Jul-2018  ryo use panic() instead of some printf to show fault status.
useful for ddb "show panic" command.
 1.5 28-Apr-2018  ryo branches: 1.5.2;
Oops, my previous commit is totally wrong. recast mask/pattern list.
pointed out by David Binderman in PR/53224, thanks.
 1.4 27-Apr-2018  ryo remove suspicious compare, and cleanup complex conditionals.
pointed out PR/53159 by dcb314, thanks.
 1.3 01-Apr-2018  ryo Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.2 11-Jan-2016  skrll branches: 1.2.16;
PR port-arm/50641: src/sys/arch/aarch64/include/db_machdep.h:67: possible bad if test ?

Fix the bl instruction test.
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.6;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.6.1 19-Mar-2016  skrll Sync with HEAD
 1.1.4.3 03-Dec-2017  jdolecek update from HEAD
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file db_machdep.h was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.2.16.5 20-Oct-2018  pgoyette Sync with head
 1.2.16.4 30-Sep-2018  pgoyette Ssync with HEAD
 1.2.16.3 28-Jul-2018  pgoyette Sync with HEAD
 1.2.16.2 02-May-2018  pgoyette Synch with HEAD
 1.2.16.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.5.2.1 10-Jun-2019  christos Sync with HEAD
 1.11.2.1 03-Apr-2021  thorpej Sync with HEAD.
 1.13.4.1 13-May-2021  thorpej Sync with HEAD.
 1.2 01-Apr-2018  ryo Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.28;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.28.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file disklabel.h was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.5 30-May-2022  jkoshy Use the ABI value for 'R_AARCH64_TLSLD_LDST128_DTPREL_LO12_NC'.
 1.4 12-Oct-2018  ryo add initial support of COMPAT_NETBSD32 on AArch64.
arm ELF32 EABI binaries could be execute in AArch32 state on AArch64. A32 THUMB mode is not supported yet.
 1.3 15-Aug-2018  ryo fix relocation type. 277 is R_AARCH64_ADD_ABS_LO12_NC
 1.2 06-Nov-2017  christos branches: 1.2.2; 1.2.4;
Cleanup and clarify the ELFSIZE mess:

We now have 2 variables automatically set in elf_machdep.h:

ARCH_ELFSIZE: the size for userland binaries
KERN_ELFSIZE: the size for the kernel binaries

DB_ELFSIZE has been deleted and KERN_ELFSIZE should have always the
same values DB_ELFSIZE used to have.

In sys/exec_elf.h, if ELFSIZE is not set, it is set to KERN_ELFSIZE
for the kernel and ARCH_ELFSIZE for userland. These defaults should
eliminate the need for most manual ELFSIZE setting.
 1.1 10-Aug-2014  matt branches: 1.1.4;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.4.3 03-Dec-2017  jdolecek update from HEAD
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file elf_machdep.h was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.2.4.1 10-Jun-2019  christos Sync with HEAD
 1.2.2.2 20-Oct-2018  pgoyette Sync with head
 1.2.2.1 06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.1 10-Aug-2014  matt branches: 1.1.4;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file endian.h was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.1 10-Aug-2014  matt branches: 1.1.4;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file endian_machdep.h was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.1 10-Aug-2014  matt branches: 1.1.4;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file fenv.h was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.1 10-Aug-2014  matt branches: 1.1.4;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file float.h was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.2 29-Jun-2020  riastradh Move aarch64/fpu.h to arm/fpu.h.
 1.1 29-Jun-2020  riastradh Draft fpu_kern_enter/leave on aarch64.
 1.5 06-Aug-2020  ryo revert the changes of http://mail-index.netbsd.org/source-changes/2020/08/03/msg120183.html

This change is overengineered.
bus_space_{peek,poke}_N does not have to be reentrant nor available for interrupt context.

requested by skrll@
 1.4 03-Aug-2020  ryo Fix a problem in which a fault occured in an interrupt handler during copyin/copyout was erroneously detected as being occured by copyin.

- keep idepth in faultbuf and compare it to avoid unnecessary fault recovery
- make cpu_set_onfault() nestable to use bus_space_{peek,poke}()
in hardware interrupt handlers during copyin & copyout.
 1.3 03-Dec-2019  jmcneill Define lwp_trapframe() macro
 1.2 01-Apr-2018  ryo branches: 1.2.2; 1.2.6;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.28;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.28.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file frame.h was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.2.6.1 09-Dec-2019  martin Pull up following revision(s) (requested by jmcneill in ticket #525):

external/cddl/osnet/dev/dtrace/aarch64/dtrace_isa.c: revision 1.1
distrib/sets/lists/modules/md.i386: revision 1.83
share/mk/bsd.own.mk: revision 1.1168
usr.bin/mkubootimage/mkubootimage.c: revision 1.25
sys/modules/dtrace/Makefile: revision 1.7
usr.bin/mkubootimage/mkubootimage.c: revision 1.26
sys/modules/dtrace/Makefile: revision 1.8
external/cddl/osnet/dist/lib/libdtrace/aarch64/dt_isadep.c: revision 1.2
distrib/sets/lists/modules/mi: revision 1.128
sys/arch/aarch64/include/frame.h: revision 1.3
sys/arch/evbarm/conf/mk.generic64: revision 1.4
external/cddl/osnet/dist/lib/libdtrace/common/dt_link.c: revision 1.12
sys/modules/cyclic/Makefile: revision 1.4
sys/arch/aarch64/conf/Makefile.aarch64: revision 1.16
external/cddl/osnet/dev/dtrace/aarch64/dtrace_subr.c: revision 1.1
sys/arch/aarch64/aarch64/start.S: revision 1.3
sys/arch/aarch64/aarch64/trap.c: revision 1.22
external/cddl/osnet/dev/fbt/aarch64/fbt_isa.c: revision 1.1
external/cddl/osnet/dev/dtrace/aarch64/dtrace_asm.S: revision 1.1
external/cddl/osnet/dev/fbt/aarch64/fbt_isa.h: revision 1.1
external/cddl/osnet/dev/dtrace/aarch64/regset.h: revision 1.1
external/cddl/osnet/lib/libdtrace/Makefile: revision 1.26
distrib/sets/lists/modules/md.amd64: revision 1.82
usr.bin/mkubootimage/mkubootimage.1: revision 1.13
distrib/sets/lists/modules/ad.arm: revision 1.14

Add KDTRACE_HOOKS support.

Define lwp_trapframe() macro

dtrace: add support for aarch64

Add syscall_linux back for other arm architectures (accidently removed
in previous)

Add -u flag for updating headers in place.

Fix alignment of .text section by changing load address to
0xffffffc000000000 and adding 64 bytes of padding before the entry point.

Update arm64 image header in place

Move dtrace_syscall_linux out of mi set list

Enable DTrace on aarch64

Fix signed/unsigned comparison
 1.2.2.1 08-Apr-2020  martin Merge changes from current as of 20200406
 1.3 30-Aug-2021  jmcneill Add definition for HCR_E2H bit
 1.2 29-Aug-2020  maxv Slightly clarify, and style.
 1.1 01-Apr-2018  ryo branches: 1.1.2;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1.2.2 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.2.1 01-Apr-2018  pgoyette file hypervisor.h was added on branch pgoyette-compat on 2018-04-07 04:12:11 +0000
 1.1 10-Aug-2014  matt branches: 1.1.4;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file ieee.h was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.1 10-Aug-2014  matt branches: 1.1.4;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file ieeefp.h was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.1 10-Aug-2014  matt branches: 1.1.4;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file int_const.h was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.3 13-Aug-2014  matt branches: 1.3.2;
Back out last change.
 1.2 13-Aug-2014  justin Add formatting for aarch64 as using arm ones errors for ll on 64 bit types
 1.1 10-Aug-2014  matt Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.3.2.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.3.2.1 13-Aug-2014  tls file int_fmtio.h was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.1 10-Aug-2014  matt branches: 1.1.4;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file int_limits.h was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.1 10-Aug-2014  matt branches: 1.1.4;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file int_mwgwtypes.h was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.1 10-Aug-2014  matt branches: 1.1.4;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file int_types.h was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.2 01-Apr-2018  ryo Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.28;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.28.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file intr.h was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.2 09-Nov-2018  mrg implement dumpsys() and friends for arm64.

this is almost a direct copy of the arm code, which is simply
as the basic structures about physical memory are the same
between arm and arm64. the main change i made was to use
the direct map instead of a virtual dump page that is remapped
to whatever physical page is being dumped.

i also changed the existing cpu_kcore_hdr_t to include the
missing number of ram segments.

note that this is not a complete solution for crash dumps yet,
as the libkvm code needs some work. i'm fairly positive that
this side is correct, as i can see the data i expect to see,
but libkvm's _kvm_kvtop() function returns garbage so far.

there is no "minidump" support here yet, ala amd64, but we
probably want it eventually.


ok skrll@.
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.28; 1.1.30;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.30.1 10-Jun-2019  christos Sync with HEAD
 1.1.28.1 26-Nov-2018  pgoyette Sync with HEAD, resolve a couple of conflicts
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file kcore.h was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.1 10-Aug-2014  matt branches: 1.1.4;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file limits.h was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.5 24-Jul-2022  riastradh aarch64/lock.h: Need <sys/param.h> for _HARDKERNEL.

Add include guard while here.

XXX Why does this aarch64-specific file have #ifdef __aarch64__?
 1.4 26-Sep-2021  jmcneill Use the yield instruction as SPINLOCK_BACKOFF_HOOK for aarch64.
 1.3 26-Jun-2015  matt Use <sys/common_lock.h> for !__arm__
 1.2 13-Aug-2014  matt branches: 1.2.2; 1.2.4;
Use __ATOMIC_RELAXED in __cpu_simple_lock_init
 1.1 10-Aug-2014  matt Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.2.4.1 22-Sep-2015  skrll Sync with HEAD
 1.2.2.3 03-Dec-2017  jdolecek update from HEAD
 1.2.2.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.2.2.1 13-Aug-2014  tls file lock.h was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.9 01-Mar-2021  jmcneill Add DISABLE_INTERRUPT_SAVE(), like DISABLE_INTERRUPT() but also returns
the previous state.

Use DISABLE_INTERRUPT_SAVE()/ENABLE_INTERRUPT() in pic_splfuncs instead
of cpsid()/cpsie(). The difference here is the caller no longer specifies
which bits to disable and enable; on arm32 we continue to use I32_bit and
on aarch64 we now consistently toggle both IRQ and FIQ state.
 1.8 20-Feb-2021  jmcneill daif_disable: since we read bits before setting them, if the current state
matches the desired state we can skip the daif write
 1.7 07-Feb-2021  jmcneill Use reg_daif{set,clr}_write directly instead of daif_{en,dis}able for
ENABLE_INTERRUPT() and DISABLE_INTERRUPT() macros, to avoid an unnecessary
reg_daif_read().
 1.6 30-Oct-2020  skrll branches: 1.6.2;
Retire arm_[di]sb in favour of the isb() and dsb(sy) macro invocations.
 1.5 09-Jul-2018  jmcneill Include aarch64/machdep.h for arm32 compat.
 1.4 09-Jul-2018  ryo add MULTIPROCESSOR support
 1.3 01-Apr-2018  ryo branches: 1.3.2;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.2 16-Mar-2017  chs branches: 1.2.12;
allow pcu_save() and pcu_discard() to be called on other threads,
ptrace needs to use it that way.
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.6; 1.1.10; 1.1.14;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.14.1 21-Apr-2017  bouyer Sync with HEAD
 1.1.10.1 20-Mar-2017  pgoyette Sync with HEAD
 1.1.6.1 28-Aug-2017  skrll Sync with HEAD
 1.1.4.3 03-Dec-2017  jdolecek update from HEAD
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file locore.h was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.2.12.2 28-Jul-2018  pgoyette Sync with HEAD
 1.2.12.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.3.2.1 10-Jun-2019  christos Sync with HEAD
 1.6.2.1 03-Apr-2021  thorpej Sync with HEAD.
 1.1 30-Nov-2024  christos branches: 1.1.4;
Create a new header lwp_private.h to contain _lwp_getprivate_fast,
_lwp_gettcb_fast, _lwp_settcb and remove them from mcontext.h, so that:
1. we don't need special hacks to hide them
2. we can include <lwp.h> where needed to get the necessary prototypes
without redefining them locally.
 1.1.4.2 02-Aug-2025  perseant Sync with HEAD
 1.1.4.1 30-Nov-2024  perseant file lwp_private.h was added on branch perseant-exfatfs on 2025-08-02 05:55:20 +0000
 1.19 07-Feb-2024  msaitoh Remove ryo@'s mail addresses.
 1.18 30-Aug-2021  jmcneill Add FIQ support.
 1.17 06-Sep-2020  ryo Fix panic caused by modload. http://mail-index.netbsd.org/port-arm/2020/08/30/msg006960.html

The address space reserved for modules may not be mapped in L1-L3.
 1.16 06-Aug-2020  ryo revert the changes of http://mail-index.netbsd.org/source-changes/2020/08/03/msg120183.html

This change is overengineered.
bus_space_{peek,poke}_N does not have to be reentrant nor available for interrupt context.

requested by skrll@
 1.15 03-Aug-2020  ryo Fix a problem in which a fault occured in an interrupt handler during copyin/copyout was erroneously detected as being occured by copyin.

- keep idepth in faultbuf and compare it to avoid unnecessary fault recovery
- make cpu_set_onfault() nestable to use bus_space_{peek,poke}()
in hardware interrupt handlers during copyin & copyout.
 1.14 08-Jul-2020  ryo Determination of A64,A32,T32 for disasm is now done in strrdisasm() instead of the caller.
correctly disassemble by processor state if defined DEBUG_DUMP_ON_USERFAULT or DEBUG_DDB_ON_USERFAULT.
 1.13 01-Jul-2020  ryo - On some systems with a different cache line size (and DIC,IDC) per CPU, trap "mrs Xt,ctr_el0" instruction
to return the minimum cache line size of the system to userland.
- add CLIDR_EL1 and CTR_EL0 to struct aarch64_sysctl_cpu_id.

On most systems, cache line size is the same for all CPUs, so this mechanism won't be required.
Rather, this is primarily for errata support, which will be committed later.
 1.12 29-Jun-2020  riastradh Draft fpu_kern_enter/leave on aarch64.
 1.11 23-May-2020  ryo Not only the kernel thread, but also the userland PAC keys
(APIA,APIB,APDA,APDB,APGA) are now randomly initialized at exec, and switched
when context switch.
userland programs are able to perform pointer authentication on ARMv8.3+PAC cpu.

reviewd by maxv@, thanks.
 1.10 15-Feb-2020  skrll Various updates and improvements to cpu start up on arm/aarch64

- start sharing more code around the AP startup messaging.
- call arm_cpu_topology_set early so that ci_core_id is available for
drivers, e.g. bcm2835_intr.c
- both arm and aarch64 now have
- a static cpu_info_store array
- the same arm_cpu_{hatched,mbox}
 1.9 18-Dec-2019  riastradh branches: 1.9.2;
New function cpu_startup_hook on arm.

Called at end of cpu_startup. Can be defined in, e.g., evbarm to do
additional stuff after cpu_startup. Defined as a weak alias to a
function that does nothing, so optional.

ok jmcneill
 1.8 16-Jul-2019  skrll branches: 1.8.2;
Add vaddr_t initarm(void *);

Missed in previous commit.
 1.7 06-Apr-2019  thorpej Overhaul the API used to fetch and store individual memory cells in
userspace. The old fetch(9) and store(9) APIs (fubyte(), fuword(),
subyte(), suword(), etc.) are retired and replaced with new ufetch(9)
and ustore(9) APIs that can return proper error codes, etc. and are
implemented consistently across all platforms. The interrupt-safe
variants are no longer supported (and several of the existing attempts
at fuswintr(), etc. were buggy and not actually interrupt-safe).

Also augmement the ucas(9) API, making it consistently available on
all plaforms, supporting uniprocessor and multiprocessor systems, even
those that do not have CAS or LL/SC primitives.

Welcome to NetBSD 8.99.37.
 1.6 18-Oct-2018  skrll Provide generic start code that assumes the MMU is off and caches are
disabled as per the linux booting protocol for ARMv6 and ARMv7 boards.
u-boot image type should be changed to 'linux' for correct behaviour.

The new start code builds a minimal "bootstrap" L1PT with cached access
disabled and uses the same table for all processors. AP startup is
performed in less steps and more code is written in C.

The bootstrap tables and stack are placed into an (orphaned) section
"_init_memory" which is given to uvm when it is no longer used.

Various kernels have been converted to use this code and tested. Some
boards were provided by TNF. Thanks!

The GENERIC kernel now boots on boards using the TEGRA, SUNXI and EXYNOS
kernels. The GENERIC kernel will also work on RPI2 using u-boot.

Thanks to martin@ and aymeric@ for testing on parallella and nanosoc
respectively
 1.5 15-Sep-2018  jakllsch make kernel-groveling crash(8) work on aarch64
 1.4 05-Aug-2018  skrll Refactor code to split aarch{32,64} kernel page tables and VM setup. This
will help re-build the kernel page tables on aarch64 with correct section
mappings.
 1.3 19-Jul-2018  christos Implement TRAP_SIGDEBUG for aarch64...
ptraced programs die with:
data_abort_handler, 257: pid 199.1 (a.out): signal 11 (trap 0x82000006) @pc 0, addr 0x0, error=Instruction Abort (EL0)
 1.2 09-Jul-2018  ryo add MULTIPROCESSOR support
 1.1 01-Apr-2018  ryo branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1.4.2 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.1.4.1 10-Jun-2019  christos Sync with HEAD
 1.1.2.6 20-Oct-2018  pgoyette Sync with head
 1.1.2.5 30-Sep-2018  pgoyette Ssync with HEAD
 1.1.2.4 06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.1.2.3 28-Jul-2018  pgoyette Sync with HEAD
 1.1.2.2 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.2.1 01-Apr-2018  pgoyette file machdep.h was added on branch pgoyette-compat on 2018-04-07 04:12:11 +0000
 1.8.2.1 12-Feb-2020  martin Pull up following revision(s) (requested by riastradh in ticket #705):

sys/arch/aarch64/aarch64/aarch64_machdep.c: revision 1.35
sys/stand/efiboot/efifdt.c: revision 1.20
sys/stand/efiboot/efifdt.h: revision 1.7
sys/arch/aarch64/include/machdep.h: revision 1.9
sys/stand/efiboot/efiboot.h: revision 1.11
sys/arch/arm/arm32/arm32_machdep.c: revision 1.129
sys/arch/arm/include/arm32/machdep.h: revision 1.30
sys/stand/efiboot/exec.c: revision 1.12
sys/arch/evbarm/fdt/fdt_machdep.c: revision 1.65
sys/stand/efiboot/version: revision 1.14
sys/stand/efiboot/boot.c: revision 1.19

New function cpu_startup_hook on arm.

Called at end of cpu_startup. Can be defined in, e.g., evbarm to do
additional stuff after cpu_startup. Defined as a weak alias to a
function that does nothing, so optional.
ok jmcneill

Implement rndseed support in efiboot and fdt arm.

The EFI environment variable `rndseed' specifies the path to the
random seed. It is loaded only for fdt platforms at the moment.
Since the rndseed (an rndsave_t object as defined in <sys/rndio.h>)
is 536 bytes long (for hysterical raisins), and to avoid having to
erase parts of the fdt tree, we load it into a physical page whose
address is passed in the fdt tree, rather than passing the content of
the file as an fdt node directly; the kernel then reserves the page
from uvm, and maps it into kva to call rnd_seed.

For now, the only kernel that does use efiboot with fdt is evbarm,
which knows to handle the rndseed. Any new kernels that use efiboot
with fdt must do the same; otherwise uvm may hand out the page with
the secret key on it for a normal page allocation in the kernel --
which should be OK if there are no kernel memory disclosure bugs, but
would lead to worse consequences than simply loading the seed late in
userland with /etc/rc.d/random_seed otherwise.

ok jmcneill
 1.9.2.1 29-Feb-2020  ad Sync with head.
 1.1 10-Aug-2014  matt branches: 1.1.4;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file math.h was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.4 01-Apr-2018  ryo Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.3 27-Feb-2018  skrll branches: 1.3.2;
Remove c&p comment that doesn't apply
 1.2 15-Feb-2018  kamil Introduce _UC_MACHINE_FP() as a macro

_UC_MACHINE_FP() is a helper macro to extract from mcontext a frame pointer.

Don't rely on this interface as a compiler might strip frame pointer or
optimize it making this interface unreliable.


For hppa assume a small frame context, for larger frames FP might be located
in a different register (4 instead of 3).

For ia64 there is no strict frame pointer, and registers might rotate.
Reuse 79 following:

./gcc/config/ia64/ia64.h:#define HARD_FRAME_POINTER_REGNUM LOC_REG (79)

Once ia64 will mature, this should be revisited.

A macro can encapsulate a real function for extracting Frame Pointer on
more complex CPUs / ABIs.


For the remaining CPUs, reuse standard register as defined in appropriate ABI.

The direct users of this macro are LLVM and GCC with Sanitizers.

Proposed on tech-userlevel@.

Sponsored by <The NetBSD Foundation>
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.22;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.22.3 21-Mar-2018  martin Pull up the following, requested by kamil in ticket #552:

external/gpl3/gcc{.old}/dist/libsanitizer/asan/asan_linux.cc 1.4
sys/arch/aarch64/include/mcontext.h 1.2
sys/arch/alpha/include/mcontext.h 1.9
sys/arch/amd64/include/mcontext.h 1.19
sys/arch/arm/include/mcontext.h 1.19
sys/arch/hppa/include/mcontext.h 1.9
sys/arch/i386/include/mcontext.h 1.14
sys/arch/ia64/include/mcontext.h 1.6
sys/arch/m68k/include/mcontext.h 1.10
sys/arch/mips/include/mcontext.h 1.22
sys/arch/or1k/include/mcontext.h 1.2
sys/arch/powerpc/include/mcontext.h 1.18
sys/arch/riscv/include/mcontext.h 1.5
sys/arch/sh3/include/mcontext.h 1.11
sys/arch/sparc/include/mcontext.h 1.14-1.17
sys/arch/sparc64/include/mcontext.h 1.10
sys/arch/vax/include/mcontext.h 1.9
tests/lib/libc/sys/Makefile 1.50
tests/lib/libc/sys/t_ucontext.c 1.2-1.5
sys/arch/hppa/include/mcontext.h 1.10
sys/arch/ia64/include/mcontext.h 1.7

- Introduce _UC_MACHINE_FP(). _UC_MACHINE_FP() is a helper
macro to extract from mcontext a frame pointer.
- Add new tests in lib/libc/sys/t_ucontext:
* ucontext_sp (testing _UC_MACHINE_SP)
* ucontext_fp (testing _UC_MACHINE_FP)
* ucontext_pc (testing _UC_MACHINE_PC)
* ucontext_intrv (testing _UC_MACHINE_INTRV)

Add a dummy implementation of _UC_MACHINE_INTRV() for ia64.

Implement _UC_MACHINE_INTRV() for hppa.

Make the t_ucontext.c test more portable.

We now have _UC_MACHINE_FP.
 1.1.22.2 26-Feb-2018  snj revert ticket 552, which broke the build
 1.1.22.1 25-Feb-2018  snj Pull up following revision(s) (requested by kamil in ticket #552):
sys/arch/aarch64/include/mcontext.h: 1.2
sys/arch/alpha/include/mcontext.h: 1.9
sys/arch/amd64/include/mcontext.h: 1.19
sys/arch/arm/include/mcontext.h: 1.19
sys/arch/hppa/include/mcontext.h: 1.9
sys/arch/i386/include/mcontext.h: 1.14
sys/arch/ia64/include/mcontext.h: 1.6
sys/arch/m68k/include/mcontext.h: 1.10
sys/arch/mips/include/mcontext.h: 1.22
sys/arch/or1k/include/mcontext.h: 1.2
sys/arch/powerpc/include/mcontext.h: 1.18
sys/arch/riscv/include/mcontext.h: 1.5
sys/arch/sh3/include/mcontext.h: 1.11
sys/arch/sparc/include/mcontext.h: 1.14-1.17
sys/arch/sparc64/include/mcontext.h: 1.10
sys/arch/vax/include/mcontext.h: 1.9
tests/lib/libc/sys/Makefile: 1.50
tests/lib/libc/sys/t_ucontext.c: 1.2
Introduce _UC_MACHINE_FP() as a macro
_UC_MACHINE_FP() is a helper macro to extract from mcontext a frame pointer.
Don't rely on this interface as a compiler might strip frame pointer or
optimize it making this interface unreliable.
For hppa assume a small frame context, for larger frames FP might be located
in a different register (4 instead of 3).
For ia64 there is no strict frame pointer, and registers might rotate.
Reuse 79 following:
./gcc/config/ia64/ia64.h:#define HARD_FRAME_POINTER_REGNUM LOC_REG (79)
Once ia64 will mature, this should be revisited.
A macro can encapsulate a real function for extracting Frame Pointer on
more complex CPUs / ABIs.
For the remaining CPUs, reuse standard register as defined in appropriate ABI.
The direct users of this macro are LLVM and GCC with Sanitizers.
Proposed on tech-userlevel@.
Sponsored by <The NetBSD Foundation>
--
Improve _UC_MACHINE_FP() for SPARC/SPARC64
Introduce a static inline function _uc_machine_fp() that contains improved
caluclation of a frame pointer.
Algorithm:
uptr *stk_ptr;
# if defined (__arch64__)
stk_ptr = (uptr *) (*sp + 2047);
# else
stk_ptr = (uptr *) *sp;
# endif
*bp = stk_ptr[15];
Noted by <mrg>
--
Make _UC_MACHINE_FP() compile again and fix it so that it does not add
the offset twice.
--
fix _UC_MACHINE32_FP() -- use 32 bit pointer value so that [15] is
the right offset. do this by using __greg32_t, which is only in
the sparc64 version, and these are only useful there, so move them.
--
Add new tests in lib/libc/sys/t_ucontext
New tests:
- ucontext_sp
- ucontext_fp
- ucontext_pc
- ucontext_intrv
They test respectively:
- _UC_MACHINE_SP
- _UC_MACHINE_FP
- _UC_MACHINE_PC
- _UC_MACHINE_INTRV
These tests attempt to access and print the values from ucontext, without
interpreting the values.
This is a follow up of the _UC_MACHINE_FP() introduction.
These tests use PRIxREGISTER, and require to be built with -D_KERNTYPES.
Sponsored by <The NetBSD Foundation>
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file mcontext.h was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.3.2.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.2 12-Aug-2020  skrll Part III of ad's performance improvements for aarch64

- Assembly language stubs for mutex_enter() and mutex_exit().
 1.1 10-Aug-2014  matt branches: 1.1.4;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file mutex.h was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.4 02-Jul-2020  rin Add support of ptrace(2) for COMPAT_NETBSD32.

Now, GDB for arm32 is usable for debugging 32bit applications.

OK ryo@
 1.3 24-Nov-2019  rin PR port-arm/54702

Add support for earmv6hf binaries on COMPAT_NETBSD32 for aarch64:

- Emulate ARMv6 instructions with cache operations register (c7), that
are deprecated since ARMv7, and disabled on ARMv8 with LP64 kernel.

- ep_machine_arch (default: earmv7hf) is copied from executables, as we
do for mips64. "uname -p" reports earmv6hf if compiled for earmv6hf;
configure scripts etc can determine the appropriate architecture.

Many thanks to ryo@ for helping me to add support of Thumb-mode,
as well as providing exhaustive test cases:

https://github.com/ryo/mcr_test/

We've confirmed:

- Emulation works in Thumb-mode.
- T32 16-bit length illegal instruction results in SIGILL, even if
it is located nearby a boundary b/w mapped and unmapped pages.
- T32 32-bit instruction results in SIGSEGV if it is located across
a boundary b/w mapped and unmapped pages.

XXX
pullup to netbsd-9
 1.2 12-Oct-2018  ryo branches: 1.2.4;
add initial support of COMPAT_NETBSD32 on AArch64.
arm ELF32 EABI binaries could be execute in AArch32 state on AArch64. A32 THUMB mode is not supported yet.
 1.1 01-Apr-2018  ryo branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1.4.2 08-Apr-2020  martin Merge changes from current as of 20200406
 1.1.4.1 10-Jun-2019  christos Sync with HEAD
 1.1.2.3 20-Oct-2018  pgoyette Sync with head
 1.1.2.2 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.2.1 01-Apr-2018  pgoyette file netbsd32_machdep.h was added on branch pgoyette-compat on 2018-04-07 04:12:11 +0000
 1.2.4.1 01-Jan-2021  martin Pull up following revision(s) (requested by rin in ticket #1172):

sys/arch/aarch64/aarch64/trap.c: revision 1.30
sys/arch/aarch64/include/ptrace.h: revision 1.10
sys/arch/aarch64/include/netbsd32_machdep.h: revision 1.4 (patch)
sys/arch/aarch64/aarch64/netbsd32_machdep.c: revision 1.14
sys/arch/aarch64/aarch64/netbsd32_machdep.c: revision 1.15

Add support of ptrace(2) for COMPAT_NETBSD32.

Now, GDB for arm32 is usable for debugging 32bit applications.
OK ryo@

For rev 1.14 and before, netbsd32_process_write_regs() returns EINVAL
if non-modifiable bits are set in CPSR.
Instead, mask out non-modifiable bits and make this function success
regardless of value in CPSR. New behavior matches that of arm:
https://nxr.netbsd.org/xref/src/sys/arch/arm/arm/process_machdep.c#187

This fixes lib/libc/sys/t_ptrace_wait*:access_regs6 tests, in which
register contents retrieved by PT_GETREGS are set back by PT_SETREGS.

No new regression is observed in full ATF run.

OK ryo
 1.16 31-May-2021  simonb Include "opt_param.h" (ifdef _KERNEL_OPT) everywhere that MSGBUFSIZE is
referenced since some sources include <machine/param.h>.
 1.15 24-Jan-2021  jmcneill branches: 1.15.4;
Use 32K as the default NFSv3 read and write data sizes on aarch64, matching
i386 and amd64.
 1.14 01-Feb-2020  skrll branches: 1.14.6;
G/C
 1.13 24-Nov-2019  rin branches: 1.13.2;
PR port-arm/54702

Add support for earmv6hf binaries on COMPAT_NETBSD32 for aarch64:

- Emulate ARMv6 instructions with cache operations register (c7), that
are deprecated since ARMv7, and disabled on ARMv8 with LP64 kernel.

- ep_machine_arch (default: earmv7hf) is copied from executables, as we
do for mips64. "uname -p" reports earmv6hf if compiled for earmv6hf;
configure scripts etc can determine the appropriate architecture.

Many thanks to ryo@ for helping me to add support of Thumb-mode,
as well as providing exhaustive test cases:

https://github.com/ryo/mcr_test/

We've confirmed:

- Emulation works in Thumb-mode.
- T32 16-bit length illegal instruction results in SIGILL, even if
it is located nearby a boundary b/w mapped and unmapped pages.
- T32 32-bit instruction results in SIGSEGV if it is located across
a boundary b/w mapped and unmapped pages.

XXX
pullup to netbsd-9
 1.12 19-Oct-2019  jmcneill Increase aarch64 MAXCPUS to 256.
 1.11 19-Jan-2019  skrll branches: 1.11.4;
Increase MSGBUFSIZE
 1.10 07-Jan-2019  jdolecek move DEV_BSIZE, DEV_BSHIFT out of MD param.h, they are same on all ports

also move BLKDEV_IOSIZE, MAXPHYS, but allow override since some ports
have different value (powerpc uses NBPG for BLKDEV_IOSIZE, sun2/sun3
have lower MAXPHYS)
 1.9 04-Jan-2019  rin ALIGNBYTES32 should be (8 - 1), not (4 - 1) for EABI:
https://nxr.netbsd.org/xref/src/sys/arch/arm/include/cdefs.h#56

Now, sshd for earmv7hf works without problems.
Also fix other users of cmsg(3) API hopefully.
 1.8 06-Dec-2018  skrll Expose CACHE_LINE_SIZE (and COHERENCY_UNIT) so that fstat can work
 1.7 18-Nov-2018  skrll Add CPU_THUNDERX which sets COHERENCY_UNIT and CACHE_LINE_SIZE to 128
 1.6 15-Nov-2018  riastradh Respect the __HIDE_DELAY kludge like on other ports.
 1.5 14-Nov-2018  jakllsch Switch to NKMEMPAGES_MAX_UNLIMITED.

This aligns aarch64 with our other modern 64-bit ports. Significantly
improves file caching utilization on aarch64 systems with copious RAM.
 1.4 12-Oct-2018  ryo add initial support of COMPAT_NETBSD32 on AArch64.
arm ELF32 EABI binaries could be execute in AArch32 state on AArch64. A32 THUMB mode is not supported yet.
 1.3 28-Apr-2018  jmcneill branches: 1.3.2;
Increase default MSGBUFSIZE to match arm32 defaults
 1.2 01-Apr-2018  ryo Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.28;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.28.7 26-Jan-2019  pgoyette Sync with HEAD
 1.1.28.6 18-Jan-2019  pgoyette Synch with HEAD
 1.1.28.5 26-Dec-2018  pgoyette Sync with HEAD, resolve a few conflicts
 1.1.28.4 26-Nov-2018  pgoyette Sync with HEAD, resolve a couple of conflicts
 1.1.28.3 20-Oct-2018  pgoyette Sync with head
 1.1.28.2 02-May-2018  pgoyette Synch with HEAD
 1.1.28.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file param.h was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.3.2.3 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.3.2.2 08-Apr-2020  martin Merge changes from current as of 20200406
 1.3.2.1 10-Jun-2019  christos Sync with HEAD
 1.11.4.1 23-Oct-2019  martin Pull up following revision(s) (requested by jmcneill in ticket #359):

sys/arch/aarch64/aarch64/locore.S: revision 1.42
sys/arch/aarch64/aarch64/locore.S: revision 1.43
sys/arch/aarch64/aarch64/locore.S: revision 1.44
sys/arch/arm/fdt/cpu_fdt.c: revision 1.28
sys/arch/aarch64/include/cpu.h: revision 1.14
sys/arch/aarch64/include/param.h: revision 1.12
sys/arch/arm/arm32/cpu.c: revision 1.133
sys/arch/arm/arm32/cpu.c: revision 1.134
sys/arch/arm/include/cpu.h: revision 1.101
sys/arch/arm/acpi/cpu_acpi.c: revision 1.7
sys/arch/aarch64/aarch64/cpu.c: revision 1.23
sys/arch/aarch64/aarch64/cpu.c: revision 1.24
sys/arch/aarch64/aarch64/cpu.c: revision 1.25

Increase aarch64 MAXCPUS to 256.

-

Invalidate dcache before polling AP hatched status

-

Avoid overlap between BP and last AP stack. AP stacks are now in order of
increasing address order.

Spotted by and idea from mlelstv.

-

Use separate cacheline aligned arrays for mbox and hatched as before.

-

cpu_hatched_p only for MULTIPROCESSOR
 1.13.2.1 29-Feb-2020  ad Sync with head.
 1.14.6.1 03-Apr-2021  thorpej Sync with HEAD.
 1.15.4.1 17-Jun-2021  thorpej Sync w/ HEAD.
 1.2 27-Dec-2018  mrg make savecore for arm64 basically work.

- move MD lwp "md_ktf" member into struct pcb. the pcb is used by
the gdb "bsd-kvm" target code to find the stack of each thread
and needs to be available in a well known location.
- implement aarch64_nbsd_supply_pcb() in GDB. makes basic gdb work
on a crash dump.
- remove '#if L_MD_KTF + 8 == L_MD_CPACR' conditional code, as there
is no more L_MD_KTF.

with this gdb has minimal working functionality with "target kvm",
and crash can at least "ps" on a crash dump.

ok skrll.
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.28; 1.1.30;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.30.1 10-Jun-2019  christos Sync with HEAD
 1.1.28.1 18-Jan-2019  pgoyette Synch with HEAD
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file pcb.h was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.59 02-Aug-2023  skrll No need to define cpu_{,set}_tlb_info here - just use the
sys/uvm/pmap/pmap_tlb.h versions.
 1.58 20-Apr-2023  skrll Provide a shared pmap_devmap implementation and convert all pmap_devmap
arrays to use DEVMAP_ENTRY{,_END}
 1.57 03-Nov-2022  skrll Provide MI PMAP support on AARCH64
 1.56 29-Oct-2022  skrll KNF + remove unnecessary brackets
 1.55 23-Oct-2022  skrll KNF.
 1.54 23-Oct-2022  skrll Line continuation alignment whitespace. NFC.
 1.53 15-Oct-2022  jmcneill Use "non-posted" instead of "strongly ordered" to describe nGnRnE mappings

Rename the following defines:
- _ARM_BUS_SPACE_MAP_STRONGLY_ORDERED to BUS_SPACE_MAP_NONPOSTED
- PMAP_DEV_SO to PMAP_DEV_NP
- LX_BLKPAG_ATTR_DEVICE_MEM_SO to LX_BLKPAG_ATTR_DEVICE_MEM_NP
Rename the following option:
- AARCH64_DEVICE_MEM_STRONGLY_ORDERED to AARCH64_DEVICE_MEM_NONPOSTED
 1.52 02-Apr-2022  skrll Update to support EFI runtime outside the kernel virtual address space
by creating an EFI RT pmap that can be activated / deactivated when
required.

Adds support for EFI RT to ARM_MMU_EXTENDED (ASID) 32-bit Arm machines.

On Arm64 the usage of pmapboot_enter is reduced and the mappings are
created much later in the boot process -- now in cpu_startup_hook.
Backward compatiblity for KVA mapped RT from old bootaa64.efi is
maintained.

Adding support to other platforms should be easier as a result.
 1.51 15-Jan-2022  skrll Remove unnecessary brackets
 1.50 14-Jan-2022  skrll Restore the previous pmap_remove_all behaviour as the new method meant
the n1sdp couldn't complete a build.

No noticeable change in kernel build performance.
 1.49 10-Oct-2021  skrll Use sys/uvm/pmap/pmap_tlb.c on Aarch64 in the same way that some Arm, MIPS,
and some PPC kernels do. This removes the limitation of 256 processes on
CPUs with 8bit ASID field, e.g. Apple M1.

Additionally the following changes have been made

- removed a couple of unnecessary aarch64_tlbi_all calls
- removed any invalidation after freeing page tables due to
_pmap_sweep_pdp. This was never necessary afaict.
- all kernel mappings are marked global and userland mapping not-global.

Performance testing hasn't show a significant difference. The data here
is from building a kernel on an lx2k system with nvme.

before
1489.6u 400.4s 2:40.65 1176.5% 228+224k 0+32289io 57pf+0w
1482.6u 403.2s 2:38.49 1189.9% 228+222k 0+32274io 46pf+0w
1485.4u 402.2s 2:37.27 1200.2% 228+222k 0+32275io 12pf+0w

after
1493.9u 404.6s 2:37.50 1205.4% 227+221k 0+32265io 48pf+0w
1485.0u 408.0s 2:38.54 1194.0% 227+222k 0+32272io 36pf+0w
1484.3u 407.0s 2:35.88 1213.3% 228+224k 0+32268io 14pf+0w

>>> stats.ttest_ind([160.65,158.49,157.27], [157.5,158.54,155.88])
Ttest_indResult(statistic=1.1923622711296888, pvalue=0.2990182944606766)
>>>
 1.48 19-May-2021  skrll Make even more pmap agnostic
 1.47 30-Apr-2021  skrll branches: 1.47.2;
Make the ddb for pmap / pte information pmap agnostic
 1.46 20-Mar-2021  skrll branches: 1.46.2;
Make pmapboot_enter panic if anything goes wrong and any mappings overlap
rather than only doing it in locore.S
 1.45 31-Jan-2021  skrll branches: 1.45.2;
Improve a comment
 1.44 31-Jan-2021  ryo implement pmap_remove_all().

The size of struct pv_entry has increased, but speed of kernel build has improved by about 1%
exec and exit should have been improved.
 1.43 19-Sep-2020  skrll branches: 1.43.2;
Make __md_palloc pmap agnostic (think sys/uvm/pmap)
 1.42 12-Aug-2020  skrll Part IV of ad's performance improvements for aarch64

- Implement pmap_growkernel(), and update kernel pmap's stats with atomics.

- Then, pmap_kenter_pa() and pmap_kremove() no longer need to allocate
memory nor take pm_lock, because they only modify L3 PTEs.

- Then, pm_lock and pp_lock can be adaptive mutexes at IPL_NONE which are
cheaper than spin mutexes.

- Take the pmap's lock in pmap_extract() if not the kernel's pmap, otherwise
pmap_extract() might see inconsistent state.
 1.41 16-Jul-2020  skrll pmapboot_enter simplication
- bootpage_alloc in asm becomes pmapboot_pagealloc in C
- PMAPBOOT_ENTER_NOBLOCK is removed as it's not used
- PMAPBOOT_ENTER_NOOVERWRITE is removed as it's now always on
- physpage_allocator argument is removed as it's always
pmapboot_pagealloc
- Support for EARLYCONS without CONSADDR is removed so that the identity
map for CONSADDR is always known.

For the assembly files:
2 files changed, 40 insertions(+), 89 deletions(-)

LGTM ryo
 1.40 14-Jun-2020  ad - Fix a lock order reversal in pmap_page_protect().

- Make sure pmap is always locked when updating stats; atomics no longer
needed to do that.

- Remove unneeded traversal of pv list in pmap_enter_pv().

- Shrink struct vm_page from 136 to 128 bytes (cache line sized) and struct
pv_entry from 48 to 32 bytes (power of 2 sized).

- Embed a pv_entry in each vm_page. This means PV entries don't need to
be allocated for private anonymous memory / COW pages / most UBC mappings.
Dynamic PV entries are then used only for stuff like shared libraries and
shared memory.

Proposed on port-arm@.
 1.39 14-May-2020  skrll Use MUTEX_NODEBUG for PV locks as is commonly done. OK ryo.
 1.38 13-May-2020  ryo - move aarch64 addressspace macros from pmap.h to cpufunc.h
- rename ptr_strip_pac() to aarch64_strip_pac()
 1.37 08-Apr-2020  ryo use PMAP_PAGE_INIT() to initialize mutex in pmap_page.

VM_MDPAGE_INIT() in pmap_free_pdp() had initialized pp_flags,
so it unintentionally cleared PMAP_PAGE_FLAGS_PV_TRACKED.
use PMAP_PAGE_INIT to avoid using PMAP_PAGE_FLAGS_PV_TRACKED.

pointed out by tnn@, thanks
 1.36 29-Feb-2020  ryo add helper function aarch64_addresspace() and aarch64_untag_address() to check address space, and eliminate address tag
 1.35 29-Feb-2020  ryo replace KSEG pages mapping code with generic function pmapboot_enter_range()
 1.34 10-Feb-2020  ryo use LIST(3) instead of TAILQ(3) to save one word in struct vm_page and struct pmap.

pointed out by riastradh@. thanks
 1.33 03-Feb-2020  ryo add support pmap_pv(9)

Patch originally from jmcneill@. thanks
 1.32 03-Feb-2020  ryo separate struct vm_page_md into vm_page_md and pmap_page
for preparation pmap_pv(9)
 1.31 26-Jan-2020  skrll Typo in comment
 1.30 06-Jan-2020  skrll branches: 1.30.2;
Fix DEVMAP build losage by reducing diffs between arm and aarch64

*sigh*
 1.29 30-Dec-2019  skrll Drop DEVMAP_{TRUNK_ADDR,ROUND_SIZE} to 4KB pages now that pmap_map_chunk
allows this.
 1.28 28-Dec-2019  jmcneill Do not use Early Write Acknowledge for PCIe I/O and config space.
 1.27 27-Dec-2019  jmcneill Enable early write acknowledge for device memory mappings.
 1.26 29-Oct-2019  maya Define PMAP_NEED_PROCWR, providing strategically placed i-cache
synchronization where just-changed memory is about to be executed.

Fixes SIGILLs seen when running Mono 6 on QEMU Cortex-A57.

ok ryo
 1.25 12-Aug-2019  skrll Use PMAP_DEV in DEVMAP_ENTRY rather than pmap_map_chunk. It's clearer and
means pmap_map_chunk can be made to map other memory types.
 1.24 08-Apr-2019  ryo branches: 1.24.4;
- free empty page tables pages if reach a certain usage.
- need to lock at removing an old pg (_pmap_remove_pv) in _pmap_enter()
 1.23 19-Mar-2019  ryo - add ddb command "machine ttbr" to dump MMU tables.
- tidy up descriptions, usages and messages.
 1.22 19-Mar-2019  ryo - free L1-L3 pages that has been emptied by pmap_remove().
- if no memories, pmap_enter will return correctly ENOMEM if PMAP_CANFAIL, or wait until available any memories if !PMAP_CANFAIL.

These changes improves the stability when we use a huge virtual memory spaces with mmap.
 1.21 06-Feb-2019  ryo improve pmap_remove
- don't lock/unlock per page in pmap_remove()
- speedup pte lookup for continuous addresses
- bring out pool_cache_put(&_pmap_pv_pool, pv) from lock/unlock section
 1.20 04-Jan-2019  jdolecek re-apply rev. 1.18, now tested by Jonathan Kollasch and Ryo Shimizu - no
problems observed, and about 2x speedup for cached read

Implement PMAP_DIRECT / pmap_direct_process() in support of experimental
UBC optimization

PR kern/53124
 1.19 21-Nov-2018  jdolecek revert PMAP_DIRECT until tested; requested by mrg@
 1.18 20-Nov-2018  jdolecek Implement PMAP_DIRECT / pmap_direct_process() in support of experimental
UBC optimizations (compile-tested only for now)

PR kern/53124
 1.17 01-Nov-2018  maxv Add kASan support for aarch64. Stack tracking needs more investigation
and will come in a separate commit.

Reviewed by ryo@ jmcneill@ skrll@.
 1.16 18-Oct-2018  skrll Provide generic start code that assumes the MMU is off and caches are
disabled as per the linux booting protocol for ARMv6 and ARMv7 boards.
u-boot image type should be changed to 'linux' for correct behaviour.

The new start code builds a minimal "bootstrap" L1PT with cached access
disabled and uses the same table for all processors. AP startup is
performed in less steps and more code is written in C.

The bootstrap tables and stack are placed into an (orphaned) section
"_init_memory" which is given to uvm when it is no longer used.

Various kernels have been converted to use this code and tested. Some
boards were provided by TNF. Thanks!

The GENERIC kernel now boots on boards using the TEGRA, SUNXI and EXYNOS
kernels. The GENERIC kernel will also work on RPI2 using u-boot.

Thanks to martin@ and aymeric@ for testing on parallella and nanosoc
respectively
 1.15 13-Oct-2018  ryo - define PMAP_{MAP,UNMAP}_POOLPAGE for performance
- define __HAVE_MM_MD_KERNACC and add mm_md_kernacc()
 1.14 12-Oct-2018  ryo add initial support of COMPAT_NETBSD32 on AArch64.
arm ELF32 EABI binaries could be execute in AArch32 state on AArch64. A32 THUMB mode is not supported yet.
 1.13 12-Oct-2018  ryo rewrite pmap_pte_lookup() to share similar code.
 1.12 04-Oct-2018  ryo cleanup locore, and changed the way to map memories during boot.
- add functions bootpage_enter() and bootpage_alloc() to adapt various layout
of physical memory map. especially for 64bit physical memory layout.
pmapboot_alloc() allocates pagetable pages from _end[].
- changed to map only the required amount for PA=VA identity mapping
(kernel image, UART device, and FDT blob) with L2_BLOCK(2Mbyte).
- changing page permission for kernel image, and making KSEG mapping are done
at cpu_kernel_vm_init() instead of at locore.
- optimize PTE entries with PTE Contiguous bit. it is enabled on devmap only for now.

reviewed by skrll@, thanks.
 1.11 04-Oct-2018  ryo * define LX_BLKPAG_{OS,ATTR}_* for OS dependent PTE attributes in pmap.h
* cleanup macros
 1.10 15-Sep-2018  jakllsch make kernel-groveling crash(8) work on aarch64
 1.9 10-Sep-2018  maxv Rename _pmap_alloc_pdp -> pmap_alloc_pdp, and make it public.
 1.8 10-Aug-2018  ryo treat kernel-exec attr and user-exec attr separately.
kernel cannot execute userland exec page, and user cannot execute kernel page.
 1.7 06-Aug-2018  ryo set kernel text/rodata readonly by default.
add function db_write_text() for setting ddb breakpoint.
 1.6 27-Jul-2018  ryo changes of pmap.c r1.13 seems to be unstable.
In order to invalidate icache, not to invalidate all icache,
but temporary to make the page writable and invalidate target address only.
 1.5 08-Jun-2018  jmcneill branches: 1.5.2;
Provide bs_mmap implementations for bcm283x based boards.

PR: port-arm/53283
Submitted by: Nick Hudson
 1.4 27-Apr-2018  ryo fix instability behavior of bufcache on aarch64.
* fix to return correct ref/mod when PMAP_WIRED.
* changed to keep wired flags in pte instead of pv_entry, and cleanup.
 1.3 09-Apr-2018  jmcneill Fix encoding of MMAP flags for generic_bs_mmap
 1.2 01-Apr-2018  ryo Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.28;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.28.10 18-Jan-2019  pgoyette Synch with HEAD
 1.1.28.9 26-Nov-2018  pgoyette Sync with HEAD, resolve a couple of conflicts
 1.1.28.8 20-Oct-2018  pgoyette Sync with head
 1.1.28.7 30-Sep-2018  pgoyette Ssync with HEAD
 1.1.28.6 06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.1.28.5 28-Jul-2018  pgoyette Sync with HEAD
 1.1.28.4 25-Jun-2018  pgoyette Sync with HEAD
 1.1.28.3 02-May-2018  pgoyette Synch with HEAD
 1.1.28.2 16-Apr-2018  pgoyette Sync with HEAD, resolve some conflicts
 1.1.28.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file pmap.h was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.5.2.3 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.5.2.2 08-Apr-2020  martin Merge changes from current as of 20200406
 1.5.2.1 10-Jun-2019  christos Sync with HEAD
 1.24.4.2 29-Dec-2019  martin Pull up following revision(s) (requested by jmcneill in ticket #586):

sys/arch/arm/nvidia/tegra_pcie.c: revision 1.27
sys/arch/aarch64/aarch64/pmap.c: revision 1.57
sys/arch/aarch64/aarch64/locore.S: revision 1.48
sys/arch/aarch64/include/armreg.h: revision 1.29
sys/arch/aarch64/aarch64/pmap.c: revision 1.58
sys/arch/aarch64/aarch64/locore.S: revision 1.49
sys/arch/arm/acpi/acpipchb.c: revision 1.14
sys/arch/aarch64/aarch64/genassym.cf: revision 1.16
sys/arch/arm/acpi/acpi_machdep.c: revision 1.13
sys/arch/aarch64/include/pmap.h: revision 1.27
sys/arch/aarch64/aarch64/genassym.cf: revision 1.17
sys/arch/aarch64/include/pmap.h: revision 1.28
sys/arch/arm/fdt/pcihost_fdtvar.h: revision 1.3
sys/arch/arm/include/bus_defs.h: revision 1.14
sys/arch/aarch64/aarch64/bus_space.c: revision 1.9
sys/arch/arm/fdt/pcihost_fdt.c: revision 1.12
sys/arch/aarch64/conf/files.aarch64: revision 1.15
sys/arch/aarch64/conf/files.aarch64: revision 1.16
sys/arch/arm/rockchip/rk3399_pcie.c: revision 1.9

Enable early write acknowledge for device memory mappings.

Do not use Early Write Acknowledge for PCIe I/O and config space.
 1.24.4.1 04-Nov-2019  martin Pull up following revision(s) (requested by maya in ticket #393):

sys/arch/aarch64/include/pmap.h: revision 1.26
sys/arch/aarch64/aarch64/pmap.c: revision 1.48

Define PMAP_NEED_PROCWR, providing strategically placed i-cache
synchronization where just-changed memory is about to be executed.

Fixes SIGILLs seen when running Mono 6 on QEMU Cortex-A57.

ok ryo
 1.30.2.1 29-Feb-2020  ad Sync with head.
 1.43.2.1 03-Apr-2021  thorpej Sync with HEAD.
 1.45.2.1 03-Apr-2021  thorpej Sync with HEAD.
 1.46.2.2 17-Jun-2021  thorpej Sync w/ HEAD.
 1.46.2.1 13-May-2021  thorpej Sync with HEAD.
 1.47.2.1 31-May-2021  cjep sync with head
 1.10 09-Oct-2025  skrll There is no meed to dsb(ishst) after every pte_pde_cas - the necessary
dsb will be performed later when a pte is added,
 1.9 26-Jul-2025  martin Allow building w/o options EFI_RUNTIME
 1.8 26-Jul-2023  skrll branches: 1.8.6;
blank line audit
 1.7 26-Jul-2023  skrll G/C pmap_md_kernel_*
 1.6 26-Jul-2023  skrll Reduce #ifdefs
 1.5 26-Jul-2023  skrll Wrap long lines in a comment block.
 1.4 26-Jul-2023  skrll spaces to tabs.
 1.3 20-Apr-2023  skrll Provide a shared pmap_devmap implementation and convert all pmap_devmap
arrays to use DEVMAP_ENTRY{,_END}
 1.2 21-Dec-2022  skrll Rename pmap_md_pdetab_destroy to pmap_md_pdetab_fini to match
pmap_md_pdetab_init.

Call pmap_md_pdetab_fini from pmap_segtab_destroy.
 1.1 03-Nov-2022  skrll Provide MI PMAP support on AARCH64
 1.8.6.1 02-Aug-2025  perseant Sync with HEAD
 1.2 01-Apr-2018  ryo Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.28;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.28.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file pmc.h was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.8 12-Aug-2020  skrll Part II of ad's aarch64 performance improvements (cpu_switch.S bugs are
all mine)

- Use tpidr_el1 to hold curlwp and not curcpu, because curlwp is accessed
much more often by MI code. It also makes curlwp preemption safe and
allows aarch64_curlwp() to be a const function (curcpu must be volatile).

- Make ASTs operate per-LWP rather than per-CPU, otherwise sometimes LWPs
can see spurious ASTs (which doesn't cause a problem, it just means some
time may be wasted).

- Use plain stores to set/clear ASTs. Make sure ASTs are always set on the
same CPU as the target LWP, and delivered via IPI if posted from a remote
CPU so that they are resolved quickly.

- Add some cache line padding to struct cpu_info, to match x86.

- Add a memory barrier in a couple of places where ci_curlwp is set. This
is needed whenever an LWP that is resuming on the CPU could hold an
adaptive mutex. The barrier needs to drain the CPU's store buffer, so
that the update to ci_curlwp becomes globally visible before the LWP can
resume and call mutex_exit(). By my reading of the ARM docs it looks like
the instruction I used will do the right thing, but I'm not 100% sure.
 1.7 23-May-2020  ryo Not only the kernel thread, but also the userland PAC keys
(APIA,APIB,APDA,APDB,APGA) are now randomly initialized at exec, and switched
when context switch.
userland programs are able to perform pointer authentication on ARMv8.3+PAC cpu.

reviewd by maxv@, thanks.
 1.6 12-Apr-2020  maxv Add support for Pointer Authentication (PAC).

We use the "pac-ret" option, to sign the return instruction pointer on
function entry, and authenticate it on function exit. This acts as a
mitigation against ROP.

The authentication uses a per-lwp (secret) I-A key stored in the 128bit
APIAKey register and part of the lwp context. During lwp creation, the
kernel generates a random key, and during context switches, it installs
the key of the target lwp on the CPU.

Userland cannot read the APIAKey register directly. However, it can sign
its pointers with it, because the register is architecturally shared
between userland and the kernel. Although part of the CPU design, it is
a bit of an undesired behavior, because it allows to forge valid kernel
pointers from userland. To avoid that, we don't share the key with
userland, and rather switch it in EL0<->EL1 transitions. This means that
when userland executes, a different key is loaded in APIAKey than the one
the kernel uses. For now the userland key is a fixed 128bit zero value.

The DDB stack unwinder is changed to strip the authentication code from
the pointers in lr.

Two problems are known:

* Currently the idlelwps' keys are not really secret. This is because
the RNG is not yet available when we spawn these lwps. Not overly
important, but would be nice to fix with UEFI RNG.
* The key switching in EL0<->EL1 transitions is not the most optimized
code on the planet. Instead of checking aarch64_pac_enabled, it would
be better to hot-patch the code at boot time, but there currently is
no hot-patch support on aarch64.

Tested on Qemu.
 1.5 24-Nov-2019  rin branches: 1.5.6;
part of PR port-arm/54702

Having md_march32 unconditionally in struct mdproc, in order to
make libkvm happy.

XXX
pullup to netbsd-9
 1.4 24-Nov-2019  rin PR port-arm/54702

Add support for earmv6hf binaries on COMPAT_NETBSD32 for aarch64:

- Emulate ARMv6 instructions with cache operations register (c7), that
are deprecated since ARMv7, and disabled on ARMv8 with LP64 kernel.

- ep_machine_arch (default: earmv7hf) is copied from executables, as we
do for mips64. "uname -p" reports earmv6hf if compiled for earmv6hf;
configure scripts etc can determine the appropriate architecture.

Many thanks to ryo@ for helping me to add support of Thumb-mode,
as well as providing exhaustive test cases:

https://github.com/ryo/mcr_test/

We've confirmed:

- Emulation works in Thumb-mode.
- T32 16-bit length illegal instruction results in SIGILL, even if
it is located nearby a boundary b/w mapped and unmapped pages.
- T32 32-bit instruction results in SIGSEGV if it is located across
a boundary b/w mapped and unmapped pages.

XXX
pullup to netbsd-9
 1.3 27-Dec-2018  mrg make savecore for arm64 basically work.

- move MD lwp "md_ktf" member into struct pcb. the pcb is used by
the gdb "bsd-kvm" target code to find the stack of each thread
and needs to be available in a well known location.
- implement aarch64_nbsd_supply_pcb() in GDB. makes basic gdb work
on a crash dump.
- remove '#if L_MD_KTF + 8 == L_MD_CPACR' conditional code, as there
is no more L_MD_KTF.

with this gdb has minimal working functionality with "target kvm",
and crash can at least "ps" on a crash dump.

ok skrll.
 1.2 01-Apr-2018  ryo branches: 1.2.2;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.28;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.28.2 18-Jan-2019  pgoyette Synch with HEAD
 1.1.28.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file proc.h was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.2.2.3 21-Apr-2020  martin Sync with HEAD
 1.2.2.2 08-Apr-2020  martin Merge changes from current as of 20200406
 1.2.2.1 10-Jun-2019  christos Sync with HEAD
 1.5.6.1 20-Apr-2020  bouyer Sync with HEAD
 1.4 10-Feb-2021  ryo Oh...the name of the mcount call was different between gcc and llvm.
gcc calls it as "_mconut", llvm calls as "__mcount".

Change the main name of mcount to "mcount()", and created "_mcount" and "__mcount" entries
to work regardless of which compiler the object was created with.
 1.3 10-Feb-2021  ryo add support kernel profiling on aarch64

- add MCOUNT_ENTER, MCOUNT_EXIT macro
- __mcount() function should be aligned
- add "-fno-optimize-sibling-calls" option when PROF. for accurate profiling, it is better to suppress the tail call.
 1.2 23-Apr-2020  jakllsch branches: 1.2.2;
Fix userland gprof profiling on aarch64.

Adjusts _PROF_PROLOGUE to match OpenBSD; reworks our MCOUNT to retrieve
frompc placed on stack by the prologue, and to streamline sp manipulation
when preserving argument registers.
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.40;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.40.1 25-Apr-2020  bouyer Sync with bouyer-xenpvh-base2 (HEAD)
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file profile.h was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.2.2.1 03-Apr-2021  thorpej Sync with HEAD.
 1.2 01-Apr-2018  ryo Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.28;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.28.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file psl.h was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.14 19-Aug-2022  ryo Fixed a bug that pte's __BIT(63,48) could be set when accessing addresses above 0x0001000000000000 in /dev/mem with mmap().
 1.13 10-Oct-2021  skrll Use sys/uvm/pmap/pmap_tlb.c on Aarch64 in the same way that some Arm, MIPS,
and some PPC kernels do. This removes the limitation of 256 processes on
CPUs with 8bit ASID field, e.g. Apple M1.

Additionally the following changes have been made

- removed a couple of unnecessary aarch64_tlbi_all calls
- removed any invalidation after freeing page tables due to
_pmap_sweep_pdp. This was never necessary afaict.
- all kernel mappings are marked global and userland mapping not-global.

Performance testing hasn't show a significant difference. The data here
is from building a kernel on an lx2k system with nvme.

before
1489.6u 400.4s 2:40.65 1176.5% 228+224k 0+32289io 57pf+0w
1482.6u 403.2s 2:38.49 1189.9% 228+222k 0+32274io 46pf+0w
1485.4u 402.2s 2:37.27 1200.2% 228+222k 0+32275io 12pf+0w

after
1493.9u 404.6s 2:37.50 1205.4% 227+221k 0+32265io 48pf+0w
1485.0u 408.0s 2:38.54 1194.0% 227+222k 0+32272io 36pf+0w
1484.3u 407.0s 2:35.88 1213.3% 228+224k 0+32268io 14pf+0w

>>> stats.ttest_ind([160.65,158.49,157.27], [157.5,158.54,155.88])
Ttest_indResult(statistic=1.1923622711296888, pvalue=0.2990182944606766)
>>>
 1.12 29-Feb-2020  ryo Fix pmap to work correctly with tagged addresses

- when fault, untag from address before passing to uvm/pmap functions
- pmap_extract() checks more strictly and consider the address tag
 1.11 31-Jan-2020  maxv BTI definitions.
 1.10 11-Sep-2019  skrll branches: 1.10.2;
Define PRIxPTE
 1.9 11-Sep-2019  skrll Move the TCR and TTBR defines into armreg.h where they below. NFCI.
 1.8 11-Sep-2019  jmcneill - Fix TCR_TG0 field definitions to match Armv8 ARM
- Rename TCR_IPS_64TB to TCR_IPS_16TB, add TCR_IPS_4PB
- Whitespace fixes
 1.7 15-Aug-2019  skrll Indent the field value defines. NFCI.
 1.6 13-Aug-2019  skrll Add DBM
 1.5 04-Oct-2018  ryo * define LX_BLKPAG_{OS,ATTR}_* for OS dependent PTE attributes in pmap.h
* cleanup macros
 1.4 17-Jul-2018  ryo fix build with aarch64 gcc/gas
 1.3 01-Apr-2018  ryo branches: 1.3.2;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.2 16-Jan-2017  maya branches: 1.2.12;
Correct definitions for TCR.

Values from ARM Cortex A-53 MPCore Processor Technical Reference Manual
4.3.48. Translation Control Register, EL1
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.6; 1.1.10; 1.1.14;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.14.1 21-Apr-2017  bouyer Sync with HEAD
 1.1.10.1 20-Mar-2017  pgoyette Sync with HEAD
 1.1.6.1 05-Feb-2017  skrll Sync with HEAD
 1.1.4.3 03-Dec-2017  jdolecek update from HEAD
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file pte.h was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.2.12.3 20-Oct-2018  pgoyette Sync with head
 1.2.12.2 28-Jul-2018  pgoyette Sync with HEAD
 1.2.12.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.3.2.2 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.3.2.1 10-Jun-2019  christos Sync with HEAD
 1.10.2.1 29-Feb-2020  ad Sync with head.
 1.12 07-Sep-2020  ryo Oops. revert my previous commit. AArch64 instructions are always LE.
 1.11 06-Sep-2020  ryo need swap for aarch64be
 1.10 02-Jul-2020  rin Add support of ptrace(2) for COMPAT_NETBSD32.

Now, GDB for arm32 is usable for debugging 32bit applications.

OK ryo@
 1.9 18-Jun-2019  kamil branches: 1.9.2;
Introduce PTRACE_REG_FP() a helper macro to retrieve the frame pointer

The macro is dummy for ia64 (the FP register is unknown and can change
freely) and sparc/sparc64 (not stored in struct reg).
 1.8 13-Dec-2018  ryo add support PT_STEP
 1.7 21-Jul-2018  ryo don't depend endian.
 1.6 20-Jul-2018  christos flip the byte order
 1.5 12-Apr-2017  kamil branches: 1.5.10; 1.5.12;
Add new macro PTRACE_BREAKPOINT_ASM in <sys/ptrace.h> MD part

This macro ships with a MD-specific assembly instruction triggering
a software breakpoint.

Missing instruction for powerpc targets.

This code is used in ATF tests (lib/libc/sys/t_ptrace_wait).

Original patch by Nick Hudson, thanks!
 1.4 25-Sep-2015  christos branches: 1.4.2; 1.4.4;
For processors that have memory breakpoints, add macros for them to help
libproc
 1.3 15-Sep-2015  christos Provide access to pc/sp/syscall-return registers like we have for mcontext
 1.2 11-Aug-2014  matt branches: 1.2.2; 1.2.4;
#include <arm/ptrace.h> instead of <arm/asm.h>
(opps)
 1.1 10-Aug-2014  matt Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.2.4.3 28-Aug-2017  skrll Sync with HEAD
 1.2.4.2 27-Dec-2015  skrll Sync with HEAD (as of 26th Dec)
 1.2.4.1 22-Sep-2015  skrll Sync with HEAD
 1.2.2.3 03-Dec-2017  jdolecek update from HEAD
 1.2.2.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.2.2.1 11-Aug-2014  tls file ptrace.h was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.4.4.1 21-Apr-2017  bouyer Sync with HEAD
 1.4.2.1 26-Apr-2017  pgoyette Sync with HEAD
 1.5.12.2 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.5.12.1 10-Jun-2019  christos Sync with HEAD
 1.5.10.2 26-Dec-2018  pgoyette Sync with HEAD, resolve a few conflicts
 1.5.10.1 28-Jul-2018  pgoyette Sync with HEAD
 1.9.2.1 01-Jan-2021  martin Pull up following revision(s) (requested by rin in ticket #1172):

sys/arch/aarch64/aarch64/trap.c: revision 1.30
sys/arch/aarch64/include/ptrace.h: revision 1.10
sys/arch/aarch64/include/netbsd32_machdep.h: revision 1.4 (patch)
sys/arch/aarch64/aarch64/netbsd32_machdep.c: revision 1.14
sys/arch/aarch64/aarch64/netbsd32_machdep.c: revision 1.15

Add support of ptrace(2) for COMPAT_NETBSD32.

Now, GDB for arm32 is usable for debugging 32bit applications.
OK ryo@

For rev 1.14 and before, netbsd32_process_write_regs() returns EINVAL
if non-modifiable bits are set in CPSR.
Instead, mask out non-modifiable bits and make this function success
regardless of value in CPSR. New behavior matches that of arm:
https://nxr.netbsd.org/xref/src/sys/arch/arm/arm/process_machdep.c#187

This fixes lib/libc/sys/t_ptrace_wait*:access_regs6 tests, in which
register contents retrieved by PT_GETREGS are set back by PT_SETREGS.

No new regression is observed in full ATF run.

OK ryo
 1.3 17-Jul-2018  kamil Use __uint128_t conditionally in aarch64 reg.h

Check whether __uint128_t is available checking __SIZEOF_INT128__ in
preprocessor.
Move __aligned attribute to the whole structure.

No functional change for current NetBSD/aarch64 users of GCC and Clang.

This change allows to use the aarch64 target with rumpkernel on Linux
aarch64 hosts, in a toolchain configuration with 128-bit variables.

OK from <martin> and <christos>
 1.2 01-Apr-2018  ryo branches: 1.2.2;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.28;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.28.2 28-Jul-2018  pgoyette Sync with HEAD
 1.1.28.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file reg.h was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.2.2.1 10-Jun-2019  christos Sync with HEAD
 1.1 10-Aug-2014  matt branches: 1.1.4;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file rwlock.h was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.2 10-May-2020  skrll Don't futz with tpidr_el0 in {set,long}jmp as it breaks TLS as seen in
qemu
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.34;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.34.1 13-May-2020  martin Pull up following revision(s) (requested by skrll in ticket #901):

sys/arch/aarch64/include/setjmp.h: revision 1.2
lib/libc/arch/aarch64/genassym.cf: revision 1.2
lib/libc/arch/aarch64/gen/setjmp.S: revision 1.3
lib/libc/arch/aarch64/gen/_setjmp.S: revision 1.4

Don't futz with tpidr_el0 in {set,long}jmp as it breaks TLS as seen in
qemu
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file setjmp.h was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.4 05-Nov-2021  thorpej Normally, to support COMPAT_NETBSD32 we need to define
__HAVE_STRUCT_SIGCONTEXT in order to support the old
"sigcontext" style of handlers for 32-bit binaries.
However, we only support 32-bit EABI binaries on AArch64,
and by happy accident (due to a libc bug introduced in
2006), 32-bit NetBSD EABI binaries never used "sigcontext"
style handlers. So, we don't need to carry any of this
baggage forward.

This addresses the AArch64 case of PR kern/56487.
 1.3 27-Oct-2021  thorpej - In sendsig() and sigaction1(), don't hard-code signal trampoline
versions. Instead, use the version constants from <sys/signal.h>
and automatically (and correctly) handle cases where multiple versions
of a particular trampoline flavor exist. Conditionalize support
for sigcontext trampolines on __HAVE_STRUCT_SIGCONTEXT.
- aarch64 and amd64 don't use sigcontext natively, but do need to
support it for 32-bit compatibility; define __HAVE_STRUCT_SIGCONTEXT
conditionally on _KERNEL.
 1.2 01-Apr-2018  ryo Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.28;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.28.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file signal.h was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.5 05-May-2024  riastradh aarch64/sljit_machdep.h: Make this work in compat32 context.

Should fix clang build of compat32 eabi libsljit:

dependall ===> compat/arm/eabi/../../../lib/../external/bsd/sljit/lib
In file included from /home/source/ab/HEAD-llvm/src/sys/external/bsd/sljit/dist/sljit_src/sljitLir.c:1678:
/home/source/ab/HEAD-llvm/src/sys/external/bsd/sljit/dist/sljit_src/sljitNativeARM_64.c:142:54: error: shift count >= width of type [-Werror,-Wshift-count-overflow]
FAIL_IF(push_inst(compiler, MOVK | RD(dst) | (((imm >> 32) & 0xffff) << 5) | (2 << 21)));
^ ~~
 1.4 02-Apr-2024  riastradh bsd.own.mk: Enable MKLSJIT on aarch64.

Make sure there's only one copy of the conditional, in bsd.own.mk;
just make sys/modules/Makefile conditional on MKSLJIT so we don't
have to keep these in sync.

As a workaround for PR 58106, tweak the conditional definition of
SLJIT_CACHE_FLUSH to use cpu_icache_sync_range only in _HARDKERNEL,
and use __builtin___clear_cache in userland and in rump kernels.

PR 58103: bpfjit.kmod is not built on aarch64
 1.3 11-Dec-2020  skrll branches: 1.3.18;
s:aarch64/cpufunc.h:arm/cpufunc.h:

a baby step in the grand arm header unification challenge
 1.2 02-Dec-2018  alnsn branches: 1.2.4; 1.2.6; 1.2.14;
Switch to __builtin___clear_cache() in userspace.

aarch64_sync_icache() doesn't exist because there no libarm equivalent
on aarch64.
 1.1 26-Aug-2018  rjs branches: 1.1.2;
Add SLJIT to aarch64.
 1.1.2.3 26-Dec-2018  pgoyette Sync with HEAD, resolve a few conflicts
 1.1.2.2 06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.1.2.1 26-Aug-2018  pgoyette file sljit_machdep.h was added on branch pgoyette-compat on 2018-09-06 06:55:23 +0000
 1.2.14.1 14-Dec-2020  thorpej Sync w/ HEAD.
 1.2.6.1 18-Apr-2024  martin Pull up following revision(s) (requested by riastradh in ticket #655):

sys/modules/Makefile: revision 1.285
share/mk/bsd.own.mk: revision 1.1365
share/mk/bsd.own.mk: revision 1.1366
sys/arch/aarch64/include/sljit_machdep.h: revision 1.4
sys/external/bsd/sljit/dist/sljit_src/sljitNativeARM_64.c: revision 1.5
(all via patch)

sljit: Pacify -Wsign-compare.

If these sizes are negative, we're probably in trouble anyway, so
assert nonnegative here.
Needed to resolve PR 58103.

bsd.own.mk: Enable MKLSJIT on aarch64.

Make sure there's only one copy of the conditional, in bsd.own.mk;
just make sys/modules/Makefile conditional on MKSLJIT so we don't
have to keep these in sync.

As a workaround for PR 58106, tweak the conditional definition of
SLJIT_CACHE_FLUSH to use cpu_icache_sync_range only in _HARDKERNEL,
and use __builtin___clear_cache in userland and in rump kernels.

PR 58103: bpfjit.kmod is not built on aarch64
bsd.own.mk: No need for MKSLJIT to be set differently from others.
- Use ?=, not =, so mk.conf setting wins.
- Write out per-architecture tabular settings, not a conditional.
- Add comments for the architectures that look like they should have
sljit but don't. (XXX Missing comments about powerpc and mips --
not sure why, is this because modules don't yet work on those
architectures, or what?)

Tidying for PR 58103: bpfjit.kmod is not built on aarch64.
 1.2.4.2 10-Jun-2019  christos Sync with HEAD
 1.2.4.1 02-Dec-2018  christos file sljit_machdep.h was added on branch phil-wifi on 2019-06-10 22:05:43 +0000
 1.3.18.2 11-May-2024  martin Additionally pull up the following, to fix 32bit compat compilation after
ticket #655, requested by riastradh:

sys/arch/aarch64/include/sljit_machdep.h 1.5

aarch64/sljit_machdep.h: Make this work in compat32 context.
 1.3.18.1 18-Apr-2024  martin Pull up following revision(s) (requested by riastradh in ticket #655):

sys/modules/Makefile: revision 1.285
share/mk/bsd.own.mk: revision 1.1365
share/mk/bsd.own.mk: revision 1.1366
sys/arch/aarch64/include/sljit_machdep.h: revision 1.4
sys/external/bsd/sljit/dist/sljit_src/sljitNativeARM_64.c: revision 1.5

sljit: Pacify -Wsign-compare.

If these sizes are negative, we're probably in trouble anyway, so
assert nonnegative here.
Needed to resolve PR 58103.

bsd.own.mk: Enable MKLSJIT on aarch64.

Make sure there's only one copy of the conditional, in bsd.own.mk;
just make sys/modules/Makefile conditional on MKSLJIT so we don't
have to keep these in sync.

As a workaround for PR 58106, tweak the conditional definition of
SLJIT_CACHE_FLUSH to use cpu_icache_sync_range only in _HARDKERNEL,
and use __builtin___clear_cache in userland and in rump kernels.

PR 58103: bpfjit.kmod is not built on aarch64
bsd.own.mk: No need for MKSLJIT to be set differently from others.
- Use ?=, not =, so mk.conf setting wins.
- Write out per-architecture tabular settings, not a conditional.
- Add comments for the architectures that look like they should have
sljit but don't. (XXX Missing comments about powerpc and mips --
not sure why, is this because modules don't yet work on those
architectures, or what?)

Tidying for PR 58103: bpfjit.kmod is not built on aarch64.
 1.2 01-Apr-2018  ryo Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.28;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.28.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file sysarch.h was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.3 10-May-2020  skrll branches: 1.3.2;
Provide a trap.h (currently empty)
 1.2 01-Apr-2018  ryo Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.28;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.28.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file trap.h was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.3.2.2 13-May-2020  martin Pull up following revision(s) (requested by skrll in ticket #900):

sys/arch/aarch64/include/Makefile: revision 1.4
sys/arch/aarch64/include/trap.h: revision 1.3
distrib/sets/lists/comp/ad.aarch64: revision 1.40

Provide a trap.h (currently empty)

Update for trap.h
 1.3.2.1 10-May-2020  martin file trap.h was added on branch netbsd-9 on 2020-05-13 12:31:11 +0000
 1.21 03-Nov-2022  skrll Provide MI PMAP support on AARCH64
 1.20 10-Oct-2021  skrll Use sys/uvm/pmap/pmap_tlb.c on Aarch64 in the same way that some Arm, MIPS,
and some PPC kernels do. This removes the limitation of 256 processes on
CPUs with 8bit ASID field, e.g. Apple M1.

Additionally the following changes have been made

- removed a couple of unnecessary aarch64_tlbi_all calls
- removed any invalidation after freeing page tables due to
_pmap_sweep_pdp. This was never necessary afaict.
- all kernel mappings are marked global and userland mapping not-global.

Performance testing hasn't show a significant difference. The data here
is from building a kernel on an lx2k system with nvme.

before
1489.6u 400.4s 2:40.65 1176.5% 228+224k 0+32289io 57pf+0w
1482.6u 403.2s 2:38.49 1189.9% 228+222k 0+32274io 46pf+0w
1485.4u 402.2s 2:37.27 1200.2% 228+222k 0+32275io 12pf+0w

after
1493.9u 404.6s 2:37.50 1205.4% 227+221k 0+32265io 48pf+0w
1485.0u 408.0s 2:38.54 1194.0% 227+222k 0+32272io 36pf+0w
1484.3u 407.0s 2:35.88 1213.3% 228+224k 0+32268io 14pf+0w

>>> stats.ttest_ind([160.65,158.49,157.27], [157.5,158.54,155.88])
Ttest_indResult(statistic=1.1923622711296888, pvalue=0.2990182944606766)
>>>
 1.19 30-Sep-2021  skrll Make tlb_asid_t unsigned int as pmap_tlb.c expects tlb_asid_t to be able to
hold ASID_MAX + 1.
 1.18 24-Mar-2021  simonb s/depreciated/deprecated/g
 1.17 23-Jan-2021  jmcneill branches: 1.17.2;
Add __HAVE_BUS_SPACE_8
 1.16 14-Sep-2020  ryo branches: 1.16.2;
PID_MAX is just an initial value (soft maximum). Don't use it for CTASSERT.
defined __HAVE_CPU_MAXPROC to use function cpu_maxproc().

pointed out by mrg@, thanks.
 1.15 03-Aug-2020  ryo Implement MD ucas(9) (__HAVE_UCAS_FULL)
 1.14 14-Feb-2020  skrll sort __HAVE_* defines. NFCI
 1.13 06-Dec-2019  kamil branches: 1.13.2;
Remove __HAVE_CPU_LWP_SETPRIVATE from aarch64

aarch64 specific cpu_lwp_setprivate() is redundant with its caller
lwp_setprivate() and there are no MD bits.
 1.12 13-Oct-2018  ryo - define PMAP_{MAP,UNMAP}_POOLPAGE for performance
- define __HAVE_MM_MD_KERNACC and add mm_md_kernacc()
 1.11 17-Jul-2018  joerg Be consistent and explicitly size register32_t too.
 1.10 17-Jul-2018  christos match declaration types for registers from reg.h
 1.9 12-Jul-2018  maxv Remove the kernel PMC code. Sent yesterday on tech-kern@.

This change:

* Removes "options PERFCTRS", the associated includes, and the associated
ifdefs. In doing so, it removes several XXXSMPs in the MI code, which is
good.

* Removes the PMC code of ARM XSCALE.

* Removes all the pmc.h files. They were all empty, except for ARM XSCALE.

* Reorders the x86 PMC code not to rely on the legacy pmc.h file. The
definitions are put in sysarch.h.

* Removes the kern/sys_pmc.c file, and along with it, the sys_pmc_control
and sys_pmc_get_info syscalls. They are marked as OBSOL in kern,
netbsd32 and rump.

* Removes the pmc_evid_t and pmc_ctr_t types.

* Removes all the associated man pages. The sets are marked as obsolete.
 1.8 28-Apr-2018  jmcneill branches: 1.8.2;
Define __HAVE_OLD_DISKLABEL for compatibility with the arm32 port.
 1.7 27-Apr-2018  ryo define __HAVE_ATOMIC64_OPS
pointed out by nonaka@, thanks
 1.6 01-Apr-2018  ryo Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.5 28-Feb-2016  joerg branches: 1.5.16;
Reorder using register_t to the point where it is defined.
 1.4 23-Jan-2016  christos expose the kernel types for standalone code.
 1.3 23-Jan-2016  christos Hide {p,v}{addr,size}_t and register_t (and a couple more types that
are machine-specific) from userland unless _KERNEL/_KMEMUSER and a
new _KERNTYPES variables is defined. The _KERNTYPES should be fixed
for many subsystems that should not be using it (rump)...
 1.2 27-Aug-2015  pooka Fix PTHREAD_FOO_INITIALIZER for C++ by not using volatile in the relevant
pthread types in C++ builds, attempt 2.

The problem with attempt 1 was making assumptions of what the MD
__cpu_simple_lock_t (declared volatile) looks like. To get a same type
except non-volatile, we change the MD type to __cpu_simple_lock_nv_t
and typedef __cpu_simple_lock_t as a volatile __cpu_simple_lock_nv_t.
IMO, __cpu_simple_lock_t should not be volatile at all, but changing it
now is too risky.

Fixes at least Rumprun w/ gcc 5.1/5.2. Furthermore, the mpd application
(and possibly others) will no longer require NetBSD-specific patches.

Tested: build.sh for i386, Rumprun for x86_64 w/ gcc 5.2.

Based on the patch from Christos in lib/49989.
 1.1 10-Aug-2014  matt branches: 1.1.4; 1.1.6;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.6.2 19-Mar-2016  skrll Sync with HEAD
 1.1.6.1 22-Sep-2015  skrll Sync with HEAD
 1.1.4.3 03-Dec-2017  jdolecek update from HEAD
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file types.h was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.5.16.4 20-Oct-2018  pgoyette Sync with head
 1.5.16.3 28-Jul-2018  pgoyette Sync with HEAD
 1.5.16.2 02-May-2018  pgoyette Synch with HEAD
 1.5.16.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.8.2.2 08-Apr-2020  martin Merge changes from current as of 20200406
 1.8.2.1 10-Jun-2019  christos Sync with HEAD
 1.13.2.1 29-Feb-2020  ad Sync with head.
 1.16.2.1 03-Apr-2021  thorpej Sync with HEAD.
 1.17.2.1 03-Apr-2021  thorpej Sync with HEAD.
 1.1 01-Apr-2018  ryo branches: 1.1.2;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.1.2.2 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.1.2.1 01-Apr-2018  pgoyette file userret.h was added on branch pgoyette-compat on 2018-04-07 04:12:11 +0000
 1.21 30-Jun-2024  jmcneill aarch64: Bump VM_PHYSSEG_MAX to match DRAM_BANKS / FDT_MEMORY_RANGES.

On aarch64 there is a single free list, so VM_PHYSSEG_MAX needs to be
the same as FDT_MEMORY_RANGES (which needs to be the same as DRAM_BANKS).
Future cleanup should be done to fold these into a single define.
 1.20 16-Apr-2023  skrll branches: 1.20.6;
Rename VM_KERNEL_IO_ADDRESS to VM_KERNEL_IO_BASE to match RISC-V

It's less letters, matches other similar variables and will help with
sharing code between the two architectures.

NFCI.
 1.19 02-Apr-2022  skrll branches: 1.19.4;
Update to support EFI runtime outside the kernel virtual address space
by creating an EFI RT pmap that can be activated / deactivated when
required.

Adds support for EFI RT to ARM_MMU_EXTENDED (ASID) 32-bit Arm machines.

On Arm64 the usage of pmapboot_enter is reduced and the mappings are
created much later in the boot process -- now in cpu_startup_hook.
Backward compatiblity for KVA mapped RT from old bootaa64.efi is
maintained.

Adding support to other platforms should be easier as a result.
 1.18 21-Mar-2021  skrll Adjust the kernel virtual address space so that KASAN will map the kernel
seperately from managed kernel virtual memory and not map the unused space
between the two.
 1.17 10-Nov-2020  skrll branches: 1.17.2;
AA64 is not MIPS.

Change all KSEG references to directmap
 1.16 06-Oct-2020  christos branches: 1.16.2;
GC unused MAXTSIZ32
 1.15 23-Sep-2020  skrll Readability of a comment
 1.14 19-Sep-2020  skrll Define VM_KERNEL_VM_{BASE,SIZE} for aarch64 and remove an #ifdef in
fdt/platform.h

NFCI
 1.13 16-Sep-2020  skrll G/C AARCH64_KMEMORY_BASE
 1.12 08-Jul-2020  skrll Fix a comment
 1.11 04-Mar-2020  ryo change kernel vm base address to use more than 256GB of memory. (up to 64TB)

also enlarge KSEG(direct map) region from 512GB to 64TB.
KASAN works ok.

Note: -fasan-shadow-offset=
KASAN_SHADOW_START - (CANONICAL_BASE >> 3) =
0xFFFF400000000000 - (0xFFFF000000000000 >> 3) =
0xDFFF600000000000
 1.10 22-Jan-2020  ad Bump UBC defaults on sparc64 & aarch64, which already have a large pager_map.
 1.9 21-Jan-2020  jmcneill Switch aarch64 to use a single freelist.
 1.8 28-Oct-2018  jmcneill branches: 1.8.6;
Document the VA range reserved for EFI runtime services.
 1.7 12-Oct-2018  ryo add initial support of COMPAT_NETBSD32 on AArch64.
arm ELF32 EABI binaries could be execute in AArch32 state on AArch64. A32 THUMB mode is not supported yet.
 1.6 14-Sep-2018  ryo define VM_KERNEL_IO_SIZE for clarity
 1.5 07-Sep-2018  jmcneill Increase VM_PHYSSEG_MAX to 64
 1.4 12-May-2018  jmcneill branches: 1.4.2;
Increase PAGER_MAP_DEFAULT_SIZE to 512MB (from 16MB)
 1.3 01-Apr-2018  ryo Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)
 1.2 11-Aug-2014  matt branches: 1.2.2; 1.2.20;
Add some definitions for building RUMP libraries with MKCOMPAT.
 1.1 10-Aug-2014  matt Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.2.20.5 26-Nov-2018  pgoyette Sync with HEAD, resolve a couple of conflicts
 1.2.20.4 20-Oct-2018  pgoyette Sync with head
 1.2.20.3 30-Sep-2018  pgoyette Ssync with HEAD
 1.2.20.2 21-May-2018  pgoyette Sync with HEAD
 1.2.20.1 07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.2.2.3 03-Dec-2017  jdolecek update from HEAD
 1.2.2.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.2.2.1 11-Aug-2014  tls file vmparam.h was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000
 1.4.2.2 08-Apr-2020  martin Merge changes from current as of 20200406
 1.4.2.1 10-Jun-2019  christos Sync with HEAD
 1.8.6.1 25-Jan-2020  ad Sync with head.
 1.16.2.2 03-Apr-2021  thorpej Sync with HEAD.
 1.16.2.1 14-Dec-2020  thorpej Sync w/ HEAD.
 1.17.2.1 03-Apr-2021  thorpej Sync with HEAD.
 1.19.4.1 03-Jul-2024  martin Pull up following revision(s) (requested by jmcneill in ticket #735):

sys/dev/pci/pci_resource.c: revision 1.5
sys/arch/arm/pci/pci_msi_machdep.c: revision 1.10
sys/arch/aarch64/include/vmparam.h: revision 1.21
sys/dev/acpi/acpi_resource.c: revision 1.43

pci_resource: Make unexpected bus numbers in bridges non-fatal.

Firmware bugs happen. Log a warning and continue instead of panicing.
acpi: Ignore producer/consumer bit for fixed memory resources.

The requirement to honour the producer/consumer bit in fixed memory
resource descriptors was dropped at some point in a revision to the ACPI
2.0 specification because too many firmware implementations got it wrong.

aarch64: Bump VM_PHYSSEG_MAX to match DRAM_BANKS / FDT_MEMORY_RANGES.

On aarch64 there is a single free list, so VM_PHYSSEG_MAX needs to be
the same as FDT_MEMORY_RANGES (which needs to be the same as DRAM_BANKS).

Future cleanup should be done to fold these into a single define.

arm: pci: Fix ITS ID lookup for MSIs.
pci_get_frameid expects a BDF requestor ID as input, not a Device ID.

Fixes MSI/MSI-X support on Ampere Altra systems.
 1.20.6.1 01-Jul-2024  perseant Sync with HEAD.
 1.1 10-Aug-2014  matt branches: 1.1.4;
Preliminary files for AARCH64 (64-bit ARM) support.
Enough for a distribution build.
 1.1.4.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.4.1 10-Aug-2014  tls file wchar_limits.h was added on branch tls-maxphys on 2014-08-20 00:02:39 +0000

RSS XML Feed