Home | History | Annotate | only in /src/sys/arch/arm/vfp
History log of /src/sys/arch/arm/vfp
RevisionDateAuthorComments
 1.7 17-Oct-2021  skrll Trailing whitespace
 1.6 31-Dec-2012  matt branches: 1.6.2; 1.6.6;
Switch to using vfp_kernel_{acquire,release} so that softints don't cause
the VFP to become disabled.
 1.5 26-Dec-2012  matt Add not-yet-enabled code to use vfp_kernel_{acquire,release}
 1.4 11-Dec-2012  matt Use RET, not bx lr.
Due to evbarm/conf/INTEGRATOR conditional use of pld.
 1.3 11-Dec-2012  matt These contain to just contain bzero_page_vfp and bcopy_page_vfp
 1.2 10-Dec-2012  matt Make sure we can deal with VA != PA but still we need to have all of PA mapped.
 1.1 10-Dec-2012  matt Add code to use VFP(or Neon) instructions to zero or copy a page via
pmap_zero_page and pmap_copy_page. (Not hooked into vfp_init yet).
Requires FPU_VFP
 1.6.6.2 25-Feb-2013  tls resync with head
 1.6.6.1 31-Dec-2012  tls file pmap_vfp.S was added on branch tls-maxphys on 2013-02-25 00:28:32 +0000
 1.6.2.3 23-Jan-2013  yamt sync with head
 1.6.2.2 16-Jan-2013  yamt sync with (a bit old) head
 1.6.2.1 31-Dec-2012  yamt file pmap_vfp.S was added on branch yamt-pagecache on 2013-01-16 05:32:50 +0000
 1.78 20-Aug-2022  riastradh fpu_kern_enter/leave: Disable IPL assertions.

These don't work because mutex_enter/exit on a spin lock may raise an
IPL but not lower it, if another spin lock was already held. For
example,

mutex_enter(some_lock_at_IPL_VM);
printf("foo\n");
fpu_kern_enter();
...
fpu_kern_leave();
mutex_exit(some_lock_at_IPL_VM);

will trigger the panic, because printf takes a lock at IPL_HIGH where
the IPL wil remain until the mutex_exit. (This was a nightmare to
track down before I remembered that detail of spin lock IPL
semantics...)
 1.77 01-Apr-2022  riastradh x86, arm: Allow fpu_kern_enter/leave while cold.

Normally these are forbidden above IPL_VM, so that FPU usage doesn't
block IPL_SCHED or IPL_HIGH interrupts. But while cold, e.g. during
builtin module initialization at boot, all interrupts are blocked
anyway so it's a moot point.

Also initialize x86 cpu_info_primary.ci_kfpu_spl to -1 so we don't
trip over an assertion about it while cold -- the assertion is meant
to detect reentrance into fpu_kern_enter/leave, which is prohibited.

Also initialize cpu0's ci_kfpu_spl.
 1.76 31-Oct-2021  skrll Rework Arm (32bit and 64bit) AP startup so that cpu_hatch doesn't sleep.

The AP initialisation code in cpu_init_secondary_processor will read and
initialise the required system registers and state for the BP to attach
and report.

Rework the interrupt handler code for this new sequence. Thankfully,
this removes a bunch of code for bcm2836mp.

The VFP detection handler on <= armv7 relies on the global undefined
handler being in place until the BP attaches vfp. That is, after the
APs have been spun up.

gicv3_its.c has a serialisation issue which is protected against in
the gicv3_its_cpu_init, which is called from cpu_hatch, with a spin
lock. The serialisation issue needs addressing more completely.

Tested on RPI3, Apple M1, QEMU, and lx2k

Fixes PR port-arm/56264:
diagnostic assertion "l->l_stat == LSONPROC" failed on RPI3
 1.75 17-Oct-2021  skrll Trailing whitespace
 1.74 01-Jun-2021  rin PR port-arm/55790

Fix KASSERT failure with floating-point exception in userland.

Consider the case in which curlwp owns enabled FPU in vfp_handler().
If FPE is raised, we must skip pcu_load(9) rather than just falling
through. Otherwise, KASSERT fires in vfp_state_load(), since curlwp
already owns enabled FPU.

No regression for ATF is introduced.
 1.73 01-Jun-2021  rin PR port-arm/55790

Style fix for clarity, in preparation of main fix.

Replace condition ``curcpu()->ci_pcu_curlwp[PCU_FPU] == curlwp'' with
``curlwp->l_pcu_cpu[PCU_FPU] == curcpu()''. And add KASSERT to check
the two conditions are equivalent, as done for MI pcu code:

https://nxr.netbsd.org/xref/src/sys/kern/subr_pcu.c#323

No functional changes.
 1.72 30-Oct-2020  skrll branches: 1.72.6;
Retire arm_[di]sb in favour of the isb() and dsb(sy) macro invocations.
 1.71 01-Aug-2020  riastradh Add kthread_fpu_enter/exit support to arm.
 1.70 27-Jul-2020  riastradh Enable ChaCha NEON code on armv7 too.

The 4-blocks-at-a-time assembly helper is disabled for now; adapting
it to armv7 is going to be a little annoying with only 16 128-bit
vector registers.

(Should also do a fifth block in the integer registers for 320 bytes
at a time.)
 1.69 25-Jul-2020  riastradh Split aes_impl declarations out into aes_impl.h.

This will make it less painful to add more operations to struct
aes_impl without having to recompile everything that just uses the
block cipher directly or similar.
 1.68 13-Jul-2020  riastradh Use pcu_save_all_on_cpu, not pcu_save.

We don't care what curlwp is here; we care whose state is in the fpu
registers.
 1.67 13-Jul-2020  riastradh Limit arm32 fpu_kern_enter/leave to IPL_VM or below.
 1.66 29-Jun-2020  riastradh New permutation-based AES implementation using ARM NEON.

Also derived from Mike Hamburg's public-domain vpaes code.
 1.65 29-Jun-2020  riastradh Implement fpu_kern_enter/leave for arm32.
 1.64 29-Oct-2019  joerg Explicitly annotate FPU requirements for LLVM MC.

When using GCC, this annotations change the global state, but there is
no push/pop functionality for .fpu to avoid this problem. The state is
local to each inline assembler block with LLVM MC.
 1.63 07-Sep-2019  tnn Cortex A12 is marketed as A17 but has a distinct part number

observed on Rockchip RK3288
 1.62 06-Apr-2019  skrll Install the undefined instruction handlers only once, i.e. when attaching
on the BP.
 1.61 17-Mar-2019  skrll Trailing whitespace
 1.60 27-Jan-2019  pgoyette Merge the [pgoyette-compat] branch
 1.59 15-Aug-2018  skrll Sprinkle #include "opt_cputypes.h"
 1.58 15-Aug-2018  skrll Add __KERNEL_RCSID
 1.57 08-Apr-2018  bouyer branches: 1.57.2;
Remove the call to vfp_fpscr_handler() from vfp_handler(). It actually never
avoids a full FPU switch, and costs a function call and a few tests.

Discussed on port-arm@ on october 2017:
http://mail-index.netbsd.org/port-arm/2017/10/16/msg004411.html
 1.56 02-Mar-2018  christos branches: 1.56.2;
Add more vfp directives for gcc-6
 1.55 16-Oct-2017  bouyer We KASSERT((fregs->vfp_fpexc & VFP_FPEXC_EN) == 0) just before, so
enabled is always false. remove.
 1.54 16-Oct-2017  bouyer In the REENABLE case, make sur the fpexc copy in the pcb also has
VFP_FPEXC_EN set. Otherwise we could trap on every context switch even if
the CPU already has the VFP state.
 1.53 26-May-2017  jmcneill branches: 1.53.2;
Recognize Cortex-A57 FPU, GIC, and Generic Timer.
 1.52 22-Mar-2017  chs in vfp_state_load(), fix backwards logic for fpinst vs. fpinst2.
 1.51 16-Mar-2017  chs allow pcu_save() and pcu_discard() to be called on other threads,
ptrace needs to use it that way.
 1.50 03-Mar-2016  skrll branches: 1.50.2; 1.50.4;
Get the RPI3 working (in aarch32 mode) by recognising Cortex A53 CPUs.
While I'm here add some A57/A72 info as well.

My RPI3 works with FB console - the uart needs some help with its clocks.
 1.49 12-Nov-2015  jmcneill change some register dumps from aprint_verbose to aprint_debug
 1.48 28-Apr-2015  jmcneill isb after writing cpacr, from Andrew Turner
 1.47 23-Mar-2015  matt Fix some inverted return values. Don't return SIGILL if there is an active
FPU exception.
 1.46 20-Mar-2015  matt Remove extra )
 1.45 20-Mar-2015  matt Not only check to see if we own the VFP but that the VFP is enabled.
 1.44 17-Mar-2015  matt Don't try to catch undefined VFP instructions if we own the the FPU.
Let them raise SIGILL.
 1.43 17-Mar-2015  matt If we own the FPU, don't take anymore undefined faults. Instead generate
SIGILLs since we obviously don't understand the instruction.
 1.42 09-Feb-2015  slp Add VFP IDs for QEMU's emulated Cortex-A15.
 1.41 18-Jul-2014  matt branches: 1.41.2; 1.41.4;
fix typo reported in PR/48948
 1.40 15-Jun-2014  matt Cleanup a bit of the init logic.
 1.39 16-May-2014  rmind pcu(9):
- Remove PCU_KERNEL (hi matt!) and significantly simplify the code.
This experimental feature was tried on ARM did not meet the expectations.
It may be revived one day, but it should be done in a much simpler way.
- Add a message structure for xcall function, pass the LWP ower and thus
optimise a race condition: if LWP is discarding its state on a remote CPU,
but another LWP already did it - do not cause an unecessary re-faulting.
- Reduce the variety of flags for PCU operations (only PCU_VALID and
PCU_REENABLE are used now), pass them only to the pcu_state_load().
- Rename pcu_used_p() to pcu_valid_p(); hopefully it is less confusing.
- pcu_save_all_on_cpu: SPL ought to be used here.
- Update and improve the pcu(9) man page; it needs wizd(8) though.
 1.38 06-Apr-2014  matt propogation -> propagation
 1.37 28-Mar-2014  matt branches: 1.37.2;
Various MP changes.
 1.36 18-Mar-2014  matt Enable VFP on MV88SV58XX
 1.35 04-Mar-2014  matt Add a different version vfp_fpscr_changable if FPU_VFP was not defined.
If no FPU was found, reinit vfp_fpscr_changeable/default to values appropriate
for softfloat.
 1.34 03-Mar-2014  matt Query the media and vfp feature registers to determine what our default
mode should be and what bits in the fpscr can be changed.
Print what features are supported:
vfp0 at cpu0: NEON MPE (VFP 3.0+), rounding, NaN propogation, denormals
 1.33 25-Jan-2014  skrll Improve PCU/VFP handling to the point that the atf tests don't trigger
KASSERTs on the Raspberry PI and its arm1176jzf-s.

XXX Need to emulate bounce instructions to get correct exception codes,
XXX etc.
 1.32 24-Jan-2014  skrll Be consistent about setting fpscr for Runfast. No functional change.
 1.31 23-Jan-2014  skrll Fix typo in #define name
 1.30 21-Jan-2014  skrll Typo in comment
 1.29 27-Dec-2013  matt Switch to using FP instructions instead of cp10/11 instructions.
 1.28 14-Dec-2013  matt If we can't enable VFP/VFP2 via the CPACCESS register, bail since there
isn't a VFP.
 1.27 18-Nov-2013  matt Before checking for an exception, make sure we own the VFP.
 1.26 23-Aug-2013  matt Deal with lack of VFP.
 1.25 23-Aug-2013  matt Reap LWP_VFPUSED and use PCU internal tracking.
Add bool vfp_used_p(void);
 1.24 22-Aug-2013  drochner -extend the pcu(9) API by a function which saves all context on the
current CPU, and use it if a CPU is taken offline
-add a bool argument to pcu_discard which tells whether the internal
"LWP has used the coprocessor" flag should be set or reset. The flag
is reported by pcu_used_p(). If set, future accesses should use the
state stored in the PCB. If reset, it should be reset to default.
The former case is useful for setmcontext().
With that, it should not be necessary anymore to manage the "FPU used"
state by an additional MD variable.

approved by matt
 1.23 18-Aug-2013  matt Move parts of cpu.h that are not needed by MI code in <arm/locore.h>
Don't include <machine/cpu.h> or <machine/frame.h>, use <arm/locore.h>
Use <arm/asm.h> instead of <machine/arm.h>
 1.22 03-Aug-2013  matt Add VFP_FPSCR_RN (even though it's 0) just to be explicit.
 1.21 02-Aug-2013  matt Use armreg inlines.
Add exception -> trapsignal code.
 1.20 20-Jun-2013  matt branches: 1.20.2;
Add support for the Cortex-A15 Neon/VFP unit
 1.19 05-Feb-2013  matt Use the mrc form of the vmrs rX, mvfrX instruction to shut up gas.
 1.18 31-Jan-2013  matt Add support for machdep neon_present and id_mvfr sysctls
 1.17 28-Jan-2013  matt Add a machdep.fpu_present sysctl for ld.elf_so to use in ld.so.conf to load
libc_vfp.so.
 1.16 28-Jan-2013  matt Disable bzero_page_vfp and bcopy_page_vfp since it really isn't any faster
than memcpy.
 1.15 31-Dec-2012  matt Always re-enable the VFP when loading for a kernel LWP.
 1.14 31-Dec-2012  matt print the PC of the VFP kernel fault in the panic message.
 1.13 26-Dec-2012  matt Add support for PCU_KERNEL and vfp_kernel_acquire/vfp_kernel_release.
Add an undefined handler to catch NEON instructions.
 1.12 11-Dec-2012  matt Add code to patch pmap_{copy,zero}_page_generic to change calls to
b{copy,zero}_page to b{copy,zero}_page_vfp
 1.11 10-Dec-2012  matt move inlines into FPU_VFP
 1.10 08-Dec-2012  matt On Cortex, make sure to load/save the upper 16 64-FP registers.
When creating a mcontext_t, make sure _UC_ARM_VFP is set.
 1.9 05-Dec-2012  matt For armv7 (cortex), disable access to the upper 16 FP registers (restrict
the register space to 16 64-bit FP registers).
 1.8 05-Dec-2012  matt ARMFPE hasn't compiled since NetBSD 4. Remove it.
Complete support for FPU_VFP.
fpregs now contains vfpreg.
XXX vfpreg only has space for 16 64-bit FP registers though VFPv3 and later
have 32 64-bit FP registers.
 1.7 22-Sep-2012  matt Only use CPACR register for ARM11 and CORTEX cores.
Add VFP ids for other CORTEX CPUs.
 1.6 22-Sep-2012  matt Before testing for VFP, make sure CP10 is enabled. (And CP11 for Neon too).
 1.5 16-Aug-2012  matt branches: 1.5.2;
Add include of <arm/pcb.h>
 1.4 12-Aug-2012  matt Rework VFP support to use PCU.
Add emulation of instruction which save/restore the VFP FPSCR.
Add a sysarch hook to VFP FPSCR manipulation.

[The emulation will be used by libc to store/fetch exception modes and
rounding mode on a per-thread basis.]
 1.3 21-Nov-2009  rmind branches: 1.3.12; 1.3.20;
Use lwp_getpcb() on ARM (and acorn26/32), clean from struct user usage.
 1.2 18-Mar-2009  cegger Ansify function definitions w/o arguments. Generated with sed.
 1.1 15-Mar-2008  rearnsha branches: 1.1.2; 1.1.4; 1.1.6; 1.1.8; 1.1.12; 1.1.20; 1.1.26;
VFP support.
 1.1.26.1 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.1.20.1 28-Apr-2009  skrll Sync with HEAD.
 1.1.12.2 11-Mar-2010  yamt sync with head
 1.1.12.1 04-May-2009  yamt sync with head.
 1.1.8.2 24-Mar-2008  keiichi sync with head.
 1.1.8.1 15-Mar-2008  keiichi file vfp_init.c was added on branch keiichi-mipv6 on 2008-03-24 07:14:54 +0000
 1.1.6.2 23-Mar-2008  matt sync with HEAD
 1.1.6.1 15-Mar-2008  matt file vfp_init.c was added on branch matt-armv6 on 2008-03-23 02:03:56 +0000
 1.1.4.2 21-Mar-2008  chris Sync with head.
 1.1.4.1 15-Mar-2008  chris file vfp_init.c was added on branch chris-arm-intr-rework on 2008-03-21 13:34:41 +0000
 1.1.2.2 17-Mar-2008  yamt sync with head.
 1.1.2.1 15-Mar-2008  yamt file vfp_init.c was added on branch yamt-lazymbuf on 2008-03-17 09:14:15 +0000
 1.3.20.1 28-Nov-2012  matt Merge improved arm support (especially Cortex) from HEAD
including OMAP and BCM53xx support.
 1.3.12.4 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.3.12.3 23-Jan-2013  yamt sync with head
 1.3.12.2 16-Jan-2013  yamt sync with (a bit old) head
 1.3.12.1 30-Oct-2012  yamt sync with head
 1.5.2.5 03-Dec-2017  jdolecek update from HEAD
 1.5.2.4 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.5.2.3 23-Jun-2013  tls resync from head
 1.5.2.2 25-Feb-2013  tls resync with head
 1.5.2.1 20-Nov-2012  tls Resync to 2012-11-19 00:00:00 UTC
 1.20.2.2 18-May-2014  rmind sync with head
 1.20.2.1 28-Aug-2013  rmind sync with head
 1.37.2.1 10-Aug-2014  tls Rebase.
 1.41.4.5 28-Aug-2017  skrll Sync with HEAD
 1.41.4.4 19-Mar-2016  skrll Sync with HEAD
 1.41.4.3 27-Dec-2015  skrll Sync with HEAD (as of 26th Dec)
 1.41.4.2 06-Jun-2015  skrll Sync with HEAD
 1.41.4.1 06-Apr-2015  skrll Sync with HEAD
 1.41.2.3 26-Jul-2017  snj Pull up following revision(s) (requested by jmcneill in ticket #1435):
sys/arch/arm/arm32/cpu.c: 1.113 via patch
sys/arch/arm/broadcom/bcm2835_bsc.c: 1.6 via patch
sys/arch/arm/broadcom/bcm2835_plcom.c: 1.4 via patch
sys/arch/arm/cortex/gtmr.c: 1.18 via patch
sys/arch/arm/include/armreg.h: 1.110 via patch
sys/arch/arm/include/vfpreg.h: 1.15 via patch
sys/arch/arm/vfp/vfp_init.c: 1.50 via patch
sys/arch/evbarm/rpi/rpi_machdep.c: 1.59, 1.70-1.72 via patch
sys/arch/evbarm/rpi/vcprop.h: 1.16
Get the RPI3 working (in aarch32 mode) by recognising Cortex A53 CPUs.
While I'm here add some A57/A72 info as well.
My RPI3 works with FB console - the uart needs some help with its clocks.
--
Do invalidate the cache as RPI2 build with Clang can't fetch the memory
config otherwise.
--
Use the VC property mailbox to request the UART clock rate and use it
appropriately
Newer firmwares use 48MHz
--
Disable BSC0 on Raspberry Pi 3 and Zero W boards.
--
Interrupts are enabled before the timer is configured. Ensure that the
timer is disabled when attaching so it doesn't go crazy between the time
interrupts are enabled and clocks are initialized. My RPI3 makes it
multi-user now.
--
Enable UART0 (PL011) on GPIO header for Raspberry Pi 3 / Zero W
 1.41.2.2 26-Mar-2015  snj Pull up following revision(s) (requested by skrll in ticket #643):
sys/arch/arm/vfp/vfp_init.c: revision 1.47
Fix some inverted return values. Don't return SIGILL if there is an active
FPU exception.
 1.41.2.1 21-Mar-2015  snj Pull up following revision(s) (requested by martin in ticket #621):
sys/arch/arm/vfp/vfp_init.c: revisions 1.43-1.46
If we own the FPU, don't take anymore undefined faults. Instead generate
SIGILLs since we obviously don't understand the instruction.
--
Don't try to catch undefined VFP instructions if we own the the FPU.
Let them raise SIGILL.
--
Not only check to see if we own the VFP but that the VFP is enabled.
--
Remove extra )
 1.50.4.1 21-Apr-2017  bouyer Sync with HEAD
 1.50.2.2 26-Apr-2017  pgoyette Sync with HEAD
 1.50.2.1 20-Mar-2017  pgoyette Sync with HEAD
 1.53.2.1 24-Oct-2017  snj branches: 1.53.2.1.2;
Pull up following revision(s) (requested by bouyer in ticket #326):
sys/arch/arm/vfp/vfp_init.c: revision 1.54-1.55
sys/kern/subr_pcu.c: revision 1.21
PR port-arm/52603:
There is a race here, as seen on arm with FPU:
LWP L is running but not on CPU, has its FPU state on CPU2 which
has not been released yet, so fpexc still has VFP_FPEXC_EN set in the PCB copy.
LWP L is scheduled on CPU1, CPU1 calls cpu_switchto() for L in mi_switch().
cpu_switchto() will set VFP_FPEXC_EN in the FPU's fpexc register per the
PCB fpexc copy.
Before CPU1 calls pcu_switchpoint() for L, CPU2 calls
pcu_do_op(PCU_CMD_SAVE | PCU_CMD_RELEASE) for L because it still holds its
FPU state and wants to load another lwp. This cause VFP_FPEXC_EN to
be cleared in the PCB copy, but not in CPU1's register. L's l_pcu_cpu is
set to NULL.
When CPU1 calls pcu_switchpoint() for L it see l_pcu_cpu is NULL, and doesn't
call the release callback.
Now CPU1 has its FPU enabled but with the wrong FPU state.
Fix by releasing the PCU even if l_pcu_cpu is NULL.
--
In the REENABLE case, make sur the fpexc copy in the pcb also has
VFP_FPEXC_EN set. Otherwise we could trap on every context switch even if
the CPU already has the VFP state.
--
We KASSERT((fregs->vfp_fpexc & VFP_FPEXC_EN) == 0) just before, so
enabled is always false. remove.
 1.53.2.1.2.1 13-Dec-2017  matt Make the VFP is disabled after disabling it.
 1.56.2.2 06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.56.2.1 16-Apr-2018  pgoyette Sync with HEAD, resolve some conflicts
 1.57.2.2 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.57.2.1 10-Jun-2019  christos Sync with HEAD
 1.72.6.1 17-Jun-2021  thorpej Sync w/ HEAD.

RSS XML Feed