Home | History | Annotate | Download | only in include
History log of /src/sys/arch/x86/include/fpu.h
RevisionDateAuthorComments
 1.23  24-Oct-2020  mgorny Issue 64-bit versions of *XSAVE* for 64-bit amd64 programs

When calling FXSAVE, XSAVE, FXRSTOR, ... for 64-bit programs on amd64
use the 64-suffixed variant in order to include the complete FIP/FDP
registers in the x87 area.

The difference between the two variants is that the FXSAVE64 (new)
variant represents FIP/FDP as 64-bit fields (union fp_addr.fa_64),
while the legacy FXSAVE variant uses split fields: 32-bit offset,
16-bit segment and 16-bit reserved field (union fp_addr.fa_32).
The latter implies that the actual addresses are truncated to 32 bits
which is insufficient in modern programs.

The change is applied only to 64-bit programs on amd64. Plain i386
and compat32 continue using plain FXSAVE. Similarly, NVMM is not
changed as I am not familiar with that code.

This is a potentially breaking change. However, I don't think it likely
to actually break anything because the data provided by the old variant
were not meaningful (because of the truncated pointer).
 1.22  15-Oct-2020  mgorny Revert "Merge convert_xmm_s87.c into fpu.c"

I am going to add ATF tests for these two functions, and having them
in a separate file will make it more convenient to build and run them
in userspace.
 1.21  14-Jun-2020  riastradh Use static constant rather than stack memset buffer for zero fpregs.
 1.20  27-Nov-2019  maxv Add a small API for in-kernel FPU operations.

fpu_kern_enter();
/* do FPU stuff */
fpu_kern_leave();
 1.19  12-Oct-2019  maxv Rewrite the FPU code on x86. This greatly simplifies the logic and removes
the dependency on IPL_HIGH. NVMM is updated accordingly. Posted on
port-amd64 a week ago.

Bump the kernel version to 9.99.16.
 1.18  04-Oct-2019  maxv Rename fpu_eagerswitch to fpu_switch, and add fpu_xstate_reload to
simplify.
 1.17  26-Jun-2019  mgorny Implement PT_GETXSTATE and PT_SETXSTATE

Introduce two new ptrace() requests: PT_GETXSTATE and PT_SETXSTATE,
that provide access to the extended (and extensible) set of FPU
registers on amd64 and i386. At the moment, this covers AVX (YMM)
and AVX-512 (ZMM, opmask) registers. It can be easily extended
to cover further register types without breaking backwards
compatibility.

PT_GETXSTATE issues the XSAVE instruction with all kernel-supported
extended components enabled. The data is copied into 'struct xstate'
(which -- unlike the XSAVE area itself -- has stable format
and offsets).

PT_SETXSTATE issues the XRSTOR instruction to restore the register
values from user-provided 'struct xstate'. The function replaces only
the specific XSAVE components that are listed in 'xs_rfbm' field,
making it possible to issue partial updates.

Both syscalls take a 'struct iovec' pointer rather than a direct
argument. This requires the caller to explicitly specify the buffer
size. As a result, existing code will continue to work correctly
when the structure is extended (performing partial reads/updates).
 1.16  19-May-2019  maxv Rename

fpu_save_area_clear -> fpu_clear
fpu_save_area_reset -> fpu_sigreset

Clearer, and reduces a future diff. No real functional change.
 1.15  19-May-2019  maxv Misc changes in the x86 FPU code. Reduces a future diff. No real functional
change.
 1.14  20-Jan-2019  maxv Improvements in NVMM

* Handle the FPU differently, limit the states via the given mask rather
than via XCR0. Align to 64 bytes. Provide an initial gXCR0, to be sure
that XCR0_X87 is set. Reset XSTATE_BV when the state is modified by
the virtualizer, to force a reload from memory.

* Hide RDTSCP.

* Zero-extend RBX/RCX/RDX when handling the NVMM CPUID signature.

* Take ECX and not RCX on MSR instructions.
 1.13  05-Oct-2018  maxv export x86_fpu_mxcsr_mask, fpu_area_save and fpu_area_restore
 1.12  22-Jun-2018  maxv branches: 1.12.2;
Revert jdolecek's changes related to FXSAVE. They just didn't make any
sense and were trying to hide a real bug, which is, that there is for some
reason a wrong stack alignment that causes FXSAVE to fault in
fpuinit_mxcsr_mask. As seen in current-users@ yesterday, rdi % 16 = 8. And
as seen several months ago, as well.

The rest of the changes in XSAVE are wrong too, but I'll let him fix these
ones.
 1.11  20-Jun-2018  jdolecek as a stop-gap, make fpuinit_mxcsr_mask() for native independant of
XSAVE as it should be, only xen case checks the flag now; need to
investigate further why exactly the fault happens for the xen
no-xsave case

pointed out by maxv
 1.10  19-Jun-2018  jdolecek fix FPU initialization on Xen to allow e.g. AVX when supported by hardware;
only use XSAVE when the the CPUID OSXSAVE bit is set, as this seems to be
reliable indication

tested with Xen 4.2.6 DOM0/DOMU on Intel CPU, without and with no-xsave flag,
so should work also on those AMD CPUs, which have XSAVE disabled by default;
also tested with Xen DOM0 4.8.3

fixes PR kern/50332 by Torbjorn Granlund; sorry it took three years to address

XXX pullup netbsd-8
 1.9  14-Jun-2018  maxv Add some code to support eager fpu switch, INTEL-SA-00145. We restore the
FPU state of the lwp right away during context switches. This guarantees
that when the CPU executes in userland, the FPU doesn't contain secrets.

Maybe we also need to clear the FPU in setregs(), not sure about this one.

Can be enabled/disabled via:

machdep.fpu_eager = {0/1}

Not yet turned on automatically on affected CPUs (Intel Family 6).

More generally it would be good to turn it on automatically when XSAVEOPT
is supported, because in this case there is probably a non-negligible
performance gain; but we need to fix PR/52966.
 1.8  23-May-2018  maxv Merge convert_xmm_s87.c into fpu.c. It contains only two functions, that
are used only in fpu.c.
 1.7  03-Nov-2017  maxv branches: 1.7.2;
Fix MXCSR_MASK, it needs to be detected dynamically, otherwise when masking
MXCSR we are losing some features (eg DAZ).
 1.6  25-Feb-2014  dsl branches: 1.6.4; 1.6.6; 1.6.10; 1.6.28;
Add support for saving the AVX-256 ymm registers during FPU context switches.
Add support for the forthcoming AVX-512 registers.
Code compiled with -mavx seems to work, but I've not tested context
switches with live ymm registers.
There is a small cost on fork/exec (a larger area is copied/zerod),
but I don't think the ymm registers are read/written unless they
have been used.
The code use XSAVE on all cpus, I'm not brave enough to enable XSAVEOPT.
 1.5  23-Feb-2014  dsl Add fpu_set_default_cw() and use it in the emulations to set the default
x87 control word.
This means that nothing outside fpu.c cares about the internals of the
fpu save area.
New kernel modules won't load with the old kernel - but that won't matter.
 1.4  15-Feb-2014  dsl Load and save the fpu registers (for copies to/from userspace) using
helper functions in arch/x86/x86/fpu.c
They (hopefully) ensure that we write to the entire buffer and don't load
values that might cause faults in kernel.
Also zero out the 'pad' field of the i386 mcontext fp area that I think
once contained the registers of any Weitek fpu.
Dunno why it wasn't pasrt of the union.
Some of these copies could be removed if the code directly copied the save
area to/from userspace addresses.
 1.3  15-Feb-2014  dsl Remove all references to MDL_USEDFPU and deferred fpu initialisation.
The cost of zeroing the save area on exec is minimal.
This stops the FP registers of a random process being used the first
time an lwp uses the fpu.
sendsig_siginfo() and get_mcontext() now unconditionally copy the FP
registers.
I'll remove the double-copy for signal handlers soon.
get_mcontext() might have been leaking kernel memory to userspace - and
may still do so if i386_use_fxsave is false (short copies).
 1.2  12-Feb-2014  dsl Change i386 to use x86/fpu.c instead of i386/isa/npx.c
This changes the trap10 and trap13 code to call directly into fpu.c,
removing all the code for T_ARITHTRAP, T_XMM and T_FPUNDA from i386/trap.c
Not all of the code thate appeared to handle fpu traps was ever called!
Most of the changes just replace the include of machine/npx.h with x86/fpu.h
(or remove it entirely).
 1.1  11-Feb-2014  dsl Move sys/arch/amd64/amd64/fpu.c and sys/arch/amd64/include/fpu.h
into sys/arch/x86 in preparation for using the same code for i386.
 1.6.28.1  23-Jun-2018  martin Pull up the following, via patch, requested by maxv in ticket #897:

sys/arch/amd64/amd64/locore.S 1.166 (patch)
sys/arch/i386/i386/locore.S 1.157 (patch)
sys/arch/x86/include/cpu.h 1.92 (patch)
sys/arch/x86/include/fpu.h 1.9 (patch)
sys/arch/x86/x86/fpu.c 1.33-1.39 (patch)
sys/arch/x86/x86/identcpu.c 1.72 (patch)
sys/arch/x86/x86/vm_machdep.c 1.34 (patch)
sys/arch/x86/x86/x86_machdep.c 1.116,1.117 (patch)

Support eager fpu switch, to work around INTEL-SA-00145.
Provide a sysctl machdep.fpu_eager, which gets automatically
initialized to 1 on affected CPUs.
 1.6.10.3  03-Dec-2017  jdolecek update from HEAD
 1.6.10.2  20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.6.10.1  25-Feb-2014  tls file fpu.h was added on branch tls-maxphys on 2014-08-20 00:03:29 +0000
 1.6.6.2  22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.6.6.1  25-Feb-2014  yamt file fpu.h was added on branch yamt-pagecache on 2014-05-22 11:40:13 +0000
 1.6.4.2  18-May-2014  rmind sync with head
 1.6.4.1  25-Feb-2014  rmind file fpu.h was added on branch rmind-smpnet on 2014-05-18 17:45:30 +0000
 1.7.2.3  26-Jan-2019  pgoyette Sync with HEAD
 1.7.2.2  20-Oct-2018  pgoyette Sync with head
 1.7.2.1  25-Jun-2018  pgoyette Sync with HEAD
 1.12.2.3  13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.12.2.2  08-Apr-2020  martin Merge changes from current as of 20200406
 1.12.2.1  10-Jun-2019  christos Sync with HEAD

RSS XML Feed