Cross Reference: /src/sys/arch/x86/include/fpu.h

History log of /src/sys/arch/x86/include/fpu.h
Revision	Date	Author	Comments
1.23	24-Oct-2020	mgorny	Issue 64-bit versions of XSAVE for 64-bit amd64 programs When calling FXSAVE, XSAVE, FXRSTOR, ... for 64-bit programs on amd64 use the 64-suffixed variant in order to include the complete FIP/FDP registers in the x87 area. The difference between the two variants is that the FXSAVE64 (new) variant represents FIP/FDP as 64-bit fields (union fp_addr.fa_64), while the legacy FXSAVE variant uses split fields: 32-bit offset, 16-bit segment and 16-bit reserved field (union fp_addr.fa_32). The latter implies that the actual addresses are truncated to 32 bits which is insufficient in modern programs. The change is applied only to 64-bit programs on amd64. Plain i386 and compat32 continue using plain FXSAVE. Similarly, NVMM is not changed as I am not familiar with that code. This is a potentially breaking change. However, I don't think it likely to actually break anything because the data provided by the old variant were not meaningful (because of the truncated pointer).
1.22	15-Oct-2020	mgorny	Revert "Merge convert_xmm_s87.c into fpu.c" I am going to add ATF tests for these two functions, and having them in a separate file will make it more convenient to build and run them in userspace.
1.21	14-Jun-2020	riastradh	Use static constant rather than stack memset buffer for zero fpregs.
1.20	27-Nov-2019	maxv	Add a small API for in-kernel FPU operations. fpu_kern_enter(); /* do FPU stuff */ fpu_kern_leave();
1.19	12-Oct-2019	maxv	Rewrite the FPU code on x86. This greatly simplifies the logic and removes the dependency on IPL_HIGH. NVMM is updated accordingly. Posted on port-amd64 a week ago. Bump the kernel version to 9.99.16.
1.18	04-Oct-2019	maxv	Rename fpu_eagerswitch to fpu_switch, and add fpu_xstate_reload to simplify.
1.17	26-Jun-2019	mgorny	Implement PT_GETXSTATE and PT_SETXSTATE Introduce two new ptrace() requests: PT_GETXSTATE and PT_SETXSTATE, that provide access to the extended (and extensible) set of FPU registers on amd64 and i386. At the moment, this covers AVX (YMM) and AVX-512 (ZMM, opmask) registers. It can be easily extended to cover further register types without breaking backwards compatibility. PT_GETXSTATE issues the XSAVE instruction with all kernel-supported extended components enabled. The data is copied into 'struct xstate' (which -- unlike the XSAVE area itself -- has stable format and offsets). PT_SETXSTATE issues the XRSTOR instruction to restore the register values from user-provided 'struct xstate'. The function replaces only the specific XSAVE components that are listed in 'xs_rfbm' field, making it possible to issue partial updates. Both syscalls take a 'struct iovec' pointer rather than a direct argument. This requires the caller to explicitly specify the buffer size. As a result, existing code will continue to work correctly when the structure is extended (performing partial reads/updates).
1.16	19-May-2019	maxv	Rename fpu_save_area_clear -> fpu_clear fpu_save_area_reset -> fpu_sigreset Clearer, and reduces a future diff. No real functional change.
1.15	19-May-2019	maxv	Misc changes in the x86 FPU code. Reduces a future diff. No real functional change.
1.14	20-Jan-2019	maxv	Improvements in NVMM * Handle the FPU differently, limit the states via the given mask rather than via XCR0. Align to 64 bytes. Provide an initial gXCR0, to be sure that XCR0_X87 is set. Reset XSTATE_BV when the state is modified by the virtualizer, to force a reload from memory. * Hide RDTSCP. * Zero-extend RBX/RCX/RDX when handling the NVMM CPUID signature. * Take ECX and not RCX on MSR instructions.
1.13	05-Oct-2018	maxv	export x86_fpu_mxcsr_mask, fpu_area_save and fpu_area_restore
1.12	22-Jun-2018	maxv	branches: 1.12.2; Revert jdolecek's changes related to FXSAVE. They just didn't make any sense and were trying to hide a real bug, which is, that there is for some reason a wrong stack alignment that causes FXSAVE to fault in fpuinit_mxcsr_mask. As seen in current-users@ yesterday, rdi % 16 = 8. And as seen several months ago, as well. The rest of the changes in XSAVE are wrong too, but I'll let him fix these ones.
1.11	20-Jun-2018	jdolecek	as a stop-gap, make fpuinit_mxcsr_mask() for native independant of XSAVE as it should be, only xen case checks the flag now; need to investigate further why exactly the fault happens for the xen no-xsave case pointed out by maxv
1.10	19-Jun-2018	jdolecek	fix FPU initialization on Xen to allow e.g. AVX when supported by hardware; only use XSAVE when the the CPUID OSXSAVE bit is set, as this seems to be reliable indication tested with Xen 4.2.6 DOM0/DOMU on Intel CPU, without and with no-xsave flag, so should work also on those AMD CPUs, which have XSAVE disabled by default; also tested with Xen DOM0 4.8.3 fixes PR kern/50332 by Torbjorn Granlund; sorry it took three years to address XXX pullup netbsd-8
1.9	14-Jun-2018	maxv	Add some code to support eager fpu switch, INTEL-SA-00145. We restore the FPU state of the lwp right away during context switches. This guarantees that when the CPU executes in userland, the FPU doesn't contain secrets. Maybe we also need to clear the FPU in setregs(), not sure about this one. Can be enabled/disabled via: machdep.fpu_eager = {0/1} Not yet turned on automatically on affected CPUs (Intel Family 6). More generally it would be good to turn it on automatically when XSAVEOPT is supported, because in this case there is probably a non-negligible performance gain; but we need to fix PR/52966.
1.8	23-May-2018	maxv	Merge convert_xmm_s87.c into fpu.c. It contains only two functions, that are used only in fpu.c.
1.7	03-Nov-2017	maxv	branches: 1.7.2; Fix MXCSR_MASK, it needs to be detected dynamically, otherwise when masking MXCSR we are losing some features (eg DAZ).
1.6	25-Feb-2014	dsl	branches: 1.6.4; 1.6.6; 1.6.10; 1.6.28; Add support for saving the AVX-256 ymm registers during FPU context switches. Add support for the forthcoming AVX-512 registers. Code compiled with -mavx seems to work, but I've not tested context switches with live ymm registers. There is a small cost on fork/exec (a larger area is copied/zerod), but I don't think the ymm registers are read/written unless they have been used. The code use XSAVE on all cpus, I'm not brave enough to enable XSAVEOPT.
1.5	23-Feb-2014	dsl	Add fpu_set_default_cw() and use it in the emulations to set the default x87 control word. This means that nothing outside fpu.c cares about the internals of the fpu save area. New kernel modules won't load with the old kernel - but that won't matter.
1.4	15-Feb-2014	dsl	Load and save the fpu registers (for copies to/from userspace) using helper functions in arch/x86/x86/fpu.c They (hopefully) ensure that we write to the entire buffer and don't load values that might cause faults in kernel. Also zero out the 'pad' field of the i386 mcontext fp area that I think once contained the registers of any Weitek fpu. Dunno why it wasn't pasrt of the union. Some of these copies could be removed if the code directly copied the save area to/from userspace addresses.
1.3	15-Feb-2014	dsl	Remove all references to MDL_USEDFPU and deferred fpu initialisation. The cost of zeroing the save area on exec is minimal. This stops the FP registers of a random process being used the first time an lwp uses the fpu. sendsig_siginfo() and get_mcontext() now unconditionally copy the FP registers. I'll remove the double-copy for signal handlers soon. get_mcontext() might have been leaking kernel memory to userspace - and may still do so if i386_use_fxsave is false (short copies).
1.2	12-Feb-2014	dsl	Change i386 to use x86/fpu.c instead of i386/isa/npx.c This changes the trap10 and trap13 code to call directly into fpu.c, removing all the code for T_ARITHTRAP, T_XMM and T_FPUNDA from i386/trap.c Not all of the code thate appeared to handle fpu traps was ever called! Most of the changes just replace the include of machine/npx.h with x86/fpu.h (or remove it entirely).
1.1	11-Feb-2014	dsl	Move sys/arch/amd64/amd64/fpu.c and sys/arch/amd64/include/fpu.h into sys/arch/x86 in preparation for using the same code for i386.
1.6.28.1	23-Jun-2018	martin	Pull up the following, via patch, requested by maxv in ticket #897: sys/arch/amd64/amd64/locore.S 1.166 (patch) sys/arch/i386/i386/locore.S 1.157 (patch) sys/arch/x86/include/cpu.h 1.92 (patch) sys/arch/x86/include/fpu.h 1.9 (patch) sys/arch/x86/x86/fpu.c 1.33-1.39 (patch) sys/arch/x86/x86/identcpu.c 1.72 (patch) sys/arch/x86/x86/vm_machdep.c 1.34 (patch) sys/arch/x86/x86/x86_machdep.c 1.116,1.117 (patch) Support eager fpu switch, to work around INTEL-SA-00145. Provide a sysctl machdep.fpu_eager, which gets automatically initialized to 1 on affected CPUs.
1.6.10.3	03-Dec-2017	jdolecek	update from HEAD
1.6.10.2	20-Aug-2014	tls	Rebase to HEAD as of a few days ago.
1.6.10.1	25-Feb-2014	tls	file fpu.h was added on branch tls-maxphys on 2014-08-20 00:03:29 +0000
1.6.6.2	22-May-2014	yamt	sync with head. for a reference, the tree before this commit was tagged as yamt-pagecache-tag8. this commit was splitted into small chunks to avoid a limitation of cvs. ("Protocol error: too many arguments")
1.6.6.1	25-Feb-2014	yamt	file fpu.h was added on branch yamt-pagecache on 2014-05-22 11:40:13 +0000
1.6.4.2	18-May-2014	rmind	sync with head
1.6.4.1	25-Feb-2014	rmind	file fpu.h was added on branch rmind-smpnet on 2014-05-18 17:45:30 +0000
1.7.2.3	26-Jan-2019	pgoyette	Sync with HEAD
1.7.2.2	20-Oct-2018	pgoyette	Sync with head
1.7.2.1	25-Jun-2018	pgoyette	Sync with HEAD
1.12.2.3	13-Apr-2020	martin	Mostly merge changes from HEAD upto 20200411
1.12.2.2	08-Apr-2020	martin	Merge changes from current as of 20200406
1.12.2.1	10-Jun-2019	christos	Sync with HEAD

OpenGrok