Home | History | Annotate | Download | only in include
History log of /src/sys/arch/x86/include/cpu_extended_state.h
RevisionDateAuthorComments
 1.19  24-Apr-2025  riastradh amd64: Allocate FPU save state outside pcb if it's too large.

We have seen x86_fpu_save_size values (CPUID[EAX=0x0d, ECX=0].ECX) as
large as 11008 bytes, notably with Intel AMX TILEDATA's 8192-byte
state.

We only do this for user threads, and only on machines where it's
necessary, to avoid incurring much overhead. There is still a tiny
bit of overhead when saving and restoring the FPU state by using a
pointer indirection instead of arithmetic indirection for access to
struct pcb::pcb_savefpu, but this is probably a drop in the bucket
compared to the memory traffic incurred by the FPU state save/restore
anyway.

For now, these paths are mostly disabled on i386. We could enable
them but it will require either rewriting cpu_uarea_alloc/free for
i386, or adopting a guard page like amd64 does, which might be costly
and so should be undertaken only with some thought and care. And
since Intel AMX instructions only work in 64-bit mode, it's not
likely to be useful on i386.

PR port-amd64/57661: Crash when booting on Xeon Silver 4416+ in
KVM/Qemu

These changes, as a side effect, may fix:

PR kern/57258: kthread_fpu_enter/exit problem

by making sure to allocate an FPU save space that is large enough to
guarantee fpu_kern_enter/leave work safely, instead of just using a
union savefpu object on the stack (which, at 576 bytes, may be too
small on some machines, particularly with AVX512 requiring ~2.5K).
(But we'll have to do some extra work with kthread_fpu_enter/exit_md
-- if we try doing them again on x86 -- to actually allocate the
separate pcb on these machines!)
 1.18  25-Feb-2023  riastradh branches: 1.18.6;
x86: Mitigate MXCSR Configuration Dependent Timing in kernel FPU use.

In fpu_kern_enter, make sure all the MXCSR exception status bits are
set when we start using the FPU, so that instructions which exhibit
MCDT are unaffected by it.

While here, zero all the other FPU registers in fpu_kern_enter.

In principle we could skip this step on future CPUs that fix the MCDT
bug, but there's probably not much benefit -- workloads that do a lot
of crypto in the kernel are probably better off using
kthread_fpu_enter or WQ_FPU to skip the fpu_kern_enter/leave cycles
in the first place.

For details, see:
https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/best-practices/mxcsr-configuration-dependent-timing.html
 1.17  26-Jun-2019  mgorny branches: 1.17.28;
Implement PT_GETXSTATE and PT_SETXSTATE

Introduce two new ptrace() requests: PT_GETXSTATE and PT_SETXSTATE,
that provide access to the extended (and extensible) set of FPU
registers on amd64 and i386. At the moment, this covers AVX (YMM)
and AVX-512 (ZMM, opmask) registers. It can be easily extended
to cover further register types without breaking backwards
compatibility.

PT_GETXSTATE issues the XSAVE instruction with all kernel-supported
extended components enabled. The data is copied into 'struct xstate'
(which -- unlike the XSAVE area itself -- has stable format
and offsets).

PT_SETXSTATE issues the XRSTOR instruction to restore the register
values from user-provided 'struct xstate'. The function replaces only
the specific XSAVE components that are listed in 'xs_rfbm' field,
making it possible to issue partial updates.

Both syscalls take a 'struct iovec' pointer rather than a direct
argument. This requires the caller to explicitly specify the buffer
size. As a result, existing code will continue to work correctly
when the structure is extended (performing partial reads/updates).
 1.16  23-May-2018  maxv branches: 1.16.2;
Clean up the FPU headers.
 1.15  08-Nov-2017  maxv branches: 1.15.2;
remove vestige
 1.14  31-Oct-2017  maxv Remove outdated comment.
 1.13  31-Oct-2017  maxv Don't embed our own values in the reserved fields of the XSAVE area, it
really is a bad idea. Move them into the PCB.
 1.12  31-Oct-2017  maxv Add xsh_xcomp_bv and fx_zero, and use uint8_t instead.
 1.11  10-Aug-2017  maxv Remove the svr4/ibcs2 fpu flags.
 1.10  18-Aug-2016  maxv KNF and simplify.
 1.9  25-Feb-2014  dsl branches: 1.9.4; 1.9.6; 1.9.10; 1.9.12;
Add support for saving the AVX-256 ymm registers during FPU context switches.
Add support for the forthcoming AVX-512 registers.
Code compiled with -mavx seems to work, but I've not tested context
switches with live ymm registers.
There is a small cost on fork/exec (a larger area is copied/zerod),
but I don't think the ymm registers are read/written unless they
have been used.
The code use XSAVE on all cpus, I'm not brave enough to enable XSAVEOPT.
 1.8  18-Feb-2014  dsl It seems that firefox includes machine/fpu.h on amd64.
Add the file back so that the firwfox source doesn't have to depend
on the version of netbsd it is being compiled for.
(The i386 version doesn't play the same games in its SIGFPE handler.)
 1.7  15-Feb-2014  dsl Remove all references to MDL_USEDFPU and deferred fpu initialisation.
The cost of zeroing the save area on exec is minimal.
This stops the FP registers of a random process being used the first
time an lwp uses the fpu.
sendsig_siginfo() and get_mcontext() now unconditionally copy the FP
registers.
I'll remove the double-copy for signal handlers soon.
get_mcontext() might have been leaking kernel memory to userspace - and
may still do so if i386_use_fxsave is false (short copies).
 1.6  13-Feb-2014  dsl Check the argument types for the fpu asm functions.
 1.5  12-Feb-2014  dsl Change i386 to use x86/fpu.c instead of i386/isa/npx.c
This changes the trap10 and trap13 code to call directly into fpu.c,
removing all the code for T_ARITHTRAP, T_XMM and T_FPUNDA from i386/trap.c
Not all of the code thate appeared to handle fpu traps was ever called!
Most of the changes just replace the include of machine/npx.h with x86/fpu.h
(or remove it entirely).
 1.4  09-Feb-2014  dsl Add compatibility for some userspace code (eg firefox) that seems to look
inside the ucontext structure passed to signal handlers to modify the
xmm registers.
This should make the code compile - I'm not at all sure it works as expected,
the interactions between FP and signal handlers aren't at all clear.
AFAICT the FP state is saved on the user stack when the handler is called,
however the FP trap code can already done odd things to the FPU....
 1.3  08-Feb-2014  dsl Add bit defs for more of the x87 status register.
 1.2  07-Feb-2014  dsl Convert the amd64 build to use x86/cpu_extended_state.h so that the fpu
definitions match those of i386.
Mostly just structure and field renames, in addition:
1) process_xmm_to_s87() and process_s87_to_xmm() moved into
x86/convert_xmm_s87.c so they can be used by amd64's netbsd32 code.
2) The linux signal code simplified to use a structure copy for ths fxsave
data - it matches the hardware definition and won't change.
 1.1  07-Feb-2014  dsl Move all the hardware register layout for the x86 cpus into a header
that can also be used by amd64.
Add in skeleton definitions for XSAVE and AVX.
Update some comments to match reality.
 1.9.12.2  28-Aug-2017  skrll Sync with HEAD
 1.9.12.1  05-Oct-2016  skrll Sync with HEAD
 1.9.10.3  03-Dec-2017  jdolecek update from HEAD
 1.9.10.2  20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.9.10.1  25-Feb-2014  tls file cpu_extended_state.h was added on branch tls-maxphys on 2014-08-20 00:03:29 +0000
 1.9.6.2  22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.9.6.1  25-Feb-2014  yamt file cpu_extended_state.h was added on branch yamt-pagecache on 2014-05-22 11:40:13 +0000
 1.9.4.2  18-May-2014  rmind sync with head
 1.9.4.1  25-Feb-2014  rmind file cpu_extended_state.h was added on branch rmind-smpnet on 2014-05-18 17:45:30 +0000
 1.15.2.1  25-Jun-2018  pgoyette Sync with HEAD
 1.16.2.1  13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.17.28.1  25-Jul-2023  martin Pull up following revision(s) (requested by riastradh in ticket #244):

sys/arch/x86/x86/fpu.c: revision 1.80
sys/arch/x86/include/cpu_extended_state.h: revision 1.18

x86: Mitigate MXCSR Configuration Dependent Timing in kernel FPU use.

In fpu_kern_enter, make sure all the MXCSR exception status bits are
set when we start using the FPU, so that instructions which exhibit
MCDT are unaffected by it.

While here, zero all the other FPU registers in fpu_kern_enter.
In principle we could skip this step on future CPUs that fix the MCDT
bug, but there's probably not much benefit -- workloads that do a lot
of crypto in the kernel are probably better off using
kthread_fpu_enter or WQ_FPU to skip the fpu_kern_enter/leave cycles
in the first place.

For details, see:
https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/best-practices/mxcsr-configuration-dependent-timing.html
 1.18.6.1  02-Aug-2025  perseant Sync with HEAD

RSS XML Feed