History log of /src/sys/arch/amd64/include/pcb.h |
Revision | | Date | Author | Comments |
1.35 |
| 28-Apr-2025 |
riastradh | xen: Stop-gap FPU PCB fix; disable Intel AMX for now.
Since the custom cpu_uarea_alloc/free are disabled under XENPV, nothing would initialize struct pcb::pcb_savefpu to point either to struct pcb::pcb_savefpusmall, or to a separately allocated large area on machines with Intel AMX TILECFG/TILEDATA requiring it. So the memset in fpu_lwp_fork would crash on null pointer dereference:
[ 1.0000030] uvm_fault(0xffffffff8094a300, 0x0, 2) -> e [ 1.0000030] fatal page fault in supervisor mode [ 1.0000030] trap type 6 code 0x2 rip 0xffffffff8062795c cs 0xe030 rflags 0x10202 cr2 0 ilevel 0 rsp 0xffffffff80adad38 [ 1.0000030] curlwp 0xffffffff8078f880 pid 0.0 lowest kstack 0xffffffff80ad62c0 kernel: page fault trap, code=0 Stopped in pid 0.0 (system) at netbsd:memset+0x2c: repe stosq %es:(%rdi) memset() at netbsd:memset+0x2c lwp_create() at netbsd:lwp_create+0x2f1 fork1() at netbsd:fork1+0x42c main() at netbsd:main+0x44f
In order to support Intel AMX TILECFG/TILEDATA, or any other CPU extensions that increase the XSAVE area beyond what fits in a single page after struct pcb, we would need to enable the the custom cpu_uarea_alloc/free. Currently that would imply allocating stack guard pages (`redzone') under XENPV; if there's some reason the stack guard pages don't work, we could also push #ifdef XENPV conditionals into cpu_uarea_alloc/free to cover the guard pages -- to be considered.
PR kern/59371: Xen domU uvm_fault since FPU state allocation patch
PR port-amd64/57661: Crash when booting on Xeon Silver 4416+ in KVM/Qemu
|
1.34 |
| 24-Apr-2025 |
kre | offsetof() needs <stddef.h> (<sys/stddef.h>)
Include <sys/stddef.h> when offsetof() is to be used.
First step in fixing x86 builds.
|
1.33 |
| 24-Apr-2025 |
riastradh | amd64: Allocate FPU save state outside pcb if it's too large.
We have seen x86_fpu_save_size values (CPUID[EAX=0x0d, ECX=0].ECX) as large as 11008 bytes, notably with Intel AMX TILEDATA's 8192-byte state.
We only do this for user threads, and only on machines where it's necessary, to avoid incurring much overhead. There is still a tiny bit of overhead when saving and restoring the FPU state by using a pointer indirection instead of arithmetic indirection for access to struct pcb::pcb_savefpu, but this is probably a drop in the bucket compared to the memory traffic incurred by the FPU state save/restore anyway.
For now, these paths are mostly disabled on i386. We could enable them but it will require either rewriting cpu_uarea_alloc/free for i386, or adopting a guard page like amd64 does, which might be costly and so should be undertaken only with some thought and care. And since Intel AMX instructions only work in 64-bit mode, it's not likely to be useful on i386.
PR port-amd64/57661: Crash when booting on Xeon Silver 4416+ in KVM/Qemu
These changes, as a side effect, may fix:
PR kern/57258: kthread_fpu_enter/exit problem
by making sure to allocate an FPU save space that is large enough to guarantee fpu_kern_enter/leave work safely, instead of just using a union savefpu object on the stack (which, at 576 bytes, may be too small on some machines, particularly with AVX512 requiring ~2.5K). (But we'll have to do some extra work with kthread_fpu_enter/exit_md -- if we try doing them again on x86 -- to actually allocate the separate pcb on these machines!)
|
1.32 |
| 17-Mar-2020 |
maxv | Add a redzone between the pcb and the stack. Sent to port-amd64@.
|
1.31 |
| 12-Oct-2019 |
christos | disable CTASSERT for lint
|
1.30 |
| 12-Oct-2019 |
maxv | Rewrite the FPU code on x86. This greatly simplifies the logic and removes the dependency on IPL_HIGH. NVMM is updated accordingly. Posted on port-amd64 a week ago.
Bump the kernel version to 9.99.16.
|
1.29 |
| 26-Jul-2018 |
maxv | Rework dbregs, to switch the registers during context switches, and not on each user->kernel transition via userret. Reloads of DR6/DR7 are expensive on both native and xen.
|
1.28 |
| 31-Dec-2017 |
maxv | branches: 1.28.2; 1.28.4; gc unused
|
1.27 |
| 31-Oct-2017 |
maxv | Don't embed our own values in the reserved fields of the XSAVE area, it really is a bad idea. Move them into the PCB.
|
1.26 |
| 23-Feb-2017 |
kamil | Introduce PT_GETDBREGS and PT_SETDBREGS in ptrace(2) on i386 and amd64
This interface is modeled after FreeBSD API with the usage.
This replaced previous watchpoint API. The previous one was introduced recently in NetBSD-current and remove its spurs without any backward-compatibility.
Design choices for Debug Register accessors: - exec() (TRAP_EXEC event) must remove debug registers from LWP - debug registers are only per-LWP, not per-process globally - debug registers must not be inherited after (v)forking a process - debug registers must not be inherited after forking a thread - a debugger is responsible to set global watchpoints/breakpoints with the debug registers, to achieve this PTRACE_LWP_CREATE/PTRACE_LWP_EXIT event monitoring function is designed to be used - debug register traps must generate SIGTRAP with si_code TRAP_DBREG - debugger is responsible to retrieve debug register state to distinguish the exact debug register trap (DR6 is Status Register on x86) - kernel must not remove debug register traps after triggering a trap event a debugger is responsible to detach this trap with appropriate PT_SETDBREGS call (DR7 is Control Register on x86) - debug registers must not be exposed in mcontext - userland must not be allowed to set a trap on the kernel
Implementation notes on i386 and amd64: - the initial state of debug register is retrieved on boot and this value is stored in a local copy (initdbregs), this value is used to initialize dbreg context after PT_GETDBREGS - struct dbregs is stored in pcb as a pointer and by default not initialized - reserved registers (DR4-DR5, DR9-DR15) are ignored
Further ideas: - restrict this interface with securelevel
Tested on real hardware i386 (Intel Pentium IV) and amd64 (Intel i7).
This commit enables 390 debug register ATF tests in kernel/arch/x86. All tests are passing.
This commit does not cover netbsd32 compat code. Currently other interface PT_GET_SIGINFO/PT_SET_SIGINFO is required in netbsd32 compat code in order to validate reliably PT_GETDBREGS/PT_SETDBREGS.
This implementation does not cover FreeBSD specific defines in their <x86/reg.h>: DBREG_DR7_LOCAL_ENABLE, DBREG_DR7_GLOBAL_ENABLE, DBREG_DR7_LEN_1 etc. These values tend to be reinvented by each tracer on its own. GNU Debugger (GDB) works with NetBSD debug registers after adding this patch:
--- gdb/amd64bsd-nat.c.orig 2016-02-10 03:19:39.000000000 +0000 +++ gdb/amd64bsd-nat.c @@ -167,6 +167,10 @@ amd64bsd_target (void)
#ifdef HAVE_PT_GETDBREGS
+#ifndef DBREG_DRX +#define DBREG_DRX(d,x) ((d)->dr[(x)]) +#endif + static unsigned long amd64bsd_dr_get (ptid_t ptid, int regnum) {
Another reason to stop introducing unpopular defines covering machine specific register macros is that these value varies across generations of the same CPU family.
GDB demo: (gdb) c Continuing.
Watchpoint 2: traceme
Old value = 0 New value = 16 main (argc=1, argv=0x7f7fff79fe30) at test.c:8 8 printf("traceme=%d\n", traceme);
(Currently the GDB interface is not reliable due to NetBSD support bugs)
Sponsored by <The NetBSD Foundation>
|
1.25 |
| 20-Feb-2014 |
dsl | branches: 1.25.6; 1.25.10; 1.25.14; Move the amd64 and i386 pcb to the bottom of the uarea, and move the kernel stack to the top. Change the pcb layouts so that fpu save area is at the end and is 64byte aligned ready for xsave (saving the ymm registers). Welcome to 6.99.32
|
1.24 |
| 11-Feb-2014 |
dsl | Move sys/arch/amd64/amd64/fpu.c and sys/arch/amd64/include/fpu.h into sys/arch/x86 in preparation for using the same code for i386.
|
1.23 |
| 07-Feb-2014 |
dsl | Convert the amd64 build to use x86/cpu_extended_state.h so that the fpu definitions match those of i386. Mostly just structure and field renames, in addition: 1) process_xmm_to_s87() and process_s87_to_xmm() moved into x86/convert_xmm_s87.c so they can be used by amd64's netbsd32 code. 2) The linux signal code simplified to use a structure copy for ths fxsave data - it matches the hardware definition and won't change.
|
1.22 |
| 19-Jan-2014 |
dsl | Remove the unused 'struct md_coredump'.
|
1.21 |
| 11-Dec-2013 |
dsl | Remove the fields that were used to save the i387 fp state on interrupt. They were written but never read. Possibly they should be saved for 32 bit processes, but that might be a relic from real i387 where the fpu was actully asynchronous.
|
1.20 |
| 01-Dec-2013 |
christos | revert fpu/pcu changes until we figure out what's wrong; they cause random freezes
|
1.19 |
| 23-Oct-2013 |
drochner | Use the MI "pcu" framework for bookkeeping of npx/fpu states on x86. This reduces the amount of MD code enormously, and makes it easier to implement support for newer CPU features which require more fpu state, or for fpu usage by the kernel. For access to FPU state across CPUs, an xcall kthread is used now rather than a dedicated IPI. No user visible changes intended.
|
1.18 |
| 31-Dec-2012 |
dsl | branches: 1.18.2; Move the two fields used to save some i387 state on the last fpu trap into their own sub-structure of the pcb (from 'struct savefpu'). They only (seem) to be used in some code that generates core dumps for 32bit processes (code that might be broken as well!). 'struct safefpu' is now identical to 'struct fxsave64'. One (or both) needs extending to support AVX - might need to be dynamically sized. Removed all the __aligned(16) except for the one in struct pcb itself. Only the copy used for the fsave instruction need be aligned.
|
1.17 |
| 07-Jul-2010 |
chs | branches: 1.17.8; 1.17.18; add the guts of TLS support on amd64. based on joerg's patch, reworked by me to support 32-bit processes as well. we now keep %fs and %gs loaded with the user values while in the kernel, which means we don't need to reload them when returning to user mode.
|
1.16 |
| 27-Oct-2009 |
rmind | branches: 1.16.2; 1.16.4; Make pcb_ldt_sel, in amd64, an unused field. Unlike in i386, it was missed during clean-up of LDT handling.
|
1.15 |
| 26-Oct-2008 |
mrg | branches: 1.15.8; put the contents of these header files around #ifdef __x86_64__, and #include the <i386/foo.h> in the #else clause, making these files largely bit-size independant.
|
1.14 |
| 30-Apr-2008 |
ad | branches: 1.14.6; lcr0() was changed to take a u_long. pcb_cr0 was a 32-bit signed quantity. It was being sign extended in cpu_hatch() (CR0_PG is always set), causing systems to crash and reboot before going multiuser.
|
1.13 |
| 28-Apr-2008 |
martin | Remove clause 3 and 4 from TNF licenses
|
1.12 |
| 16-Apr-2008 |
cegger | branches: 1.12.2; 1.12.4; use POSIX integer types
|
1.11 |
| 05-Jan-2008 |
yamt | branches: 1.11.6; - make amd64 use per-cpu tss. - fix iopl syscall for amd64+xen.
|
1.10 |
| 27-Nov-2007 |
christos | branches: 1.10.6; Shuffle things around so that pcb_savefpu goes back to be aligned in a 16 bit boundary. Noted by Arto Huusko.
|
1.9 |
| 26-Nov-2007 |
christos | make cr2 64 bits. Requested by fvdl.
|
1.8 |
| 24-Nov-2007 |
christos | preserve cr2 on pcb for the benefit of linux emulation.
|
1.7 |
| 17-Oct-2007 |
garbled | branches: 1.7.2; Merge the ppcoea-renovation branch to HEAD.
This branch was a major cleanup and rototill of many of the various OEA cpu based PPC ports that focused on sharing as much code as possible between the various ports to eliminate near-identical copies of files in every tree. Additionally there is a new PIC system that unifies the interface to interrupt code for all different OEA ppc arches. The work for this branch was done by a variety of people, too long to list here.
TODO: bebox still needs work to complete the transition to -renovation. ofppc still needs a bunch of work, which I will be looking at. ev64260 still needs to be renovated amigappc was not attempted.
NOTES: pmppc was removed as an arch, and moved to a evbppc target.
|
1.6 |
| 17-May-2007 |
yamt | branches: 1.6.8; 1.6.10; merge yamt-idlelwp branch. asked by core@. some ports still needs work.
from doc/BRANCHES:
idle lwp, and some changes depending on it.
1. separate context switching and thread scheduling. (cf. gmcgarry_ctxsw) 2. implement idle lwp. 3. clean up related MD/MI interfaces. 4. make scheduler(s) modular.
|
1.5 |
| 04-Mar-2007 |
christos | branches: 1.5.2; 1.5.4; 1.5.10; Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.
|
1.4 |
| 11-Dec-2005 |
christos | branches: 1.4.26; merge ktrace-lwp.
|
1.3 |
| 15-May-2005 |
fvdl | branches: 1.3.2; Optionally include saving and restoring the 64bit %gs and %fs base register values in the PCB. Do this in pmap_activate for now (XXX not a good place for it, but a convenient one).
|
1.2 |
| 07-Aug-2003 |
agc | Move UCB-licensed code from 4-clause to 3-clause licence.
Patches provided by Joel Baker in PR 22364, verified by myself.
|
1.1 |
| 26-Apr-2003 |
fvdl | branches: 1.1.2; Rename the x86_64 port to amd64, as this is the actual name used for the processor family now. x86_64 is kept as the MACHINE_ARCH value, since it's already widely used (by e.g. the toolchain, etc), and by other operating systems.
|
1.1.2.4 |
| 10-Nov-2005 |
skrll | Sync with HEAD. Here we go again...
|
1.1.2.3 |
| 21-Sep-2004 |
skrll | Fix the sync with head I botched.
|
1.1.2.2 |
| 18-Sep-2004 |
skrll | Sync with HEAD.
|
1.1.2.1 |
| 03-Aug-2004 |
skrll | Sync with HEAD
|
1.3.2.3 |
| 21-Jan-2008 |
yamt | sync with head
|
1.3.2.2 |
| 07-Dec-2007 |
yamt | sync with head
|
1.3.2.1 |
| 03-Sep-2007 |
yamt | sync with head.
|
1.4.26.2 |
| 12-Mar-2007 |
rmind | Sync with HEAD.
|
1.4.26.1 |
| 03-Mar-2007 |
yamt | adapt amd64.
XXX changes in identcpu.c is minmum for MONITOR. XXX identcpu.c should be shared with i386.
|
1.5.10.1 |
| 22-May-2007 |
matt | Update to HEAD.
|
1.5.4.1 |
| 11-Jul-2007 |
mjf | Sync with head.
|
1.5.2.2 |
| 03-Dec-2007 |
ad | Sync with HEAD.
|
1.5.2.1 |
| 27-May-2007 |
ad | Sync with head.
|
1.6.10.2 |
| 09-Jan-2008 |
matt | sync with HEAD
|
1.6.10.1 |
| 06-Nov-2007 |
matt | sync with HEAD
|
1.6.8.1 |
| 27-Nov-2007 |
joerg | Sync with HEAD. amd64 Xen support needs testing.
|
1.7.2.2 |
| 18-Feb-2008 |
mjf | Sync with HEAD.
|
1.7.2.1 |
| 08-Dec-2007 |
mjf | Sync with HEAD.
|
1.10.6.1 |
| 08-Jan-2008 |
bouyer | Sync with HEAD
|
1.11.6.2 |
| 17-Jan-2009 |
mjf | Sync with HEAD.
|
1.11.6.1 |
| 02-Jun-2008 |
mjf | Sync with HEAD.
|
1.12.4.4 |
| 11-Aug-2010 |
yamt | sync with head.
|
1.12.4.3 |
| 11-Mar-2010 |
yamt | sync with head
|
1.12.4.2 |
| 04-May-2009 |
yamt | sync with head.
|
1.12.4.1 |
| 16-May-2008 |
yamt | sync with head.
|
1.12.2.1 |
| 18-May-2008 |
yamt | sync with head.
|
1.14.6.1 |
| 13-Dec-2008 |
haad | Update haad-dm branch to haad-dm-base2.
|
1.15.8.2 |
| 24-Oct-2010 |
jym | Sync with HEAD
|
1.15.8.1 |
| 01-Nov-2009 |
jym | Sync with HEAD.
|
1.16.4.1 |
| 05-Mar-2011 |
rmind | sync with head
|
1.16.2.1 |
| 17-Aug-2010 |
uebayasi | Sync with HEAD.
|
1.17.18.3 |
| 03-Dec-2017 |
jdolecek | update from HEAD
|
1.17.18.2 |
| 20-Aug-2014 |
tls | Rebase to HEAD as of a few days ago.
|
1.17.18.1 |
| 25-Feb-2013 |
tls | resync with head
|
1.17.8.2 |
| 22-May-2014 |
yamt | sync with head.
for a reference, the tree before this commit was tagged as yamt-pagecache-tag8.
this commit was splitted into small chunks to avoid a limitation of cvs. ("Protocol error: too many arguments")
|
1.17.8.1 |
| 23-Jan-2013 |
yamt | sync with head
|
1.18.2.1 |
| 18-May-2014 |
rmind | sync with head
|
1.25.14.1 |
| 21-Apr-2017 |
bouyer | Sync with HEAD
|
1.25.10.1 |
| 20-Mar-2017 |
pgoyette | Sync with HEAD
|
1.25.6.1 |
| 28-Aug-2017 |
skrll | Sync with HEAD
|
1.28.4.3 |
| 13-Apr-2020 |
martin | Mostly merge changes from HEAD upto 20200411
|
1.28.4.2 |
| 08-Apr-2020 |
martin | Merge changes from current as of 20200406
|
1.28.4.1 |
| 10-Jun-2019 |
christos | Sync with HEAD
|
1.28.2.1 |
| 28-Jul-2018 |
pgoyette | Sync with HEAD
|