Cross Reference: /src/sys/arch/amd64/include/pcb.h

History log of /src/sys/arch/amd64/include/pcb.h
Revision	Date	Author	Comments
1.35	28-Apr-2025	riastradh	xen: Stop-gap FPU PCB fix; disable Intel AMX for now. Since the custom cpu_uarea_alloc/free are disabled under XENPV, nothing would initialize struct pcb::pcb_savefpu to point either to struct pcb::pcb_savefpusmall, or to a separately allocated large area on machines with Intel AMX TILECFG/TILEDATA requiring it. So the memset in fpu_lwp_fork would crash on null pointer dereference: [ 1.0000030] uvm_fault(0xffffffff8094a300, 0x0, 2) -> e [ 1.0000030] fatal page fault in supervisor mode [ 1.0000030] trap type 6 code 0x2 rip 0xffffffff8062795c cs 0xe030 rflags 0x10202 cr2 0 ilevel 0 rsp 0xffffffff80adad38 [ 1.0000030] curlwp 0xffffffff8078f880 pid 0.0 lowest kstack 0xffffffff80ad62c0 kernel: page fault trap, code=0 Stopped in pid 0.0 (system) at netbsd:memset+0x2c: repe stosq %es:(%rdi) memset() at netbsd:memset+0x2c lwp_create() at netbsd:lwp_create+0x2f1 fork1() at netbsd:fork1+0x42c main() at netbsd:main+0x44f In order to support Intel AMX TILECFG/TILEDATA, or any other CPU extensions that increase the XSAVE area beyond what fits in a single page after struct pcb, we would need to enable the the custom cpu_uarea_alloc/free. Currently that would imply allocating stack guard pages (`redzone') under XENPV; if there's some reason the stack guard pages don't work, we could also push #ifdef XENPV conditionals into cpu_uarea_alloc/free to cover the guard pages -- to be considered. PR kern/59371: Xen domU uvm_fault since FPU state allocation patch PR port-amd64/57661: Crash when booting on Xeon Silver 4416+ in KVM/Qemu
1.34	24-Apr-2025	kre	offsetof() needs <stddef.h> (<sys/stddef.h>) Include <sys/stddef.h> when offsetof() is to be used. First step in fixing x86 builds.
1.33	24-Apr-2025	riastradh	amd64: Allocate FPU save state outside pcb if it's too large. We have seen x86_fpu_save_size values (CPUID[EAX=0x0d, ECX=0].ECX) as large as 11008 bytes, notably with Intel AMX TILEDATA's 8192-byte state. We only do this for user threads, and only on machines where it's necessary, to avoid incurring much overhead. There is still a tiny bit of overhead when saving and restoring the FPU state by using a pointer indirection instead of arithmetic indirection for access to struct pcb::pcb_savefpu, but this is probably a drop in the bucket compared to the memory traffic incurred by the FPU state save/restore anyway. For now, these paths are mostly disabled on i386. We could enable them but it will require either rewriting cpu_uarea_alloc/free for i386, or adopting a guard page like amd64 does, which might be costly and so should be undertaken only with some thought and care. And since Intel AMX instructions only work in 64-bit mode, it's not likely to be useful on i386. PR port-amd64/57661: Crash when booting on Xeon Silver 4416+ in KVM/Qemu These changes, as a side effect, may fix: PR kern/57258: kthread_fpu_enter/exit problem by making sure to allocate an FPU save space that is large enough to guarantee fpu_kern_enter/leave work safely, instead of just using a union savefpu object on the stack (which, at 576 bytes, may be too small on some machines, particularly with AVX512 requiring ~2.5K). (But we'll have to do some extra work with kthread_fpu_enter/exit_md -- if we try doing them again on x86 -- to actually allocate the separate pcb on these machines!)
1.32	17-Mar-2020	maxv	Add a redzone between the pcb and the stack. Sent to port-amd64@.
1.31	12-Oct-2019	christos	disable CTASSERT for lint
1.30	12-Oct-2019	maxv	Rewrite the FPU code on x86. This greatly simplifies the logic and removes the dependency on IPL_HIGH. NVMM is updated accordingly. Posted on port-amd64 a week ago. Bump the kernel version to 9.99.16.
1.29	26-Jul-2018	maxv	Rework dbregs, to switch the registers during context switches, and not on each user->kernel transition via userret. Reloads of DR6/DR7 are expensive on both native and xen.
1.28	31-Dec-2017	maxv	branches: 1.28.2; 1.28.4; gc unused
1.27	31-Oct-2017	maxv	Don't embed our own values in the reserved fields of the XSAVE area, it really is a bad idea. Move them into the PCB.
1.26	23-Feb-2017	kamil	Introduce PT_GETDBREGS and PT_SETDBREGS in ptrace(2) on i386 and amd64 This interface is modeled after FreeBSD API with the usage. This replaced previous watchpoint API. The previous one was introduced recently in NetBSD-current and remove its spurs without any backward-compatibility. Design choices for Debug Register accessors: - exec() (TRAP_EXEC event) must remove debug registers from LWP - debug registers are only per-LWP, not per-process globally - debug registers must not be inherited after (v)forking a process - debug registers must not be inherited after forking a thread - a debugger is responsible to set global watchpoints/breakpoints with the debug registers, to achieve this PTRACE_LWP_CREATE/PTRACE_LWP_EXIT event monitoring function is designed to be used - debug register traps must generate SIGTRAP with si_code TRAP_DBREG - debugger is responsible to retrieve debug register state to distinguish the exact debug register trap (DR6 is Status Register on x86) - kernel must not remove debug register traps after triggering a trap event a debugger is responsible to detach this trap with appropriate PT_SETDBREGS call (DR7 is Control Register on x86) - debug registers must not be exposed in mcontext - userland must not be allowed to set a trap on the kernel Implementation notes on i386 and amd64: - the initial state of debug register is retrieved on boot and this value is stored in a local copy (initdbregs), this value is used to initialize dbreg context after PT_GETDBREGS - struct dbregs is stored in pcb as a pointer and by default not initialized - reserved registers (DR4-DR5, DR9-DR15) are ignored Further ideas: - restrict this interface with securelevel Tested on real hardware i386 (Intel Pentium IV) and amd64 (Intel i7). This commit enables 390 debug register ATF tests in kernel/arch/x86. All tests are passing. This commit does not cover netbsd32 compat code. Currently other interface PT_GET_SIGINFO/PT_SET_SIGINFO is required in netbsd32 compat code in order to validate reliably PT_GETDBREGS/PT_SETDBREGS. This implementation does not cover FreeBSD specific defines in their <x86/reg.h>: DBREG_DR7_LOCAL_ENABLE, DBREG_DR7_GLOBAL_ENABLE, DBREG_DR7_LEN_1 etc. These values tend to be reinvented by each tracer on its own. GNU Debugger (GDB) works with NetBSD debug registers after adding this patch: --- gdb/amd64bsd-nat.c.orig 2016-02-10 03:19:39.000000000 +0000 +++ gdb/amd64bsd-nat.c @@ -167,6 +167,10 @@ amd64bsd_target (void) #ifdef HAVE_PT_GETDBREGS +#ifndef DBREG_DRX +#define DBREG_DRX(d,x) ((d)->dr[(x)]) +#endif + static unsigned long amd64bsd_dr_get (ptid_t ptid, int regnum) { Another reason to stop introducing unpopular defines covering machine specific register macros is that these value varies across generations of the same CPU family. GDB demo: (gdb) c Continuing. Watchpoint 2: traceme Old value = 0 New value = 16 main (argc=1, argv=0x7f7fff79fe30) at test.c:8 8 printf("traceme=%d\n", traceme); (Currently the GDB interface is not reliable due to NetBSD support bugs) Sponsored by <The NetBSD Foundation>
1.25	20-Feb-2014	dsl	branches: 1.25.6; 1.25.10; 1.25.14; Move the amd64 and i386 pcb to the bottom of the uarea, and move the kernel stack to the top. Change the pcb layouts so that fpu save area is at the end and is 64byte aligned ready for xsave (saving the ymm registers). Welcome to 6.99.32
1.24	11-Feb-2014	dsl	Move sys/arch/amd64/amd64/fpu.c and sys/arch/amd64/include/fpu.h into sys/arch/x86 in preparation for using the same code for i386.
1.23	07-Feb-2014	dsl	Convert the amd64 build to use x86/cpu_extended_state.h so that the fpu definitions match those of i386. Mostly just structure and field renames, in addition: 1) process_xmm_to_s87() and process_s87_to_xmm() moved into x86/convert_xmm_s87.c so they can be used by amd64's netbsd32 code. 2) The linux signal code simplified to use a structure copy for ths fxsave data - it matches the hardware definition and won't change.
1.22	19-Jan-2014	dsl	Remove the unused 'struct md_coredump'.
1.21	11-Dec-2013	dsl	Remove the fields that were used to save the i387 fp state on interrupt. They were written but never read. Possibly they should be saved for 32 bit processes, but that might be a relic from real i387 where the fpu was actully asynchronous.
1.20	01-Dec-2013	christos	revert fpu/pcu changes until we figure out what's wrong; they cause random freezes
1.19	23-Oct-2013	drochner	Use the MI "pcu" framework for bookkeeping of npx/fpu states on x86. This reduces the amount of MD code enormously, and makes it easier to implement support for newer CPU features which require more fpu state, or for fpu usage by the kernel. For access to FPU state across CPUs, an xcall kthread is used now rather than a dedicated IPI. No user visible changes intended.
1.18	31-Dec-2012	dsl	branches: 1.18.2; Move the two fields used to save some i387 state on the last fpu trap into their own sub-structure of the pcb (from 'struct savefpu'). They only (seem) to be used in some code that generates core dumps for 32bit processes (code that might be broken as well!). 'struct safefpu' is now identical to 'struct fxsave64'. One (or both) needs extending to support AVX - might need to be dynamically sized. Removed all the __aligned(16) except for the one in struct pcb itself. Only the copy used for the fsave instruction need be aligned.
1.17	07-Jul-2010	chs	branches: 1.17.8; 1.17.18; add the guts of TLS support on amd64. based on joerg's patch, reworked by me to support 32-bit processes as well. we now keep %fs and %gs loaded with the user values while in the kernel, which means we don't need to reload them when returning to user mode.
1.16	27-Oct-2009	rmind	branches: 1.16.2; 1.16.4; Make pcb_ldt_sel, in amd64, an unused field. Unlike in i386, it was missed during clean-up of LDT handling.
1.15	26-Oct-2008	mrg	branches: 1.15.8; put the contents of these header files around #ifdef __x86_64__, and #include the <i386/foo.h> in the #else clause, making these files largely bit-size independant.
1.14	30-Apr-2008	ad	branches: 1.14.6; lcr0() was changed to take a u_long. pcb_cr0 was a 32-bit signed quantity. It was being sign extended in cpu_hatch() (CR0_PG is always set), causing systems to crash and reboot before going multiuser.
1.13	28-Apr-2008	martin	Remove clause 3 and 4 from TNF licenses
1.12	16-Apr-2008	cegger	branches: 1.12.2; 1.12.4; use POSIX integer types
1.11	05-Jan-2008	yamt	branches: 1.11.6; - make amd64 use per-cpu tss. - fix iopl syscall for amd64+xen.
1.10	27-Nov-2007	christos	branches: 1.10.6; Shuffle things around so that pcb_savefpu goes back to be aligned in a 16 bit boundary. Noted by Arto Huusko.
1.9	26-Nov-2007	christos	make cr2 64 bits. Requested by fvdl.
1.8	24-Nov-2007	christos	preserve cr2 on pcb for the benefit of linux emulation.
1.7	17-Oct-2007	garbled	branches: 1.7.2; Merge the ppcoea-renovation branch to HEAD. This branch was a major cleanup and rototill of many of the various OEA cpu based PPC ports that focused on sharing as much code as possible between the various ports to eliminate near-identical copies of files in every tree. Additionally there is a new PIC system that unifies the interface to interrupt code for all different OEA ppc arches. The work for this branch was done by a variety of people, too long to list here. TODO: bebox still needs work to complete the transition to -renovation. ofppc still needs a bunch of work, which I will be looking at. ev64260 still needs to be renovated amigappc was not attempted. NOTES: pmppc was removed as an arch, and moved to a evbppc target.
1.6	17-May-2007	yamt	branches: 1.6.8; 1.6.10; merge yamt-idlelwp branch. asked by core@. some ports still needs work. from doc/BRANCHES: idle lwp, and some changes depending on it. 1. separate context switching and thread scheduling. (cf. gmcgarry_ctxsw) 2. implement idle lwp. 3. clean up related MD/MI interfaces. 4. make scheduler(s) modular.
1.5	04-Mar-2007	christos	branches: 1.5.2; 1.5.4; 1.5.10; Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.
1.4	11-Dec-2005	christos	branches: 1.4.26; merge ktrace-lwp.
1.3	15-May-2005	fvdl	branches: 1.3.2; Optionally include saving and restoring the 64bit %gs and %fs base register values in the PCB. Do this in pmap_activate for now (XXX not a good place for it, but a convenient one).
1.2	07-Aug-2003	agc	Move UCB-licensed code from 4-clause to 3-clause licence. Patches provided by Joel Baker in PR 22364, verified by myself.
1.1	26-Apr-2003	fvdl	branches: 1.1.2; Rename the x86_64 port to amd64, as this is the actual name used for the processor family now. x86_64 is kept as the MACHINE_ARCH value, since it's already widely used (by e.g. the toolchain, etc), and by other operating systems.
1.1.2.4	10-Nov-2005	skrll	Sync with HEAD. Here we go again...
1.1.2.3	21-Sep-2004	skrll	Fix the sync with head I botched.
1.1.2.2	18-Sep-2004	skrll	Sync with HEAD.
1.1.2.1	03-Aug-2004	skrll	Sync with HEAD
1.3.2.3	21-Jan-2008	yamt	sync with head
1.3.2.2	07-Dec-2007	yamt	sync with head
1.3.2.1	03-Sep-2007	yamt	sync with head.
1.4.26.2	12-Mar-2007	rmind	Sync with HEAD.
1.4.26.1	03-Mar-2007	yamt	adapt amd64. XXX changes in identcpu.c is minmum for MONITOR. XXX identcpu.c should be shared with i386.
1.5.10.1	22-May-2007	matt	Update to HEAD.
1.5.4.1	11-Jul-2007	mjf	Sync with head.
1.5.2.2	03-Dec-2007	ad	Sync with HEAD.
1.5.2.1	27-May-2007	ad	Sync with head.
1.6.10.2	09-Jan-2008	matt	sync with HEAD
1.6.10.1	06-Nov-2007	matt	sync with HEAD
1.6.8.1	27-Nov-2007	joerg	Sync with HEAD. amd64 Xen support needs testing.
1.7.2.2	18-Feb-2008	mjf	Sync with HEAD.
1.7.2.1	08-Dec-2007	mjf	Sync with HEAD.
1.10.6.1	08-Jan-2008	bouyer	Sync with HEAD
1.11.6.2	17-Jan-2009	mjf	Sync with HEAD.
1.11.6.1	02-Jun-2008	mjf	Sync with HEAD.
1.12.4.4	11-Aug-2010	yamt	sync with head.
1.12.4.3	11-Mar-2010	yamt	sync with head
1.12.4.2	04-May-2009	yamt	sync with head.
1.12.4.1	16-May-2008	yamt	sync with head.
1.12.2.1	18-May-2008	yamt	sync with head.
1.14.6.1	13-Dec-2008	haad	Update haad-dm branch to haad-dm-base2.
1.15.8.2	24-Oct-2010	jym	Sync with HEAD
1.15.8.1	01-Nov-2009	jym	Sync with HEAD.
1.16.4.1	05-Mar-2011	rmind	sync with head
1.16.2.1	17-Aug-2010	uebayasi	Sync with HEAD.
1.17.18.3	03-Dec-2017	jdolecek	update from HEAD
1.17.18.2	20-Aug-2014	tls	Rebase to HEAD as of a few days ago.
1.17.18.1	25-Feb-2013	tls	resync with head
1.17.8.2	22-May-2014	yamt	sync with head. for a reference, the tree before this commit was tagged as yamt-pagecache-tag8. this commit was splitted into small chunks to avoid a limitation of cvs. ("Protocol error: too many arguments")
1.17.8.1	23-Jan-2013	yamt	sync with head
1.18.2.1	18-May-2014	rmind	sync with head
1.25.14.1	21-Apr-2017	bouyer	Sync with HEAD
1.25.10.1	20-Mar-2017	pgoyette	Sync with HEAD
1.25.6.1	28-Aug-2017	skrll	Sync with HEAD
1.28.4.3	13-Apr-2020	martin	Mostly merge changes from HEAD upto 20200411
1.28.4.2	08-Apr-2020	martin	Merge changes from current as of 20200406
1.28.4.1	10-Jun-2019	christos	Sync with HEAD
1.28.2.1	28-Jul-2018	pgoyette	Sync with HEAD

OpenGrok