Home | History | Annotate | Download | only in include
History log of /src/sys/arch/amd64/include/frameasm.h
RevisionDateAuthorComments
 1.55  30-Jul-2022  riastradh x86: Eliminate mfence hotpatch for membar_sync.

The more-compatible LOCK ADD $0,-N(%rsp) turns out to be cheaper
than MFENCE anyway. Let's save some space and maintenance and rip
out the hotpatching for it.
 1.54  09-Apr-2022  riastradh x86: Every load is a load-acquire, so membar_consumer is a noop.

lfence is only needed for MD logic, such as operations on I/O memory
rather than normal cacheable memory, or special instructions like
RDTSC -- never for MI synchronization between threads/CPUs. No need
for hot-patching to do lfence here.

(The x86_lfence function might reasonably be patched on i386 to do
lfence for MD logic, but it isn't now and this doesn't change that.)
 1.53  17-Apr-2021  rillig sys/arch/amd64: remove trailing whitespace
 1.52  19-Jul-2020  maxv Revert most of ad's movs/stos change. Instead do a lot simpler: declare
svs_quad_copy() used by SVS only, with no need for instrumentation, because
SVS is disabled when sanitizers are on.
 1.51  21-Jun-2020  bouyer Fix comment
 1.50  01-Jun-2020  ad Reported-by: syzbot+6dd5a230d19f0cbc7814@syzkaller.appspotmail.com

Instrument STOS/MOVS for KMSAN to unbreak it.
 1.49  26-Apr-2020  maxv Use the hotpatch framework for LFENCE/MFENCE.
 1.48  25-Apr-2020  bouyer Merge the bouyer-xenpvh branch, bringing in Xen PV drivers support under HVM
guests in GENERIC.
Xen support can be disabled at runtime with
boot -c
disable hypervisor
 1.47  17-Nov-2019  maxv branches: 1.47.6;
Disable KCOV - by raising the interrupt level - in the TLB IPI handler,
because this is only noise.
 1.46  14-Nov-2019  maxv Add support for Kernel Memory Sanitizer (kMSan). It detects uninitialized
memory used by the kernel at run time, and just like kASan and kCSan, it
is an excellent feature. It has already detected 38 uninitialized variables
in the kernel during my testing, which I have since discreetly fixed.

We use two shadows:
- "shad", to track uninitialized memory with a bit granularity (1:1).
Each bit set to 1 in the shad corresponds to one uninitialized bit of
real kernel memory.
- "orig", to track the origin of the memory with a 4-byte granularity
(1:1). Each uint32_t cell in the orig indicates the origin of the
associated uint32_t of real kernel memory.

The memory consumption of these shadows is consequent, so at least 4GB of
RAM is recommended to run kMSan.

The compiler inserts calls to specific __msan_* functions on each memory
access, to manage both the shad and the orig and detect uninitialized
memory accesses that change the execution flow (like an "if" on an
uninitialized variable).

We mark as uninit several types of memory buffers (stack, pools, kmem,
malloc, uvm_km), and check each buffer passed to copyout, copyoutstr,
bwrite, if_transmit_lock and DMA operations, to detect uninitialized memory
that leaves the system. This allows us to detect kernel info leaks in a way
that is more efficient and also more user-friendly than KLEAK.

Contrary to kASan, kMSan requires comprehensive coverage, ie we cannot
tolerate having one non-instrumented function, because this could cause
false positives. kMSan cannot instrument ASM functions, so I converted
most of them to __asm__ inlines, which kMSan is able to instrument. Those
that remain receive special treatment.

Contrary to kASan again, kMSan uses a TLS, so we must context-switch this
TLS during interrupts. We use different contexts depending on the interrupt
level.

The orig tracks precisely the origin of a buffer. We use a special encoding
for the orig values, and pack together in each uint32_t cell of the orig:
- a code designating the type of memory (Stack, Pool, etc), and
- a compressed pointer, which points either (1) to a string containing
the name of the variable associated with the cell, or (2) to an area
in the kernel .text section which we resolve to a symbol name + offset.

This encoding allows us not to consume extra memory for associating
information with each cell, and produces a precise output, that can tell
for example the name of an uninitialized variable on the stack, the
function in which it was pushed on the stack, and the function where we
accessed this uninitialized variable.

kMSan is available with LLVM, but not with GCC.

The code is organized in a way that is similar to kASan and kCSan, so it
means that other architectures than amd64 can be supported.
 1.45  12-Oct-2019  maxv Rewrite the FPU code on x86. This greatly simplifies the logic and removes
the dependency on IPL_HIGH. NVMM is updated accordingly. Posted on
port-amd64 a week ago.

Bump the kernel version to 9.99.16.
 1.44  18-May-2019  maxv Two changes in the CPU mitigations:

* Micro-optimize: put every mitigation in the same branch. This removes
two branches in each exc/int return path, and removes all branches in
the syscall return path.

* Modify the SpectreV2 mitigation to be compatible with SpectreV4. I
recently realized that both couldn't be enabled at the same time on
Intel. This is because initially, when there was just SpectreV2, we
could reset the whole IA32_SPEC_CTRL MSR. But then Intel added another
bit in it for SpectreV4, so it isn't right to reset it entirely
anymore. SSBD needs to stay.
 1.43  14-May-2019  maxv Mitigation for INTEL-SA-00233: Microarchitectural Data Sampling (MDS).

It requires a microcode update, now available on the Intel website. The
microcode modifies the behavior of the VERW instruction, and makes it flush
internal CPU buffers. We hotpatch the return-to-userland path to add VERW.

Two sysctls are added:

machdep.mds.mitigated = {0/1} user-settable
machdep.mds.method = {string} constructed by the kernel

The kernel will automatically enable the mitigation if the updated
microcode is present. If the new microcode is not present, the user can
load it via cpuctl, and set machdep.mds.mitigated=1.
 1.42  11-Feb-2019  cherry We reorganise definitions for XEN source support as follows:

XEN - common sources required for baseline XEN support.
XENPV - sources required for support of XEN in PV mode.
XENPVHVM - sources required for support for XEN in HVM mode.
XENPVH - sources required for support for XEN in PVH mode.
 1.41  12-Aug-2018  maxv Move the PCPU area from slot 384 to slot 510, to avoid creating too much
fragmentation in the slot space (384 is in the middle of the kernel half
of the VA).
 1.40  13-Jul-2018  martin Provide empty SVS_ENTER_NMI/SVS_LEAVE_NMI for kernels w/o options SVS
 1.39  12-Jul-2018  maxv Handle NMIs correctly when SVS is enabled. We store the kernel's CR3 at the
top of the NMI stack, and we unconditionally switch to it, because we don't
know with which page tables we received the NMI. Hotpatch the whole thing as
usual.

This restores the ability to use PMCs on Intel CPUs.
 1.38  28-Mar-2018  maxv branches: 1.38.2;
Add the IBRS mitigation for SpectreV2 on amd64.

Different operations are performed during context transitions:

user->kernel: IBRS <- 1
kernel->user: IBRS <- 0

And during context switches:

user->user: IBPB <- 0
kernel->user: IBPB <- 0
[user->kernel:IBPB <- 0 this one may not be needed]

We use two macros, IBRS_ENTER and IBRS_LEAVE, to set the IBRS bit. The
thing is hotpatched for better performance, like SVS.

The idea is that IBRS is a "privileged" bit, which is set to 1 in kernel
mode and 0 in user mode. To protect the branch predictor between user
processes (which are of the same privilege), we use the IBPB barrier.

The Intel manual also talks about (MWAIT/HLT)+HyperThreading, and says
that when using either of the two instructions IBRS must be disabled for
better performance on the core. I'm not totally sure about this part, so
I'm not adding it now.

IBRS is available only when the Intel microcode update is applied. The
mitigation must be enabled manually with machdep.spectreV2.mitigated.

Tested by msaitoh a week ago (but I adapted a few things since). Probably
more changes to come.
 1.37  25-Feb-2018  maxv branches: 1.37.2;
Remove INTRENTRY_L, it's not used anymore.
 1.36  22-Feb-2018  maxv Make the machdep.svs_enabled sysctl writable, and add the kernel code
needed to disable SVS at runtime.

We set 'svs_enabled' to false, and hotpatch the kernel entry/exit points
to eliminate the context switch code.

We need to make sure there is no remote CPU that is executing the code we
are hotpatching. So we use two barriers:

* After the first one each CPU is guaranteed to be executing in
svs_disable_cpu with interrupts disabled (this way it can't leave this
place).

* After the second one it is guaranteed that SVS is disabled, so we flush
the cache, enable interrupts and continue execution normally.

Between the two barriers, cpu0 will disable SVS (svs_enabled=false and
hotpatch), and each CPU will restore the generic syscall entry point.

Three notes:

* We should call svs_pgg_update(true) afterwards, to put back PG_G on
the kernel pages (for better performance). This will be done in another
commit.

* The fact that we disable interrupts does not prevent us from receiving
an NMI, and it would be problematic. So we need to add some code to
verify that PMCs are disabled before hotpatching. This will be done
in another commit.

* In svs_disable() we expect each CPU to be online. We need to add a
check to make sure they indeed are.

The sysctl allows only a 1->0 transition. There is no point in doing 0->1
transitions anyway, and it would be complicated to implement because we
need to re-synchronize the CPU user page tables with the current ones (we
lost track of them in the last 1->0 transition).
 1.35  22-Feb-2018  maxv Add a dynamic detection for SVS.

The SVS_* macros are now compiled as skip-noopt. When the system boots, if
the cpu is from Intel, they are hotpatched to their real content.
Typically:

jmp 1f
int3
int3
int3
... int3 ...
1:

gets hotpatched to:

movq SVS_UTLS+UTLS_KPDIRPA,%rax
movq %rax,%cr3
movq CPUVAR(KRSP0),%rsp

These two chunks of code being of the exact same size. We put int3 (0xCC)
to make sure we never execute there.

In the non-SVS (ie non-Intel) case, all it costs is one jump. Given that
the SVS_* macros are small, this jump will likely leave us in the same
icache line, so it's pretty fast.

The syscall entry point is special, because there we use a scratch uint64_t
not in curcpu but in the UTLS page, and it's difficult to hotpatch this
properly. So instead of hotpatching we declare the entry point as an ASM
macro, and define two functions: syscall and syscall_svs, the latter being
the one used in the SVS case.

While here 'syscall' is optimized not to contain an SVS_ENTER - this way
we don't even need to do a jump on the non-SVS case.

When adding pages in the user page tables, make sure we don't have PG_G,
now that it's dynamic.

A read-only sysctl is added, machdep.svs_enabled, that tells whether the
kernel uses SVS or not.

More changes to come, svs_init() is not very clean.
 1.34  27-Jan-2018  maxv Put the default %cs value in INTR_RECURSE_HWFRAME. Pushing an immediate
costs less than reading the %cs register and pushing its value. This
value is not allowed to be != GSEL(GCODE_SEL,SEL_KPL) in all cases.
 1.33  27-Jan-2018  maxv Declare and use INTR_RECURSE_ENTRY, an optimized version of INTRENTRY.
When processing deferred interrupts, we are always entering the new
handler in kernel mode, so there is no point performing the userland
checks.

Saves several instructions.
 1.32  27-Jan-2018  maxv Remove DO_DEFERRED_SWITCH and DO_DEFERRED_SWITCH_RETRY, unused.
 1.31  21-Jan-2018  maxv Unmap the kernel from userland in SVS, and leave only the needed
trampolines. As explained below, SVS should now completely mitigate
Meltdown on GENERIC kernels, even though it needs some more tweaking
for GENERIC_KASLR.

Until now the kernel entry points looked like:

FUNC(intr)
pushq $ERR
pushq $TRAPNO
INTRENTRY
... handle interrupt ...
INTRFASTEXIT
END(intr)

With this change they are split and become:

FUNC(handle)
... handle interrupt ...
INTRFASTEXIT
END(handle)

TEXT_USER_BEGIN
FUNC(intr)
pushq $ERR
pushq $TRAPNO
INTRENTRY
jmp handle
END(intr)
TEXT_USER_END

A new section is introduced, .text.user, that contains minimal kernel
entry/exit points. In order to choose what to put in this section, two
macros are introduced, TEXT_USER_BEGIN and TEXT_USER_END.

The section is mapped in userland with normal 4K pages.

In GENERIC, the section is 4K-page-aligned and embedded in .text, which
is mapped with large pages. That is to say, when an interrupt comes in,
the CPU has the user page tables loaded and executes the 'intr' functions
on 4K pages; after calling SVS_ENTER (in INTRENTRY) these 4K pages become
2MB large pages, and remain so when executing in kernel mode.

In GENERIC_KASLR, the section is 4K-page-aligned and independent from the
other kernel texts. The prekern just picks it up and maps it at a random
address.

In GENERIC, SVS should now completely mitigate Meltdown: what we put in
.text.user is not secret.

In GENERIC_KASLR, SVS would have to be improved a bit more: the
'jmp handle' instruction is actually secret, since it leaks the address
of the section we are jumping into. By exploiting Meltdown on Intel, this
theoretically allows a local user to reconstruct the address of the first
text section. But given that our KASLR produces several texts, and that
each section is not correlated with the others, the level of protection
KASLR provides is still good.
 1.30  20-Jan-2018  maxv Use .pushsection/.popsection, we will soon embed macros in several layers
of nested sections.
 1.29  18-Jan-2018  maxv Unmap the kernel heap from the user page tables (SVS).

This implementation is optimized and organized in such a way that we
don't need to copy the kernel stack to a safe place during user<->kernel
transitions. We create two VAs that point to the same physical page; one
will be mapped in userland and is offset in order to contain only the
trapframe, the other is mapped in the kernel and maps the entire stack.

Sent on tech-kern@ a week ago.
 1.28  11-Jan-2018  maxv Declare new SVS_* variants: SVS_ENTER_NOSTACK and SVS_LEAVE_NOSTACK. Use
SVS_ENTER_NOSTACK in the syscall entry point, and put it before the code
that touches curlwp. (curlwp is located in the direct map.)

Then, disable __HAVE_CPU_UAREA_ROUTINES (to be removed later). This moves
the kernel stack into pmap_kernel(), and not the direct map. That's a
change I've always wanted to make: because of the direct map we can't add
a redzone on the stack, and basically, a stack overflow can go very far
in memory without being detected (as far as erasing all of the system's
memory).

Finally, unmap the direct map from userland.
 1.27  07-Jan-2018  maxv Add a new option, SVS (for Separate Virtual Space), that unmaps kernel
pages when running in userland. For now, only the PTE area is unmapped.

Sent on tech-kern@.
 1.26  07-Jan-2018  maxv Switch x86_retpatch[] -> HOTPATCH().
 1.25  07-Jan-2018  maxv Switch x86_lockpatch[] -> HOTPATCH().
 1.24  07-Jan-2018  maxv Implement a real hotpatch feature.

Define a HOTPATCH() macro, that puts a label and additional information
in the new .rodata.hotpatch kernel section. In patch.c, scan the section
and patch what needs to be. Now it is possible to hotpatch the content of
a macro.

SMAP is switched to use this new system; this saves a call+ret in each
kernel entry/exit point.

Many other operating systems do the same.
 1.23  17-Oct-2017  maxv Have the cpu clear PSL_D automatically when entering the kernel via a
syscall. Then, don't clear PSL_D and PSL_AC in the syscall entry point,
they are now both cleared by the cpu (faster). However they still need to
be manually cleared in the interrupt/trap entry points.
 1.22  17-Oct-2017  maxv Add support for SMAP on amd64.

PSL_AC is cleared from %rflags in each kernel entry point. In the copy
sections, a copy window is opened and the kernel can touch userland
pages. This window is closed when the kernel is done, either at the end
of the copy sections or in the fault-recover functions.

This implementation is not optimized yet, due to the fact that INTRENTRY
is a macro, and we can't hotpatch macros.

Sent on tech-kern@ a month or two ago, tested on a Kabylake.
 1.21  15-Sep-2017  maxv Declare INTRFASTEXIT as a function, so that there is only one iretq in the
kernel. Then, check %rip against the address of this iretq instead of
disassembling (%rip) - which could fault again, or point at some random
address which happens to contain the iretq opcode. The same is true for gs
below, but I'll fix that in another commit.
 1.20  15-Jul-2012  dsl branches: 1.20.2; 1.20.32;
Rename MDP_IRET to MDL_IRET since it is an lwp flag, not a proc one.
Add an MDL_COMPAT32 flag to the lwp's md_flags, set it for 32bit lwps
and use it to force 'return to user' with iret (as is done when
MDL_IRET is set).
Split the iret/sysret code paths much later.
Remove all the replicated code for 32bit system calls - which was only
needed so that iret was always used.
frameasm.h for XEN contains '#define swapgs', while XEN probable never
needs swapgs, this is likely to be confusing.
Add a SWAPGS which is a nop on XEN and swapgs otherwise.
(I've not yet checked all the swapgs in files that include frameasm.h)
Simple x86 programs still work.
Hijack 6.99.9 kernel bump (needed for compat32 modules)
 1.19  17-May-2012  dsl Let the user of INTRENTRY_L() place a label on the 'swapgs' used
when faulting from user space.
 1.18  07-May-2012  dsl Add a ';' that got deleted in a slight tidyup.
 1.17  07-May-2012  dsl Move all the XEN differences to a single conditional.
Merge the XEN/non-XEN versions of INTRFASTEXIT and
INTR_RECURSE_HWFRAME by using extra defines.
Split INTRENTRY so that code can insert extra instructions
inside user/kernel conditional.
 1.16  10-Aug-2011  cherry branches: 1.16.2; 1.16.6; 1.16.8;
Correct offset calculation for ci
 1.15  12-Jan-2011  joerg branches: 1.15.6;
Allow use of traditional CPP to be set on a per platform base in sys.mk.
Honour this for dependency processing in bsd.dep.mk. Switch i386 and
amd64 assembly to use ISO C90 preprocessor concat and drop the
-traditional-cpp on this platform.
 1.14  07-Jul-2010  chs add the guts of TLS support on amd64. based on joerg's patch,
reworked by me to support 32-bit processes as well.
we now keep %fs and %gs loaded with the user values
while in the kernel, which means we don't need to
reload them when returning to user mode.
 1.13  21-Nov-2008  ad branches: 1.13.4; 1.13.6; 1.13.8;
PR port-amd64/39991 modules/compat_linux: build fix
 1.12  21-Apr-2008  cegger branches: 1.12.2; 1.12.8; 1.12.10; 1.12.12; 1.12.14; 1.12.18;
Access Xen's vcpu info structure per-CPU.
Tested on i386 and amd64 (both dom0 and domU) by me.
Xen2 tested (both dom0 and domU) by bouyer.
OK bouyer
 1.11  29-Feb-2008  yamt branches: 1.11.2;
don't bother to check curlwp==NULL.
 1.10  21-Dec-2007  dsl branches: 1.10.2; 1.10.6;
Create the trap/syscall frame space for all the registers in one go.
Use the tramp-frame offsets (TF_foo) for all references to the registers.
Sort the saving of the GP registers into the same order as the trap frame
because consequetive memory accesses are liekly to be faster.
 1.9  21-Dec-2007  dsl Change the xen CLI() and STI() defines to only use one scratch register.
As well as saving an instruction, in one place it saves a push/pop pair.
 1.8  22-Nov-2007  bouyer branches: 1.8.2; 1.8.6;
Pull up the bouyer-xenamd64 branch to HEAD. This brings in amd64 support
to NetBSD/Xen, both Dom0 and DomU.
 1.7  14-Nov-2007  ad Clear the direction flag on entry to the kernel.
 1.6  18-Oct-2007  yamt branches: 1.6.2;
merge yamt-x86pmap branch.

- reduce differences between amd64 and i386. notably, share pmap.c
between them. it makes several i386 pmap improvements available to
amd64, including tlb shootdown reduction and bug fixes from Stephan Uphoff.
- implement deferred pmap switching for amd64.
- remove LARGEPAGES option. always use large pages if available.
also, make it work on amd64.
 1.5  17-Oct-2007  garbled Merge the ppcoea-renovation branch to HEAD.

This branch was a major cleanup and rototill of many of the various OEA
cpu based PPC ports that focused on sharing as much code as possible
between the various ports to eliminate near-identical copies of files in
every tree. Additionally there is a new PIC system that unifies the
interface to interrupt code for all different OEA ppc arches. The work
for this branch was done by a variety of people, too long to list here.

TODO:
bebox still needs work to complete the transition to -renovation.
ofppc still needs a bunch of work, which I will be looking at.
ev64260 still needs to be renovated
amigappc was not attempted.

NOTES:
pmppc was removed as an arch, and moved to a evbppc target.
 1.4  21-May-2007  fvdl branches: 1.4.8; 1.4.10; 1.4.12; 1.4.14;
Revert fs/gs changes until I figure out issues with them.
 1.3  11-May-2007  fvdl Don't save/restore %fs and %gs in trapframe. The kernel won't touch them.
Instead, save/restore them on context switch. For 32bit processes, save/restore
the selector values only, for 64bit processes, save/restore the appropriate
MSRs. Iff the defaults have been changed.
 1.2  09-Feb-2007  ad branches: 1.2.2; 1.2.6; 1.2.8; 1.2.14;
Merge newlock2 to head.
 1.1  26-Apr-2003  fvdl branches: 1.1.18; 1.1.48;
Rename the x86_64 port to amd64, as this is the actual name used for
the processor family now. x86_64 is kept as the MACHINE_ARCH value,
since it's already widely used (by e.g. the toolchain, etc), and
by other operating systems.
 1.1.48.1  20-Oct-2006  ad Make ASTs per-LWP.
 1.1.18.6  17-Mar-2008  yamt sync with head.
 1.1.18.5  21-Jan-2008  yamt sync with head
 1.1.18.4  07-Dec-2007  yamt sync with head
 1.1.18.3  15-Nov-2007  yamt sync with head.
 1.1.18.2  27-Oct-2007  yamt sync with head.
 1.1.18.1  26-Feb-2007  yamt sync with head.
 1.2.14.1  22-May-2007  matt Update to HEAD.
 1.2.8.1  11-Jul-2007  mjf Sync with head.
 1.2.6.3  03-Dec-2007  ad Sync with HEAD.
 1.2.6.2  03-Dec-2007  ad Sync with HEAD.
 1.2.6.1  23-Oct-2007  ad Sync with head.
 1.2.2.1  17-May-2007  yamt sync with head.
 1.4.14.4  18-Nov-2007  bouyer Sync with HEAD
 1.4.14.3  25-Oct-2007  bouyer Finish sync with HEAD. Especially use the new x86 pmap for xenamd64.
For this:
- rename pmap_pte_set() to pmap_pte_testset()
- make pmap_pte_set() a function or macro for non-atomic PTE write
- define and use pmap_pa2pte()/pmap_pte2pa() to read/write PTE entries
- define pmap_pte_flush() which is a nop in x86 case, and flush the
MMUops queue in the Xen case
 1.4.14.2  18-Oct-2007  bouyer Explicitely set the flag argument of HYPERVISOR_iret to 0.
 1.4.14.1  17-Oct-2007  bouyer amd64 (aka x86-64) support for Xen. Based on the OpenBSD port done by
Mathieu Ropert in 2006.
DomU-only for now. An INSTALL_XEN3_DOMU kernel with a ramdisk will boot to
sysinst if you're lucky. Often it panics because a runable LWP has
a NULL stack (really, it's all of l->l_addr which is has been zeroed out
while the process was on the queue !)
TODO:
- bug fixes :)
- Most of the xpq_* functions should be shared with xen/i386
- The xen/i386 assembly bootstrap code should be remplaced with the C
version in xenamd64/amd64/xpmap.c
- see if a config(5) trick could allow to merge xenamd64 back to xen or amd64.
 1.4.12.1  30-Sep-2007  yamt implement deferred pmap switching for amd64, and make amd64 use
x86 shared pmap code. it makes several i386 pmap improvements available
to amd64, including tlb shootdown reduction and bug fixes from Stephan Uphoff.
 1.4.10.3  23-Mar-2008  matt sync with HEAD
 1.4.10.2  09-Jan-2008  matt sync with HEAD
 1.4.10.1  06-Nov-2007  matt sync with HEAD
 1.4.8.3  27-Nov-2007  joerg Sync with HEAD. amd64 Xen support needs testing.
 1.4.8.2  14-Nov-2007  joerg Sync with HEAD.
 1.4.8.1  26-Oct-2007  joerg Sync with HEAD.

Follow the merge of pmap.c on i386 and amd64 and move
pmap_init_tmp_pgtbl into arch/x86/x86/pmap.c. Modify the ACPI wakeup
code to restore CR4 before jumping back into kernel space as the large
page option might cover that.
 1.6.2.3  27-Dec-2007  mjf Sync with HEAD.
 1.6.2.2  08-Dec-2007  mjf Sync with HEAD.
 1.6.2.1  19-Nov-2007  mjf Sync with HEAD.
 1.8.6.1  02-Jan-2008  bouyer Sync with HEAD
 1.8.2.1  26-Dec-2007  ad Sync with head.
 1.10.6.3  17-Jan-2009  mjf Sync with HEAD.
 1.10.6.2  02-Jun-2008  mjf Sync with HEAD.
 1.10.6.1  03-Apr-2008  mjf Sync with HEAD.
 1.10.2.1  24-Mar-2008  keiichi sync with head.
 1.11.2.1  18-May-2008  yamt sync with head.
 1.12.18.1  12-Jun-2012  riz Pull up following revision(s) (requested by spz in ticket #1772):
sys/arch/amd64/amd64/trap.c: revision 1.71 via patch
sys/arch/amd64/amd64/vector.S: revision 1.41 via patch
sys/arch/amd64/include/frameasm.h: patch

Treat traps in kernel mode during the 'return to user' iret sequence
as user faults.
Based heavily in the i386 code with the correct opcode bytes inserted.
iret path tested, arranging for segment register errors is harder.
User %fs and %gs (32bit apps) are loaded much earlier and any errors
will generate kernel panics - there is probably code to try to stop
the invalid values being set.
If we get a fault setting the user %gs, or on a iret that is returning
to userspace, we must do a 'swapgs' to reload the kernel %gs_base.
Also save the %ds, %es, %fs, %gs selector values in the frame so
they can be restored if we finally return to user (probably after
an application SIGSEGV handler has fixed the error).
Without this any such fault leaves the kernel running with the wrong
%gs offset and it will most likely fault again early in trap().
Repeats until the stack tramples on something important.
iret change works, invalid %gs is a little harder to arrange.
 1.12.14.1  12-Jun-2012  riz Pull up following revision(s) (requested by spz in ticket #1772):
sys/arch/amd64/amd64/trap.c: revision 1.71 via patch
sys/arch/amd64/amd64/vector.S: revision 1.41 via patch
sys/arch/amd64/include/frameasm.h: patch

Treat traps in kernel mode during the 'return to user' iret sequence
as user faults.
Based heavily in the i386 code with the correct opcode bytes inserted.
iret path tested, arranging for segment register errors is harder.
User %fs and %gs (32bit apps) are loaded much earlier and any errors
will generate kernel panics - there is probably code to try to stop
the invalid values being set.
If we get a fault setting the user %gs, or on a iret that is returning
to userspace, we must do a 'swapgs' to reload the kernel %gs_base.
Also save the %ds, %es, %fs, %gs selector values in the frame so
they can be restored if we finally return to user (probably after
an application SIGSEGV handler has fixed the error).
Without this any such fault leaves the kernel running with the wrong
%gs offset and it will most likely fault again early in trap().
Repeats until the stack tramples on something important.
iret change works, invalid %gs is a little harder to arrange.
 1.12.12.1  12-Jun-2012  riz Pull up following revision(s) (requested by spz in ticket #1772):
sys/arch/amd64/amd64/trap.c: revision 1.71 via patch
sys/arch/amd64/amd64/vector.S: revision 1.41 via patch
sys/arch/amd64/include/frameasm.h: patch

Treat traps in kernel mode during the 'return to user' iret sequence
as user faults.
Based heavily in the i386 code with the correct opcode bytes inserted.
iret path tested, arranging for segment register errors is harder.
User %fs and %gs (32bit apps) are loaded much earlier and any errors
will generate kernel panics - there is probably code to try to stop
the invalid values being set.
If we get a fault setting the user %gs, or on a iret that is returning
to userspace, we must do a 'swapgs' to reload the kernel %gs_base.
Also save the %ds, %es, %fs, %gs selector values in the frame so
they can be restored if we finally return to user (probably after
an application SIGSEGV handler has fixed the error).
Without this any such fault leaves the kernel running with the wrong
%gs offset and it will most likely fault again early in trap().
Repeats until the stack tramples on something important.
iret change works, invalid %gs is a little harder to arrange.
 1.12.10.1  19-Jan-2009  skrll Sync with HEAD.
 1.12.8.1  13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.12.2.2  11-Aug-2010  yamt sync with head.
 1.12.2.1  04-May-2009  yamt sync with head.
 1.13.8.1  05-Mar-2011  rmind sync with head
 1.13.6.1  17-Aug-2010  uebayasi Sync with HEAD.
 1.13.4.3  27-Aug-2011  jym Sync with HEAD. Most notably: uvm/pmap work done by rmind@, and MP Xen
work of cherry@.

No regression observed on suspend/restore.
 1.13.4.2  28-Mar-2011  jym Sync with HEAD. TODO before merge:
- shortcut for suspend code in sysmon, when powerd(8) is not running.
Borrow ``xs_watch'' thread context?
- bug hunting in xbd + xennet resume. Rings are currently thrashed upon
resume, so current implementation force flush them on suspend. It's not
really needed.
 1.13.4.1  24-Oct-2010  jym Sync with HEAD
 1.15.6.1  03-Jun-2011  cherry Initial import of xen MP sources, with kernel and userspace tests.
- this is a source priview.
- boots to single user.
- spurious interrupt and pmap related panics are normal
 1.16.8.1  03-Jun-2012  jdc Pull up revisions:
src/sys/arch/amd64/include/frameasm.h revision 1.17-1.19
src/sys/arch/amd64/amd64/vector.S revision 1.40-1.41
src/sys/arch/amd64/amd64/trap.c revision 1.71
(requested by dsl in ticket #280).

Move all the XEN differences to a single conditional.
Merge the XEN/non-XEN versions of INTRFASTEXIT and
INTR_RECURSE_HWFRAME by using extra defines.
Split INTRENTRY so that code can insert extra instructions
inside user/kernel conditional.

Add a ';' that got deleted in a slight tidyup.

Rejig the way TRAP() and ZTRAP() are defined and add Z/TRAP_NJ() that
excludes the 'jmp alltraps'.
Use the _NJ versions for trap entries with non-standard code.
Move all the KDTRACE_HOOKS code into a single block inside the
IDTVEC(trap03) code. This removes a mis-predicted from every
trap when KDTRACE_HOOKS are enabled.
Add a few blank lines, need some comments as well :-)
No functional changes intended.

Let the user of INTRENTRY_L() place a label on the 'swapgs' used
when faulting from user space.

If we get a fault setting the user %gs, or on a iret that is returning
to userspace, we must do a 'swapgs' to reload the kernel %gs_base.
Also save the %ds, %es, %fs, %gs selector values in the frame so
they can be restored if we finally return to user (probably after
an application SIGSEGV handler has fixed the error).
Without this any such fault leaves the kernel running with the wrong
%gs offset and it will most likely fault again early in trap().
Repeats until the stack tramples on something important.
iret change works, invalid %gs is a little harder to arrange.

Treat traps in kernel mode during the 'return to user' iret sequence
as user faults.
Based heavily in the i386 code with the correct opcode bytes inserted.
iret path tested, arranging for segment register errors is harder.
User %fs and %gs (32bit apps) are loaded much earlier and any errors
will generate kernel panics - there is probably code to try to stop
the invalid values being set.
 1.16.6.1  02-Jun-2012  mrg sync to latest -current.
 1.16.2.2  30-Oct-2012  yamt sync with head
 1.16.2.1  23-May-2012  yamt sync with head.
 1.20.32.4  14-May-2019  martin Pull up following revision(s) (requested by maxv in ticket #1269):

sys/arch/amd64/amd64/locore.S: revision 1.181 (adapted)
sys/arch/amd64/amd64/amd64_trap.S: revision 1.47 (adapted)
sys/arch/x86/include/specialreg.h: revision 1.144 (adapted)
sys/arch/amd64/include/frameasm.h: revision 1.43 (adapted)
sys/arch/x86/x86/spectre.c: revision 1.27 (adapted)

Mitigation for INTEL-SA-00233: Microarchitectural Data Sampling (MDS).
It requires a microcode update, now available on the Intel website. The
microcode modifies the behavior of the VERW instruction, and makes it flush
internal CPU buffers. We hotpatch the return-to-userland path to add VERW.

Two sysctls are added:

machdep.mds.mitigated = {0/1} user-settable
machdep.mds.method = {string} constructed by the kernel

The kernel will automatically enable the mitigation if the updated
microcode is present. If the new microcode is not present, the user can
load it via cpuctl, and set machdep.mds.mitigated=1.
 1.20.32.3  14-Apr-2018  martin Pullup the following revisions via patch, requested by maxv in ticket #748:

sys/arch/amd64/amd64/copy.S 1.29 (adapted, via patch)
sys/arch/amd64/amd64/amd64_trap.S 1.16,1.19 (partial) (via patch)
sys/arch/amd64/amd64/trap.c 1.102,1.106 (partial),1.110 (via patch)
sys/arch/amd64/include/frameasm.h 1.22,1.24 (via patch)
sys/arch/x86/x86/cpu.c 1.137 (via patch)
sys/arch/x86/x86/patch.c 1.23,1.26 (partial) (via patch)

Backport of SMAP support.
 1.20.32.2  22-Mar-2018  martin Pull up the following revisions, requested by maxv in ticket #652:

sys/arch/amd64/amd64/amd64_trap.S upto 1.39 (partial, patch)
sys/arch/amd64/amd64/db_machdep.c 1.6 (patch)
sys/arch/amd64/amd64/genassym.cf 1.65,1.66,1.67 (patch)
sys/arch/amd64/amd64/locore.S upto 1.159 (partial, patch)
sys/arch/amd64/amd64/machdep.c 1.299-1.302 (patch)
sys/arch/amd64/amd64/trap.c upto 1.113 (partial, patch)
sys/arch/amd64/amd64/amd64/vector.S upto 1.61 (partial, patch)
sys/arch/amd64/conf/GENERIC 1.477,1.478 (patch)
sys/arch/amd64/conf/kern.ldscript 1.26 (patch)
sys/arch/amd64/include/frameasm.h upto 1.37 (partial, patch)
sys/arch/amd64/include/param.h 1.25 (patch)
sys/arch/amd64/include/pmap.h 1.41,1.43,1.44 (patch)
sys/arch/x86/conf/files.x86 1.91,1.93 (patch)
sys/arch/x86/include/cpu.h 1.88,1.89 (patch)
sys/arch/x86/include/pmap.h 1.75 (patch)
sys/arch/x86/x86/cpu.c 1.144,1.146,1.148,1.149 (patch)
sys/arch/x86/x86/pmap.c upto 1.289 (partial, patch)
sys/arch/x86/x86/vm_machdep.c 1.31,1.32 (patch)
sys/arch/x86/x86/x86_machdep.c 1.104,1.106,1.108 (patch)
sys/arch/x86/x86/svs.c 1.1-1.14
sys/arch/xen/conf/files.compat 1.30 (patch)

Backport SVS. Not enabled yet.
 1.20.32.1  07-Mar-2018  martin Pull up the following revisions (via patch), requested by maxv in ticket #610:

sys/arch/amd64/amd64/amd64_trap.S 1.8,1.10,1.12 (partial),1.13-1.15,
1.19 (partial),1.20,1.21,1.22,1.24
(via patch)
sys/arch/amd64/amd64/locore.S 1.129 (partial),1.132 (via patch)
sys/arch/amd64/amd64/trap.c 1.97 (partial),1.111 (via patch)
sys/arch/amd64/amd64/vector.S 1.54,1.55 (via patch)
sys/arch/amd64/include/frameasm.h 1.21,1.23 (via patch)
sys/arch/x86/x86/cpu.c 1.138 (via patch)
sys/arch/xen/conf/Makefile.xen 1.45 (via patch)

Rename and reorder several things in amd64_trap.S.
Compile amd64_trap.S as a file.
Introduce nmitrap and doubletrap.
Have the CPU clear PSL_D automatically in the syscall entry point.
 1.20.2.1  03-Dec-2017  jdolecek update from HEAD
 1.37.2.3  06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.37.2.2  28-Jul-2018  pgoyette Sync with HEAD
 1.37.2.1  30-Mar-2018  pgoyette Resolve conflicts between branch and HEAD
 1.38.2.2  13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.38.2.1  10-Jun-2019  christos Sync with HEAD
 1.47.6.1  11-Apr-2020  bouyer Include ci_isources[] for XenPV too.
Adjust spllower() to XenPV needs, and switch XenPV to the native spllower().
Remove xen_spllower().

RSS XML Feed