Home | History | Annotate | Download | only in kern
History log of /src/sys/kern/subr_msan.c
RevisionDateAuthorComments
 1.19  11-Apr-2023  riastradh kmsan: Format exact instruction addresses relative to symbols.
 1.18  26-Oct-2022  riastradh ddb/db_active.h: New home for extern db_active.

This can be included unconditionally, and db_active can then be
queried unconditionally; if DDB is not in the kernel, then db_active
is a constant zero. Reduces need for #include opt_ddb.h, #ifdef DDB.
 1.17  11-Sep-2021  riastradh ksyms: Use pserialize(9) for kernel access to ksyms.

This makes it available in interrupt context, e.g. for printing
messages with kernel symbol names for return addresses as drm wants
to do.
 1.16  07-Sep-2021  riastradh Revert "ksyms: Use pserialize(9) for kernel access to ksyms."
 1.15  07-Sep-2021  riastradh ksyms: Use pserialize(9) for kernel access to ksyms.

This makes it available in interrupt context, e.g. for printing
messages with kernel symbol names for return addresses as drm wants
to do.
 1.14  09-Sep-2020  maxv kmsan: update the copyright notices
 1.13  05-Sep-2020  riastradh Round of uvm.h cleanup.

The poorly named uvm.h is generally supposed to be for uvm-internal
users only.

- Narrow it to files that actually need it -- mostly files that need
to query whether curlwp is the pagedaemon, which should maybe be
exposed by an external header.

- Use uvm_extern.h where feasible and uvm_*.h for things not exposed
by it. We should split up uvm_extern.h but this will serve for now
to reduce the uvm.h dependencies.

- Use uvm_stat.h and #ifdef UVMHIST uvm.h for files that use
UVMHIST(ubchist), since ubchist is declared in uvm.h but the
reference evaporates if UVMHIST is not defined, so we reduce header
file dependencies.

- Make uvm_device.h and uvm_swap.h independently includable while
here.

ok chs@
 1.12  30-Jun-2020  maxv Make copystr() a MI C function, part of libkern and shared on all
architectures.

Notes:

- On alpha and ia64 the function is kept but gets renamed locally to avoid
symbol collision. This is because on these two arches, I am not sure
whether the ASM callers do not rely on fixed registers, so I prefer to
keep the ASM body for now.
- On Vax, only the symbol is removed, because the body is used from other
functions.
- On RISC-V, this change fixes a bug: copystr() was just a wrapper around
strlcpy(), but strlcpy() makes the operation less safe (strlen on the
source beyond its size).
- The kASan, kCSan and kMSan wrappers are removed, because now that
copystr() is in C, the compiler transformations are applied to it,
without the need for manual wrappers.

Could test on amd64 only, but should be fine.
 1.11  15-May-2020  maxv Use a generic description when scanning mbufs.
 1.10  15-Apr-2020  maxv Use large pages for the kMSan shadows. This greatly improves performance,
and slightly reduces memory consumption.
 1.9  03-Apr-2020  maxv branches: 1.9.2; 1.9.4;
Verify that the terminating '\0', too, is initialized.
 1.8  22-Feb-2020  maxv Be less strict: when copyinstr() returns ENAMETOOLONG, it does initialize
the buffer, so mark it as such.
 1.7  31-Jan-2020  maxv Be more informative.
 1.6  25-Jan-2020  maxv Actually, uio_vmspace is never NULL, the check should be against
pmap_kernel.
 1.5  08-Dec-2019  maxv branches: 1.5.2;
Use the inlines; it is actually fine, since the compiler drops the inlines
if the caller is kmsan-instrumented, forcing a white-listing of the memory
access.
 1.4  06-Dec-2019  maxv cast to proper type
 1.3  22-Nov-2019  maxv Ah, strcat/strchr/strrchr are ASM functions, so instrument them.
 1.2  15-Nov-2019  maxv Instrument ufetch/ustore in kMSan, these were the last remaining functions.
 1.1  14-Nov-2019  maxv Add support for Kernel Memory Sanitizer (kMSan). It detects uninitialized
memory used by the kernel at run time, and just like kASan and kCSan, it
is an excellent feature. It has already detected 38 uninitialized variables
in the kernel during my testing, which I have since discreetly fixed.

We use two shadows:
- "shad", to track uninitialized memory with a bit granularity (1:1).
Each bit set to 1 in the shad corresponds to one uninitialized bit of
real kernel memory.
- "orig", to track the origin of the memory with a 4-byte granularity
(1:1). Each uint32_t cell in the orig indicates the origin of the
associated uint32_t of real kernel memory.

The memory consumption of these shadows is consequent, so at least 4GB of
RAM is recommended to run kMSan.

The compiler inserts calls to specific __msan_* functions on each memory
access, to manage both the shad and the orig and detect uninitialized
memory accesses that change the execution flow (like an "if" on an
uninitialized variable).

We mark as uninit several types of memory buffers (stack, pools, kmem,
malloc, uvm_km), and check each buffer passed to copyout, copyoutstr,
bwrite, if_transmit_lock and DMA operations, to detect uninitialized memory
that leaves the system. This allows us to detect kernel info leaks in a way
that is more efficient and also more user-friendly than KLEAK.

Contrary to kASan, kMSan requires comprehensive coverage, ie we cannot
tolerate having one non-instrumented function, because this could cause
false positives. kMSan cannot instrument ASM functions, so I converted
most of them to __asm__ inlines, which kMSan is able to instrument. Those
that remain receive special treatment.

Contrary to kASan again, kMSan uses a TLS, so we must context-switch this
TLS during interrupts. We use different contexts depending on the interrupt
level.

The orig tracks precisely the origin of a buffer. We use a special encoding
for the orig values, and pack together in each uint32_t cell of the orig:
- a code designating the type of memory (Stack, Pool, etc), and
- a compressed pointer, which points either (1) to a string containing
the name of the variable associated with the cell, or (2) to an area
in the kernel .text section which we resolve to a symbol name + offset.

This encoding allows us not to consume extra memory for associating
information with each cell, and produces a precise output, that can tell
for example the name of an uninitialized variable on the stack, the
function in which it was pushed on the stack, and the function where we
accessed this uninitialized variable.

kMSan is available with LLVM, but not with GCC.

The code is organized in a way that is similar to kASan and kCSan, so it
means that other architectures than amd64 can be supported.
 1.5.2.2  29-Feb-2020  ad Sync with head.
 1.5.2.1  25-Jan-2020  ad Sync with head.
 1.9.4.3  21-Apr-2020  martin Sync with HEAD
 1.9.4.2  13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.9.4.1  03-Apr-2020  martin file subr_msan.c was added on branch phil-wifi on 2020-04-13 08:05:04 +0000
 1.9.2.1  20-Apr-2020  bouyer Sync with HEAD

RSS XML Feed