Home | History | Annotate | only in /src/sys/arch/amd64/stand/prekern
History log of /src/sys/arch/amd64/stand/prekern
RevisionDateAuthorComments
 1.9 24-Jun-2024  riastradh amd64/prekern: Add ldscript to DPADD since it affects build output.
 1.8 25-Jul-2018  kamil Specify NOLIBCSANITIZER in x86 bootloader-like code under sys/arch/

Set NOLIBCSANITIZER for i386 and amd64 specific bootloader-like code.
 1.7 02-Jun-2018  christos branches: 1.7.2;
Disable MKSANITIZER
 1.6 23-Dec-2017  ryoon branches: 1.6.2;
Use ldscript from src to fix build.sh build
 1.5 26-Nov-2017  maxv branches: 1.5.2;
Add a PRNG for the prekern, based on SHA512. The formula is basically:

Y0 = SHA512(entropy-file, 256bit rdseed, 64bit rdtsc)
Yn+1 = SHA512(256bit lowerhalf(Yn), 256bit rdseed, 64bit rdtsc)

On each round, random values are taken from the higher half of Yn. If
rdseed is not available, rdrand is used.

The SHA1 checksum of entropy-file is verified. However, the rndsave_t::data
field is not updated by the prekern, because the area is accessed via the
read-only view we created in locore. I like this design, so it will have
to be updated differently.
 1.4 17-Nov-2017  maxv style
 1.3 14-Nov-2017  maxv Add -Wstrict-prototypes, and fix each warning.
 1.2 13-Nov-2017  maxv Link libkern in the prekern, and remove redefined functions.
 1.1 10-Oct-2017  maxv Add the amd64 prekern. It is a kernel relocator used for Kernel ASLR (see
tech-kern@). It works, but is not yet linked to the build system, because
I can't build a distribution right now.
 1.5.2.2 03-Dec-2017  jdolecek update from HEAD
 1.5.2.1 26-Nov-2017  jdolecek file Makefile was added on branch tls-maxphys on 2017-12-03 11:35:48 +0000
 1.6.2.2 28-Jul-2018  pgoyette Sync with HEAD
 1.6.2.1 25-Jun-2018  pgoyette Sync with HEAD
 1.7.2.1 10-Jun-2019  christos Sync with HEAD
 1.7 04-May-2021  khorben prekern: add support for warning messages

As submitted on port-amd64@ (part 1/3)

Tested on NetBSD/amd64.
 1.6 23-May-2020  maxv branches: 1.6.6;
Bump copyrights.
 1.5 23-May-2020  maxv Extract putc().
 1.4 03-Apr-2019  maxv When scrolling the screen don't forget to update the last line. Whatever,
there is no case where the screen scrolls actually.
 1.3 17-Nov-2017  maxv branches: 1.3.2; 1.3.6;
style
 1.2 14-Nov-2017  maxv Add -Wstrict-prototypes, and fix each warning.
 1.1 10-Oct-2017  maxv Add the amd64 prekern. It is a kernel relocator used for Kernel ASLR (see
tech-kern@). It works, but is not yet linked to the build system, because
I can't build a distribution right now.
 1.3.6.1 10-Jun-2019  christos Sync with HEAD
 1.3.2.2 03-Dec-2017  jdolecek update from HEAD
 1.3.2.1 17-Nov-2017  jdolecek file console.c was added on branch tls-maxphys on 2017-12-03 11:35:48 +0000
 1.6.6.1 13-May-2021  thorpej Sync with HEAD.
 1.22 04-May-2021  khorben prekern: add support for warning messages

As submitted on port-amd64@ (part 1/3)

Tested on NetBSD/amd64.
 1.21 07-May-2020  maxv branches: 1.21.6;
Clarify.
 1.20 07-May-2020  maxv If we encounter relocations from a section that the bootloader dropped,
AND if the section is a note, then skip the relocations.

Considering a note that the bootloader dropped, there are two possible
sides for the relocations: (1) the relocations from the note towards the
rest of the binary, and (2) the relocations from the rest of the binary
towards the note.

We skip (1), which is correct, because the notes do not play any role at
run time. If we encounter (2) however then there is a bug in the kernel,
so add a sanity check against that.

This fixes KASLR since the latest Xen changes (which introduced .note.Xen).
 1.19 05-May-2020  maxv Gather the section filtering in a single function, and add a sanity check
when relocating, to make sure the section we're accessing is mappable.

Currently this check fails, because of the Xen section, which has RELAs but
is an unmappable unallocated note.

Also improve the prekern ASSERTs while here.
 1.18 05-Jan-2019  maxv Apply amd64/kobj_machdep.c::rev1.7 to the prekern too, to fix the
relocation with updated binutils.
 1.17 21-Nov-2017  maxv branches: 1.17.2; 1.17.4; 1.17.6;
Clean up and add some ASSERTs.
 1.16 17-Nov-2017  maxv style
 1.15 15-Nov-2017  maxv Small cleanup.
 1.14 15-Nov-2017  maxv Support large pages on KASLR kernels, in a way that does not reduce
randomness, but on the contrary that increases it.

The size of the kernel sub-blocks is changed to be 1MB. This produces a
kernel with sections that are always < 2MB in size, that can fit a large
page.

Each section is put in a 2MB physical chunk. In this chunk, there is a
padding of approximately 1MB. The prekern uses a random offset aligned to
sh_addralign, to shift the section in physical memory.

For example, physical memory layout created by the bootloader for .text.4
and .rodata.0:
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+
|+---------------+ |+---------------+ |
|| .text.4 | PAD || .rodata.0 | PAD |
|+---------------+ |+---------------+ |
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+
PA PA+2MB PA+4MB

Then, physical memory layout, after having been shifted by the prekern:
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+
| P +---------------+ | +---------------+ |
| A | .text.4 | PAD | PAD | .rodata.0 | PAD |
| D +---------------+ | +---------------+ |
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+
PA PA+2MB PA+4MB

The kernel maps these 2MB physical chunks with 2MB large pages. Therefore,
randomness is enforced at both the virtual and physical levels, and the
resulting entropy is higher than that of our current implementaion until
now.

The padding around the section is filled by the prekern. Not to consume
too much memory, the sections that are smaller than PAGE_SIZE are mapped
with normal pages - because there is no point in optimizing them. In these
normal pages, the same shift is applied.

This change has two additional advantages: (a) the cache attacks based on
the TLB are mostly mitigated, because even if you are able to determine
that a given page-aligned range is mapped as executable you don't know
where exactly within that range the section actually begins, and (b) given
that we are slightly randomizing the physical layout we are making some
rare physical attacks more difficult to conduct.

NOTE: after this change you need to update GENERIC_KASLR / prekern /
bootloader.
 1.13 14-Nov-2017  maxv Add -Wstrict-prototypes, and fix each warning.
 1.12 13-Nov-2017  maxv One more ASSERT, won't hurt.
 1.11 13-Nov-2017  maxv Don't process ELF sections that don't have the ALLOC flag set.

NOTE: you need to update both the prekern and the bootloader after this
change.
 1.10 13-Nov-2017  maxv Change the mapping logic: don't group sections of the same type into
segments, and rather map each section independently at a random VA.

In particular, .data and .bss are not merged anymore and reside at
different addresses.
 1.9 09-Nov-2017  maxv Define utility functions as inlines in prekern.h.
 1.8 09-Nov-2017  maxv Fill in the page padding. Only .text is pre-filled by the ld script, but
this will change in the future.
 1.7 05-Nov-2017  maxv Mprotect the segments in mm.c using bootspace, and remove the now unused
fields of elfinfo.
 1.6 01-Nov-2017  maxv Handle absolute symbols. Since my linux_sigcode.S::rev1.4 there are two
Elf_Rela that point to the NULL symbol - which the prekern thought was an
external reference.

In the ELF spec, STN_UNDEF means the value of the symbol is zero.
 1.5 29-Oct-2017  maxv Fix a few error messages, and be a little more verbose.
 1.4 29-Oct-2017  maxv Randomize the kernel segments independently. That is to say, put text,
rodata and data at different addresses (and in a random order).

To achieve that, the mapping order in the prekern is changed. Until now,
we were creating the kernel map the following way:
-> choose a random VA
-> map [kernpa_start; kernpa_end[ at this VA
-> parse the ELF structures from there
-> determine where exactly the kernel segments are located
-> relocate etc
Now, we are doing:
-> create a read-only view of [kernpa_start; kernpa_end[
-> from this view, compute the size of the "head" region
-> choose a random VA in the HEAD window, and map the head there
-> for each region in (text, rodata, data, boot)
-> compute the size of the region from the RO view
-> choose a random VA in the KASLR window
-> map the region there
-> relocate etc

Each time we map a region, we initialize its bootspace fields right away.

The "head" region must be put before the other regions in memory, because
the kernel uses (headva + sh_offset) to get the addresses of the symbols,
and the offset is unsigned.

Given that the head does not have an mcmodel constraint, its location is
randomized in a window located below the KASLR window.

The rest of the regions being in the same window, we need to detect
collisions.

Note that the module map is embedded in the "boot" region, and that
therefore its location is randomized too.
 1.3 29-Oct-2017  maxv Add three functions and start using them; will be more useful soon.
 1.2 11-Oct-2017  maxv Make sure we're relocating a relocatable kernel.
 1.1 10-Oct-2017  maxv Add the amd64 prekern. It is a kernel relocator used for Kernel ASLR (see
tech-kern@). It works, but is not yet linked to the build system, because
I can't build a distribution right now.
 1.17.6.1 10-Jun-2019  christos Sync with HEAD
 1.17.4.1 18-Jan-2019  pgoyette Synch with HEAD
 1.17.2.2 03-Dec-2017  jdolecek update from HEAD
 1.17.2.1 21-Nov-2017  jdolecek file elf.c was added on branch tls-maxphys on 2017-12-03 11:35:48 +0000
 1.21.6.1 13-May-2021  thorpej Sync with HEAD.
 1.11 19-Mar-2019  maxv Fix/remove some half-baked stuff I left in the prekern:

- Page-align the idt store, to be extra sure.
- Remove unneeded prototypes.
- Drop the TSS, we don't care and aren't even using it.
- Initialize %ss with a default value.
- Fix three exception handlers, no need to push an error code.

No actual impact, because these things are used only when returning from
exceptions received in the prekern; these exceptions are not supposed to
be ever received, never are, and if they were we wouldn't return anyway.
 1.10 09-Mar-2019  maxv Start replacing the x86 PTE bits.
 1.9 07-Mar-2019  maxv Drop PG_RO, PG_KR and PG_PROT, they are useless and create confusion.
 1.8 25-May-2018  maxv branches: 1.8.2;
Hide a bunch of local symbols.
 1.7 22-Dec-2017  maxv branches: 1.7.2;
Sync comments with reality.
 1.6 26-Nov-2017  maxv branches: 1.6.2;
Add rdrand.
 1.5 14-Nov-2017  maxv Remove XXX: set FRAMESIZE to the kernel value. Verily I don't understand
why we are doing that in the non-kaslr kernels, but let's just reproduce
the behavior.

jump_kernel is changed to use callq, so that the stack alignment is
preserved.
 1.4 10-Nov-2017  maxv Add cpuid and rdseed.
 1.3 29-Oct-2017  maxv Randomize the kernel segments independently. That is to say, put text,
rodata and data at different addresses (and in a random order).

To achieve that, the mapping order in the prekern is changed. Until now,
we were creating the kernel map the following way:
-> choose a random VA
-> map [kernpa_start; kernpa_end[ at this VA
-> parse the ELF structures from there
-> determine where exactly the kernel segments are located
-> relocate etc
Now, we are doing:
-> create a read-only view of [kernpa_start; kernpa_end[
-> from this view, compute the size of the "head" region
-> choose a random VA in the HEAD window, and map the head there
-> for each region in (text, rodata, data, boot)
-> compute the size of the region from the RO view
-> choose a random VA in the KASLR window
-> map the region there
-> relocate etc

Each time we map a region, we initialize its bootspace fields right away.

The "head" region must be put before the other regions in memory, because
the kernel uses (headva + sh_offset) to get the addresses of the symbols,
and the offset is unsigned.

Given that the head does not have an mcmodel constraint, its location is
randomized in a window located below the KASLR window.

The rest of the regions being in the same window, we need to detect
collisions.

Note that the module map is embedded in the "boot" region, and that
therefore its location is randomized too.
 1.2 11-Oct-2017  maxv Remove this #if, these options belong to the kernel and not the prekern.
No real change since eblob is always here. And I was apparently drunk
when writing some comments.
 1.1 10-Oct-2017  maxv Add the amd64 prekern. It is a kernel relocator used for Kernel ASLR (see
tech-kern@). It works, but is not yet linked to the build system, because
I can't build a distribution right now.
 1.6.2.2 03-Dec-2017  jdolecek update from HEAD
 1.6.2.1 26-Nov-2017  jdolecek file locore.S was added on branch tls-maxphys on 2017-12-03 11:35:48 +0000
 1.7.2.1 25-Jun-2018  pgoyette Sync with HEAD
 1.8.2.1 10-Jun-2019  christos Sync with HEAD
 1.28 04-May-2021  khorben prekern: add support for warning messages

As submitted on port-amd64@ (part 1/3)

Tested on NetBSD/amd64.
 1.27 07-May-2020  maxv branches: 1.27.6;
Clarify.
 1.26 07-May-2020  maxv Explain more.
 1.25 15-Feb-2020  maxv Explain more.
 1.24 09-Mar-2019  maxv branches: 1.24.6;
Start replacing the x86 PTE bits.
 1.23 07-Mar-2019  maxv Drop PG_RO, PG_KR and PG_PROT, they are useless and create confusion.
 1.22 20-Jun-2018  maxv branches: 1.22.2;
Add and use bootspace.smodule. Initialize it in locore/prekern to better
hide the specifics from the "upper" layers. This allows for greater
flexibility.
 1.21 21-Dec-2017  maxv branches: 1.21.2;
Remove unused macros.
 1.20 26-Nov-2017  maxv branches: 1.20.2;
Oh, damn. Obviously I forgot one case here: an already-mapped region could
be contained entirely in the region we're trying to create. So go through
another round. While here add mm_reenter_pa, and make sure the va given to
mm_enter_pa does not already point to something.
 1.19 26-Nov-2017  maxv Add a PRNG for the prekern, based on SHA512. The formula is basically:

Y0 = SHA512(entropy-file, 256bit rdseed, 64bit rdtsc)
Yn+1 = SHA512(256bit lowerhalf(Yn), 256bit rdseed, 64bit rdtsc)

On each round, random values are taken from the higher half of Yn. If
rdseed is not available, rdrand is used.

The SHA1 checksum of entropy-file is verified. However, the rndsave_t::data
field is not updated by the prekern, because the area is accessed via the
read-only view we created in locore. I like this design, so it will have
to be updated differently.
 1.18 21-Nov-2017  maxv Clean up and add some ASSERTs.
 1.17 15-Nov-2017  maxv Small cleanup.
 1.16 15-Nov-2017  maxv Mmh, should be <=.
 1.15 15-Nov-2017  maxv Define MM_PROT_* locally.
 1.14 15-Nov-2017  maxv Support large pages on KASLR kernels, in a way that does not reduce
randomness, but on the contrary that increases it.

The size of the kernel sub-blocks is changed to be 1MB. This produces a
kernel with sections that are always < 2MB in size, that can fit a large
page.

Each section is put in a 2MB physical chunk. In this chunk, there is a
padding of approximately 1MB. The prekern uses a random offset aligned to
sh_addralign, to shift the section in physical memory.

For example, physical memory layout created by the bootloader for .text.4
and .rodata.0:
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+
|+---------------+ |+---------------+ |
|| .text.4 | PAD || .rodata.0 | PAD |
|+---------------+ |+---------------+ |
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+
PA PA+2MB PA+4MB

Then, physical memory layout, after having been shifted by the prekern:
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+
| P +---------------+ | +---------------+ |
| A | .text.4 | PAD | PAD | .rodata.0 | PAD |
| D +---------------+ | +---------------+ |
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+
PA PA+2MB PA+4MB

The kernel maps these 2MB physical chunks with 2MB large pages. Therefore,
randomness is enforced at both the virtual and physical levels, and the
resulting entropy is higher than that of our current implementaion until
now.

The padding around the section is filled by the prekern. Not to consume
too much memory, the sections that are smaller than PAGE_SIZE are mapped
with normal pages - because there is no point in optimizing them. In these
normal pages, the same shift is applied.

This change has two additional advantages: (a) the cache attacks based on
the TLB are mostly mitigated, because even if you are able to determine
that a given page-aligned range is mapped as executable you don't know
where exactly within that range the section actually begins, and (b) given
that we are slightly randomizing the physical layout we are making some
rare physical attacks more difficult to conduct.

NOTE: after this change you need to update GENERIC_KASLR / prekern /
bootloader.
 1.13 14-Nov-2017  maxv Add -Wstrict-prototypes, and fix each warning.
 1.12 13-Nov-2017  maxv Change the mapping logic: don't group sections of the same type into
segments, and rather map each section independently at a random VA.

In particular, .data and .bss are not merged anymore and reside at
different addresses.
 1.11 11-Nov-2017  maxv Detect collisions from bootspace directly.
 1.10 11-Nov-2017  maxv Modify the layout of the bootspace structure, in such a way that it can
contain several kernel segments of the same type (eg several .text
segments). Some parts are still a bit messy but will be cleaned up soon.

I cannot compile-test this change on i386, but it seems fine enough.

NOTE: you need to rebuild and reinstall a new prekern after this change.
 1.9 09-Nov-2017  maxv Fill in the page padding. Only .text is pre-filled by the ld script, but
this will change in the future.
 1.8 05-Nov-2017  maxv Mprotect the segments in mm.c using bootspace, and remove the now unused
fields of elfinfo.
 1.7 29-Oct-2017  maxv Fix a few error messages, and be a little more verbose.
 1.6 29-Oct-2017  maxv Randomize the kernel segments independently. That is to say, put text,
rodata and data at different addresses (and in a random order).

To achieve that, the mapping order in the prekern is changed. Until now,
we were creating the kernel map the following way:
-> choose a random VA
-> map [kernpa_start; kernpa_end[ at this VA
-> parse the ELF structures from there
-> determine where exactly the kernel segments are located
-> relocate etc
Now, we are doing:
-> create a read-only view of [kernpa_start; kernpa_end[
-> from this view, compute the size of the "head" region
-> choose a random VA in the HEAD window, and map the head there
-> for each region in (text, rodata, data, boot)
-> compute the size of the region from the RO view
-> choose a random VA in the KASLR window
-> map the region there
-> relocate etc

Each time we map a region, we initialize its bootspace fields right away.

The "head" region must be put before the other regions in memory, because
the kernel uses (headva + sh_offset) to get the addresses of the symbols,
and the offset is unsigned.

Given that the head does not have an mcmodel constraint, its location is
randomized in a window located below the KASLR window.

The rest of the regions being in the same window, we need to detect
collisions.

Note that the module map is embedded in the "boot" region, and that
therefore its location is randomized too.
 1.5 28-Oct-2017  maxv Fix a mistake I made in the very first revision. The calculation of the
number of slots was incorrect in some cases, and it could cause the
prekern to fault right away at boot time, or the kernel to fault when
loading kernel modules near the end of the module map.

The variables are divided by PAGE_SIZE to prevent integer overflows.
 1.4 23-Oct-2017  maxv Add two XXXs, so that people don't get confused, a fifth region is needed
anyway.
 1.3 18-Oct-2017  maxv If a branch is already there, use it and don't create a new one. This way
we can call mm_map_tree twice with neighboring regions.
 1.2 15-Oct-2017  maxv Descend the page tree from L4 to L1, instead of allocating a separate
branch and linking it at the end. This way we don't need to allocate VA
from the (tiny) prekern map.
 1.1 10-Oct-2017  maxv Add the amd64 prekern. It is a kernel relocator used for Kernel ASLR (see
tech-kern@). It works, but is not yet linked to the build system, because
I can't build a distribution right now.
 1.20.2.2 03-Dec-2017  jdolecek update from HEAD
 1.20.2.1 26-Nov-2017  jdolecek file mm.c was added on branch tls-maxphys on 2017-12-03 11:35:48 +0000
 1.21.2.1 25-Jun-2018  pgoyette Sync with HEAD
 1.22.2.2 08-Apr-2020  martin Merge changes from current as of 20200406
 1.22.2.1 10-Jun-2019  christos Sync with HEAD
 1.24.6.1 29-Feb-2020  ad Sync with head.
 1.27.6.1 13-May-2021  thorpej Sync with HEAD.
 1.8 21-Aug-2022  mlelstv Adapt to pmap/bootspace migrations.
 1.7 23-May-2020  maxv Bump copyrights.
 1.6 03-Nov-2018  maxv Remove VA_SIGN_POS from the computation of the indexes, it is not needed.
 1.5 12-Aug-2018  maxv Move the PTE area from slot 255 to slot 509. I've never understood why we
put it on 255; the "kernel" half of the VM space begins on slot 256, so
if anything, the PTE area should have been above it, not below.

Virtually extend the user slots in slotspace, because we don't want
(randomized) kernel mappings to land on slot 255.

The prekern is updated accordingly.

Tested on GENERIC, GENERIC_KASLR and XEN3_DOM0.
 1.4 21-Jan-2018  maxv branches: 1.4.2; 1.4.4;
Increase the size of the initial mapping of the kernel. KASLR kernels are
bigger than their GENERIC counterparts, and the limit will soon be hit on
them.
 1.3 17-Nov-2017  maxv branches: 1.3.2;
style
 1.2 05-Nov-2017  maxv Remove unused.
 1.1 10-Oct-2017  maxv Add the amd64 prekern. It is a kernel relocator used for Kernel ASLR (see
tech-kern@). It works, but is not yet linked to the build system, because
I can't build a distribution right now.
 1.3.2.2 03-Dec-2017  jdolecek update from HEAD
 1.3.2.1 17-Nov-2017  jdolecek file pdir.h was added on branch tls-maxphys on 2017-12-03 11:35:48 +0000
 1.4.4.1 10-Jun-2019  christos Sync with HEAD
 1.4.2.2 26-Nov-2018  pgoyette Sync with HEAD, resolve a couple of conflicts
 1.4.2.1 06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.14 04-May-2021  khorben prekern: add support for warning messages

As submitted on port-amd64@ (part 1/3)

Tested on NetBSD/amd64.
 1.13 23-May-2020  maxv branches: 1.13.6;
Bump copyrights.
 1.12 23-May-2020  maxv Hum, forgot to include this file in my "Clarify." commit on mm.c:rev1.27
and elf.c:rev1.21.
 1.11 19-Mar-2019  maxv Fix/remove some half-baked stuff I left in the prekern:

- Page-align the idt store, to be extra sure.
- Remove unneeded prototypes.
- Drop the TSS, we don't care and aren't even using it.
- Initialize %ss with a default value.
- Fix three exception handlers, no need to push an error code.

No actual impact, because these things are used only when returning from
exceptions received in the prekern; these exceptions are not supposed to
be ever received, never are, and if they were we wouldn't return anyway.
 1.10 12-Aug-2018  maxv Move the PTE area from slot 255 to slot 509. I've never understood why we
put it on 255; the "kernel" half of the VM space begins on slot 256, so
if anything, the PTE area should have been above it, not below.

Virtually extend the user slots in slotspace, because we don't want
(randomized) kernel mappings to land on slot 255.

The prekern is updated accordingly.

Tested on GENERIC, GENERIC_KASLR and XEN3_DOM0.
 1.9 02-Aug-2018  maxv Add a "version" field in the prekern_args structure. The kernel checks it,
and if it's not happy it returns back to the prekern.
 1.8 25-May-2018  maxv branches: 1.8.2;
Rename the entry points of the prekern, rename the array and move it into
.rodata.
 1.7 26-Nov-2017  maxv branches: 1.7.2; 1.7.4;
Add a PRNG for the prekern, based on SHA512. The formula is basically:

Y0 = SHA512(entropy-file, 256bit rdseed, 64bit rdtsc)
Yn+1 = SHA512(256bit lowerhalf(Yn), 256bit rdseed, 64bit rdtsc)

On each round, random values are taken from the higher half of Yn. If
rdseed is not available, rdrand is used.

The SHA1 checksum of entropy-file is verified. However, the rndsave_t::data
field is not updated by the prekern, because the area is accessed via the
read-only view we created in locore. I like this design, so it will have
to be updated differently.
 1.6 17-Nov-2017  maxv style
 1.5 14-Nov-2017  maxv Add -Wstrict-prototypes, and fix each warning.
 1.4 05-Nov-2017  maxv Mprotect the segments in mm.c using bootspace, and remove the now unused
fields of elfinfo.
 1.3 29-Oct-2017  maxv Randomize the kernel segments independently. That is to say, put text,
rodata and data at different addresses (and in a random order).

To achieve that, the mapping order in the prekern is changed. Until now,
we were creating the kernel map the following way:
-> choose a random VA
-> map [kernpa_start; kernpa_end[ at this VA
-> parse the ELF structures from there
-> determine where exactly the kernel segments are located
-> relocate etc
Now, we are doing:
-> create a read-only view of [kernpa_start; kernpa_end[
-> from this view, compute the size of the "head" region
-> choose a random VA in the HEAD window, and map the head there
-> for each region in (text, rodata, data, boot)
-> compute the size of the region from the RO view
-> choose a random VA in the KASLR window
-> map the region there
-> relocate etc

Each time we map a region, we initialize its bootspace fields right away.

The "head" region must be put before the other regions in memory, because
the kernel uses (headva + sh_offset) to get the addresses of the symbols,
and the offset is unsigned.

Given that the head does not have an mcmodel constraint, its location is
randomized in a window located below the KASLR window.

The rest of the regions being in the same window, we need to detect
collisions.

Note that the module map is embedded in the "boot" region, and that
therefore its location is randomized too.
 1.2 29-Oct-2017  maxv Add a fifth region, called "head". On kaslr kernels it contains the ELF
Header and the ELF Section Headers. On normal kernels it is empty (the
headers are in the "boot" region).

Note: if you're using GENERIC_KASLR, you also need to rebuild the prekern.
 1.1 10-Oct-2017  maxv Add the amd64 prekern. It is a kernel relocator used for Kernel ASLR (see
tech-kern@). It works, but is not yet linked to the build system, because
I can't build a distribution right now.
 1.7.4.2 06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.7.4.1 25-Jun-2018  pgoyette Sync with HEAD
 1.7.2.2 03-Dec-2017  jdolecek update from HEAD
 1.7.2.1 26-Nov-2017  jdolecek file prekern.c was added on branch tls-maxphys on 2017-12-03 11:35:48 +0000
 1.8.2.1 10-Jun-2019  christos Sync with HEAD
 1.13.6.1 13-May-2021  thorpej Sync with HEAD.
 1.25 21-Aug-2022  mlelstv Adapt to pmap/bootspace migrations.
 1.24 04-May-2021  khorben prekern: add support for warning messages

As submitted on port-amd64@ (part 1/3)

Tested on NetBSD/amd64.
 1.23 23-May-2020  maxv branches: 1.23.6;
Bump copyrights.
 1.22 07-May-2020  maxv Forgot to commit this file as part of elf.c::rev1.21 mm.c::rev1.27.
 1.21 05-May-2020  maxv Gather the section filtering in a single function, and add a sanity check
when relocating, to make sure the section we're accessing is mappable.

Currently this check fails, because of the Xen section, which has RELAs but
is an unmappable unallocated note.

Also improve the prekern ASSERTs while here.
 1.20 20-Jun-2018  maxv Add and use bootspace.smodule. Initialize it in locore/prekern to better
hide the specifics from the "upper" layers. This allows for greater
flexibility.
 1.19 15-Jan-2018  christos branches: 1.19.2;
avoid typedef redefinitiones
 1.18 26-Nov-2017  maxv branches: 1.18.2;
Add a PRNG for the prekern, based on SHA512. The formula is basically:

Y0 = SHA512(entropy-file, 256bit rdseed, 64bit rdtsc)
Yn+1 = SHA512(256bit lowerhalf(Yn), 256bit rdseed, 64bit rdtsc)

On each round, random values are taken from the higher half of Yn. If
rdseed is not available, rdrand is used.

The SHA1 checksum of entropy-file is verified. However, the rndsave_t::data
field is not updated by the prekern, because the area is accessed via the
read-only view we created in locore. I like this design, so it will have
to be updated differently.
 1.17 26-Nov-2017  maxv Add rdrand.
 1.16 21-Nov-2017  maxv Clean up and add some ASSERTs.
 1.15 15-Nov-2017  maxv Small cleanup.
 1.14 15-Nov-2017  maxv Define MM_PROT_* locally.
 1.13 15-Nov-2017  maxv Support large pages on KASLR kernels, in a way that does not reduce
randomness, but on the contrary that increases it.

The size of the kernel sub-blocks is changed to be 1MB. This produces a
kernel with sections that are always < 2MB in size, that can fit a large
page.

Each section is put in a 2MB physical chunk. In this chunk, there is a
padding of approximately 1MB. The prekern uses a random offset aligned to
sh_addralign, to shift the section in physical memory.

For example, physical memory layout created by the bootloader for .text.4
and .rodata.0:
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+
|+---------------+ |+---------------+ |
|| .text.4 | PAD || .rodata.0 | PAD |
|+---------------+ |+---------------+ |
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+
PA PA+2MB PA+4MB

Then, physical memory layout, after having been shifted by the prekern:
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+
| P +---------------+ | +---------------+ |
| A | .text.4 | PAD | PAD | .rodata.0 | PAD |
| D +---------------+ | +---------------+ |
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+
PA PA+2MB PA+4MB

The kernel maps these 2MB physical chunks with 2MB large pages. Therefore,
randomness is enforced at both the virtual and physical levels, and the
resulting entropy is higher than that of our current implementaion until
now.

The padding around the section is filled by the prekern. Not to consume
too much memory, the sections that are smaller than PAGE_SIZE are mapped
with normal pages - because there is no point in optimizing them. In these
normal pages, the same shift is applied.

This change has two additional advantages: (a) the cache attacks based on
the TLB are mostly mitigated, because even if you are able to determine
that a given page-aligned range is mapped as executable you don't know
where exactly within that range the section actually begins, and (b) given
that we are slightly randomizing the physical layout we are making some
rare physical attacks more difficult to conduct.

NOTE: after this change you need to update GENERIC_KASLR / prekern /
bootloader.
 1.12 14-Nov-2017  maxv Add -Wstrict-prototypes, and fix each warning.
 1.11 13-Nov-2017  maxv Change the mapping logic: don't group sections of the same type into
segments, and rather map each section independently at a random VA.

In particular, .data and .bss are not merged anymore and reside at
different addresses.
 1.10 13-Nov-2017  maxv Link libkern in the prekern, and remove redefined functions.
 1.9 11-Nov-2017  maxv Modify the layout of the bootspace structure, in such a way that it can
contain several kernel segments of the same type (eg several .text
segments). Some parts are still a bit messy but will be cleaned up soon.

I cannot compile-test this change on i386, but it seems fine enough.

NOTE: you need to rebuild and reinstall a new prekern after this change.
 1.8 10-Nov-2017  maxv Implement memcpy, the builtin version does not work with variable sizes.
 1.7 10-Nov-2017  maxv Add cpuid and rdseed.
 1.6 09-Nov-2017  maxv Define utility functions as inlines in prekern.h.
 1.5 09-Nov-2017  maxv Fill in the page padding. Only .text is pre-filled by the ld script, but
this will change in the future.
 1.4 05-Nov-2017  maxv Mprotect the segments in mm.c using bootspace, and remove the now unused
fields of elfinfo.
 1.3 29-Oct-2017  maxv Randomize the kernel segments independently. That is to say, put text,
rodata and data at different addresses (and in a random order).

To achieve that, the mapping order in the prekern is changed. Until now,
we were creating the kernel map the following way:
-> choose a random VA
-> map [kernpa_start; kernpa_end[ at this VA
-> parse the ELF structures from there
-> determine where exactly the kernel segments are located
-> relocate etc
Now, we are doing:
-> create a read-only view of [kernpa_start; kernpa_end[
-> from this view, compute the size of the "head" region
-> choose a random VA in the HEAD window, and map the head there
-> for each region in (text, rodata, data, boot)
-> compute the size of the region from the RO view
-> choose a random VA in the KASLR window
-> map the region there
-> relocate etc

Each time we map a region, we initialize its bootspace fields right away.

The "head" region must be put before the other regions in memory, because
the kernel uses (headva + sh_offset) to get the addresses of the symbols,
and the offset is unsigned.

Given that the head does not have an mcmodel constraint, its location is
randomized in a window located below the KASLR window.

The rest of the regions being in the same window, we need to detect
collisions.

Note that the module map is embedded in the "boot" region, and that
therefore its location is randomized too.
 1.2 29-Oct-2017  maxv Add a fifth region, called "head". On kaslr kernels it contains the ELF
Header and the ELF Section Headers. On normal kernels it is empty (the
headers are in the "boot" region).

Note: if you're using GENERIC_KASLR, you also need to rebuild the prekern.
 1.1 10-Oct-2017  maxv Add the amd64 prekern. It is a kernel relocator used for Kernel ASLR (see
tech-kern@). It works, but is not yet linked to the build system, because
I can't build a distribution right now.
 1.18.2.2 03-Dec-2017  jdolecek update from HEAD
 1.18.2.1 26-Nov-2017  jdolecek file prekern.h was added on branch tls-maxphys on 2017-12-03 11:35:48 +0000
 1.19.2.1 25-Jun-2018  pgoyette Sync with HEAD
 1.23.6.1 13-May-2021  thorpej Sync with HEAD.
 1.2 11-Oct-2017  maxv branches: 1.2.2;
Add an alignment to fill strictly all of the padding; does not increase
the size of the prekern.
 1.1 10-Oct-2017  maxv Add the amd64 prekern. It is a kernel relocator used for Kernel ASLR (see
tech-kern@). It works, but is not yet linked to the build system, because
I can't build a distribution right now.
 1.2.2.2 03-Dec-2017  jdolecek update from HEAD
 1.2.2.1 11-Oct-2017  jdolecek file prekern.ldscript was added on branch tls-maxphys on 2017-12-03 11:35:48 +0000
 1.5 04-May-2021  khorben prekern: add warnings upon problems collecting entropy

As submitted on port-amd64@ (part 3/3)

Tested on NetBSD/amd64.
 1.4 04-May-2021  khorben prekern: do not choke on bad entropy files

As submitted on port-amd64@ (part 2/3)

Tested on NetBSD/amd64.
 1.3 21-May-2020  maxv branches: 1.3.6;
Mmh, should check cpuid_level first.
 1.2 26-Nov-2017  maxv branches: 1.2.2;
I forgot to say in my previous commit that the PRNG is inspired from a
conversation with Taylor and Thor on tech-kern@.

(just add a comment)
 1.1 26-Nov-2017  maxv Add a PRNG for the prekern, based on SHA512. The formula is basically:

Y0 = SHA512(entropy-file, 256bit rdseed, 64bit rdtsc)
Yn+1 = SHA512(256bit lowerhalf(Yn), 256bit rdseed, 64bit rdtsc)

On each round, random values are taken from the higher half of Yn. If
rdseed is not available, rdrand is used.

The SHA1 checksum of entropy-file is verified. However, the rndsave_t::data
field is not updated by the prekern, because the area is accessed via the
read-only view we created in locore. I like this design, so it will have
to be updated differently.
 1.2.2.2 03-Dec-2017  jdolecek update from HEAD
 1.2.2.1 26-Nov-2017  jdolecek file prng.c was added on branch tls-maxphys on 2017-12-03 11:35:48 +0000
 1.3.6.1 13-May-2021  thorpej Sync with HEAD.
 1.3 23-May-2020  maxv Bump copyrights.
 1.2 14-Nov-2017  maxv branches: 1.2.2;
Remove XXX: set FRAMESIZE to the kernel value. Verily I don't understand
why we are doing that in the non-kaslr kernels, but let's just reproduce
the behavior.

jump_kernel is changed to use callq, so that the stack alignment is
preserved.
 1.1 10-Oct-2017  maxv Add the amd64 prekern. It is a kernel relocator used for Kernel ASLR (see
tech-kern@). It works, but is not yet linked to the build system, because
I can't build a distribution right now.
 1.2.2.2 03-Dec-2017  jdolecek update from HEAD
 1.2.2.1 14-Nov-2017  jdolecek file redef.h was added on branch tls-maxphys on 2017-12-03 11:35:48 +0000
 1.6 23-May-2020  maxv Bump copyrights.
 1.5 19-Mar-2019  maxv Fix/remove some half-baked stuff I left in the prekern:

- Page-align the idt store, to be extra sure.
- Remove unneeded prototypes.
- Drop the TSS, we don't care and aren't even using it.
- Initialize %ss with a default value.
- Fix three exception handlers, no need to push an error code.

No actual impact, because these things are used only when returning from
exceptions received in the prekern; these exceptions are not supposed to
be ever received, never are, and if they were we wouldn't return anyway.
 1.4 14-Jul-2018  maxv Drop NENTRY() from the x86 kernels, use ENTRY(). With PMCs (and other hardware
tracing facilities) we have a much better ways of monitoring the CPU activity
than GPROF, without software modification.

Also I think GPROF has never worked, because the 'start' functions of both
i386 and amd64 use ENTRY(), and it would have caused a function call while the
kernel was not yet relocated.
 1.3 25-May-2018  maxv branches: 1.3.2;
Rename the entry points of the prekern, rename the array and move it into
.rodata.
 1.2 22-Dec-2017  maxv branches: 1.2.2;
Sync comments with reality.
 1.1 10-Oct-2017  maxv branches: 1.1.2;
Add the amd64 prekern. It is a kernel relocator used for Kernel ASLR (see
tech-kern@). It works, but is not yet linked to the build system, because
I can't build a distribution right now.
 1.1.2.2 03-Dec-2017  jdolecek update from HEAD
 1.1.2.1 10-Oct-2017  jdolecek file trap.S was added on branch tls-maxphys on 2017-12-03 11:35:48 +0000
 1.2.2.2 28-Jul-2018  pgoyette Sync with HEAD
 1.2.2.1 25-Jun-2018  pgoyette Sync with HEAD
 1.3.2.1 10-Jun-2019  christos Sync with HEAD

RSS XML Feed