Home | History | Annotate | Download | only in arm
History log of /src/sys/crypto/aes/arch/arm/aes_armv8_64.S
RevisionDateAuthorComments
 1.15  08-Sep-2020  riastradh aesarmv8: Reallocate registers to shave off unnecessary MOV.
 1.14  08-Sep-2020  riastradh aesarmv8: Issue two 4-register ld/st, not four 2-register ld/st.
 1.13  08-Sep-2020  riastradh aesarmv8: Adapt aes_armv8_64.S to big-endian.

Patch mainly from (and tested by) jakllsch@ with minor tweaks by me.
 1.12  08-Aug-2020  riastradh Fix ARM NEON implementations of AES and ChaCha on big-endian ARM.

New macros such as VQ_N_U32(a,b,c,d) for NEON vector initializers.
Needed because GCC and Clang disagree on the ordering of lanes,
depending on whether it's 64-bit big-endian, 32-bit big-endian, or
little-endian -- and, bizarrely, both of them disagree with the
architectural numbering of lanes.

Experimented with using

static const uint8_t x8[16] = {...};

uint8x16_t x = vld1q_u8(x8);

which doesn't require knowing anything about the ordering of lanes,
but this generates considerably worse code and apparently confuses
GCC into not recognizing the constant value of x8.

Fix some clang mistakes while here too.
 1.11  27-Jul-2020  riastradh Add RCSIDs to the AES and ChaCha .S sources.
 1.10  27-Jul-2020  riastradh Issue aese/aesmc and aesd/aesimc in pairs.

Advised by the aarch64 optimization guide; increases cgd throughput
by about 10%.
 1.9  27-Jul-2020  riastradh Align critical-path loops in AES and ChaCha.
 1.8  25-Jul-2020  riastradh Implement AES-CCM with ARMv8.5-AES.
 1.7  25-Jul-2020  riastradh Invert some loops to save a branch instruction on every iteration.
 1.6  22-Jul-2020  riastradh Fix register name in comment.

Some time ago I reallocated the registers to avoid inadvertently
clobbering the callee-saves v9, but neglected to update the comment.
 1.5  19-Jul-2020  ryo fix build with clang/llvm.

clang aarch64 assembler doesn't accept optional number of lanes of vector register.
(but ARMARM says that an assembler must accept it)
 1.4  30-Jun-2020  riastradh Reallocate registers to avoid abusing callee-saves registers, v8-v15.

Forgot to consult the AAPCS before committing this before -- oops!

While here, take advantage of the 32 aarch64 simd registers to avoid
all stack spills.
 1.3  30-Jun-2020  riastradh Use `.arch_extension aes' for aese/aesmc/aesd/aesimc.

Unlike `.arch_extension crypto', this works with clang; both work
with gas, so we'll go with this.

Clang still can't handle aes_armv8_64.S yet -- it gets confused by
dup and mov on lanes, but this makes progress.
 1.2  30-Jun-2020  riastradh Use .p2align rather than .align.

Apparently on arm, .align is actually an alias for .p2align, taking a
power of two rather than a number of bytes, so aes_armv8_64.o was
bloated to 32KB with obscene alignment when it only needed to be
barely past 4KB.

Do the same for the x86 aes_ni_64.S -- even though .align takes a
number of bytes rather than a power of two on x86, let's just stay
away from the temptations of the evil .align directive.
 1.1  29-Jun-2020  riastradh Implement AES in kernel using ARMv8.0-AES on aarch64.

RSS XML Feed