NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: port-amd64/58366: KASLR broken
Based on the attached sampling of ten boots, four failed and four
successful, tested by logix, it looks like the issue is alignment of
the PADDQ memory operand address.
The trapping instruction, at aes_sse2_selftest + 0xb9, is:
60 0f d4 05 .. .. .. .. paddq ........(%rip),%xmm0
where the ellipsis encodes the sign-extended displacement from the
starting address of the next instruction, which lies at
aes_sse2_selftest + 0xb9 + 8, to the address of a constant operand in
memory. In the sampling we find:
aes_sse2_selftest+0xb9 displacement operand address
ffffffffa0c9b146 3a8448fa ffffffffdb4dfa48 crash
ffffffffb92bb886 01fe0dd2 ffffffffbb29c660 boot
ffffffff924d4b86 455eb222 ffffffffd7abfdb0 boot
ffffffffd98cbbc6 fd1c9c6a ffffffffd6a95838 crash
ffffffff96ee5866 ee73e9a2 ffffffff85624210 boot
fffffffff0663406 9cc81eda ffffffff8d2e52e8 crash
ffffffffb584ed46 24a92c22 ffffffffda2e1970 boot
ffffffff884698a6 578434fa fffffffedfcacda8 crash
fffffffffc4b7d86 d83e72f2 ffffffffd489f080 boot
fffffffffa0af5a6 f9fe4002 fffffffff40935b0 boot
Normally, x86 isn't picky about alignment. But this looks like a
strong correlation between misalignment and crashes. The Intel manual
says:
Some instructions that operate on double quadwords require
memory operands to be aligned on a natural boundary. These
instructions generate a general-protection exception (#GP
[trap type T_PROTFLT=4 in NetBSD]) if an unaligned operand is
specified. (4.1.1 Alignment of Words, Doublewords, Quadwords,
and Double Quadwords, p. 4-2)
The address of a 128-bit packed memory operand must be aligned
on a 16-byte boundary, except in the following cases:
- a MOVUPD instruction which supports unaligned accesses
- scalar instructions that use an 8-byte memory operand that
is not subject to alignment requirements.
(11.3 SSE2 Data Types, p. 11-4)
--Intel 64 and IA-32 Architectures Software Developers Manual,
Volume 1: Basic Architecture, Order Number: 253665-077US,
April 2022
The AMD manual says:
Generally, legacy SSE instructions that attempt to access a
vector operand in memory that is not naturally aligned trigger
a general-protection fault (#GP). (4.3.2 Data Alignment,
p. 120)
--AMD64 Architecture Programmer's Manual, Volume 1:
Application Programming, Publication No. 24592,
Revision 3.23, October 2020
So that's a plausible reason for this trap to happen. The attached
program confirms that PADDQ with unaligned address gets SIGSEGV with
si_trap=4, i.e., T_PROTFLT. (Annoyingly, I don't see how to get at
the _memory operand_ address from siginfo -- si_addr is the
_instruction_ address in this case.)
Now why is the address misaligned? The aes_sse2_subr.S generated by
gcc contains:
.text
.globl aes_sse2_selftest
.type aes_sse2_selftest, @function
aes_sse2_selftest:
...
paddq .LC11(%rip), %xmm0
...
.section .rodata.cst16,"aM",@progbits,16
...
.align 16
.LC11:
.quad -1
.quad -1
So .LC11 _should_ be aligned on a 16-byte boundary inside the
.rodata.cst16 section. And `readelf -Ss aes_sse2_subr.o' confirms
(a) that the .rodata.cst16 section requests 16-byte alignment, and
(b) that the .LC11 symbol's address in the section has 16-byte
alignment in the .rodata.cst16 section:
Section Headers:
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
...
[ 9] .rodata.cst16 PROGBITS 0000000000000000 000031a0
0000000000000020 0000000000000010 AM 0 0 16
...
Symbol table '.symtab' contains 68 entries:
Num: Value Size Type Bind Vis Ndx Name
...
6: 0000000000000010 0 NOTYPE LOCAL DEFAULT 9 .LC11
But when the kernel is linked with `--split-by-file=0x100000', the
combined .rodata section is split into multiple subsections sometimes
on _non-aligned_ boundaries with _less_ alignment:
Section Headers:
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
...
[33] .rodata PROGBITS 00000000000022c0 00112700
000000000005cfe0 0000000000000000 A 0 0 64
...
[133] .rodata.0 PROGBITS 000000000005f2a0 0103aea0
00000000000e2c80 0000000000000000 A 0 0 32
...
[135] .rodata.1 PROGBITS 0000000000141f20 0111db20
00000000001000e0 0000000000000000 A 0 0 32
...
[137] .rodata.2 PROGBITS 0000000000242000 0121dc00
00000000000ffdc0 0000000000000000 A 0 0 64
...
[139] .rodata.3 PROGBITS 0000000000341dc0 0131d9c0
00000000001004f8 0000000000000000 A 0 0 64
...
[141] .rodata.4 PROGBITS 00000000004422b8 0141deb8
0000000000100bb0 0000000000000000 A 0 0 8
[142] .rodata.5 PROGBITS 0000000000542e68 0151ea68
00000000000231c8 0000000000000000 A 0 0 8
With -X omitted from the link flags so it doesn't delete local
symbols, we see that .LC11 winds up in .rodata.4 (not sure which .LC11
it is but all three are in .rodata.4):
Symbol table '.symtab' contains 56230 entries:
Num: Value Size Type Bind Vis Ndx Name
...
11530: 0000000000016548 0 NOTYPE LOCAL DEFAULT 141 .LC11
...
11566: 0000000000016778 0 NOTYPE LOCAL DEFAULT 141 .LC11
...
11574: 0000000000016868 0 NOTYPE LOCAL DEFAULT 141 .LC11
And for some reason, .rodata.4 only requests 8-byte alignment.
It looks like when ld splits sections, it sometimes chooses
non-aligned splitting points and then reduces the alignment of the
next section accordingly:
section address size align
.rodata 0x22c0 0x5cfe0 64
.rodata.0 0x5f2a0 0xe2c80 32
The starting address of .rodata is 64-byte-aligned, but its size is
only 32-byte-aligned. The starting address of .rodata.0, which starts
contiguously after .rodata in the virtual address space of the ELF
file, is only 32-byte-aligned. And when we get to .rodata.4, it's
gone down to only 8-byte alignment.
So when the KASLR bootloader (`prekern') randomizes the address space,
if it respects the requested alignment but roughly uniformly
randomizes everything else, there's a roughly 1/2 probability that the
.rodata.4 section will come out misaligned for PADDQ and the kernel
will crash at boot.
We can try removing `--split-by-file', but that will reduce the
efficacy of KASLR as a security measure, since it will only be able to
randomize .rodata (and .text and .data and ...) as a whole and not the
separate parts of each section independently.
But the right fix is probably to convince ld to insert appropriate
padding in the split sections so that the alignment can be maintained
(or convince ELF to support section alignment constraints of the form
`congruent to k modulo 2^n' and not just `congruent to 0 modulo 2^n',
but that might be a taller order).
--------------------------------------------------------------------------------
[ 1.2520095] trap type 4 code 0 rip 0xffffffffa0c9b146 cs 0x8 rflags 0x246 cr2 0 ilevel 0x6 rsp 0xffffffffa30ffa80
db{0}> print aes_sse2_selftest+0xb9
ffffffffa0c9b146
db{0}> x/xb aes_sse2_selftest+0xb9,8
netbsd:aes_sse2_selftest+0xb9: 5d40f66 3a8448fa c0700f66 6f0f664e
netbsd:aes_sse2_selftest+0xc9: f66d04d 6601f173 d305df0f 663a8448
netbsd:aes_sse2_selftest+0xd9:
--------------------------------------------------------------------------------
db{0}> print aes_sse2_selftest+0xb9
ffffffffb92bb886
db{0}> x/xb aes_sse2_selftest+0xb9,8
netbsd:aes_sse2_selftest+0xb9: 5d40f66 1fe0dd2 c0700f66 6f0f664e
netbsd:aes_sse2_selftest+0xc9: f66d04d 6601f173 ab05df0f 6601fe0d
netbsd:aes_sse2_selftest+0xd9:
--------------------------------------------------------------------------------
db{0}> print aes_sse2_selftest+0xb9
ffffffff924d4b86
db{0}> x/xb aes_sse2_selftest+0xb9,8
netbsd:aes_sse2_selftest+0xb9: 5d40f66 455eb222 c0700f66 6f0f664e
netbsd:aes_sse2_selftest+0xc9: f66d04d 6601f173 fb05df0f 66455eb1
netbsd:aes_sse2_selftest+0xd9:
--------------------------------------------------------------------------------
[ 1.2417008] trap type 4 code 0 rip 0xffffffffd98cbbc6 cs 0x8 rflags 0x246 cr2 0 ilevel 0x6 rsp 0xffffffffa2608a80
db{0}> print aes_sse2_selftest+0xb9
ffffffffd98cbbc6
db{0}> x/xb aes_sse2_selftest+0xb9,8
netbsd:aes_sse2_selftest+0xb9: 5d40f66 fd1c9c6a c0700f66 6f0f664e
netbsd:aes_sse2_selftest+0xc9: f66d04d 6601f173 4305df0f 66fd1c9c
netbsd:aes_sse2_selftest+0xd9:
--------------------------------------------------------------------------------
db{0}> print aes_sse2_selftest+0xb9
ffffffff96ee5866
db{0}> x/xb aes_sse2_selftest+0xb9,8
netbsd:aes_sse2_selftest+0xb9: 5d40f66 ee73e9a2 c0700f66 6f0f664e
netbsd:aes_sse2_selftest+0xc9: f66d04d 6601f173 7b05df0f 66ee73e9
netbsd:aes_sse2_selftest+0xd9:
--------------------------------------------------------------------------------
[ 1.2660439] trap type 4 code 0 rip 0xfffffffff0663406 cs 0x8 rflags 0x246 cr2 0 ilevel 0x6 rsp 0xffffffffa2967a80
db{0}> print aes_sse2_selftest+0xb9
fffffffff0663406
db{0}> x/xb aes_sse2_selftest+0xb9,8
netbsd:aes_sse2_selftest+0xb9: 5d40f66 9cc81eda c0700f66 6f0f664e
netbsd:aes_sse2_selftest+0xc9: f66d04d 6601f173 b305df0f 669cc81e
netbsd:aes_sse2_selftest+0xd9:
--------------------------------------------------------------------------------
db{0}> print aes_sse2_selftest+0xb9
ffffffffb584ed46
db{0}> x/xb aes_sse2_selftest+0xb9,8
netbsd:aes_sse2_selftest+0xb9: 5d40f66 24a92c22 c0700f66 6f0f664e
netbsd:aes_sse2_selftest+0xc9: f66d04d 6601f173 fb05df0f 6624a92b
netbsd:aes_sse2_selftest+0xd9:
--------------------------------------------------------------------------------
[ 1.2370654] trap type 4 code 0 rip 0xffffffff884698a6 cs 0x8 rflags 0x246 cr2 0 ilevel 0x6 rsp 0xffffffffaf0daa80
db{0}> print aes_sse2_selftest+0xb9
ffffffff884698a6
db{0}> x/xb aes_sse2_selftest+0xb9,8
netbsd:aes_sse2_selftest+0xb9: 5d40f66 578434fa c0700f66 6f0f664e
netbsd:aes_sse2_selftest+0xc9: f66d04d 6601f173 d305df0f 66578434
netbsd:aes_sse2_selftest+0xd9:
--------------------------------------------------------------------------------
db{0}> print aes_sse2_selftest+0xb9
fffffffffc4b7d86
db{0}> x/xb aes_sse2_selftest+0xb9,8
netbsd:aes_sse2_selftest+0xb9: 5d40f66 d83e72f2 c0700f66 6f0f664e
netbsd:aes_sse2_selftest+0xc9: f66d04d 6601f173 cb05df0f 66d83e72
netbsd:aes_sse2_selftest+0xd9:
--------------------------------------------------------------------------------
db{0}> print aes_sse2_selftest+0xb9
fffffffffa0af5a6
db{0}> x/xb aes_sse2_selftest+0xb9,8
netbsd:aes_sse2_selftest+0xb9: 5d40f66 f9fe4002 c0700f66 6f0f664e
netbsd:aes_sse2_selftest+0xc9: f66d04d 6601f173 db05df0f 66f9fe3f
netbsd:aes_sse2_selftest+0xd9:
--------------------------------------------------------------------------------
#include <emmintrin.h>
#include <err.h>
#include <immintrin.h>
#include <signal.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>
__attribute__((noinline))
__m128i
paddq(const __m128i *p, __m128i x)
{
return _mm_add_epi64(*p, x);
}
static void
on_sigsegv(int signo, siginfo_t *si, void *ctx)
{
char buf[1024];
snprintf(buf, sizeof(buf), "SIGSEGV:"
" si_signo=%d si_errno=%d si_code=%d"
" si_addr=%p si_trap=%d\n",
si->si_signo, si->si_errno, si->si_code,
si->si_addr, si->si_trap);
(void)write(STDERR_FILENO, buf, strlen(buf));
_exit(0);
}
int
main(void)
{
struct sigaction sa;
memset(&sa, 0, sizeof(sa));
sa.sa_sigaction = &on_sigsegv;
if (sigfillset(&sa.sa_mask) == -1)
err(1, "sigfillset");
sa.sa_flags = SA_SIGINFO;
if (sigaction(SIGSEGV, &sa, NULL) == -1)
err(1, "sigaction");
char buf[17] __attribute__((aligned(16)));
volatile __m128i x = _mm_loadu_si128((const __m128i_u *)buf);
volatile __m128i y = paddq((const __m128i *)(buf + 1), x);
(void)y;
return 1;
}
Home |
Main Index |
Thread Index |
Old Index