NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: port-amd64/58366: KASLR broken



The following reply was made to PR port-amd64/58366; it has been noted by GNATS.

From: Taylor R Campbell <campbell%mumble.net@localhost>
To: Harold Gutch <logix%foobar.franken.de@localhost>
Cc: gnats-bugs%NetBSD.org@localhost, port-amd64-maintainer%NetBSD.org@localhost,
	gnats-admin%NetBSD.org@localhost, netbsd-bugs%NetBSD.org@localhost
Subject: Re: port-amd64/58366: KASLR broken
Date: Sun, 30 Jun 2024 14:35:34 +0000

 This is a multi-part message in MIME format.
 --=_La7nkjIJilZyzQ04m7iYdJZjSuvrF8DG
 
 Based on the attached sampling of ten boots, four failed and four
 successful, tested by logix, it looks like the issue is alignment of
 the PADDQ memory operand address.
 
 The trapping instruction, at aes_sse2_selftest + 0xb9, is:
 
 	60 0f d4 05 .. .. .. ..		paddq	........(%rip),%xmm0
 
 where the ellipsis encodes the sign-extended displacement from the
 starting address of the next instruction, which lies at
 aes_sse2_selftest + 0xb9 + 8, to the address of a constant operand in
 memory.  In the sampling we find:
 
 aes_sse2_selftest+0xb9	displacement	operand address
 ffffffffa0c9b146	3a8448fa	ffffffffdb4dfa48	crash
 ffffffffb92bb886	01fe0dd2	ffffffffbb29c660	boot
 ffffffff924d4b86	455eb222	ffffffffd7abfdb0	boot
 ffffffffd98cbbc6	fd1c9c6a	ffffffffd6a95838	crash
 ffffffff96ee5866	ee73e9a2	ffffffff85624210	boot
 fffffffff0663406	9cc81eda	ffffffff8d2e52e8	crash
 ffffffffb584ed46	24a92c22	ffffffffda2e1970	boot
 ffffffff884698a6	578434fa	fffffffedfcacda8	crash
 fffffffffc4b7d86	d83e72f2	ffffffffd489f080	boot
 fffffffffa0af5a6	f9fe4002	fffffffff40935b0	boot
 
 Normally, x86 isn't picky about alignment.  But this looks like a
 strong correlation between misalignment and crashes.  The Intel manual
 says:
 
 	Some instructions that operate on double quadwords require
 	memory operands to be aligned on a natural boundary.  These
 	instructions generate a general-protection exception (#GP
 	[trap type T_PROTFLT=4 in NetBSD]) if an unaligned operand is
 	specified.  (4.1.1 Alignment of Words, Doublewords, Quadwords,
 	and Double Quadwords, p. 4-2)
 
 	The address of a 128-bit packed memory operand must be aligned
 	on a 16-byte boundary, except in the following cases:
 	- a MOVUPD instruction which supports unaligned accesses
 	- scalar instructions that use an 8-byte memory operand that
 	  is not subject to alignment requirements.
 	(11.3 SSE2 Data Types, p. 11-4)
 
 	--Intel 64 and IA-32 Architectures Software Developers Manual,
 	  Volume 1: Basic Architecture, Order Number: 253665-077US,
 	  April 2022
 
 The AMD manual says:
 
 	Generally, legacy SSE instructions that attempt to access a
 	vector operand in memory that is not naturally aligned trigger
 	a general-protection fault (#GP).  (4.3.2 Data Alignment,
 	p. 120)
 
 	--AMD64 Architecture Programmer's Manual, Volume 1:
 	  Application Programming, Publication No. 24592,
 	  Revision 3.23, October 2020
 
 So that's a plausible reason for this trap to happen.  The attached
 program confirms that PADDQ with unaligned address gets SIGSEGV with
 si_trap=4, i.e., T_PROTFLT.  (Annoyingly, I don't see how to get at
 the _memory operand_ address from siginfo -- si_addr is the
 _instruction_ address in this case.)
 
 
 Now why is the address misaligned?  The aes_sse2_subr.S generated by
 gcc contains:
 
 	.text
 	.globl	aes_sse2_selftest
 	.type	aes_sse2_selftest, @function
 aes_sse2_selftest:
 ...
 	paddq	.LC11(%rip), %xmm0
 ...
 	.section	.rodata.cst16,"aM",@progbits,16
 ...
 	.align 16
 .LC11:
 	.quad	-1
 	.quad	-1
 
 So .LC11 _should_ be aligned on a 16-byte boundary inside the
 .rodata.cst16 section.  And `readelf -Ss aes_sse2_subr.o' confirms
 (a) that the .rodata.cst16 section requests 16-byte alignment, and
 (b) that the .LC11 symbol's address in the section has 16-byte
 alignment in the .rodata.cst16 section:
 
 Section Headers:
   [Nr] Name              Type             Address           Offset
        Size              EntSize          Flags  Link  Info  Align
 ...
   [ 9] .rodata.cst16     PROGBITS         0000000000000000  000031a0
        0000000000000020  0000000000000010  AM       0     0     16
 ...
 
 Symbol table '.symtab' contains 68 entries:
    Num:    Value          Size Type    Bind   Vis      Ndx Name
 ...
      6: 0000000000000010     0 NOTYPE  LOCAL  DEFAULT    9 .LC11
 
 But when the kernel is linked with `--split-by-file=0x100000', the
 combined .rodata section is split into multiple subsections sometimes
 on _non-aligned_ boundaries with _less_ alignment:
 
 Section Headers:
   [Nr] Name              Type             Address           Offset
        Size              EntSize          Flags  Link  Info  Align
 ...
   [33] .rodata           PROGBITS         00000000000022c0  00112700
        000000000005cfe0  0000000000000000   A       0     0     64
 ...
   [133] .rodata.0         PROGBITS         000000000005f2a0  0103aea0
        00000000000e2c80  0000000000000000   A       0     0     32
 ...
   [135] .rodata.1         PROGBITS         0000000000141f20  0111db20
        00000000001000e0  0000000000000000   A       0     0     32
 ...
   [137] .rodata.2         PROGBITS         0000000000242000  0121dc00
        00000000000ffdc0  0000000000000000   A       0     0     64
 ...
   [139] .rodata.3         PROGBITS         0000000000341dc0  0131d9c0
        00000000001004f8  0000000000000000   A       0     0     64
 ...
   [141] .rodata.4         PROGBITS         00000000004422b8  0141deb8
        0000000000100bb0  0000000000000000   A       0     0     8
   [142] .rodata.5         PROGBITS         0000000000542e68  0151ea68
        00000000000231c8  0000000000000000   A       0     0     8
 
 With -X omitted from the link flags so it doesn't delete local
 symbols, we see that .LC11 winds up in .rodata.4 (not sure which .LC11
 it is but all three are in .rodata.4):
 
 Symbol table '.symtab' contains 56230 entries:
    Num:    Value          Size Type    Bind   Vis      Ndx Name
 ...
  11530: 0000000000016548     0 NOTYPE  LOCAL  DEFAULT  141 .LC11
 ...
  11566: 0000000000016778     0 NOTYPE  LOCAL  DEFAULT  141 .LC11
 ...
  11574: 0000000000016868     0 NOTYPE  LOCAL  DEFAULT  141 .LC11
 
 And for some reason, .rodata.4 only requests 8-byte alignment.
 
 It looks like when ld splits sections, it sometimes chooses
 non-aligned splitting points and then reduces the alignment of the
 next section accordingly:
 
 section		address		size		align
 .rodata		0x22c0		0x5cfe0		64
 .rodata.0	0x5f2a0		0xe2c80		32
 
 The starting address of .rodata is 64-byte-aligned, but its size is
 only 32-byte-aligned.  The starting address of .rodata.0, which starts
 contiguously after .rodata in the virtual address space of the ELF
 file, is only 32-byte-aligned.  And when we get to .rodata.4, it's
 gone down to only 8-byte alignment.
 
 So when the KASLR bootloader (`prekern') randomizes the address space,
 if it respects the requested alignment but roughly uniformly
 randomizes everything else, there's a roughly 1/2 probability that the
 .rodata.4 section will come out misaligned for PADDQ and the kernel
 will crash at boot.
 
 We can try removing `--split-by-file', but that will reduce the
 efficacy of KASLR as a security measure, since it will only be able to
 randomize .rodata (and .text and .data and ...) as a whole and not the
 separate parts of each section independently.
 
 But the right fix is probably to convince ld to insert appropriate
 padding in the split sections so that the alignment can be maintained
 (or convince ELF to support section alignment constraints of the form
 `congruent to k modulo 2^n' and not just `congruent to 0 modulo 2^n',
 but that might be a taller order).
 
 --=_La7nkjIJilZyzQ04m7iYdJZjSuvrF8DG
 Content-Type: text/plain; charset="ISO-8859-1"; name="sample"
 Content-Transfer-Encoding: quoted-printable
 Content-Disposition: attachment; filename="sample.txt"
 
 ---------------------------------------------------------------------------=
 -----
 [   1.2520095] trap type 4 code 0 rip 0xffffffffa0c9b146 cs 0x8 rflags 0x24=
 6 cr2 0 ilevel 0x6 rsp 0xffffffffa30ffa80
 
 db{0}> print aes_sse2_selftest+0xb9
 ffffffffa0c9b146
 db{0}> x/xb aes_sse2_selftest+0xb9,8
 netbsd:aes_sse2_selftest+0xb9:  5d40f66     3a8448fa    c0700f66    6f0f664e
 netbsd:aes_sse2_selftest+0xc9:  f66d04d     6601f173    d305df0f    663a8448
 netbsd:aes_sse2_selftest+0xd9:
 ---------------------------------------------------------------------------=
 -----
 db{0}> print aes_sse2_selftest+0xb9
 ffffffffb92bb886
 db{0}> x/xb aes_sse2_selftest+0xb9,8
 netbsd:aes_sse2_selftest+0xb9:  5d40f66     1fe0dd2     c0700f66    6f0f664e
 netbsd:aes_sse2_selftest+0xc9:  f66d04d     6601f173    ab05df0f    6601fe0d
 netbsd:aes_sse2_selftest+0xd9:
 ---------------------------------------------------------------------------=
 -----
 db{0}> print aes_sse2_selftest+0xb9
 ffffffff924d4b86
 db{0}> x/xb aes_sse2_selftest+0xb9,8
 netbsd:aes_sse2_selftest+0xb9:  5d40f66     455eb222    c0700f66    6f0f664e
 netbsd:aes_sse2_selftest+0xc9:  f66d04d     6601f173    fb05df0f    66455eb1
 netbsd:aes_sse2_selftest+0xd9:
 ---------------------------------------------------------------------------=
 -----
 [   1.2417008] trap type 4 code 0 rip 0xffffffffd98cbbc6 cs 0x8 rflags 0x24=
 6 cr2 0 ilevel 0x6 rsp 0xffffffffa2608a80
 
 db{0}> print aes_sse2_selftest+0xb9
 ffffffffd98cbbc6
 db{0}> x/xb aes_sse2_selftest+0xb9,8
 netbsd:aes_sse2_selftest+0xb9:  5d40f66     fd1c9c6a    c0700f66    6f0f664e
 netbsd:aes_sse2_selftest+0xc9:  f66d04d     6601f173    4305df0f    66fd1c9c
 netbsd:aes_sse2_selftest+0xd9:
 ---------------------------------------------------------------------------=
 -----
 db{0}> print aes_sse2_selftest+0xb9
 ffffffff96ee5866
 db{0}> x/xb aes_sse2_selftest+0xb9,8
 netbsd:aes_sse2_selftest+0xb9:  5d40f66     ee73e9a2    c0700f66    6f0f664e
 netbsd:aes_sse2_selftest+0xc9:  f66d04d     6601f173    7b05df0f    66ee73e9
 netbsd:aes_sse2_selftest+0xd9:
 ---------------------------------------------------------------------------=
 -----
 [   1.2660439] trap type 4 code 0 rip 0xfffffffff0663406 cs 0x8 rflags 0x24=
 6 cr2 0 ilevel 0x6 rsp 0xffffffffa2967a80
 
 db{0}> print aes_sse2_selftest+0xb9
 fffffffff0663406
 db{0}> x/xb aes_sse2_selftest+0xb9,8
 netbsd:aes_sse2_selftest+0xb9:  5d40f66     9cc81eda    c0700f66    6f0f664e
 netbsd:aes_sse2_selftest+0xc9:  f66d04d     6601f173    b305df0f    669cc81e
 netbsd:aes_sse2_selftest+0xd9:
 ---------------------------------------------------------------------------=
 -----
 db{0}> print aes_sse2_selftest+0xb9
 ffffffffb584ed46
 db{0}> x/xb aes_sse2_selftest+0xb9,8
 netbsd:aes_sse2_selftest+0xb9:  5d40f66     24a92c22    c0700f66    6f0f664e
 netbsd:aes_sse2_selftest+0xc9:  f66d04d     6601f173    fb05df0f    6624a92b
 netbsd:aes_sse2_selftest+0xd9:
 ---------------------------------------------------------------------------=
 -----
 [   1.2370654] trap type 4 code 0 rip 0xffffffff884698a6 cs 0x8 rflags 0x24=
 6 cr2 0 ilevel 0x6 rsp 0xffffffffaf0daa80
 
 db{0}> print aes_sse2_selftest+0xb9
 ffffffff884698a6
 db{0}> x/xb aes_sse2_selftest+0xb9,8
 netbsd:aes_sse2_selftest+0xb9:  5d40f66     578434fa    c0700f66    6f0f664e
 netbsd:aes_sse2_selftest+0xc9:  f66d04d     6601f173    d305df0f    66578434
 netbsd:aes_sse2_selftest+0xd9:
 ---------------------------------------------------------------------------=
 -----
 db{0}> print aes_sse2_selftest+0xb9
 fffffffffc4b7d86
 db{0}> x/xb aes_sse2_selftest+0xb9,8
 netbsd:aes_sse2_selftest+0xb9:  5d40f66     d83e72f2    c0700f66    6f0f664e
 netbsd:aes_sse2_selftest+0xc9:  f66d04d     6601f173    cb05df0f    66d83e72
 netbsd:aes_sse2_selftest+0xd9:
 ---------------------------------------------------------------------------=
 -----
 db{0}> print aes_sse2_selftest+0xb9
 fffffffffa0af5a6
 db{0}> x/xb aes_sse2_selftest+0xb9,8
 netbsd:aes_sse2_selftest+0xb9:  5d40f66     f9fe4002    c0700f66    6f0f664e
 netbsd:aes_sse2_selftest+0xc9:  f66d04d     6601f173    db05df0f    66f9fe3f
 netbsd:aes_sse2_selftest+0xd9:
 ---------------------------------------------------------------------------=
 -----
 
 --=_La7nkjIJilZyzQ04m7iYdJZjSuvrF8DG
 Content-Type: text/plain; charset="ISO-8859-1"; name="paddq_unaligned"
 Content-Transfer-Encoding: quoted-printable
 Content-Disposition: attachment; filename="paddq_unaligned.c"
 
 #include <emmintrin.h>
 #include <err.h>
 #include <immintrin.h>
 #include <signal.h>
 #include <stdio.h>
 #include <string.h>
 #include <unistd.h>
 
 __attribute__((noinline))
 __m128i
 paddq(const __m128i *p, __m128i x)
 {
 	return _mm_add_epi64(*p, x);
 }
 
 static void
 on_sigsegv(int signo, siginfo_t *si, void *ctx)
 {
 	char buf[1024];
 
 	snprintf(buf, sizeof(buf), "SIGSEGV:"
 	    " si_signo=3D%d si_errno=3D%d si_code=3D%d"
 	    " si_addr=3D%p si_trap=3D%d\n",
 	    si->si_signo, si->si_errno, si->si_code,
 	    si->si_addr, si->si_trap);
 	(void)write(STDERR_FILENO, buf, strlen(buf));
 	_exit(0);
 }
 
 int
 main(void)
 {
 	struct sigaction sa;
 
 	memset(&sa, 0, sizeof(sa));
 	sa.sa_sigaction =3D &on_sigsegv;
 	if (sigfillset(&sa.sa_mask) =3D=3D -1)
 		err(1, "sigfillset");
 	sa.sa_flags =3D SA_SIGINFO;
 	if (sigaction(SIGSEGV, &sa, NULL) =3D=3D -1)
 		err(1, "sigaction");
 
 	char buf[17] __attribute__((aligned(16)));
 	volatile __m128i x =3D _mm_loadu_si128((const __m128i_u *)buf);
 	volatile __m128i y =3D paddq((const __m128i *)(buf + 1), x);
 	(void)y;
 
 	return 1;
 }
 
 --=_La7nkjIJilZyzQ04m7iYdJZjSuvrF8DG--
 


Home | Main Index | Thread Index | Old Index