NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
kern/60272: sys/atomic.h: unnecessary stack spillage
>Number: 60272
>Category: kern
>Synopsis: sys/atomic.h: unnecessary stack spillage
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sat May 16 14:55:00 +0000 2026
>Originator: Taylor R Campbell
>Release: current, 11, 10, 9
>Organization:
The AtomicBSD Volition
>Environment:
>Description:
The sys/atomic.h load/store operations do unnecessary
load/store operations on the stack.
Example 1, atomic_load_relaxed:
https://nxr.netbsd.org/xref/src/sys/dev/ic/tpm.c?r=1.29#809
809 static void
810 tpm_rng_get(size_t nbytes, void *cookie)
811 {
812 struct tpm_softc *sc = cookie;
813
814 if (atomic_load_relaxed(&sc->sc_rnddisabled))
815 return; /* tough */
...
819 }
GCC generates the following code, with an unnecessary store to
stack and reload from stack marked with (*):
0000000000000480 <tpm_rng_get>:
tpm_rng_get():
/home/riastradh/netbsd/current/src/sys/dev/ic/tpm.c:811
480: a9bd7bfd stp x29, x30, [sp, #-48]!
484: 910003fd mov x29, sp
488: f9000bf3 str x19, [sp, #16]
48c: aa0103f3 mov x19, x1
/home/riastradh/netbsd/current/src/sys/dev/ic/tpm.c:814
490: 39445022 ldrb w2, [x1, #276]
494: 12001c42 and w2, w2, #0xff
498: 3900bfe2 strb w2, [sp, #47] // (*)
49c: 3940bfe1 ldrb w1, [sp, #47] // (*)
4a0: 72001c3f tst w1, #0xff
4a4: 54000080 b.eq 4b4 <tpm_rng_get+0x34> // b.none
Example 2, atomic_store_relaxed:
https://nxr.netbsd.org/xref/src/sys/dev/pci/if_aq.c?r=1.50#6050
6012 static void
6013 aq_handle_reset_work(struct work *work, void *arg)
6014 {
...
6048 IFNET_UNLOCK(ifp);
6049
6050 atomic_store_relaxed(&sc->sc_reset_pending, 0);
6051 }
GCC generates the following code:
/home/riastradh/netbsd/current/src/sys/dev/pci/if_aq.c:6050
5fd8: b9007fff str wzr, [sp, #124] // (*)
5fdc: f240067f tst x19, #0x3
5fe0: 54000101 b.ne 6000 <aq_handle_reset_work+0x1bc> // b.any
/home/riastradh/netbsd/current/src/sys/dev/pci/if_aq.c:6050 (discriminator 3)
5fe4: b9407fe0 ldr w0, [sp, #124] // (*)
5fe8: b90f2ac0 str w0, [x22, #3880]
This happens because of the following definitions (ignoring the
kcsan variants, and the hash-locked versions for hppa/sparc):
https://nxr.netbsd.org/xref/src/sys/sys/atomic.h#432
432 #define __BEGIN_ATOMIC_LOAD(p, v) \
433 __typeof__(*(p)) v = *(p)
434 #define __END_ATOMIC_LOAD(v) \
435 v
...
440 #define __DO_ATOMIC_STORE(p, v) \
441 *p = v
445 #define atomic_load_relaxed(p) \
446 ({ \
447 const volatile __typeof__(*(p)) *__al_ptr = (p); \
448 __ATOMIC_PTR_CHECK(__al_ptr); \
449 __BEGIN_ATOMIC_LOAD(__al_ptr, __al_val); \
450 __END_ATOMIC_LOAD(__al_val); \
451 })
...
471 #define atomic_store_relaxed(p,v) \
472 ({ \
473 volatile __typeof__(*(p)) *__as_ptr = (p); \
474 __typeof__(*(p)) __as_val = (v); \
475 __ATOMIC_PTR_CHECK(__as_ptr); \
476 __DO_ATOMIC_STORE(__as_ptr, __as_val); \
477 })
The net effect is that atomic_load_relaxed(p) roughly expands
to:
volatile typeof(*p) *__al_ptr = p;
typeof(*__al_ptr) __al_val = *p;
return __al_val;
And atomic_store_relaxed(p, v) roughly expands to:
volatile typeof(*p) *__as_ptr = p;
typeof(*p) __as_val = v;
*__as_ptr = __as_val;
Note that the intermediate variable __al_val/__as_val will have
all the qualifiers of p, which is generally volatile-qualified,
which provokes the compiler to issue an extra store/load.
>How-To-Repeat:
code inspection
>Fix:
Several options:
1. Teach the kernel build to just use C11 stdatomic.h.
The code is already there to take advantage of it -- it's
just not usable right now because of -nostdinc, so it's a
matter of setting up the toolchain the right way to make
stdatomic.h available.
2. Use the compiler __atomic_* builtins.
Requires some research to find how stable they are, and
whether they're compatible between gcc and clang (and,
ideally, how onerous they are for pcc).
3. Use __typeof_unqual__ instead of __typeof__ to strip the
volatile qualifier.
Caveat: This won't work in the version of clang we have in
tree. But it will work for gcc.
Home |
Main Index |
Thread Index |
Old Index