NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

kern/60272: sys/atomic.h: unnecessary stack spillage



>Number:         60272
>Category:       kern
>Synopsis:       sys/atomic.h: unnecessary stack spillage
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sat May 16 14:55:00 +0000 2026
>Originator:     Taylor R Campbell
>Release:        current, 11, 10, 9
>Organization:
The AtomicBSD Volition
>Environment:
>Description:

	The sys/atomic.h load/store operations do unnecessary
	load/store operations on the stack.

	Example 1, atomic_load_relaxed:

https://nxr.netbsd.org/xref/src/sys/dev/ic/tpm.c?r=1.29#809

    809 static void
    810 tpm_rng_get(size_t nbytes, void *cookie)
    811 {
    812 	struct tpm_softc *sc = cookie;
    813 
    814 	if (atomic_load_relaxed(&sc->sc_rnddisabled))
    815 		return;		/* tough */
...
    819 }

	GCC generates the following code, with an unnecessary store to
	stack and reload from stack marked with (*):

0000000000000480 <tpm_rng_get>:
tpm_rng_get():
/home/riastradh/netbsd/current/src/sys/dev/ic/tpm.c:811
     480:       a9bd7bfd        stp     x29, x30, [sp, #-48]!
     484:       910003fd        mov     x29, sp
     488:       f9000bf3        str     x19, [sp, #16]
     48c:       aa0103f3        mov     x19, x1
/home/riastradh/netbsd/current/src/sys/dev/ic/tpm.c:814
     490:       39445022        ldrb    w2, [x1, #276]
     494:       12001c42        and     w2, w2, #0xff
     498:       3900bfe2        strb    w2, [sp, #47]		// (*)
     49c:       3940bfe1        ldrb    w1, [sp, #47]		// (*)
     4a0:       72001c3f        tst     w1, #0xff
     4a4:       54000080        b.eq    4b4 <tpm_rng_get+0x34>  // b.none

	Example 2, atomic_store_relaxed:

https://nxr.netbsd.org/xref/src/sys/dev/pci/if_aq.c?r=1.50#6050

   6012 static void
   6013 aq_handle_reset_work(struct work *work, void *arg)
   6014 {
...
   6048 	IFNET_UNLOCK(ifp);
   6049 
   6050 	atomic_store_relaxed(&sc->sc_reset_pending, 0);
   6051 }

	GCC generates the following code:

/home/riastradh/netbsd/current/src/sys/dev/pci/if_aq.c:6050
    5fd8:       b9007fff        str     wzr, [sp, #124]		// (*)
    5fdc:       f240067f        tst     x19, #0x3
    5fe0:       54000101        b.ne    6000 <aq_handle_reset_work+0x1bc>  // b.any
/home/riastradh/netbsd/current/src/sys/dev/pci/if_aq.c:6050 (discriminator 3)
    5fe4:       b9407fe0        ldr     w0, [sp, #124]		// (*)
    5fe8:       b90f2ac0        str     w0, [x22, #3880]

	This happens because of the following definitions (ignoring the
	kcsan variants, and the hash-locked versions for hppa/sparc):

https://nxr.netbsd.org/xref/src/sys/sys/atomic.h#432

    432 #define __BEGIN_ATOMIC_LOAD(p, v) \
    433 	__typeof__(*(p)) v = *(p)
    434 #define __END_ATOMIC_LOAD(v) \
    435 	v
...
    440 #define __DO_ATOMIC_STORE(p, v) \
    441 	*p = v

    445 #define	atomic_load_relaxed(p)						      \
    446 ({									      \
    447 	const volatile __typeof__(*(p)) *__al_ptr = (p);		      \
    448 	__ATOMIC_PTR_CHECK(__al_ptr);					      \
    449 	__BEGIN_ATOMIC_LOAD(__al_ptr, __al_val);			      \
    450 	__END_ATOMIC_LOAD(__al_val);					      \
    451 })
...
    471 #define	atomic_store_relaxed(p,v)					      \
    472 ({									      \
    473 	volatile __typeof__(*(p)) *__as_ptr = (p);			      \
    474 	__typeof__(*(p)) __as_val = (v);				      \
    475 	__ATOMIC_PTR_CHECK(__as_ptr);					      \
    476 	__DO_ATOMIC_STORE(__as_ptr, __as_val);				      \
    477 })

	The net effect is that atomic_load_relaxed(p) roughly expands
	to:

		volatile typeof(*p) *__al_ptr = p;
		typeof(*__al_ptr) __al_val = *p;
		return __al_val;

	And atomic_store_relaxed(p, v) roughly expands to:

		volatile typeof(*p) *__as_ptr = p;
		typeof(*p) __as_val = v;
		*__as_ptr = __as_val;

	Note that the intermediate variable __al_val/__as_val will have
	all the qualifiers of p, which is generally volatile-qualified,
	which provokes the compiler to issue an extra store/load.

>How-To-Repeat:

	code inspection

>Fix:

	Several options:

	1. Teach the kernel build to just use C11 stdatomic.h.

	   The code is already there to take advantage of it -- it's
	   just not usable right now because of -nostdinc, so it's a
	   matter of setting up the toolchain the right way to make
	   stdatomic.h available.

	2. Use the compiler __atomic_* builtins.

	   Requires some research to find how stable they are, and
	   whether they're compatible between gcc and clang (and,
	   ideally, how onerous they are for pcc).

	3. Use __typeof_unqual__ instead of __typeof__ to strip the
	   volatile qualifier.

	   Caveat: This won't work in the version of clang we have in
	   tree.  But it will work for gcc.




Home | Main Index | Thread Index | Old Index