NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/57240: Missing store-before-store barriers in cpu_switchto on non-TSO MP platforms



The following reply was made to PR kern/57240; it has been noted by GNATS.

From: Taylor R Campbell <riastradh%NetBSD.org@localhost>
To: gnats-bugs%NetBSD.org@localhost
Cc: 
Subject: Re: kern/57240: Missing store-before-store barriers in cpu_switchto on non-TSO MP platforms
Date: Thu, 23 Feb 2023 12:43:28 +0000

 I sat down this morning to try to prove that either mutex_vector_enter
 succeeds, or an eventual mutex_exit will wake it up.  But I failed.
 
 It looks like store-before-store barriers are not enough on their own,
 and never have been, and both sides (mutex_vector_enter and
 cpu_switchto) need the dreaded store-before-load barrier.
 
 Let's recap the algorithm (with some irrelevant bits omitted -- spin
 optimizations, KERNEL_LOCK coordination, and acquire/release membars).
 
 mutex_vector_enter:
 	N1. load owner :=3D mtx->mtx_owner
 	N2. try CAS(&mtx->mtx_owner, 0, curlwp); on success, return
 	N3. lock turnstile
 	N4. try CAS(&mtx->mtx_owner, owner, owner|MUTEX_BIT_WAITERS);
 	    on failure, unlock turnstile and restart
 	N5. if owner->l_cpu->ci_curlwp =3D=3D owner (owner on CPU), restart=20
 	N6. reload mtx->mtx_owner; if MUTEX_BIT_WAITERS clear, restart
 	N7. sleep (unlocks turnstile) and restart
 
 mutex_exit, surrounded by cpu_switchto before (from otherlwp) and
 after (back to otherlwp):
 	X1. [cpu_switchto] store curcpu()->ci_curlwp :=3D owner
 	...
 	X2. Without preemption:
 		(a) load owner :=3D mtx->mtx_owner
 		(b) if owner !=3D curlwp, go to X2
 		(c) store mtx->mtx_owner :=3D 0
 		(d) return
 	X3. lock turnstile
 	X4. wake waiters and unlock turnstile
 	...
 	X5. [cpu_switchto] store curcpu()->ci_curlwp :=3D otherlwp
 
 Here's a possible sequence of events on two CPUs:
 
 cpu1 (mutex_exit, lwp1)			cpu2 (mutex_enter)
 -----------------------			------------------
 X1. store curcpu()->ci_curlwp :=3D lwp1
 	*** stuck in store buffer
 X2(a). load mtx->mtx_owner
 	(reads out lwp1)
 					N4. CAS store mtx->mtx_owner
 						:=3D lwp1|MUTEX_BIT_WAITERS
 X2(b). if owner !=3D curlwp go to X2
 	(branch not taken)
 					N5. load owner->l_cpu->ci_curlwp
 						*** reads out otherlwp!
 X1'. store buffer finally flushes, too late
 					N6. reload mtx->mtx_owner
 						(MUTEX_BIT_WAITERS still set)
 X2(c). store mtx->mtx_owner :=3D 0
 X2(d). return
 					N7. sleep forever
 
 It's not enough to issue a store-before-store barrier between X1
 (curcpu()->ci_curlwp =3D owner in cpu_switchto) and X2 (mutex_exit) --
 the stores are already ordered in this sequence of events.
 
 Inserting store-before-load barriers between X1 and X2, and between N4
 and N5, prevents this scenario: with these barriers, mutex_exit can't
 examine mtx->mtx_owner until mutex_vector_enter would see that the
 owner is running again, and mutex_vector_enter can't test whether the
 owner is running again until mutex_exit would see MUTEX_BIT_WAITERS.
 (The same issue applies to ci->ci_biglock_wanted for KERNEL_LOCK
 coordination.)
 
 I couldn't find any way to avoid the store-before-load barriers in my
 proof sketches.  When I realized this, I decided to check what Solaris
 does, and sure enough, there's store-before-load on both sides, in
 mutex_vector_enter and in `resume' (equivalent of cpu_switchto), and a
 long comment in mutex.c explaining the reasoning I came upon:
 
 https://github.com/illumos/illumos-gate/blob/9da20b8a0eca3f70a8e2ba38926902=
 59d3ec6cb6/usr/src/uts/common/os/mutex.c
 https://github.com/illumos/illumos-gate/blob/9da20b8a0eca3f70a8e2ba38926902=
 59d3ec6cb6/usr/src/uts/common/os/mutex.c#L429-L431
 https://github.com/illumos/illumos-gate/blob/f0089e391b2bc4be2755f1a1b51fb4=
 cd9b8f3988/usr/src/uts/sun4/ml/swtch.s#L296-L297
 https://github.com/illumos/illumos-gate/blob/f0089e391b2bc4be2755f1a1b51fb4=
 cd9b8f3988/usr/src/uts/intel/ml/swtch.s#L467-L468
 
 So unless ad@ knows some magic proof techniques that bypass the
 barriers both I and the Solaris authors think are necessary, I think
 we need to put a store-before-load barrier in every cpu_switchto.
 


Home | Main Index | Thread Index | Old Index