Subject: Defered signal handling in SA libpthread
To: None <tech-userlevel@netbsd.org>
From: Bill Stouder-Studenmund <wrstuden@netbsd.org>
List: tech-userlevel
Date: 09/08/2007 21:45:33
--n+lFg1Zro7sl44OB
Content-Type: multipart/mixed; boundary="YToU2i3Vx8H2dn7O"
Content-Disposition: inline


--YToU2i3Vx8H2dn7O
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

One issue I still have to tackle in the wrstuden-fixsa branch is how to=20
deal with deferred signals. I have an idea I want to float here.

The key issue is that we need to be able to have the pthread_kill() call=20
trigger signal processing on an arbitrary thread. The problem comes when=20
our concurrency is greater than one. The thread we're targeting can be=20
running on another processor. Oops!

I see two prongs to how to deal with this. The general idea is to broaden=
=20
the scope of handling PT_FLAG_SIGDEFERRED.

Right now we are able to deliver a signal to a thread blocked in the=20
kernel. We post the signal to the thread and set PT_FLAG_SIGDEFERRED.

My idea is to do the same when the thread is running on another processor.

There are two parts to this issue. One part is the kernel part of how do=20
we get the running thread (well lwp) to notice the signal. "An upcall"=20
backed by an IPI should do it. The upcall handling code then deals with=20
ensuring the signal gets handled. But that's not today's topic.

Today's topic is the other half. The problem is there's a race. The kernel
can deliver an upcall to an lwp, but it can't ensure that a given thread
is (still) running on that lwp. We could have a situation where the target
thread is just about to switch itself off of the lwp when the killing=20
thread decides it needs to get the kernel to interrupt it. So close to=20
being switched off that it is switched off by the time that the kernel=20
interrupt actually gets there.

=46rom looking at the code, I think the thing to do is to add tests to see=
=20
if PT_FLAG_SIGDEFERRED is set while we're in the process of blocking, and=
=20
if so, rather than blocking we handle the signal.

=46rom looking at the code, the thing to do is make this test while we hold=
=20
our own pt_statelock lock.

I have coded most of the change to detect deferred signals, and I'd=20
appreciate comments. I'm attaching it below.

The parts I have left are a change so we can tell which lwp we're on and a=
=20
change in the assert in pthread_kill() so we only get upset if the target=
=20
thread isn't us and is running on our lwp.

Thoughts? I'll probably check this into the branch in a bit.

Take care,

Bill

--YToU2i3Vx8H2dn7O
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="defsig.diff"
Content-Transfer-Encoding: quoted-printable

? cscope.out
Index: pthread.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
RCS file: /cvsroot/src/lib/libpthread/pthread.c,v
retrieving revision 1.48
diff -u -p -r1.48 pthread.c
--- pthread.c	24 Apr 2006 18:39:36 -0000	1.48
+++ pthread.c	9 Sep 2007 04:37:06 -0000
@@ -691,6 +691,14 @@ pthread_join(pthread_t thread, void **va
 			pthread_spinunlock(self, &thread->pt_join_lock);
 			pthread_exit(PTHREAD_CANCELED);
 		}
+		if (pthread_check_defsig(self)) {
+			pthread_spinunlock(self, &self->pt_statelock);
+			pthread_spinunlock(self, &thread->pt_join_lock);
+			pthread__signal_deferred(self, self);
+			pthread_spinlock(self, &thread->pt_flaglock);
+			pthread_spinlock(self, &thread->pt_join_lock);
+			continue;
+		}
 		self->pt_state =3D PT_STATE_BLOCKED_QUEUE;
 		self->pt_sleepobj =3D thread;
 		self->pt_sleepq =3D &thread->pt_joiners;
Index: pthread_barrier.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
RCS file: /cvsroot/src/lib/libpthread/pthread_barrier.c,v
retrieving revision 1.6
diff -u -p -r1.6 pthread_barrier.c
--- pthread_barrier.c	8 Mar 2003 08:03:35 -0000	1.6
+++ pthread_barrier.c	9 Sep 2007 04:37:06 -0000
@@ -189,6 +189,16 @@ pthread_barrier_wait(pthread_barrier_t *
=20
 		pthread_spinlock(self, &self->pt_statelock);
=20
+		if (pthread_check_defsig(self)) {
+			pthread_spinunlock(self, &self->pt_statelock);
+			pthread_spinunlock(self, &barrier->ptb_lock);
+			SDPRINTF(("(barrier wait %p) Caught defsig on %p\n",
+			    self, barrier));
+			pthread__signal_deferred(self, self);
+			pthread_spinlock(self, &barrier->ptb_lock);
+			continue;
+		}
+
 		self->pt_state =3D PT_STATE_BLOCKED_QUEUE;
 		self->pt_sleepobj =3D barrier;
 		self->pt_sleepq =3D &barrier->ptb_waiters;
Index: pthread_cond.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
RCS file: /cvsroot/src/lib/libpthread/pthread_cond.c,v
retrieving revision 1.18
diff -u -p -r1.18 pthread_cond.c
--- pthread_cond.c	6 Jan 2005 17:33:36 -0000	1.18
+++ pthread_cond.c	9 Sep 2007 04:37:06 -0000
@@ -134,6 +134,21 @@ pthread_cond_wait(pthread_cond_t *cond,=20
 		    "Multiple mutexes used for condition wait",=20
 		    cond->ptc_mutex =3D=3D mutex);
 #endif
+	if (pthread_check_defsig(self)) {
+		pthread_spinunlock(self, &self->pt_statelock);
+		pthread_spinunlock(self, &cond->ptc_lock);
+		SDPRINTF(("(cond wait %p) Caught defsig on %p, mutex %p\n",
+		    self, cond, mutex));
+		/* mutex still locked. Isn't supposed to matter for
+		 * signal hanlders. */
+		pthread__signal_deferred(self, self);
+		/*
+		 * We can return from spurious wakeups. So to make life simple,
+		 * do so. Coveres anything that may have happened while we
+		 * were delivering signals
+		 */
+		return 0;
+	}
 	self->pt_state =3D PT_STATE_BLOCKED_QUEUE;
 	self->pt_sleepobj =3D cond;
 	self->pt_sleepq =3D &cond->ptc_waiters;
@@ -213,6 +228,21 @@ pthread_cond_timedwait(pthread_cond_t *c
 		    "Multiple mutexes used for condition wait",
 		    cond->ptc_mutex =3D=3D mutex);
 #endif
+	if (pthread_check_defsig(self)) {
+		pthread_spinunlock(self, &self->pt_statelock);
+		pthread_spinunlock(self, &cond->ptc_lock);
+		/* mutex still locked. Isn't supposed to matter for
+		 * signal hanlders. */
+		SDPRINTF(("(cond timed wait %p) Caught defsig on %p, mutex %p"
+		    "\n", self, cond));
+		pthread__signal_deferred(self, self);
+		/*
+		 * We can return from spurious wakeups. So to make life simple,
+		 * do so. Coveres anything that may have happened while we
+		 * were delivering signals
+		 */
+		return 0;
+	}
 =09
 	pthread__alarm_add(self, &alarm, abstime, pthread_cond_wait__callback,
 	    &wait);
Index: pthread_int.h
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
RCS file: /cvsroot/src/lib/libpthread/pthread_int.h,v
retrieving revision 1.34
diff -u -p -r1.34 pthread_int.h
--- pthread_int.h	3 Oct 2006 09:37:07 -0000	1.34
+++ pthread_int.h	9 Sep 2007 04:37:06 -0000
@@ -352,6 +352,12 @@ int	pthread__find(pthread_t self, pthrea
 	} 								\
         } while (/*CONSTCOND*/0)
=20
+/*
+ * You must hold t->pt_statelock when making this check.
+ */
+#define pthread_check_defsig(t) 					\
+	__predict_false((t)->pt_flags & PT_FLAG_SIGDEFERRED)
+
=20
=20
 /* These three routines are defined in processor-specific code. */
Index: pthread_mutex.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
RCS file: /cvsroot/src/lib/libpthread/pthread_mutex.c,v
retrieving revision 1.22
diff -u -p -r1.22 pthread_mutex.c
--- pthread_mutex.c	22 Aug 2006 21:46:09 -0000	1.22
+++ pthread_mutex.c	9 Sep 2007 04:37:06 -0000
@@ -268,6 +268,13 @@ pthread_mutex_lock_slow(pthread_mutex_t=20
 			 * retry.
 			 */
 			pthread_spinlock(self, &self->pt_statelock);
+			if (pthread_check_defsig(self)) {
+				pthread_spinunlock(self, &self->pt_statelock);
+				PTQ_REMOVE(&mutex->ptm_blocked, self, pt_sleep);
+				pthread_spinunlock(self, &mutex->ptm_interlock);
+				pthread__signal_deferred(self, self);
+				continue;
+			}
 			self->pt_state =3D PT_STATE_BLOCKED_QUEUE;
 			self->pt_sleepobj =3D mutex;
 			self->pt_sleepq =3D &mutex->ptm_blocked;
Index: pthread_rwlock.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
RCS file: /cvsroot/src/lib/libpthread/pthread_rwlock.c,v
retrieving revision 1.13
diff -u -p -r1.13 pthread_rwlock.c
--- pthread_rwlock.c	19 Oct 2005 02:15:03 -0000	1.13
+++ pthread_rwlock.c	9 Sep 2007 04:37:06 -0000
@@ -119,6 +119,14 @@ pthread_rwlock_rdlock(pthread_rwlock_t *
 		PTQ_INSERT_TAIL(&rwlock->ptr_rblocked, self, pt_sleep);
 		/* Locking a rwlock is not a cancellation point; don't check */
 		pthread_spinlock(self, &self->pt_statelock);
+		if (pthread_check_defsig(self)) {
+			pthread_spinunlock(self, &self->pt_statelock);
+			PTQ_REMOVE(&rwlock->ptr_rblocked, self, pt_sleep);
+			pthread_spinunlock(self, &rwlock->ptr_interlock);
+			pthread__signal_deferred(self, self);
+			pthread_spinlock(self, &rwlock->ptr_interlock);
+			continue;
+		}
 		self->pt_state =3D PT_STATE_BLOCKED_QUEUE;
 		self->pt_sleepobj =3D rwlock;
 		self->pt_sleepq =3D &rwlock->ptr_rblocked;
@@ -198,6 +206,14 @@ pthread_rwlock_wrlock(pthread_rwlock_t *
 		PTQ_INSERT_TAIL(&rwlock->ptr_wblocked, self, pt_sleep);
 		/* Locking a rwlock is not a cancellation point; don't check */
 		pthread_spinlock(self, &self->pt_statelock);
+		if (pthread_check_defsig(self)) {
+			pthread_spinunlock(self, &self->pt_statelock);
+			PTQ_REMOVE(&rwlock->ptr_wblocked, self, pt_sleep);
+			pthread_spinunlock(self, &rwlock->ptr_interlock);
+			pthread__signal_deferred(self, self);
+			pthread_spinlock(self, &rwlock->ptr_interlock);
+			continue;
+		}
 		self->pt_state =3D PT_STATE_BLOCKED_QUEUE;
 		self->pt_sleepobj =3D rwlock;
 		self->pt_sleepq =3D &rwlock->ptr_wblocked;
@@ -287,11 +303,20 @@ pthread_rwlock_timedrdlock(pthread_rwloc
 		wait.ptw_thread =3D self;
 		wait.ptw_rwlock =3D rwlock;
 		wait.ptw_queue =3D &rwlock->ptr_rblocked;
+		/* Locking a rwlock is not a cancellation point; don't check */
+		pthread_spinlock(self, &self->pt_statelock);
+		if (pthread_check_defsig(self)) {
+			pthread_spinunlock(self, &self->pt_statelock);
+			pthread_spinunlock(self, &rwlock->ptr_interlock);
+			pthread__signal_deferred(self, self);
+			pthread_spinlock(self, &rwlock->ptr_interlock);
+			continue;
+		}
+		/* XXX WRS I think it's ok to add an alarm w/ statelock held.
+		 * the cond_timedwait code does the same. */
 		pthread__alarm_add(self, &alarm, abs_timeout,
 		    pthread_rwlock__callback, &wait);
 		PTQ_INSERT_TAIL(&rwlock->ptr_rblocked, self, pt_sleep);
-		/* Locking a rwlock is not a cancellation point; don't check */
-		pthread_spinlock(self, &self->pt_statelock);
 		self->pt_state =3D PT_STATE_BLOCKED_QUEUE;
 		self->pt_sleepobj =3D rwlock;
 		self->pt_sleepq =3D &rwlock->ptr_rblocked;
@@ -364,11 +389,20 @@ pthread_rwlock_timedwrlock(pthread_rwloc
 		wait.ptw_thread =3D self;
 		wait.ptw_rwlock =3D rwlock;
 		wait.ptw_queue =3D &rwlock->ptr_wblocked;
+		/* Locking a rwlock is not a cancellation point; don't check */
+		pthread_spinlock(self, &self->pt_statelock);
+		if (pthread_check_defsig(self)) {
+			pthread_spinunlock(self, &self->pt_statelock);
+			pthread_spinunlock(self, &rwlock->ptr_interlock);
+			pthread__signal_deferred(self, self);
+			pthread_spinlock(self, &rwlock->ptr_interlock);
+			continue;
+		}
+		/* XXX WRS I think it's ok to add an alarm w/ statelock held.
+		 * the cond_timedwait code does the same. */
 		pthread__alarm_add(self, &alarm, abs_timeout,
 		    pthread_rwlock__callback, &wait);
 		PTQ_INSERT_TAIL(&rwlock->ptr_wblocked, self, pt_sleep);
-		/* Locking a rwlock is not a cancellation point; don't check */
-		pthread_spinlock(self, &self->pt_statelock);
 		self->pt_state =3D PT_STATE_BLOCKED_QUEUE;
 		self->pt_sleepobj =3D rwlock;
 		self->pt_sleepq =3D &rwlock->ptr_wblocked;
Index: pthread_sig.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
RCS file: /cvsroot/src/lib/libpthread/Attic/pthread_sig.c,v
retrieving revision 1.47
diff -u -p -r1.47 pthread_sig.c
--- pthread_sig.c	24 Nov 2006 19:46:58 -0000	1.47
+++ pthread_sig.c	9 Sep 2007 04:37:07 -0000
@@ -266,6 +266,9 @@ __sigsuspend14(const sigset_t *sigmask)
 		pthread_spinunlock(self, &pt_sigsuspended_lock);
 		pthread_exit(PTHREAD_CANCELED);
 	}
+	if (pthread_check_defsig(self)) {
+		/* WRS Hmmm... how to handle... */
+	}
 	pthread_sigmask(SIG_SETMASK, sigmask, &oldmask);
=20
 	self->pt_state =3D PT_STATE_BLOCKED_QUEUE;
@@ -399,6 +402,9 @@ sigtimedwait(const sigset_t * __restrict
=20
  block:
 		pthread_spinlock(self, &self->pt_statelock);
+		if (pthread_check_defsig(self)) {
+			/* WRS Hmmm... how to handle... */
+		}
 		self->pt_state =3D PT_STATE_BLOCKED_QUEUE;
 		self->pt_sleepobj =3D &pt_sigwaiting_cond;
 		self->pt_sleepq =3D &pt_sigwaiting;
@@ -842,6 +848,9 @@ pthread__kill(pthread_t self, pthread_t=20
 		 * make the target runnable but do not deliver the signal.
 		 * Otherwise record the signal for later delivery.
 		 * XXX not MPsafe.
+		 * WRS - how is it not MP safe?
+		 * Hmm... We probably should hold the target's pt_statelock,
+		 * not our own.
 		 */
 		pthread_spinlock(self, &self->pt_statelock);
 		if (target->pt_state =3D=3D PT_STATE_BLOCKED_QUEUE &&
@@ -892,8 +901,11 @@ pthread__kill(pthread_t self, pthread_t=20
 		 * from its torpor, then mark the signal for
 		 * later processing.
 		 */
+		pthread_spinlock(self, &target->pt_siglock);
 		__sigaddset14(&target->pt_sigblocked, si->si_signo);
 		__sigaddset14(&target->pt_sigmask, si->si_signo);
+		pthread_spinunlock(self, &target->pt_siglock);
+
 		pthread_spinlock(self, &target->pt_flaglock);
 		target->pt_flags |=3D PT_FLAG_SIGDEFERRED;
 		pthread_spinunlock(self, &target->pt_flaglock);
@@ -1021,7 +1033,9 @@ pthread__signal_deferred(pthread_t self,
 		si.si_signo =3D i;
 		pthread__deliver_signal(self, t, &si);
 	}
+	pthread_spinlock(self, &t->pt_flaglock);
 	t->pt_flags &=3D ~PT_FLAG_SIGDEFERRED;
+	pthread_spinunlock(self, &t->pt_flaglock);
=20
 	pthread_spinunlock(self, &t->pt_siglock);
 }
Index: pthread_sleep.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
RCS file: /cvsroot/src/lib/libpthread/Attic/pthread_sleep.c,v
retrieving revision 1.7
diff -u -p -r1.7 pthread_sleep.c
--- pthread_sleep.c	19 Apr 2005 16:38:57 -0000	1.7
+++ pthread_sleep.c	9 Sep 2007 04:37:07 -0000
@@ -117,6 +117,13 @@ nanosleep(const struct timespec *rqtp, s
 			pthread_spinunlock(self, &pt_nanosleep_lock);
 			pthread_exit(PTHREAD_CANCELED);
 		}
+		if (pthread_check_defsig(self)) {
+			pthread_spinunlock(self, &self->pt_statelock);
+			pthread_spinunlock(self, &pt_nanosleep_lock);
+			pthread__signal_deferred(self, self);
+			pthread_spinlock(self, &pt_nanosleep_lock);
+			pthread_spinlock(self, &self->pt_statelock);
+		}
 		pthread__alarm_add(self, &alarm, &sleeptime,
 		    pthread__nanosleep_callback, self);
 	=09
Index: sem.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
RCS file: /cvsroot/src/lib/libpthread/sem.c,v
retrieving revision 1.9
diff -u -p -r1.9 sem.c
--- sem.c	19 Oct 2005 02:15:03 -0000	1.9
+++ sem.c	9 Sep 2007 04:37:07 -0000
@@ -332,6 +332,13 @@ sem_wait(sem_t *sem)
 			break;
 		}
=20
+		if (pthread_check_defsig(self)) {
+			pthread_spinunlock(self, &self->pt_statelock);
+			pthread_spinunlock(self, &(*sem)->usem_interlock);
+			pthread__signal_deferred(self, self);
+			continue;
+		}
+
 		PTQ_INSERT_TAIL(&(*sem)->usem_waiters, self, pt_sleep);
 		self->pt_state =3D PT_STATE_BLOCKED_QUEUE;
 		self->pt_sleepobj =3D *sem;

--YToU2i3Vx8H2dn7O--

--n+lFg1Zro7sl44OB
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (NetBSD)

iD8DBQFG43ptWz+3JHUci9cRAipEAJ4q1Pi7B+bwe5OwESJOphIhCXQW7wCeKjaC
PPCVPjVarOX7Ob+XIXaWvV8=
=oosj
-----END PGP SIGNATURE-----

--n+lFg1Zro7sl44OB--