tech-userlevel: Re: SA pthread and runqueues

Subject: Re: SA pthread and runqueues
To: Bill Stouder-Studenmund <wrstuden@netbsd.org>
From: Andrew Doran <ad@netbsd.org>
List: tech-userlevel
Date: 10/21/2007 22:41:14

On Sun, Oct 21, 2007 at 12:56:16PM -0700, Bill Stouder-Studenmund wrote:

> 4) I am confused as to what is the best way to handle pt_state_lock and
> the run queue lock. As you might expect, pt_state_lock locks the state of
> a thread and the runqueue lock locks the queue. The question though is how
> best to handle the locking order. Do you lock the state lock or the run
> queue lock first?

ETOOMANYLOCKS :-)

> The big problem comes when taking something off of the work queue, 
> especially when you're getting the next thread to run as you're going to 
> sleep. You've got the run queue locked, so you can find what exactly is 
> the thread to switch to. But you need to mark it as RUNNING not RUNABLE, 
> and that requires the state lock, which we can't take since we are holding 
> the run queue lock.
..
> So any suggestions on how to handle this?

The kernel solves this by using "lock chaining". Each thread has a lock
pointer that covers the scheduler related fields. As threads move between
synchronization objects (e.g. sleep queue -> run queue) the lock pointer
moves in step. So, while the thread is on a run queue it is locked by the
run queue lock. While it's on a sleep queue it's locked by the sleep queue,
while running it's locked by the CPU, ...

If you find the thread via a sync object then you probably also have that
object locked, so all threads associated with it are also locked. The tricky
bit is when you come at the thread from some other direction - you can't
know in advance what it is locked by. You first have to lock it, then check
to see if the thread is still locked by the lock you grabbed. If not unlock
and retry. One requirement for the scheme is that the locks must be static
and so must persist for the entire lifetime of the process.

Andrew