tech-kern: Re: Condition variables

Subject: Re: Condition variables
To: Jason R Thorpe <thorpej@zembu.com>
From: Eduardo Horvath <eeh@turbolinux.com>
List: tech-kern
Date: 06/06/2000 08:58:19
On Mon, 5 Jun 2000, Jason R Thorpe wrote:

> I have written a condition variable implementation for the NetBSD kernel,
> based on the tsleep() implementation.  For those of you unfamiliar with
> condition variables, a condition variable is like a tsleep() "wait channel",
> except it is separate from the thing you're waiting for (e.g. if you're
> waiting for a VM page, you wait on `&pg->condvar', not `pg').  If you've
> ever programmed in pthreads, you understand how this all works.
> 
> These are important in an SMP environment for eliminating race conditions
> between a thread blocking and another thread unblocking blocked threads; when
> you block on a condition variable, you pass an interlock that is not released
> until you are on the sleep queue.
> 
> The API is pretty straightforward:
> 
> void	cond_init(condvar_t *cv, const char *name);
> int	cond_wait(condvar_t *cv, int prio, int timo,
> 	    __volatile struct simplelock *interlock);
> void	cond_signal(condvar_t *cv);
> void	cond_broadcast(condvar_t *cv);
> 
> The `prio' and `timo' arguments to cond_wait() are the same as the
> corresponding arguments for tsleep().  A new `prio' flag has also
> been added, PNORELOCK, to prevent the thread from attempting to
> relock the interlock when it is unblocked.

O.K.  So we're going for SolarisBSD? 8^)

1) Do you need to hold the synchronization lock over the calls to
cond_signal() and cond_broadcast()?  (Actually, it might be better to
simply pass the lock in as a parameter and let those functions release
it.  That way you don't end up moving the process from the sleep queue to
the lock's wait queue waiting for the lock to drop.)

2) You don't have a `void cond_destroy(condvar_t *cv);' which you need for
bookkeeping reasons if nothing else.  (What happens if a structure is
freed with a process sleeping on a condition variable in it?)

3) What does this gain us over adding an interlock to tsleep()?  I know
what we lose: now every structure that a process might want to consider
sleeping on will need at least one `condvar_t'.  Each of these
`condvar_t's needs to be explicitly created and destroyed, which can
add significantly to code complexity.

I suppose this is more of an implementation question.  The `cond_*'
interface is conceptually cleaner from a naming perspective (although I
prefer to use `cv_*' since that's less typing) but more complicated from a
programming perspective since you now need to allocate, initialize,
and keep track of `condvar_t's.  Do we gain significant performance
advantages from making this change, considering it will probably require
modifications to large numbers of kernel datastructures?

Eduardo Horvath