Subject: Re: Moving scheduler semantics from cpu_switch() to kern_synch.c
To: Daniel Sieger <dsieger@TechFak.Uni-Bielefeld.DE>
From: Gregory McGarry <>
List: tech-kern
Date: 09/11/2006 11:22:28
> based on what gmcgarry did in September 2002 I'm actually
trying to
> move some of the scheduler semantics from cpu_switch() to
> During that time he wrote the functions nextrunqueue()
and chooseproc().

cpu_idle() was also written on that branch.

> nextrunqueue() selects the next process (at that time
there were no
> LWPs) from the highest priority runqueue. chooseproc()
either calls
> nextrunqueue() and returns, or calls cup_idle() if there
is no
> runnable process.
> My problem now is that cpu_idle() is only defined as a
prototype but
> is never actually defined (irritatingly, it has a
man-page). So, what
> is the right call to let the cpu idle until there is a
LWP on one of
> the runqueues?

Right.  Some of the man pages lingered in the main branch
after the changes were backed out and put on the branch.

> Do you have any general comments on what I'm trying to
do? I think
> having this part of the scheduler written in C would be
the first step
> necessary for any further improvements.

The original changes attempted to modularise the scheduler
so it could be replaced with other scheduler algorithms.  I
still believe that the branch still contains the best

The changes I original made were quiet agressive in
eliminating blocking interrupts and calling into locore if
there weren't any other threads in the run queue.  I'm
confident it was these changes which were causing problems
on i386 with the interrupt mask getting fubar.

On the other platforms I tested, i saw a 2x speed-up in
context switch when LWPs came along:

 * min latency: 93.100000
 * max latency: 150.700000
 * mean latency: 100.857581
 * min latency: 49.350000
 * max latency: 76.750000
 * mean latency: 54.141626
 * min latency: 54.750000
 * max latency: 76.050000
 * mean latency: 60.088654
 * HP300_SA:
 * min latency: 352.560000
 * max latency: 402.960000
 * mean latency: 367.836250
 * min latency: 129.200000
 * max latency: 187.040000
 * mean latency: 142.528223
 * HP300_OLD:
 * min latency: 357.360000
 * max latency: 414.400000
 * mean latency: 372.436104

I still don't think anyone has addressed the performance
loss when LWPs arrived.

After the changes were backed out, thorpej made the comment
that it would be a simpler change to call up into the
scheduler from cpu_switch() rather than pushing the next
lwp down and modifying all versions of cpu_switch().  This
is what FreeBSD did at the time.  While this is true, I'm
not sure whether it would continue to be the best approach
as the scheduler becomes more advanced.  Additionally, you
don't get the switching-to-myself optimisations. 
cpu_switchto() was introduced with LWPs  anyway.  It is
essentially the same as cpu_switch() on the branch, and
includes redundant code to cpu_switch().  

I also implemented the SVR4 scheduler abstraction.  The
only issue, is that there are some places in the kernel
with explicit pre-emption points which frobbed the run
queue.  They're easy enough to abstract but it's a wart on
an otherwise reasonable API.

-- Gregory McGarry <>

Do you Yahoo!? 
Listen to your personal radio station on Yahoo!7 Music