Subject: SMP projects
To: None <tech-kern@netbsd.org>
From: Andrew Doran <ad@netbsd.org>
List: tech-kern
Date: 03/16/2007 16:25:09
Hi,

I thought I'd send out a brief note on these in case anyone is looking for
something to do :-), and is interested in improving our SMP scalability.
There are three active branches and here's a rough outline of each:

=> yamt-idlelwp

This simplifies dispatching, and splits dispatching out from scheduling.
cpu_switch gets boiled down to cpu_switchto, and looses all knowledge of
locking, the run queues and so on. The benefits here are potentially
cheaper context switches, reduced locking, the potential for per-CPU
run queues and the ability to experiment with different schedulers.
Currently the i386 and amd64 ports have the necessary changes, some work
is needed to bring the others up to speed.

=> ad-audiomp

This puts MP locking into the audio subsystem and all of the drivers. The
main goal is to make audio interrupt handling MP safe, so that the kernel
lock doesn't have to get taken for audio interrupts. There are two reasons
for that (1) it will reduce latency/skipping on MP systems and (2) it will
make it possible for spin locks to be taken at IPL_VM, without holding the
kernel lock.

The latter isn't possible right now, because audio ISRs can run with
higher priority than other device interrupts, and they take the kernel
lock. That breaks the lock ordering which says the kernel lock must
always be taken before other spin locks.

About 1/2 of the drivers are converted but the MIDI part needs some more
work. It takes between a few minutes and an hour to convert a driver
depending on how complicated it is.

=> vmlocking

The basic idea here is to make the memory allocators safe to use without
holding the kernel lock (e.g. pool_get) and to make trap handling run
mostly without the kernel lock (page faults). There are three parts to
that:

(1) all of the existing spin locks in the kernel get converted to mutexes
and R/W locks. For those locks that are converted to adaptive/sleep locks,
there is no longer any ordering rule between them and the kernel_lock:
the kernel_lock can be taken at any time without worry. Since locks that
were never actively used before are now put into use a sizeable chunk of
code needs to be audited.

(2) The kernel lock gets pushed back in a few places, e.g. into the VOP_*
wrappers and in few places in UVM that deal with swap. As a fringe benefit,
it means that "simple" users of VFS like stat() can be made to run mostly
without the kernel lock, and so things like the namecache can come out from
under it too.

(3) The locking strategy in the VM system and pmap modules needs to be
tested thoroughly and any problems fixed up.

Andrew