Subject: Lazy FP context switch reconsidered
To: None <port-alpha@netbsd.org>
From: Jason R Thorpe <thorpej@zembu.com>
List: port-alpha
Date: 07/13/2001 22:02:11
So, I want to stabilize the Alpha MP kernel a bit, and as part of this,
I want to reduce some of the complexity in the code that deals with
multiple processors.

One of the most complex things here is the MP-safe lazy FPU context
switch code.

The way we currently do lazy FP context switch is like so:

	(1) Each processor has two variables: curproc, and fpcurproc.
	    curproc is the process currently running on the processor,
	    and fpcurproc is the process who's FP state the processor
	    currently holds.

	(2) When a process returns to userspace, if it is fpcurproc,
	    then the FPU is enabled.  Otherwise it is disabled.

	(3) When a process uses the FPU, and the FPU is disabled,
	    an FEN trap is taken.  If the process has not yet used
	    the FPU, then it is marked has having done so.  IF it
	    has used the FPU, the kernel determines which processor
	    on which the FP state resides.  If not `self', then a
	    "discard FPU" interrupt is sent to the processor that
	    has it, and `self' waits until the other processor has
	    sync'd it back to the process's PCB.  Once the state is
	    in the PCB, it is loaded into `self's FPU.  fpcurproc
	    is then set to curproc.  Then see step 2.

Now, the code that does this is kind of complicated, and tricky to
get right.  I also suspect that there is a lot of unnecessary overhead
here.

This is exacerbated by the fact that GCC likes to emit FP insns for
inline block moves.  Thus, the number of processes that use FP is
inflated somewhat.

I decided to instrument this.  I added some code to the FEN trap
path that collects two different statistics:

	* "FP proc use" -- when a process uses FP for the first
	  time, this counter is incremented.

	* "FP proc re-use" -- when a process that has previously
	  used FP takes a FEN trap to be able to use it again,
	  this counter is incremented.

I then booted the kernel and immediately built a GENERIC kernel.  I wanted
the number of context switches to be high, so I used "make -j4".

When the compile finished, I read some counters.  Here are the interesting
numbers:

	93470 cpu context switches (from vmstat -s)

	event                               total     rate type
	FP proc use                          7728        5 misc
	FP proc re-use                      44371       34 misc
	soft serial                          1833        1 intr
	soft net                             2544        1 intr
	soft clock                           4068        3 intr
	cpu0 clock                        1562602     1209 intr
	cpu0 device                         47304       36 intr
	kn300 irq 12                         4863        3 intr
	kn300 irq 16                           68        0 intr
	kn300 irq 36                           26        0 intr
	kn300 irq 40                        40514       31 intr
	isa irq 4                            1833        1 intr

So, this is how I have interpreted the numbers:

	(1) Nearly 1/2 of all context switches resulted in the
	    process using FP again, and having to take a trap
	    in order to do so.

	(2) The rate at which these traps happened is nearly
	    as high as interrupts from devices, and is higher
	    than the interrupt rate from the SCSI controller
	    to which the disks in the RAID volume holding the
	    source tree are attached.

	(3) Since the number of processes that use FP for the
	    first time a fair bit smaller than the re-use
	    count, it suggests that processes that use FP once
	    are very likely to use it again.

What this suggests to me is that lazy FP context switching might not
be such a hot idea on the Alpha port.  What I'd like to do is change
the FP context swithing algorithm to something like this:

	(1) When a process returns to userspace, if the process
	    has used FP, enable the FPU.

	(2) When a process is switched away from, if it has used the
	    FPU, save the FP state (thus releasing the FPU for someone
	    else to use).

	(3) When a process is switched to, if it has used the FPU,
	    restore the FP state.

	(4) When a processes uses FP for the first time, simply mark it
	    has having used the FPU, and `restore' the FP state from
	    the PCB (the FP state is zero'd when a process exec's).

This method is a whole lot simpler, eliminates the need to deal with
other processors, and may in fact reduce the amount of overhead involved
for processes that do in fact use FP.

Thoughts/comments?

-- 
        -- Jason R. Thorpe <thorpej@zembu.com>