Subject: Optimising m68k's _splraise()
To: None <port-m68k@netbsd.org>
From: Steve Woodford <steve@mctavish.co.uk>
List: port-m68k
Date: 12/09/2000 23:04:36
Hi Folks,

[I've already discussed the following optimisation with Ignatios, but
I thought it prudent to throw it open to a wider audience before
committing...]

Looking at _splraise(), the current macro in psl.h is quite wastefull and
doesn't consider that in the *vast* majority of cases, it is called with a
constant value. By using a mixture of assembly code and C, the following
replacement give the compiler a chance to optimise away a whole bunch of
useless instructions.

static __inline int
_splraise(int level)
{
	int sr;

	/*
	 * Get the current SR.
	 * Note: No need for the "clrl %0" since C code *never*
	 * interprets the SR value, and we don't care about the top
	 * 16-bits anyway.
	 */
	__asm __volatile("movw %%sr,%0" : "=d" (sr));

	/*
	 * The compiler will optimise this to 1 instruction if `level'
	 * is PSL_HIGHIPL (which is rarely used, but what the heck...)
	 *	movw	#0x2700,%sr
	 *
	 * For all other constant values, this compiles to the following:
	 * (Register names are example only)
	 *
	 *	movl	#constant,%d0
	 *	movw	%sr,%d1
	 *	cmpw	%d0,%d1
	 *	bccs	1f
	 *	movw	#constant,%sr
	 *   1:
	 *
	 * For non-constant values, an extra compare and branch is
	 * inserted. Since non-constant values are *very* rarely used,
	 * this is no big deal. It's *still* less than the 9 instructions
	 * used by the current _splraise().
	 *
	 * You'll note that the compares will work irrespective of the
	 * contents of the low 8-bits of SR.
	 *
	 * XXX: I'd like to find a way to tell the compiler to do:
	 *	movw	#constant,%d0
	 * instead of a long move...
	 */
	if ((unsigned short)level >= PSL_HIGHIPL ||
	    (unsigned short)level > (unsigned short)sr)
		__asm __volatile("movw %0,%%sr" :: "di" (level));

	return sr;
}

Similarly, _spl() doesn't need the "clrl %0" before grabbing the SR.

static __inline int
_spl(int s)
{
	int sr;
	__asm __volatile ("movew %%sr,%0; movew %1,%%sr" :
	    "&=d" (sr) : "di" (s));
	return sr;
}

I have a related optimisation for mvme68k's splx() function. It previously
*always* called into locore's spl0() function if dropping to IPLLOW,
regardless of whether or not a softint was pending. It now tests for a
softint inline. This may be useful for other m68k ports which have
simulated software interrupts.

Comments? I plan on committing this in the next couple of days if there
are no objections.

Cheers, Steve