Subject: C runqueue
To: None <port-i386@netbsd.org>
From: Gregory McGarry <g.mcgarry@ieee.org>
List: port-i386
Date: 10/23/2002 20:22:35
Some architectures actually moved from assembler implementations
of the runqueue-frobbing routines to the C versions.  I looked
at the i386 case at the time of their introduction and noticed
that gcc couldn't produce tighter code.

The main difference is the btrl instruction.  Last week I went
back and counted instruction cycles.  (I must have been bored.)
It seems that gcc was right.  So I'm sure there are some
micro-optimisation people out there who will enjoy this:

			pentium	i486	i386
	-----------------------------------------
	btrl mem,reg	13	13	13 cycles
	-----------------------------------------
	mov reg,imm	1	1	2
	rol reg,cl	4	3	3
	and mem,reg	3	3	7
			8	7	12 cycles
	-----------------------------------------

The instruction stream also doesn't permit U/V parallelism on
the pentium.

So, we save a few cycles and gain a stackframe with C.  I'd
say its worth moving i386 over too.

	-- Gregory McGarry <g.mcgarry@ieee.org>