Subject: Re: Software Floating Point for MPC821/823/860
To: Chris G. Demetriou <cgd@netbsd.org>
From: Ignatios Souvatzis <ignatios@cs.uni-bonn.de>
List: tech-kern
Date: 04/05/2000 10:21:29
On Tue, Apr 04, 2000 at 10:34:46AM -0700, Chris G. Demetriou wrote:
> Jason R Thorpe <thorpej@zembu.com> writes:
> > It also turned out that the softfloat-in-libc turned out to be a fair bit
> > faster than kernel fp emulation for *statically linked* binaries, but MUCH
> > much slower for *dynamically linked* binaries.  Obviously, the performance
> > of kernel fp emulation is the same for statically linked or dynamically
> > linked binaries.
> 
> Has anyone characterized why this is?
> 
> It's ... very counterintuitive, at least to me, that a couple of
> branches would be slower than a kernel trep...

I've seen something similar on the Motorola 68060. I've tried to create a
68060 aware libm using the Motorola library emulation code, and found that
while some benchmarks where sligthly faster, the setiathome code was 
actually slower (by some 5 %), and sometimes slower than our libm C code.

Back then, I guessed two reasons:

our C vs. Motorola assembler
- some of the Motorola assembler code, needing to emulate M68K instructions,
  is actually slower than our IEEE C code which only needs to provide IEEE
  functions.

Motorola assembler in kernel vs. in libm:
- cache trashing. The kernel emulation code stays always at the same physical 
  addresses, sharing cache locations for _all_ programs that use it. The 
  libm code is loaded at different places depending on program calling it.

Regards,
	-is