port-powerpc: I want to rid ugly float load/stores used only for data movement

Subject: I want to rid ugly float load/stores used only for data movement
To: None <tech-toolchain@netbsd.org, port-macppc@netbsd.org,>
From: M L Riechers <mlr@rse.com>
List: port-powerpc
Date: 02/22/2003 17:16:25

First, my apologies for the triple post, but I'm desperate for an
answer.  Solving this problem is bullet 1 of five standing in the way
of completing a grossly overdue project.

I'm testing stuff on a powerpc port to the Motorola mbx860 card; this
is a powerpc mpc860 processor -- it doesn't have floating point
instructions.  Therein lies the rub.

Until the "soft-float" options work altogether properly for powerpc,
which, it looks like at the very least waiting for gcc-3.3, I've
adapted the floating point kernel emulation routines from Walnut.  It
more or less works (shot 1 bug, diddled another, and at least 1 more
identified but not located), but it's molasses slow. Ok, I understand
that trips to the kernel are expensive.

But, a lot of the drag happens because gcc uses something like a
lfd/stfd instuction sequence -- that is, a 64 bit load to a floating
point register followed by a 64 bit store -- solely for data movement
and having nothing to do with floating point calculations.  (A rather
long thread on this appeared December 1999 and March 2000 concerning
the affects of this on powerpc processors --e.g. mpc604's -- that took
unaligned traps because of this behavior).  My testing indicates that
70% to 95% of the emulated software traps are due to this effort to
save a ldwz/stw machine instruction pair.

What I want to do is modify gcc to stop generating code like that, but
not shut down generating float instructions entirely, as in
"soft_float".  That is, to do exactly whatever "soft-float" does for
data movement, but not change anything else.  I couldn't find any
command line options to do just that. My intention is to use this
modified gcc to recompile all of NetBSD 1.6 userland for both the
mpc860 and our macppc 7500's, which are identical. (As an aside, I
sincerely doubt that that change will cause a performance hit for a G3
processor with more than one integer unit, and might even help
performance.  I'd like to look at that, but I don't have the time.) I
looked, but I'm intimidated.  I looked in the areas that would be
affected by "soft_float" and "cpu=mpc860".

Would anyone point me in the correct direction?

Thanks,

-Mike