Port-sh3 archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: NetBSD on SH-4A



Valeriy E. Ushakov wrote:
On Mon, Jun 15, 2009 at 15:18:01 -0700, David S. Alessio wrote:

I'd like to see the SH-4A fully supported. Any suggestions on the best path to get there?

The approach that first comes to my mind is:
1)   Add kernel support for the FPU to the SH-4 arch
Has anyone examined this? If so, any thoughts/suggestions on the "right way" to do it?
   I'd like to see a lazy FPU context switch.

Look at how other arches do it and do the same for SH-4 FPU.  FPU
autodetection would be nice, but compile-time kernel option could do
for now.  Shouldn't be hard, but I never got around to it as real life
keep interfering.  If you could work on it and contribute the code
back, that would be very helpful
How would autodetection be helpful for SH-4 and SH-4A? I don't think we'll see many [any] SH-4A-DSP chips; and all SH-4/4A have an FPU.


(we don't have much NetBSD/sh3
developers).

Well, you've got one more now ;)
2)  Recompile the SH-4 world with FPU support
   (gcc default integer division should not use the FPU)

Not sure it's possible without hacking gcc as with -m4 it generates
calls to __sdivsi3_i4 and expects its result in FPUL.
gcc has an SH-specific option "-mdiv=STRATEGY" where STRATEGY is a menu of division options. I believe gcc-4.3.3 by default uses "-mdiv=call" which implements the inv.minlat strategy (integer division instructions).

Which begs the questions: When will NetBSD/ports support gcc-4.3.3? Is there anything standing in the way of updating to 4.3.3 as the default gcc version?

Why is this particular requirement?
A few reasons:
o any integer division in kernel code should use DIV0, DIV1 instructions (not the FPU). o Using the FPU for integer division results in a slight win at a micro-benchmark level, but results in a net loss at the macro-benchmark level. If we assume several applications running, every thread performing at least one integer division per time slot, then every thread will dirty the FPU and trigger an FPU context save. This is expensive. Doubly expensive if we use lazy FPU context save -- we'd get hit with an FPU exception on just about every context switch o most apps/threads don't use float/double vars and would not need the FPU

-david


begin:vcard
fn:David S. Alessio
n:Alessio;David S.
org:Systemic Realtime Design, LLC.
adr:;;201 San Antonio Circle, Suite 145;Mountain View;CA;94040;USA
email;internet:David%SysRealTime.com@localhost
title:CTO
tel;work:+1-650-559-8222
tel;fax:+1-650-204-6968
tel;cell:+1-650-248-8867
x-mozilla-html:TRUE
url:www.SysRealTime.com
version:2.1
end:vcard



Home | Main Index | Thread Index | Old Index