Port-arm archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: vfp_fpscr_handler() from vfp_handler()



On Mon, Oct 16, 2017 at 04:59:02PM +0200, Manuel Bouyer wrote:
> Hello,
> in vfp_handler() (the handler for FPU fault, which will manage the FPU
> lazy context switch), we call vfp_fpscr_handler() at the beggining.
> 
> The point should be to avoid a full FPU context switch is we just
> want to read or write to the FPSCR from user code.
> 
> But a printf() here shows that vfp_fpscr_handler() can successfully emulate
> the instruction only once in a while. so we're paying a function call and
> a few tests in the FPU handler for nothing.
> 
> After 15mn from a clean boot and running opencpn:
> chartplotter-dd:/home/bouyer>vmstat -e | grep fp
> cpu0 vfp coproc use                   284    0 misc
> cpu0 vfp coproc re-use              11749   12 misc
> cpu1 vfp coproc use                   232    0 misc
> cpu1 vfp coproc re-use               9432    9 misc
> 
> but nothing from vfp_fpscr_handler(). I actually have this printf in place
> for a few days on different boards and saw it fire only a couple of time.
> 
> Would anyone mind if I remove this call from vfp_handler() ? It should save
> a few cycle for threads using the FPU ...

I looked into this and talked with gimpy about it a while back:

the reason for this emulation of FPSCR is to provide a way to maintain
per-thread rounding and exception state for softfloat.  gimpy's plan was
to use the FPSCR hardfloat instructions in userland to get and set those bits
even for softfloat, and have the kernel emulate the instructions and store
the FPSCR bits in the PCB on systems that did not actually have an FPSCR
register.  the softfloat libc code does not support per-thread
rounding/exception state currently, there's just process-wide global
variable.  gimpy also added a hook the softfloat code to allow MD code to
override the global variable for rounding/exception state (see
"set_float_rounding_mode") but nothing actually overrides that currently.
it looks like some groundwork was laid but this plan was never completed.

it seems better to me to use TLS in libc to implement per-thread
rounding/exception state rather than having the kernel emulate the FPSCR
with vfp_fpscr_handler().  I somewhat intended to change things to do that
when I was fixing the fenv stuff back in february/march this year,
but I guess I ran out of steam.

-Chuck


Home | Main Index | Thread Index | Old Index