Source-Changes archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

CVS commit: src/sys/arch/x86/x86



Module Name:    src
Committed By:   maxv
Date:           Sat Nov  4 08:58:30 UTC 2017

Modified Files:
        src/sys/arch/x86/x86: fpu.c

Log Message:
Add support for xsaveopt. It is basically an instruction that optimizes
context switch performance by not saving to memory FPU registers that are
known to be in their initial state or known not to have changed since the
last time they were saved to memory.

Our code is now compatible with the internal state tracking engine:
 - We don't modify the in-memory FPU state after doing an XSAVE/XSAVEOPT.
   That is to say, we always call XRSTOR first.
 - During a fork, the whole in-memory FPU state area is memcopied in the
   new PCB, and CR0_TS is set. Next time the forked thread uses the FPU it
   will fault, we migrate the area, call XRSTOR and clear CR0_TS. During
   this XRSTOR XSTATE_BV still contains the initial values, and it forces
   a reload of XINUSE.
 - Whenever software wants to change the in-memory FPU state, it manually
   sets XSTATE_BV[i]=1, which forces XINUSE[i]=1.
 - The address of the state passed to xrstor is always the same for a
   given LWP.

fpu_save_area_clear is changed not to force a reload of CW if fx_cw is
the standard FPU value. This way we have XINUSE[i]=0 for x87, and xsaveopt
will optimize this state.

Small benchmark:
        switch lwp to cpu2
        do float operation
        switch lwp to cpu3
        do float operation
Doing this 10^6 times in a loop, my cpu goes on average from 28,2 seconds
to 20,8 seconds.


To generate a diff of this commit:
cvs rdiff -u -r1.23 -r1.24 src/sys/arch/x86/x86/fpu.c

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.




Home | Main Index | Thread Index | Old Index