NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: port-arm/52603 (arm(v7?) vfp register corruption)



Synopsis: arm(v7?) vfp register corruption

Responsible-Changed-From-To: port-arm-maintainer->bouyer
Responsible-Changed-By: bouyer%NetBSD.org@localhost
Responsible-Changed-When: Mon, 16 Oct 2017 14:05:06 +0000
Responsible-Changed-Why:
.


State-Changed-From-To: open->analyzed
State-Changed-By: bouyer%NetBSD.org@localhost
State-Changed-When: Mon, 16 Oct 2017 14:05:06 +0000
State-Changed-Why:
The race senario is the following:

LWP L is running but not on CPU, has its FPU state on CPU2 which
has not been released yet, so fpexc still has VFP_FPEXC_EN set in the PCB copy.

LWP L is scheduled on CPU1, CPU1 calls cpu_switchto() for L in mi_switch().
cpu_switchto() will set VFP_FPEXC_EN in the FPU's fpexc register per the
PCB fpexc copy.

Before CPU1 calls pcu_switchpoint() for L, CPU2 calls
pcu_do_op(PCU_CMD_SAVE | PCU_CMD_RELEASE) for L because it still holds its
FPU state and wants to load another lwp. This cause VFP_FPEXC_EN to
be cleared in the PCB copy, but not in CPU1's register. L's l_pcu_cpu is
set to NULL.

When CPU1 calls pcu_switchpoint() for L it see l_pcu_cpu is NULL, and doesn't
call the release callback.

Now CPU1 has its FPU enabled but with the wrong FPU state.

I see the following way to fix this:
a) go to splhigh() before cpu_switchto() and splx() after pcu_switchpoint().
   I'm not sure it's a good idea to block all interrupts here.
b) in pcu_switchpoint() call the release callback even if l_pcu_cpu is NULL,
   to make sure the PCU of the current CPU is released too.
c) call pcu_switchpoint() for newl before cpu_switchto() in mi_switch()
d) in cpu_switchto() always set fpexc to 0
e) in cpu_switchto() check l_pcu_cpu before using the PCB fpexc copy,
   and set fpexc to NULL if the PCB copy is not from this CPU.

I think e is too exensive for cpu_switchto().
d will always cause a FPU trap, even if the FPU state is loaded on the CPU

I tested b) and it works, and it looks like the cleanest way to fix this.





Home | Main Index | Thread Index | Old Index