NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

port-amd64/49853: x86 fpu save loop may terminate early?



>Number:         49853
>Category:       port-amd64
>Synopsis:       x86 fpu save loop may terminate early?
>Confidential:   no
>Severity:       non-critical
>Priority:       medium
>Responsible:    port-amd64-maintainer
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sat Apr 25 12:20:00 +0000 2015
>Originator:     Martin Husemann
>Release:        NetBSD 7.99.12
>Organization:
The NetBSD Foundation, Inc
>Environment:
System: NetBSD martins.aprisoft.de 7.99.12 NetBSD 7.99.12 (GENERIC) #18: Sat Apr 25 14:06:36 CEST 2015 martin%martins.aprisoft.de@localhost:/usr/src/sys/arch/amd64/compile/GENERIC amd64
Architecture: x86_64
Machine: amd64
>Description:

When fpu ownership is removed from a remote cpu, there is a loop in
x86/x86/fpu.c:fpusave_lwp that waits for the other cpu to clear pcb->pcb_fpcpu.
IIUC the loop has a second exit condition to protect against unresponsive
cpus and avoid an endless loop:

               while (pcb->pcb_fpcpu == oci && ticks == hardclock_ticks) {
                       x86_pause();
                       spins++;
               }

That is: give up waiting once hardclock_ticks has increased. Now I don't
understand what prevents this clock tick to happen basically at the same
moment that we send the ipi. This would cause the loop to exit early, and
the function return while the other cpu is not done saving FPU state.

Should this read something like:

               while (pcb->pcb_fpcpu == oci && (ticks+1) >= hardclock_ticks) {
                       x86_pause();
                       spins++;
               }

instead?

>How-To-Repeat:
code inspection

>Fix:
see above?



Home | Main Index | Thread Index | Old Index