NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

port-amd64/52966: amd64 FPU handling broken on AMD



>Number:         52966
>Category:       port-amd64
>Synopsis:       amd64 FPU handling broken on AMD
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    port-amd64-maintainer
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Jan 30 17:55:00 +0000 2018
>Originator:     Michael van Elst
>Release:        NetBSD 8.99.12
>Organization:
--
                                Michael van Elst
Internet: mlelstv%serpens.de@localhost
                                "A potential Snark may lurk in every tree."
>Environment:
	
	
System: NetBSD slowpoke 8.99.12 NetBSD 8.99.12 (SLOWPOKE) #19: Tue Jan 30 13:57:16 CET 2018 mlelstv@gossam:/home/netbsd-current/obj.amd64/home/netbsd-current/src/sys/arch/amd64/compile/SLOWPOKE amd64
Architecture: x86_64
Machine: amd64
>Description:
A version of the stream benchmark fails on AMD Ryzen CPUs. The benchmark
does multi-threaded floating point operations (using OpenMP) for testing
memory bandwidth and also validates the result by comparing it with a scalar
compuation. While the benchmark runs fine, the validation fails if multiple
threads are used. With more than 4 threads it fails almost always, with less
threads it some times succeeds, with a single thread it succeeds.

>How-To-Repeat:

Get source from

http://ftp.netbsd.org/pub/NetBSD/misc/mlelstv/stream.c

which has been slightly adjusted from the original to compile without -lnuma,
and compile with:

gcc -O3 -std=c99 -fopenmp -DNON_NUMA -DN=80000000 -DNTIMES=100 stream.c -o stream

and let it run. With that value of N you need about 1.8GB RAM.

When the validation succeeds the program reports "Solution validates",
otherwise it reports the error. On the Ryzen system the errors are somewhat
random.

The same machine runs the benchmark fine with the latest netbsd-8 kernel
as it preceeds XSAVEOPT support.

>Fix:

A workaround suggested by maxv@ is to disable the use of XSAVEOPT by
commenting out:

        if (descs[0] & CPUID_PES1_XSAVEOPT)
                x86_fpu_save = FPU_SAVE_XSAVEOPT;

in sys/arch/x86/x86/identcpu.c. The kernel then falls back to use XSAVE
to save and restore the FPU registers.

>Unformatted:
 	
 	


Home | Main Index | Thread Index | Old Index