Source-Changes archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

CVS commit: src/sys/arch



Module Name:    src
Committed By:   riastradh
Date:           Thu Apr 24 01:50:39 UTC 2025

Modified Files:
        src/sys/arch/amd64/amd64: genassym.cf
        src/sys/arch/amd64/include: pcb.h
        src/sys/arch/i386/i386: genassym.cf
        src/sys/arch/i386/include: pcb.h
        src/sys/arch/x86/include: cpu.h cpu_extended_state.h
        src/sys/arch/x86/x86: fpu.c vm_machdep.c

Log Message:
amd64: Allocate FPU save state outside pcb if it's too large.

We have seen x86_fpu_save_size values (CPUID[EAX=0x0d, ECX=0].ECX) as
large as 11008 bytes, notably with Intel AMX TILEDATA's 8192-byte
state.

We only do this for user threads, and only on machines where it's
necessary, to avoid incurring much overhead.  There is still a tiny
bit of overhead when saving and restoring the FPU state by using a
pointer indirection instead of arithmetic indirection for access to
struct pcb::pcb_savefpu, but this is probably a drop in the bucket
compared to the memory traffic incurred by the FPU state save/restore
anyway.

For now, these paths are mostly disabled on i386.  We could enable
them but it will require either rewriting cpu_uarea_alloc/free for
i386, or adopting a guard page like amd64 does, which might be costly
and so should be undertaken only with some thought and care.  And
since Intel AMX instructions only work in 64-bit mode, it's not
likely to be useful on i386.

PR port-amd64/57661: Crash when booting on Xeon Silver 4416+ in
KVM/Qemu

These changes, as a side effect, may fix:

PR kern/57258: kthread_fpu_enter/exit problem

by making sure to allocate an FPU save space that is large enough to
guarantee fpu_kern_enter/leave work safely, instead of just using a
union savefpu object on the stack (which, at 576 bytes, may be too
small on some machines, particularly with AVX512 requiring ~2.5K).
(But we'll have to do some extra work with kthread_fpu_enter/exit_md
-- if we try doing them again on x86 -- to actually allocate the
separate pcb on these machines!)


To generate a diff of this commit:
cvs rdiff -u -r1.98 -r1.99 src/sys/arch/amd64/amd64/genassym.cf
cvs rdiff -u -r1.32 -r1.33 src/sys/arch/amd64/include/pcb.h
cvs rdiff -u -r1.136 -r1.137 src/sys/arch/i386/i386/genassym.cf
cvs rdiff -u -r1.59 -r1.60 src/sys/arch/i386/include/pcb.h
cvs rdiff -u -r1.139 -r1.140 src/sys/arch/x86/include/cpu.h
cvs rdiff -u -r1.18 -r1.19 src/sys/arch/x86/include/cpu_extended_state.h
cvs rdiff -u -r1.89 -r1.90 src/sys/arch/x86/x86/fpu.c
cvs rdiff -u -r1.46 -r1.47 src/sys/arch/x86/x86/vm_machdep.c

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.




Home | Main Index | Thread Index | Old Index