NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

port-amd64/59299: Support Intel AMX CPU state (TILECFG/TILEDATA)



>Number:         59299
>Category:       port-amd64
>Synopsis:       Support Intel AMX CPU state (TILECFG/TILEDATA)
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    port-amd64-maintainer
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Apr 16 00:20:00 +0000 2025
>Originator:     Taylor R Campbell
>Release:        current
>Organization:
The TileCFG Foundation
>Environment:
>Description:
Intel AMX (Advanced Matrix Extensions) extends the CPU state with:

- a 64-byte TILECFG register
- an 8192-byte TILEDATA register

which a new set of tile computation instructions operate on.  These registers function similarly to the xmm/ymm/zmm extended SIMD registers -- they are saved and restored with XSAVE/XRSTOR, support for them is indicated in CPUID[EAX=0x0d, ECX=0] and other information is reported in other CPUID[EAX=0x0d, ECX=...] outputs, and access to them is controlled via XCR0 bits.

With the patches for

PR kern/57661: Crash when booting on Xeon Silver 4416+ in KVM/Qemu
https://gnats.netbsd.org/57661

we can save and restore the AMX state naively, but we should also expose them to ptrace(2) for debuggers.
>How-To-Repeat:
do matrixy stuff, I dunno
>Fix:
1. Define XSAVE_* numbers for TILECFG and TILEDATA.
2. Extend `struct xstate' (NetBSD software representation of the XSAVE area with fixed offsets) with TILECFG and TILEDATA components.
3. At boot-time, compute the smallest size of `struct xstate' prefix that fits all the XSAVE_* components supported by both the hardware and the software, say x86_xstate_size.
4. Change sizeof(struct xstate) to x86_xstate_size in various places, and use kmem_zalloc(x86_xstate_size, KM_SLEEP) instead of stack-allocated struct xstate objects (now that they're well over 10 KiB).
5. Add automatic tests of ptrace access to TILECFG/TILEDATA, which can be run on new enough CPUs.
6. Make sure gdb knows what to do.
7. Go back after all this work and discover that Intel has totally deprecated AMX and decided to ditch it like it did a few years ago with MPX.



Home | Main Index | Thread Index | Old Index