tech-kern archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Straw proposal: MI kthread vector/fp unit API
Here's a straw proposal for an MI API to allow a kthread to use any
vector or floating-point unit on the CPU -- call it the `FPU' for
brevity.
The MI concept of `the FPU' encompasses _all_ vector or floating-point
units that userland threads would have access to, so we don't have to
complicate it by distinguishing, e.g., the crypto registers from the
floating-point registers on Cavium -- it's all or nothing.
1. New kthread flag KTHREAD_FPU.
Any kthread created with this flag will have its FPU state saved
and restored like a userland thread.
The implementation would be a new lwp l_pflag, say LP_SYSTEM_FPU.
MD FPU traps which currently panic on LP_SYSTEM lwps will panic
only if LP_SYSTEM && !LP_SYSTEM_FPU.
2. New functions
s = kthread_fpu_enter();
...
kthread_fpu_exit(s);
During this time, it has the effect of the KTHREAD_FPU flag, and
kthread_fpu_enter/exit nest. kthread_fpu_exit additionally zeroes
the FPU registers to avoid leaking secrets through Spectre-class
vulnerabilities in case an adversary can control speculative FPU
execution before the next FPU-changing context switch.
3. New workqueue flag WQ_FPU passes KTHREAD_FPU to all the internal
kthreads. Threadpools do not have any new flag -- they can use
kthread_fpu_enter/exit in the job function, since different
threadpool jobs by design share kthreads with one another.
There may also be MD functions like x86 fpu_kern_enter to use the FPU
with preemption disabled. They may be limited to a single type of FPU
or vector unit, e.g. just Cavium crypto but not MIPS floating-point.
These functions can avoid disabling preemption -- and avoind zeroing
the FPU registers -- in FPU-enabled kthreads.
That way, for example, you can use (say) an AES encryption routine
aes_enc as a subroutine anywhere in the kernel, and an MD definition
of aes_enc can internally use AES-NI with the appropriate MD
fpu_kern_enter -- but it's a little cheaper to use aes_enc in an
FPU-enabled kthread. This gave a modest measurable boost to cgd(4)
throughput in my preliminary experiments.
Thoughts?
Home |
Main Index |
Thread Index |
Old Index