[patch 0/6] x86/fpu: Preparatory changes for guest AMX support

From: Thomas Gleixner
Date: Mon Dec 13 2021 - 21:50:25 EST


Folks,

this is a follow up to the initial sketch of patches which got picked up by
Jing and have been posted in combination with the KVM parts:

https://lore.kernel.org/r/20211208000359.2853257-1-yang.zhong@xxxxxxxxx

This update is only touching the x86/fpu code and not changing anything on
the KVM side.

BIG FAT WARNING: This is compile tested only!

In course of the dicsussion of the above patchset it turned out that there
are a few conceptual issues vs. hardware and software state and also
vs. guest restore.

This series addresses this with the following changes vs. the original
approach:

1) fpstate reallocation is now independent of fpu_swap_kvm_fpstate()

It is triggered directly via XSETBV and XFD MSR write emulation which
are used both for runtime and restore purposes.

For this it provides two wrappers around a common update function, one
for XCR0 and one for XFD.

Both check the validity of the arguments and the correct sizing of the
guest FPU fpstate. If the size is not sufficient, fpstate is
reallocated.

The functions can fail.

2) XFD synchronization

KVM must neither touch the XFD MSR nor the fpstate->xfd software state
in order to guarantee state consistency.

In the MSR write emulation case the XFD specific update handler has to
be invoked. See #1

If MSR write emulation is disabled because the buffer size is
sufficient for all use cases, i.e.:

guest_fpu::xfeatures == guest_fpu::perm

then there is no guarantee that the XFD software state on VMEXIT is
the same as the state on VMENTER.

A separate synchronization function is provided which reads the XFD
MSR and updates the relevant software state. This function has to be
invoked after a VMEXIT before reenabling interrupts.

With that the KVM logic looks like this:

xsetbv_emulate()
ret = fpu_update_guest_xcr0(&vcpu->arch.guest_fpu, xcr0);
if (ret)
handle_fail()
....


kvm_emulate_wrmsr()
....
case MSR_IA32_XFD:
ret = fpu_update_guest_xfd(&vcpu->arch.guest_fpu, vcpu->arch.xcr0, msrval);
if (ret)
handle_fail()
....

This covers both the case of a running vCPU and the case of restore.

The XFD synchronization mechanism is only relevant for a running vCPU after
VMEXIT when XFD MSR write emulation is disabled:

vcpu_run()
vcpu_enter_guest()
for (;;) {
...
vmenter();
...
};
...

if (!xfd_write_emulated(vcpu))
fpu_sync_guest_vmexit_xfd_state();

local_irq_enable();

It has no relevance for the guest restore case.

With that all XFD/fpstate related issues should be covered in a consistent
way.

CPUID validation can be done without exporting yet more FPU functions:

if (requested_xfeatures & ~vcpu->arch.guest_fpu.perm)
return -ENOPONY;

That's the purpose of fpu_guest::perm from the beginning along with
fpu_guest::xfeatures for other validation purposes.

XFD_ERR MSR handling is completely separate and as discussed a KVM only
issue for now. KVM has to ensure that the MSR is 0 before interrupts are
enabled. So this is not touched here.

The only remaining issue is the KVM XSTATE save/restore size checking which
probably requires some FPU core assistance. But that requires some more
thoughts vs. the IOCTL interface extension and once that is settled it
needs to be solved in one go. But that's an orthogonal issue to the above.

The series is also available from git:

git://git.kernel.org/pub/scm/linux/kernel/git/people/tglx/devel.git x86/fpu-kvm

Thanks,

tglx
---
include/asm/fpu/api.h | 63 ++++++++++++++++++++++++
include/asm/fpu/types.h | 22 ++++++++
include/uapi/asm/prctl.h | 26 +++++----
kernel/fpu/core.c | 123 ++++++++++++++++++++++++++++++++++++++++++++---
kernel/fpu/xstate.c | 118 +++++++++++++++++++++++++++------------------
kernel/fpu/xstate.h | 20 ++++++-
kernel/process.c | 2
7 files changed, 307 insertions(+), 67 deletions(-)