Re: [PATCH AUTOSEL 5.16 07/28] x86/kvm/fpu: Limit guest user_xfeatures to supported bits of XCR0

From: Paolo Bonzini
Date: Tue Jun 07 2022 - 08:55:13 EST


On 6/6/22 23:27, Peter Xu wrote:
On Mon, Jun 06, 2022 at 06:18:12PM +0200, Paolo Bonzini wrote:
However there seems to be something missing at least to me, on why it'll
fail a migration from 5.15 (without this patch) to 5.18 (with this patch).
In my test case, user_xfeatures will be 0x7 (FP|SSE|YMM) if without this
patch, but 0x0 if with it.

What CPU model are you using for the VM?

I didn't specify it, assuming it's qemu64 with no extra parameters.

Ok, so indeed it lacks AVX and this patch can have an effect.

For example, if the source lacks this patch but the destination has it,
the source will transmit YMM registers, but the destination will fail to
set them if they are not available for the selected CPU model.

See the commit message: "As a bonus, it will also fail if userspace tries to
set fpu features (with the KVM_SET_XSAVE ioctl) that are not compatible to
the guest configuration. Such features will never be returned by
KVM_GET_XSAVE or KVM_GET_XSAVE2."

IIUC you meant we should have failed KVM_SET_XSAVE when they're not aligned
(probably by failing validate_user_xstate_header when checking against the
user_xfeatures on dest host). But that's probably not my case, because here
KVM_SET_XSAVE succeeded, it's just that the guest gets a double fault after
the precopy migration completes (or for postcopy when the switchover is
done).

Difficult to say what's happening without seeing at least the guest code around the double fault (above you said "fail a migration" and I thought that was a different scenario than the double fault), and possibly which was the first exception that contributed to the double fault.

Paolo