Re: [PATCH RFC 2/7] kvm: x86: Introduce XFD MSRs as passthrough to guest

From: Liu, Jing2
Date: Tue Jul 06 2021 - 03:34:11 EST

On 6/30/2021 1:58 AM, Dave Hansen wrote:
On 6/27/21 7:00 PM, Liu, Jing2 wrote:
On 6/24/2021 1:50 AM, Dave Hansen wrote:
On 5/24/21 2:43 PM, Sean Christopherson wrote:
On Sun, Feb 07, 2021, Jing Liu wrote:
Passthrough both MSRs to let guest access and write without vmexit.
Why? Except for read-only MSRs, e.g. MSR_CORE_C1_RES,
passthrough MSRs are costly to support because KVM must context
switch the MSR (which, by the by, is completely missing from the

In other words, if these MSRs are full RW passthrough, guests
with XFD enabled will need to load the guest value on entry, save
the guest value on exit, and load the host value on exit. That's
in the neighborhood of a 40% increase in latency for a single
VM-Enter/VM-Exit roundtrip (~1500 cycles =>
2000 cycles).
I'm not taking a position as to whether these _should_ be passthrough or
not.  But, if they are, I don't think you strictly need to do the
RDMSR/WRMSR at VM-Exit time.
Hi Dave,

Thanks for reviewing the patches.

When vmexit, clearing XFD (because KVM thinks guest has requested AMX) can
be deferred to the time when host does XSAVES, but this means need a new
flag in common "fpu" structure or a common macro per thread which works
only dedicated for KVM case, and check the flag in 1) switch_fpu_prepare()
2) kernel_fpu_begin() . This is the concern to me.
Why is this a concern? You're worried about finding a single bit worth
of space somewhere?
A bit of flag can be found so far though the space is somehow nervous. What I
am worrying about is, we introduce a flag per thread and add the check in core
place like softirq path and context switch path, to handle a case only for KVM
thread + XFD=1 + AMX usage in guest. This is not a quite frequent case but we
need check every time for every thread.

I am considering using XGETBV(1) (~24 cycles) to detect if KVM really need
wrmsr(0) to clear XFD for guest AMX state when vmexit. And this is not a quite
frequent case I think. Only one concern is, does/will kernel check somewhere that
thread's memory fpu buffer is already large but thread's XFD=1? (I believe not)