Re: Thoughts of AMX KVM support based on latest kernel
From: Nakajima, Jun
Date: Tue Nov 16 2021 - 17:11:54 EST
>
> On Nov 16, 2021, at 12:36 PM, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>
> Paolo,
>
> On Tue, Nov 16 2021 at 20:49, Paolo Bonzini wrote:
>> On 11/16/21 19:55, Thomas Gleixner wrote:
>>> We can do that, but I'm unhappy about this conditional in schedule(). So
>>> I was asking for doing a simple KVM only solution first:
>>>
>>> vcpu_run()
>>> kvm_load_guest_fpu()
>>> wrmsrl(XFD, guest_fpstate->xfd);
>>> XRSTORS
>>>
>>> do {
>>>
>>> local_irq_disable();
>>>
>>> if (test_thread_flag(TIF_NEED_FPU_LOAD))
>>> switch_fpu_return()
>>> wrmsrl(XFD, guest_fpstate->xfd);
>>>
>>> do {
>>> vmenter(); // Guest modifies XFD
>>> } while (reenter);
>>>
>>> update_xfd_state(); // Restore consistency
>>>
>>> local_irq_enable();
>>>
>>> and check how bad that is for KVM in terms of overhead on AMX systems.
>>
>> I agree, this is how we handle SPEC_CTRL for example and it can be
>> extended to XFD.
>
> SPEC_CTRL is different because it's done right after each VMEXIT.
>
> XFD can be done lazy when breaking out of the exit fastpath loop before
> enabling interrupts.
I agree. The XFD features are for user-space.
>
>> We should first do that, then switch to the MSR lists.
>> Hacking into schedule() should really be the last resort.
>>
>>> local_irq_enable(); <- Problem starts here
>>>
>>> preempt_enable(); <- Becomes wider here
>>
>> It doesn't become that much wider because there's always preempt
>> notifiers. So if it's okay to save XFD in the XSAVES wrapper and in
>> kvm_arch_vcpu_put(), that might be already remove the need to do it
>> schedule().
>
> Did not think about preemption notifiers. Probably because I hate
> notifiers with a passion since I had to deal with the CPU hotplug
> notifier trainwreck.
>
> But yes that would work. So the places to do that would be:
>
> 1) kvm_sched_out() -> kvm_arch_vcpu_put()
> 2) kernel_fpu_begin_mask()
> 3) kvm_put_guest_fpu()
>
> But I really would start with the trivial version I suggested because
> that's already in the slow path and not at every VMEXIT.
>
> I'd be really surprised if that RDMSR is truly noticeable within all the
> other crud this path is doing.
>
I also agree here, and we’ll measure the effect to double-check.
We don’t want to complicate or optimize the system for very rare cases.
I like your "trivial version" because all the things KVM needs to do is just restore the consistent state.
---
Jun