Re: [PATCH v2 1/2] KVM: VMX: FIXED+PHYSICAL mode single target IPI fastpath

From: Sean Christopherson
Date: Wed Nov 20 2019 - 12:02:33 EST

On Wed, Nov 20, 2019 at 11:49:36AM +0800, Wanpeng Li wrote:
> On Tue, 19 Nov 2019 at 20:11, Liran Alon <liran.alon@xxxxxxxxxx> wrote:
> > > +
> > > +static void vmx_handle_exit_irqoff(struct kvm_vcpu *vcpu, u32 *exit_reason)
> > > {
> > > struct vcpu_vmx *vmx = to_vmx(vcpu);
> > >
> > > @@ -6231,6 +6263,8 @@ static void vmx_handle_exit_irqoff(struct kvm_vcpu *vcpu)
> > > handle_external_interrupt_irqoff(vcpu);
> > > else if (vmx->exit_reason == EXIT_REASON_EXCEPTION_NMI)
> > > handle_exception_nmi_irqoff(vmx);
> > > + else if (vmx->exit_reason == EXIT_REASON_MSR_WRITE)
> > > + *exit_reason = handle_ipi_fastpath(vcpu);
> >
> > 1) This case requires a comment as the only reason it is called here is an
> > optimisation. In contrast to the other cases which must be called before
> > interrupts are enabled on the host.
> >
> > 2) I would rename handler to handle_accel_set_msr_irqoff(). To signal this
> > handler runs with host interrupts disabled and to make it a general place
> > for accelerating WRMSRs in case we would require more in the future.
> Yes, TSCDEADLINE/VMX PREEMPTION TIMER is in my todo list after this merged
> upstream, handle all the comments in v3, thanks for making this nicer
> further. :)

Handling those is very different than what is being proposed here though.
For this case, only the side effect of the WRMSR is being expedited, KVM
still goes through the heavy VM-Exit handler path to handle emulating the
WRMSR itself.

To truly expedite things like TSCDEADLINE, the entire emulation of WRMSR
would need be handled without going through the standard VM-Exit handler,
which is a much more fundamental change to vcpu_enter_guest() and has
different requirements. For example, keeping IRQs disabled is pointless
for generic WRMSR emulation since the interrupt will fire as soon as KVM
resumes the guest, whereas keeping IRQs disabled for processing ICR writes
is a valid optimization since recognition of the IPI on the dest vCPU
isn't dependent on KVM resuming the current vCPU.

Rather than optimizing full emulation flows one at a time, i.e. exempting
the ICR case, I wonder if we're better off figuring out a way to improve
the performance of VM-Exit handling at a larger scale, e.g. avoid locking
kvm->srcu unnecessarily, Andrea's retpolin changes, etc...

Oh, a random thought, this fast path needs to be skipped if KVM is
running L2, i.e. is_guest_mode(vcpu) is true.