Re: [PATCH 5/5] kvm/x86: rework guest entry logic

From: Mark Rutland
Date: Fri Jan 14 2022 - 07:05:48 EST


On Thu, Jan 13, 2022 at 08:50:00PM +0000, Sean Christopherson wrote:
> On Tue, Jan 11, 2022, Mark Rutland wrote:
> > For consistency and clarity, migrate x86 over to the generic helpers for
> > guest timing and lockdep/RCU/tracing management, and remove the
> > x86-specific helpers.
> >
> > Prior to this patch, the guest timing was entered in
> > kvm_guest_enter_irqoff() (called by svm_vcpu_enter_exit() and
> > svm_vcpu_enter_exit()), and was exited by the call to
> > vtime_account_guest_exit() within vcpu_enter_guest().
> >
> > To minimize duplication and to more clearly balance entry and exit, both
> > entry and exit of guest timing are placed in vcpu_enter_guest(), using
> > the new guest_timing_{enter,exit}_irqoff() helpers. This may result in a
> > small amount of additional time being acounted towards guests.
>
> This can be further qualified to state that it only affects time accounting when
> using context tracking; tick-based accounting is unaffected because IRQs are
> disabled the entire time.

Ok. I'll replace that last sentence with:

When context tracking is used a small amount of additional time will be
accounted towards guests; tick-based accounting is unnaffected as IRQs are
disabled at this point and not enabled until after the return from the guest.

>
> And this might actually be a (benign?) bug fix for context tracking accounting in
> the EXIT_FASTPATH_REENTER_GUEST case (commits ae95f566b3d2 "KVM: X86: TSCDEADLINE
> MSR emulation fastpath" and 26efe2fd92e5, "KVM: VMX: Handle preemption timer
> fastpath"). In those cases, KVM will enter the guest multiple times without
> bouncing through vtime_account_guest_exit(). That means vtime_guest_enter() will
> be called when the CPU is already "in guest", and call vtime_account_system()
> when it really should call vtime_account_guest(). account_system_time() does
> check PF_VCPU and redirect to account_guest_time(), so it appears to be benign,
> but it's at least odd.
>
> > Other than this, there should be no functional change as a result of
> > this patch.

I've added wording:

This also corrects (benign) mis-balanced context tracking accounting
introduced in commits:

ae95f566b3d22ade ("KVM: X86: TSCDEADLINE MSR emulation fastpath")
26efe2fd92e50822 ("KVM: VMX: Handle preemption timer fastpath")

Where KVM can enter a guest multiple times, calling vtime_guest_enter()
without a corresponding call to vtime_account_guest_exit(), and with
vtime_account_system() called when vtime_account_guest() should be used.
As account_system_time() checks PF_VCPU and calls account_guest_time(),
this doesn't result in any functional problem, but is unnecessarily
confusing.

... and deleted the "no functional change" line for now.

I assume that other than the naming of the entry/exit functions you're happy
with this patch?

Thanks,
Mark.

> ...
>
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index e50e97ac4408..bd3873b90889 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -9876,6 +9876,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
> > set_debugreg(0, 7);
> > }
> >
> > + guest_timing_enter_irqoff();
> > +
> > for (;;) {
> > /*
> > * Assert that vCPU vs. VM APICv state is consistent. An APICv
> > @@ -9949,7 +9951,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
> > * of accounting via context tracking, but the loss of accuracy is
> > * acceptable for all known use cases.
> > */
> > - vtime_account_guest_exit();
> > + guest_timing_exit_irqoff();
> >
> > if (lapic_in_kernel(vcpu)) {
> > s64 delta = vcpu->arch.apic->lapic_timer.advance_expire_delta;