Re: [RFC PATCH v2 0/3] KVM: x86: add per-vCPU exits disable capability

From: Michael S. Tsirkin
Date: Mon Jan 10 2022 - 16:18:28 EST


On Tue, Dec 21, 2021 at 01:04:46AM -0800, Kechen Lu wrote:
> Summary
> ===========
> Introduce support of vCPU-scoped ioctl with KVM_CAP_X86_DISABLE_EXITS
> cap for disabling exits to enable finer-grained VM exits disabling
> on per vCPU scales instead of whole guest. This patch series enabled
> the vCPU-scoped exits control on HLT VM-exits.
>
> Motivation
> ============
> In use cases like Windows guest running heavy CPU-bound
> workloads, disabling HLT VM-exits could mitigate host sched ctx switch
> overhead. Simply HLT disabling on all vCPUs could bring
> performance benefits, but if no pCPUs reserved for host threads, could
> happened to the forced preemption as host does not know the time to do
> the schedule for other host threads want to run. With this patch, we
> could only disable part of vCPUs HLT exits for one guest, this still
> keeps performance benefits, and also shows resiliency to host stressing
> workload running at the same time.
>
> Performance and Testing
> =========================
> In the host stressing workload experiment with Windows guest heavy
> CPU-bound workloads, it shows good resiliency and having the ~3%
> performance improvement. E.g. Passmark running in a Windows guest
> with this patch disabling HLT exits on only half of vCPUs still
> showing 2.4% higher main score v/s baseline.
>
> Tested everything on AMD machines.
>
>
> v1->v2 (Sean Christopherson) :
> - Add explicit restriction for VM-scoped exits disabling to be called
> before vCPUs creation (patch 1)
> - Use vCPU ioctl instead of 64bit vCPU bitmask (patch 3), and make exits
> disable flags check purely for vCPU instead of VM (patch 2)

This is still quite blunt and assumes a ton of configuration on the host
exactly matching the workload within guest. Which seems a waste since
guests actually have the smarts to know what's happening within them.

If you are going to allow guest to halt a vCPU, how about
working on exposing mwait to guest cleanly instead?
The idea is to expose this in ACPI - linux guests
ignore ACPI and go by CPUID but windows guests follow
ACPI. Linux can be patched ;)

What we would have is a mirror of host ACPI states,
such that lower states invoke HLT and exit, higher
power states invoke mwait and wait within guest.

The nice thing with this approach is that it's already supported
by the host kernel, so it's just a question of coding up ACPI.



>
> Best Regards,
> Kechen
>
> Kechen Lu (3):
> KVM: x86: only allow exits disable before vCPUs created
> KVM: x86: move ()_in_guest checking to vCPU scope
> KVM: x86: add vCPU ioctl for HLT exits disable capability
>
> Documentation/virt/kvm/api.rst | 4 +++-
> arch/x86/include/asm/kvm-x86-ops.h | 1 +
> arch/x86/include/asm/kvm_host.h | 7 +++++++
> arch/x86/kvm/cpuid.c | 2 +-
> arch/x86/kvm/lapic.c | 2 +-
> arch/x86/kvm/svm/svm.c | 20 +++++++++++++++-----
> arch/x86/kvm/vmx/vmx.c | 26 ++++++++++++++++++--------
> arch/x86/kvm/x86.c | 24 +++++++++++++++++++++++-
> arch/x86/kvm/x86.h | 16 ++++++++--------
> 9 files changed, 77 insertions(+), 25 deletions(-)
>
> --
> 2.30.2