Re: [PATCH v3] kvm: better MWAIT emulation for guests
From: Michael S. Tsirkin
Date: Tue Mar 14 2017 - 11:35:07 EST
On Tue, Mar 14, 2017 at 02:58:24PM +0100, Radim KrÄmÃÅ wrote:
> 2017-03-14 01:44+0200, Michael S. Tsirkin:
> > Guests running Mac OS 5, 6, and 7 (Leopard through Lion) have a problem:
> > unless explicitly provided with kernel command line argument
> > "idlehalt=0" they'd implicitly assume MONITOR and MWAIT availability,
> > without checking CPUID.
> >
> > We currently emulate that as a NOP but on VMX we can do better: let
> > guest stop the CPU until timer, IPI or memory change. CPU will be busy
> > but that isn't any worse than a NOP emulation.
> >
> > Note that mwait within guests is not the same as on real hardware
> > because halt causes an exit while mwait doesn't. For this reason it
> > might not be a good idea to use the regular MWAIT flag in CPUID to
> > signal this capability. Add a flag in the hypervisor leaf instead.
> >
> > Additionally, we add a capability for QEMU - e.g. if it knows there's an
> > isolated CPU dedicated for the VCPU it can set the standard MWAIT flag
> > to improve guest behaviour.
> >
> > Reported-by: "Gabriel L. Somlo" <gsomlo@xxxxxxxxx>
> > Signed-off-by: Michael S. Tsirkin <mst@xxxxxxxxxx>
> > ---
> >
> > Note: SVM bits are untested at this point. Seems pretty
> > obvious though.
> >
> > changes from v2:
> > - add a capability to allow host userspace to detect new kernels
> > - more documentation to clarify the semantics of the feature flag
> > and why it's useful
> > - svm support as suggested by Radim
> >
> > changes from v1:
> > - typo fix resulting in rest of leaf flags being overwritten
> > Reported by: Wanpeng Li <kernellwp@xxxxxxxxx>
> > - updated commit log with data about guests helped by this feature
> > - better document differences between mwait and halt for guests
> >
> > diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
> > @@ -4135,11 +4135,11 @@ available, means that that the kernel can support guests using the
> > radix MMU defined in Power ISA V3.00 (as implemented in the POWER9
> > processor).
> >
> > -8.4 KVM_CAP_PPC_HASH_MMU_V3
>
> This patch should not not remove the PPC capability from docs.
>
> (The right name is KVM_CAP_PPC_HASH_V3, but that is for another patch.)
Oops my bad. If you do decide you want me to respin because of this,
pls let me know.
> > +8.5 KVM_CAP_X86_GUEST_MWAIT
> >
> > -Architectures: ppc
> > +Architectures: x86
> >
> > -This capability, if KVM_CHECK_EXTENSION indicates that it is
> > -available, means that that the kernel can support guests using the
> > -hashed page table MMU defined in Power ISA V3.00 (as implemented in
> > -the POWER9 processor), including in-memory segment tables.
> > +This capability indicates that guest using memory monotoring instructions
> > +(MWAIT/MWAITX) to stop the virtual CPU will not cause a VM exit. As such time
> > +spent while virtual CPU is halted in this way will then be accounted for as
> > +guest running time on the host (as opposed to e.g. HLT).
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > @@ -2684,6 +2684,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
> > case KVM_CAP_ADJUST_CLOCK:
> > r = KVM_CLOCK_TSC_STABLE;
> > break;
> > + case KVM_CAP_X86_GUEST_MWAIT:
> > + r = !!this_cpu_has(X86_FEATURE_MWAIT);
>
> this_cpu_has already returns bool, so !! is not needed.
>
> I can fix both while applying.
OK, pls let me know if you need any more.
> > + break;
> > case KVM_CAP_X86_SMM:
> > /* SMBASE is usually relocated above 1M on modern chipsets,
> > * and SMM handlers might indeed rely on 4G segment limits,