[PATCH v2] kvm: better MWAIT emulation for guests

From: Michael S. Tsirkin
Date: Mon Mar 13 2017 - 12:19:33 EST


Guests running Mac OS 5, 6, and 7 (Leopard through Lion) have a problem:
unless explicitly provided with kernel command line argument
"idlehalt=0" they'd implicitly assume MONITOR and MWAIT availability,
without checking CPUID.

We currently emulate that as a NOP but on VMX we can do better: let
guest stop the CPU until timer, IPI or memory change. CPU will be busy
but that isn't any worse than a NOP emulation.

Note that mwait within guests is not the same as on real hardware
because halt causes an exit while mwait doesn't. For this reason it
might not be a good idea to use the regular MWAIT flag in CPUID to
signal this capability: halt is sometimes better as you are giving up
the rest of your CPU slice. Add a flag in the hypervisor leaf instead.

Reported-by: "Gabriel L. Somlo" <gsomlo@xxxxxxxxx>
Signed-off-by: Michael S. Tsirkin <mst@xxxxxxxxxx>
---

changes from v1:
- typo fix resulting in rest of leaf flags being overwritten
Reported by: Wanpeng Li <kernellwp@xxxxxxxxx>
- updated commit log with data about guests helped by this feature
- better document differences between mwait and halt for guests



Documentation/virtual/kvm/cpuid.txt | 3 +++
arch/x86/include/uapi/asm/kvm_para.h | 1 +
arch/x86/kvm/cpuid.c | 3 +++
arch/x86/kvm/vmx.c | 4 ----
4 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/Documentation/virtual/kvm/cpuid.txt b/Documentation/virtual/kvm/cpuid.txt
index 3c65feb..5caa234 100644
--- a/Documentation/virtual/kvm/cpuid.txt
+++ b/Documentation/virtual/kvm/cpuid.txt
@@ -54,6 +54,9 @@ KVM_FEATURE_PV_UNHALT || 7 || guest checks this feature bit
|| || before enabling paravirtualized
|| || spinlock support.
------------------------------------------------------------------------------
+KVM_FEATURE_MWAIT || 8 || guest can use monitor/mwait
+ || || to halt the VCPU.
+------------------------------------------------------------------------------
KVM_FEATURE_CLOCKSOURCE_STABLE_BIT || 24 || host will warn if no guest-side
|| || per-cpu warps are expected in
|| || kvmclock.
diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h
index cff0bb6..9cc77a7 100644
--- a/arch/x86/include/uapi/asm/kvm_para.h
+++ b/arch/x86/include/uapi/asm/kvm_para.h
@@ -24,6 +24,7 @@
#define KVM_FEATURE_STEAL_TIME 5
#define KVM_FEATURE_PV_EOI 6
#define KVM_FEATURE_PV_UNHALT 7
+#define KVM_FEATURE_MWAIT 8

/* The last 8 bits are used to indicate how to interpret the flags field
* in pvclock structure. If no bits are set, all flags are ignored.
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index efde6cc..3c7fca83 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -594,6 +594,9 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
if (sched_info_on())
entry->eax |= (1 << KVM_FEATURE_STEAL_TIME);

+ if (this_cpu_has(X86_FEATURE_MWAIT))
+ entry->eax |= (1 << KVM_FEATURE_MWAIT);
+
entry->ebx = 0;
entry->ecx = 0;
entry->edx = 0;
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 4bfe349..b167aba 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -3547,13 +3547,9 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf)
CPU_BASED_USE_IO_BITMAPS |
CPU_BASED_MOV_DR_EXITING |
CPU_BASED_USE_TSC_OFFSETING |
- CPU_BASED_MWAIT_EXITING |
- CPU_BASED_MONITOR_EXITING |
CPU_BASED_INVLPG_EXITING |
CPU_BASED_RDPMC_EXITING;

- printk(KERN_ERR "cleared CPU_BASED_MWAIT_EXITING + CPU_BASED_MONITOR_EXITING\n");
-
opt = CPU_BASED_TPR_SHADOW |
CPU_BASED_USE_MSR_BITMAPS |
CPU_BASED_ACTIVATE_SECONDARY_CONTROLS;
--
MST