[PATCH 3.2 27/27] x86: kvm: use alternatives for VMCALL vs. VMMCALL if kernel text is read-only

From: Ben Hutchings
Date: Sun Dec 28 2014 - 20:18:02 EST

3.2.66-rc1 review patch. If anyone has any objections, please let me know.


From: Paolo Bonzini <pbonzini@xxxxxxxxxx>

commit c1118b3602c2329671ad5ec8bdf8e374323d6343 upstream.

On x86_64, kernel text mappings are mapped read-only with CONFIG_DEBUG_RODATA.
In that case, KVM will fail to patch VMCALL instructions to VMMCALL
as required on AMD processors.

The failure mode is currently a divide-by-zero exception, which obviously
is a KVM bug that has to be fixed. However, picking the right instruction
between VMCALL and VMMCALL will be faster and will help if you cannot upgrade
the hypervisor.

Reported-by: Chris Webb <chris@xxxxxxxxxxxx>
Tested-by: Chris Webb <chris@xxxxxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: "H. Peter Anvin" <hpa@xxxxxxxxx>
Cc: x86@xxxxxxxxxx
Acked-by: Borislav Petkov <bp@xxxxxxx>
Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@xxxxxxxxxxxxxxx>
arch/x86/include/asm/cpufeature.h | 1 +
arch/x86/include/asm/kvm_para.h | 10 ++++++++--
arch/x86/kernel/cpu/amd.c | 7 +++++++
3 files changed, 16 insertions(+), 2 deletions(-)

--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -193,6 +193,7 @@
#define X86_FEATURE_DECODEASSISTS (8*32+12) /* AMD Decode Assists support */
#define X86_FEATURE_PAUSEFILTER (8*32+13) /* AMD filtered pause intercept */
#define X86_FEATURE_PFTHRESHOLD (8*32+14) /* AMD pause filter threshold */
+#define X86_FEATURE_VMMCALL ( 8*32+15) /* Prefer vmmcall to vmcall */

/* Intel-defined CPU features, CPUID level 0x00000007:0 (ebx), word 9 */
--- a/arch/x86/include/asm/kvm_para.h
+++ b/arch/x86/include/asm/kvm_para.h
@@ -91,15 +91,21 @@ struct kvm_vcpu_pv_apf_data {

#ifdef __KERNEL__
#include <asm/processor.h>
+#include <asm/alternative.h>

extern void kvmclock_init(void);
extern int kvm_register_clock(char *txt);

-/* This instruction is vmcall. On non-VT architectures, it will generate a
- * trap that we will then rewrite to the appropriate instruction.
+#define KVM_HYPERCALL \
+ ALTERNATIVE(".byte 0x0f,0x01,0xc1", ".byte 0x0f,0x01,0xd9", X86_FEATURE_VMMCALL)
+/* On AMD processors, vmcall will generate a trap that we will
+ * then rewrite to the appropriate instruction.
#define KVM_HYPERCALL ".byte 0x0f,0x01,0xc1"

/* For KVM hypercalls, a three-byte sequence of either the vmrun or the vmmrun
* instruction. The hypervisor may replace it with something else but only the
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -469,6 +469,13 @@ static void __cpuinit early_init_amd(str
set_cpu_cap(c, X86_FEATURE_EXTD_APICID);
+ /*
+ * This is only needed to tell the kernel whether to use VMCALL
+ * and VMMCALL. VMMCALL is never executed except under virt, so
+ * we can set it unconditionally.
+ */
+ set_cpu_cap(c, X86_FEATURE_VMMCALL);

static void __cpuinit init_amd(struct cpuinfo_x86 *c)

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/