Re: [PATCH v3 08/22] KVM: Do compatibility checks on hotplugged CPUs
From: Yuan Yao
Date: Mon Sep 05 2022 - 21:25:35 EST
On Thu, Sep 01, 2022 at 07:17:43PM -0700, isaku.yamahata@xxxxxxxxx wrote:
> From: Chao Gao <chao.gao@xxxxxxxxx>
>
> At init time, KVM does compatibility checks to ensure that all online
> CPUs support hardware virtualization and a common set of features. But
> KVM uses hotplugged CPUs without such compatibility checks. On Intel
> CPUs, this leads to #GP if the hotplugged CPU doesn't support VMX or
> vmentry failure if the hotplugged CPU doesn't meet minimal feature
> requirements.
>
> Do compatibility checks when onlining a CPU and abort the online process
> if the hotplugged CPU is incompatible with online CPUs.
>
> CPU hotplug is disabled during hardware_enable_all() to prevent the corner
> case as shown below. A hotplugged CPU marks itself online in
> cpu_online_mask (1) and enables interrupt (2) before invoking callbacks
> registered in ONLINE section (3). So, if hardware_enable_all() is invoked
> on another CPU right after (2), then on_each_cpu() in hardware_enable_all()
> invokes hardware_enable_nolock() on the hotplugged CPU before
> kvm_online_cpu() is called. This makes the CPU escape from compatibility
> checks, which is risky.
>
> start_secondary { ...
> set_cpu_online(smp_processor_id(), true); <- 1
> ...
> local_irq_enable(); <- 2
> ...
> cpu_startup_entry(CPUHP_AP_ONLINE_IDLE); <- 3
> }
>
> Keep compatibility checks at KVM init time. It can help to find
> incompatibility issues earlier and refuse to load arch KVM module
> (e.g., kvm-intel).
>
> Loosen the WARN_ON in kvm_arch_check_processor_compat so that it
> can be invoked from KVM's CPU hotplug callback (i.e., kvm_online_cpu).
>
> Opportunistically, add a pr_err() for setup_vmcs_config() path in
> vmx_check_processor_compatibility() so that each possible error path has
> its own error message. Convert printk(KERN_ERR ... to pr_err to please
> checkpatch.pl
>
> Signed-off-by: Chao Gao <chao.gao@xxxxxxxxx>
> Reviewed-by: Sean Christopherson <seanjc@xxxxxxxxxx>
> Link: https://lore.kernel.org/r/20220216031528.92558-7-chao.gao@xxxxxxxxx
> Signed-off-by: Isaku Yamahata <isaku.yamahata@xxxxxxxxx>
> ---
Reviewed-by: Yuan Yao <yuan.yao@xxxxxxxxx>
> arch/x86/kvm/vmx/vmx.c | 10 ++++++----
> arch/x86/kvm/x86.c | 11 +++++++++--
> virt/kvm/kvm_main.c | 18 +++++++++++++++++-
> 3 files changed, 32 insertions(+), 7 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 3cf7f18a4115..2a1ab6495299 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -7421,20 +7421,22 @@ static int vmx_check_processor_compatibility(void)
> {
> struct vmcs_config vmcs_conf;
> struct vmx_capability vmx_cap;
> + int cpu = smp_processor_id();
>
> if (!this_cpu_has(X86_FEATURE_MSR_IA32_FEAT_CTL) ||
> !this_cpu_has(X86_FEATURE_VMX)) {
> - pr_err("kvm: VMX is disabled on CPU %d\n", smp_processor_id());
> + pr_err("kvm: VMX is disabled on CPU %d\n", cpu);
> return -EIO;
> }
>
> - if (setup_vmcs_config(&vmcs_conf, &vmx_cap) < 0)
> + if (setup_vmcs_config(&vmcs_conf, &vmx_cap) < 0) {
> + pr_err("kvm: failed to setup vmcs config on CPU %d\n", cpu);
> return -EIO;
> + }
> if (nested)
> nested_vmx_setup_ctls_msrs(&vmcs_conf.nested, vmx_cap.ept);
> if (memcmp(&vmcs_config, &vmcs_conf, sizeof(struct vmcs_config)) != 0) {
> - printk(KERN_ERR "kvm: CPU %d feature inconsistency!\n",
> - smp_processor_id());
> + pr_err("kvm: CPU %d feature inconsistency!\n", cpu);
> return -EIO;
> }
> return 0;
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 53c8ee677f16..68def7ca224a 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -12000,9 +12000,16 @@ void kvm_arch_hardware_unsetup(void)
>
> int kvm_arch_check_processor_compat(void)
> {
> - struct cpuinfo_x86 *c = &cpu_data(smp_processor_id());
> + int cpu = smp_processor_id();
> + struct cpuinfo_x86 *c = &cpu_data(cpu);
>
> - WARN_ON(!irqs_disabled());
> + /*
> + * Compatibility checks are done when loading KVM or in KVM's CPU
> + * hotplug callback. It ensures all online CPUs are compatible to run
> + * vCPUs. For other cases, compatibility checks are unnecessary or
> + * even problematic. Try to detect improper usages here.
> + */
> + WARN_ON(!irqs_disabled() && cpu_active(cpu));
>
> if (__cr4_reserved_bits(cpu_has, c) !=
> __cr4_reserved_bits(cpu_has, &boot_cpu_data))
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index db1303e2abc9..0ac00c711384 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -5013,7 +5013,11 @@ static void hardware_enable_nolock(void *caller_name)
>
> static int kvm_online_cpu(unsigned int cpu)
> {
> - int ret = 0;
> + int ret;
> +
> + ret = kvm_arch_check_processor_compat();
> + if (ret)
> + return ret;
>
> raw_spin_lock(&kvm_count_lock);
> /*
> @@ -5073,6 +5077,17 @@ static int hardware_enable_all(void)
> {
> int r = 0;
>
> + /*
> + * During onlining a CPU, cpu_online_mask is set before kvm_online_cpu()
> + * is called. on_each_cpu() between them includes the CPU. As a result,
> + * hardware_enable_nolock() may get invoked before kvm_online_cpu().
> + * This would enable hardware virtualization on that cpu without
> + * compatibility checks, which can potentially crash system or break
> + * running VMs.
> + *
> + * Disable CPU hotplug to prevent this case from happening.
> + */
> + cpus_read_lock();
> raw_spin_lock(&kvm_count_lock);
>
> kvm_usage_count++;
> @@ -5087,6 +5102,7 @@ static int hardware_enable_all(void)
> }
>
> raw_spin_unlock(&kvm_count_lock);
> + cpus_read_unlock();
>
> return r;
> }
> --
> 2.25.1
>