RE: [PATCH 1/2] x86/hyperv: Fix hyperv_pcpu_input_arg handling when CPUs go online/offline

From: Vitaly Kuznetsov
Date: Tue May 23 2023 - 04:14:04 EST


"Michael Kelley (LINUX)" <mikelley@xxxxxxxxxxxxx> writes:

> From: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx> Sent: Monday, May 22, 2023 1:56 AM
>>
>> Michael Kelley <mikelley@xxxxxxxxxxxxx> writes:
>>
>
> [snip]
>
>> > diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
>> > index 0f1001d..696004a 100644
>> > --- a/include/linux/cpuhotplug.h
>> > +++ b/include/linux/cpuhotplug.h
>> > @@ -201,6 +201,7 @@ enum cpuhp_state {
>> > /* Online section invoked on the hotplugged CPU from the hotplug thread */
>> > CPUHP_AP_ONLINE_IDLE,
>> > CPUHP_AP_KVM_ONLINE,
>> > + CPUHP_AP_HYPERV_ONLINE,
>>
>> (Cc: KVM)
>>
>> I would very much prefer we swap the order with KVM here: hv_cpu_init()
>> allocates and sets vCPU's VP assist page which is used by KVM on
>> CPUHP_AP_KVM_ONLINE:
>>
>> kvm_online_cpu() -> __hardware_enable_nolock() ->
>> kvm_arch_hardware_enable() -> vmx_hardware_enable():
>>
>> /*
>> * This can happen if we hot-added a CPU but failed to allocate
>> * VP assist page for it.
>> */
>> if (kvm_is_using_evmcs() && !hv_get_vp_assist_page(cpu))
>> return -EFAULT;
>>
>> With the change, this is likely broken.
>>
>> FWIF, KVM also needs current vCPU's VP index (also set by hv_cpu_init())
>> through __kvm_x86_vendor_init() -> set_hv_tscchange_cb() call chain but
>> this happens upon KVM module load so CPU hotplug ordering should not
>> matter as this always happens on a CPU which is already online.
>>
>> Generally, as 'KVM on Hyper-V' is a supported scenario, doing Hyper-V
>> init before KVM's (and vice versa on teardown) makes sense.
>>
>> > CPUHP_AP_SCHED_WAIT_EMPTY,
>> > CPUHP_AP_SMPBOOT_THREADS,
>> > CPUHP_AP_X86_VDSO_VMA_ONLINE,
>
> I have no objection to putting CPUHP_AP_HYPERV_ONLINE first. I did
> not give any consideration to a possible dependency between the two. :-(
> But note that in current code, hv_cpu_init() is running on the
> CPUHP_AP_ONLINE_DYN state, which is also after KVM. So this patch
> doesn't change the order w.r.t. KVM and the VP assist page. Are things
> already broken for KVM, or is something else happening that makes it
> work anyway?

This looks like a currently present bug indeed so I had to refresh my
memory.

KVM's CPUHP_AP_KVM_STARTIN is registered with
cpuhp_setup_state_nocalls() which means that kvm_starting_cpu() is not
called for all currently present CPUs. Moreover, kvm_init() is called
when KVM vendor module (e.g. kvm_intel) is loaded and as KVM is normally
built as module, this happens much later than Hyper-V's
hyperv_init(). vmx_hardware_enable() is actually called from
hardware_enable_all() which happens when the first KVM VM is created.

This all changes when a CPU is hotplugged. The order CPUHP_AP_* from
cpuhp_state is respected and KVM's kvm_starting_cpu() is called _before_
Hyper-V's hv_cpu_init() even before your patch. We don't see the bug
just because Hyper-V doesn't(?) support CPU hotplug. Just sending a CPU
offline with e.g. "echo 0 > /sys/devices/system/cpu/cpuX/online" is not
the same as once allocated, VP assist page persists for all non-root
Hyper-V partitions. I don't know if KVM is supported for Hyper-V root
partitions but in case it is, we may have a problem.

Let's put CPUHP_AP_HYPERV_ONLINE before KVM's CPUHP_AP_KVM_ONLINE
explicitly so CPU hotplug scenario is handled correctly, even if current
Hyper-V versions don't support it.

--
Vitaly