Re: [PATCH v2 1/2] x86/hyperv: Fix hyperv_pcpu_input_arg handling when CPUs go online/offline

From: Vitaly Kuznetsov
Date: Mon May 29 2023 - 04:06:07 EST


Michael Kelley <mikelley@xxxxxxxxxxxxx> writes:

> These commits
>
> a494aef23dfc ("PCI: hv: Replace retarget_msi_interrupt_params with hyperv_pcpu_input_arg")
> 2c6ba4216844 ("PCI: hv: Enable PCI pass-thru devices in Confidential VMs")
>
> update the Hyper-V virtual PCI driver to use the hyperv_pcpu_input_arg
> because that memory will be correctly marked as decrypted or encrypted
> for all VM types (CoCo or normal). But problems ensue when CPUs in the
> VM go online or offline after virtual PCI devices have been configured.
>
> When a CPU is brought online, the hyperv_pcpu_input_arg for that CPU is
> initialized by hv_cpu_init() running under state CPUHP_AP_ONLINE_DYN.
> But this state occurs after state CPUHP_AP_IRQ_AFFINITY_ONLINE, which
> may call the virtual PCI driver and fault trying to use the as yet
> uninitialized hyperv_pcpu_input_arg. A similar problem occurs in a CoCo
> VM if the MMIO read and write hypercalls are used from state
> CPUHP_AP_IRQ_AFFINITY_ONLINE.
>
> When a CPU is taken offline, IRQs may be reassigned in state
> CPUHP_TEARDOWN_CPU. Again, the virtual PCI driver may fault trying to
> use the hyperv_pcpu_input_arg that has already been freed by a
> higher state.
>
> Fix the onlining problem by adding state CPUHP_AP_HYPERV_ONLINE
> immediately after CPUHP_AP_ONLINE_IDLE (similar to CPUHP_AP_KVM_ONLINE)
> and before CPUHP_AP_IRQ_AFFINITY_ONLINE. Use this new state for
> Hyper-V initialization so that hyperv_pcpu_input_arg is allocated
> early enough.
>
> Fix the offlining problem by not freeing hyperv_pcpu_input_arg when
> a CPU goes offline. Retain the allocated memory, and reuse it if
> the CPU comes back online later.
>
> Signed-off-by: Michael Kelley <mikelley@xxxxxxxxxxxxx>

Very minor nitpick: I would've changed the Subject of the patch to
something a bit more meaningful, e.g.

"x86/hyperv: Avoid freeing per-CPU hyperv_pcpu_{input,output}_arg upon CPU offlining"

or something like that.

> ---
>
> Changes in v2:
> * Put CPUHP_AP_HYPERV_ONLINE before CPUHP_AP_KVM_ONLINE [Vitaly
> Kuznetsov]
>
> arch/x86/hyperv/hv_init.c | 2 +-
> drivers/hv/hv_common.c | 48 +++++++++++++++++++++++-----------------------
> include/linux/cpuhotplug.h | 1 +
> 3 files changed, 26 insertions(+), 25 deletions(-)
>
> diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
> index a5f9474..6c04b52 100644
> --- a/arch/x86/hyperv/hv_init.c
> +++ b/arch/x86/hyperv/hv_init.c
> @@ -416,7 +416,7 @@ void __init hyperv_init(void)
> goto free_vp_assist_page;
> }
>
> - cpuhp = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "x86/hyperv_init:online",
> + cpuhp = cpuhp_setup_state(CPUHP_AP_HYPERV_ONLINE, "x86/hyperv_init:online",
> hv_cpu_init, hv_cpu_die);
> if (cpuhp < 0)
> goto free_ghcb_page;
> diff --git a/drivers/hv/hv_common.c b/drivers/hv/hv_common.c
> index 64f9cec..542a1d5 100644
> --- a/drivers/hv/hv_common.c
> +++ b/drivers/hv/hv_common.c
> @@ -364,13 +364,20 @@ int hv_common_cpu_init(unsigned int cpu)
> flags = irqs_disabled() ? GFP_ATOMIC : GFP_KERNEL;
>
> inputarg = (void **)this_cpu_ptr(hyperv_pcpu_input_arg);
> - *inputarg = kmalloc(pgcount * HV_HYP_PAGE_SIZE, flags);
> - if (!(*inputarg))
> - return -ENOMEM;
>
> - if (hv_root_partition) {
> - outputarg = (void **)this_cpu_ptr(hyperv_pcpu_output_arg);
> - *outputarg = (char *)(*inputarg) + HV_HYP_PAGE_SIZE;
> + /*
> + * hyperv_pcpu_input_arg and hyperv_pcpu_output_arg memory is already
> + * allocated if this CPU was previously online and then taken offline
> + */
> + if (!*inputarg) {
> + *inputarg = kmalloc(pgcount * HV_HYP_PAGE_SIZE, flags);

Nitpick: I see this was in the original code but we could probably
switch to kcalloc() here.

> + if (!(*inputarg))
> + return -ENOMEM;
> +
> + if (hv_root_partition) {
> + outputarg = (void **)this_cpu_ptr(hyperv_pcpu_output_arg);
> + *outputarg = (char *)(*inputarg) + HV_HYP_PAGE_SIZE;
> + }
> }
>
> msr_vp_index = hv_get_register(HV_REGISTER_VP_INDEX);
> @@ -385,24 +392,17 @@ int hv_common_cpu_init(unsigned int cpu)
>
> int hv_common_cpu_die(unsigned int cpu)
> {
> - unsigned long flags;
> - void **inputarg, **outputarg;
> - void *mem;
> -
> - local_irq_save(flags);
> -
> - inputarg = (void **)this_cpu_ptr(hyperv_pcpu_input_arg);
> - mem = *inputarg;
> - *inputarg = NULL;
> -
> - if (hv_root_partition) {
> - outputarg = (void **)this_cpu_ptr(hyperv_pcpu_output_arg);
> - *outputarg = NULL;
> - }
> -
> - local_irq_restore(flags);
> -
> - kfree(mem);
> + /*
> + * The hyperv_pcpu_input_arg and hyperv_pcpu_output_arg memory
> + * is not freed when the CPU goes offline as the hyperv_pcpu_input_arg
> + * may be used by the Hyper-V vPCI driver in reassigning interrupts
> + * as part of the offlining process. The interrupt reassignment
> + * happens *after* the CPUHP_AP_HYPERV_ONLINE state has run and
> + * called this function.
> + *
> + * If a previously offlined CPU is brought back online again, the
> + * originally allocated memory is reused in hv_common_cpu_init().
> + */
>
> return 0;
> }
> diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
> index 0f1001d..3ceb9df 100644
> --- a/include/linux/cpuhotplug.h
> +++ b/include/linux/cpuhotplug.h
> @@ -200,6 +200,7 @@ enum cpuhp_state {
>
> /* Online section invoked on the hotplugged CPU from the hotplug thread */
> CPUHP_AP_ONLINE_IDLE,
> + CPUHP_AP_HYPERV_ONLINE,
> CPUHP_AP_KVM_ONLINE,
> CPUHP_AP_SCHED_WAIT_EMPTY,
> CPUHP_AP_SMPBOOT_THREADS,

Reviewed-by: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx>

--
Vitaly