Re: [PATCH] Drivers: hv: vmbus: Allow cleanup of VMBUS_CONNECT_CPU if disconnected

From: Andrea Parri
Date: Tue Nov 10 2020 - 15:18:48 EST


On Tue, Nov 10, 2020 at 07:01:18PM +0000, Chris Co wrote:
> From: Chris Co <chrco@xxxxxxxxxxxxx>
>
> When invoking kexec() on a Linux guest running on a Hyper-V host, the
> kernel panics.
>
> RIP: 0010:cpuhp_issue_call+0x137/0x140
> Call Trace:
> __cpuhp_remove_state_cpuslocked+0x99/0x100
> __cpuhp_remove_state+0x1c/0x30
> hv_kexec_handler+0x23/0x30 [hv_vmbus]
> hv_machine_shutdown+0x1e/0x30
> machine_shutdown+0x10/0x20
> kernel_kexec+0x6d/0x96
> __do_sys_reboot+0x1ef/0x230
> __x64_sys_reboot+0x1d/0x20
> do_syscall_64+0x6b/0x3d8
> entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> This was due to hv_synic_cleanup() callback returning -EBUSY to
> cpuhp_issue_call() when tearing down the VMBUS_CONNECT_CPU, even
> if the vmbus_connection.conn_state = DISCONNECTED. hv_synic_cleanup()
> should succeed in the case where vmbus_connection.conn_state
> is DISCONNECTED.
>
> Fix is to add an extra condition to test for
> vmbus_connection.conn_state == CONNECTED on the VMBUS_CONNECT_CPU and
> only return early if true. This way the kexec() path can still shut
> everything down while preserving the initial behavior of preventing
> CPU offlining on the VMBUS_CONNECT_CPU while the VM is running.
>
> Fixes: 8a857c55420f29 ("Drivers: hv: vmbus: Always handle the VMBus messages on CPU0")
> Signed-off-by: Chris Co <chrco@xxxxxxxxxxxxx>
> Cc: stable@xxxxxxxxxxxxxxx

Reviewed-by: Andrea Parri (Microsoft) <parri.andrea@xxxxxxxxx>

Thanks,
Andrea


> ---
> drivers/hv/hv.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c
> index 0cde10fe0e71..f202ac7f4b3d 100644
> --- a/drivers/hv/hv.c
> +++ b/drivers/hv/hv.c
> @@ -244,9 +244,13 @@ int hv_synic_cleanup(unsigned int cpu)
>
> /*
> * Hyper-V does not provide a way to change the connect CPU once
> - * it is set; we must prevent the connect CPU from going offline.
> + * it is set; we must prevent the connect CPU from going offline
> + * while the VM is running normally. But in the panic or kexec()
> + * path where the vmbus is already disconnected, the CPU must be
> + * allowed to shut down.
> */
> - if (cpu == VMBUS_CONNECT_CPU)
> + if (cpu == VMBUS_CONNECT_CPU &&
> + vmbus_connection.conn_state == CONNECTED)
> return -EBUSY;
>
> /*
> --
> 2.17.1
>