RE: [PATCH v2 2/6] PCI: hv: Fix a race condition in hv_irq_unmask() that can cause panic

From: Michael Kelley (LINUX)
Date: Fri Apr 07 2023 - 12:05:22 EST


From: Dexuan Cui <decui@xxxxxxxxxxxxx> Sent: Monday, April 3, 2023 7:06 PM
>
> When the host tries to remove a PCI device, the host first sends a
> PCI_EJECT message to the guest, and the guest is supposed to gracefully
> remove the PCI device and send a PCI_EJECTION_COMPLETE message to the host;
> the host then sends a VMBus message CHANNELMSG_RESCIND_CHANNELOFFER to
> the guest (when the guest receives this message, the device is already
> unassigned from the guest) and the guest can do some final cleanup work;
> if the guest fails to respond to the PCI_EJECT message within one minute,
> the host sends the VMBus message CHANNELMSG_RESCIND_CHANNELOFFER and
> removes the PCI device forcibly.
>
> In the case of fast device addition/removal, it's possible that the PCI
> device driver is still configuring MSI-X interrupts when the guest receives
> the PCI_EJECT message; the channel callback calls hv_pci_eject_device(),
> which sets hpdev->state to hv_pcichild_ejecting, and schedules a work
> hv_eject_device_work(); if the PCI device driver is calling
> pci_alloc_irq_vectors() -> ... -> hv_compose_msi_msg(), we can break the
> while loop in hv_compose_msi_msg() due to the updated hpdev->state, and
> leave data->chip_data with its default value of NULL; later, when the PCI
> device driver calls request_irq() -> ... -> hv_irq_unmask(), the guest
> crashes in hv_arch_irq_unmask() due to data->chip_data being NULL.
>
> Fix the issue by not testing hpdev->state in the while loop: when the
> guest receives PCI_EJECT, the device is still assigned to the guest, and
> the guest has one minute to finish the device removal gracefully. We don't
> really need to (and we should not) test hpdev->state in the loop.
>
> Fixes: de0aa7b2f97d ("PCI: hv: Fix 2 hang issues in hv_compose_msi_msg()")
> Signed-off-by: Dexuan Cui <decui@xxxxxxxxxxxxx>
> Cc: stable@xxxxxxxxxxxxxxx
> ---
>
> v2:
> Removed the "debug code".
> No change to the patch body.
> Added Cc:stable
>
> drivers/pci/controller/pci-hyperv.c | 11 +++++------
> 1 file changed, 5 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
> index b82c7cde19e66..1b11cf7391933 100644
> --- a/drivers/pci/controller/pci-hyperv.c
> +++ b/drivers/pci/controller/pci-hyperv.c
> @@ -643,6 +643,11 @@ static void hv_arch_irq_unmask(struct irq_data *data)
> pbus = pdev->bus;
> hbus = container_of(pbus->sysdata, struct hv_pcibus_device, sysdata);
> int_desc = data->chip_data;
> + if (!int_desc) {
> + dev_warn(&hbus->hdev->device, "%s() can not unmask irq %u\n",
> + __func__, data->irq);
> + return;
> + }
>
> spin_lock_irqsave(&hbus->retarget_msi_interrupt_lock, flags);
>
> @@ -1911,12 +1916,6 @@ static void hv_compose_msi_msg(struct irq_data *data,
> struct msi_msg *msg)
> hv_pci_onchannelcallback(hbus);
> spin_unlock_irqrestore(&channel->sched_lock, flags);
>
> - if (hpdev->state == hv_pcichild_ejecting) {
> - dev_err_once(&hbus->hdev->device,
> - "the device is being ejected\n");
> - goto enable_tasklet;
> - }
> -
> udelay(100);
> }
>
> --
> 2.25.1

Reviewed-by: Michael Kelley <mikelley@xxxxxxxxxxxxx>