RE: [PATCH] PCI: hv: Fix a bug on removing child devices on the bus

From: Long Li
Date: Tue Aug 24 2021 - 13:35:06 EST


> Subject: Re: [PATCH] PCI: hv: Fix a bug on removing child devices on the bus
>
> "Fix a bug ..." is not a very useful subject line. It doesn't say anything about what
> the patch *does*. It doesn't hint at a locking change.
>
> On Tue, Aug 24, 2021 at 12:20:20AM -0700, longli@xxxxxxxxxxxxxxxxx wrote:
> > From: Long Li <longli@xxxxxxxxxxxxx>
> >
> > In hv_pci_bus_exit, the code is holding a spinlock while calling
> > pci_destroy_slot(), which takes a mutex.
>
> It's unfortunate that slots are not better integrated into the PCI core. I'm sorry
> your driver even has to worry about this.
> >
> > This is not safe for spinlock. Fix this by moving the children to be
> > deleted to a list on the stack, and removing them after spinlock is
> > released.
> >
> > Fixes: 94d22763207a ("PCI: hv: Fix a race condition when removing the
> > device")
> >
> > Cc: "K. Y. Srinivasan" <kys@xxxxxxxxxxxxx>
> > Cc: Haiyang Zhang <haiyangz@xxxxxxxxxxxxx>
> > Cc: Stephen Hemminger <sthemmin@xxxxxxxxxxxxx>
> > Cc: Wei Liu <wei.liu@xxxxxxxxxx>
> > Cc: Dexuan Cui <decui@xxxxxxxxxxxxx>
> > Cc: Lorenzo Pieralisi <lorenzo.pieralisi@xxxxxxx>
> > Cc: Rob Herring <robh@xxxxxxxxxx>
> > Cc: "Krzysztof Wilczyński" <kw@xxxxxxxxx>
> > Cc: Bjorn Helgaas <bhelgaas@xxxxxxxxxx>
> > Cc: Michael Kelley <mikelley@xxxxxxxxxxxxx>
> > Cc: Dan Carpenter <dan.carpenter@xxxxxxxxxx>
> > Reported-by: Dan Carpenter <dan.carpenter@xxxxxxxxxx>
>
> A lore link to Dan's report would be useful here.

I will address your comments and send v2.

Long

>
> > Signed-off-by: Long Li <longli@xxxxxxxxxxxxx>
> > ---
> > drivers/pci/controller/pci-hyperv.c | 15 ++++++++++++---
> > 1 file changed, 12 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/pci/controller/pci-hyperv.c
> > b/drivers/pci/controller/pci-hyperv.c
> > index a53bd8728d0d..d4f3cce18957 100644
> > --- a/drivers/pci/controller/pci-hyperv.c
> > +++ b/drivers/pci/controller/pci-hyperv.c
> > @@ -3220,6 +3220,7 @@ static int hv_pci_bus_exit(struct hv_device *hdev,
> bool keep_devs)
> > struct hv_pci_dev *hpdev, *tmp;
> > unsigned long flags;
> > int ret;
> > + struct list_head removed;
> >
> > /*
> > * After the host sends the RESCIND_CHANNEL message, it doesn't @@
> > -3229,9 +3230,18 @@ static int hv_pci_bus_exit(struct hv_device *hdev, bool
> keep_devs)
> > return 0;
> >
> > if (!keep_devs) {
> > - /* Delete any children which might still exist. */
> > + INIT_LIST_HEAD(&removed);
> > +
> > + /* Move all present children to the list on stack */
> > spin_lock_irqsave(&hbus->device_list_lock, flags);
> > - list_for_each_entry_safe(hpdev, tmp, &hbus->children,
> list_entry) {
> > + list_for_each_entry_safe(hpdev, tmp, &hbus->children,
> list_entry)
> > + list_move_tail(&hpdev->list_entry, &removed);
> > + spin_unlock_irqrestore(&hbus->device_list_lock, flags);
> > +
> > + /* Remove all children in the list */
> > + while (!list_empty(&removed)) {
> > + hpdev = list_first_entry(&removed, struct hv_pci_dev,
> > + list_entry);
> > list_del(&hpdev->list_entry);
> > if (hpdev->pci_slot)
> > pci_destroy_slot(hpdev->pci_slot);
> > @@ -3239,7 +3249,6 @@ static int hv_pci_bus_exit(struct hv_device *hdev,
> bool keep_devs)
> > put_pcichild(hpdev);
> > put_pcichild(hpdev);
> > }
> > - spin_unlock_irqrestore(&hbus->device_list_lock, flags);
> > }
> >
> > ret = hv_send_resources_released(hdev);
> > --
> > 2.25.1
> >