RE: [PATCH 2/2] pci-hyperv: properly handle device eject

From: Long Li
Date: Tue Sep 13 2016 - 13:48:53 EST




> -----Original Message-----
> From: Dexuan Cui
> Sent: Tuesday, September 13, 2016 2:51 AM
> To: Long Li <longli@xxxxxxxxxxxxx>; KY Srinivasan <kys@xxxxxxxxxxxxx>;
> Haiyang Zhang <haiyangz@xxxxxxxxxxxxx>; Bjorn Helgaas
> <bhelgaas@xxxxxxxxxx>
> Cc: devel@xxxxxxxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; linux-
> pci@xxxxxxxxxxxxxxx
> Subject: RE: [PATCH 2/2] pci-hyperv: properly handle device eject
>
> > From: devel [mailto:driverdev-devel-bounces@xxxxxxxxxxxxxxxxxxxxxx] On
> > Behalf Of Long Li
> > Sent: Tuesday, September 13, 2016 7:54 ...
> > A PCI_EJECT message can arrive at the same time we are calling
> > pci_scan_child_bus in the workqueue for the previous
> PCI_BUS_RELATIONS
> > message, in this case we could potentailly modify the bus from two places.
> > Properly lock the bus access.
> >
> > --- a/drivers/pci/host/pci-hyperv.c
> > +++ b/drivers/pci/host/pci-hyperv.c
> > @@ -1587,7 +1587,7 @@ static void hv_eject_device_work(struct
> > work_struct
> > *work)
> > pdev = pci_get_domain_bus_and_slot(hpdev->hbus->sysdata.domain,
> 0,
> > wslot);
> > if (pdev) {
> > - pci_stop_and_remove_bus_device(pdev);
> > + pci_stop_and_remove_bus_device_locked(pdev);
> > pci_dev_put(pdev);
> > }
>
> The _locked version tries to get the mutex pci_rescan_remove_lock.
>
> But it looks pci_scan_child_bus() doesn't try to get the mutex(?), so how can
> this patch make sure the 2 code paths are not running simultaneously?

Thanks for the review.

The lock is to protect the following call to pci_scan_child_bus() in pci_devices_present_work():

/*
* Tell the core to rescan bus
* because there may have been changes.
*/
pci_lock_rescan_remove();
pci_scan_child_bus(hbus->pci_bus);
pci_unlock_rescan_remove();

This race condition has shown up in the tests.

You raised a valid concern in create_root_hv_pci_bus(). There might be another race condition there. I'll look into this.

>
> Thanks,
> -- Dexuan