Re: [PATCH 2/2] PCI: Do not wait for disconnected devices when resuming

From: Ilpo Järvinen
Date: Fri Feb 02 2024 - 12:05:07 EST


On Tue, 30 Jan 2024, Ilpo Järvinen wrote:

> On Mon, 29 Jan 2024, Bjorn Helgaas wrote:
>
> > On Mon, Jan 29, 2024 at 01:27:10PM +0200, Ilpo Järvinen wrote:
> > > On runtime resume, pci_dev_wait() is called:
> > > pci_pm_runtime_resume()
> > > pci_pm_bridge_power_up_actions()
> > > pci_bridge_wait_for_secondary_bus()
> > > pci_dev_wait()
> > >
> > > While a device is runtime suspended along with its PCIe hierarchy, the
> > > device could get disconnected. In such case, the link will not come up
> > > no matter how log pci_dev_wait() waits for it.
> >
> > s/PCIe/PCI/ (unless this is a PCIe-specific thing)
> > s/log/long/
> >
> > > Besides the above mentioned case, there could be other ways to get the
> > > device disconnected while pci_dev_wait() is waiting for the link to
> > > come up.
> > >
> > > Make pci_dev_wait() to exit if the device is already disconnected to
> > > avoid unnecessary delay. As disconnected device is not really even a
> > > failure in the same sense as link failing to come up for whatever
> > > reason, return 0 instead of errno.
> >
> > The device being disconnected is not the same as a link failure.
>
> This we agree and it's what I tried to write above.
>
> > Do
> > all the callers do the right thing if pci_dev_wait() returns success
> > when there's no device there?
>
> It's a bit complicated. I honestly don't know what is the best approach
> here so I'm very much open to your suggestion what would be preferrable.
>
> There are two main use cases (more than two callers but they seem boil
> down to two use cases).
>
> One use case is reset related functions and those would probably prefer to
> have error returned if the wait, and thus reset, failed.
>
> Then the another is wait for buses, that is,
> pci_bridge_wait_for_secondary_bus() which return 0 if there's no device
> (wait successful). For it, it would make sense to return 0 also when
> device is disconnected because it seems analoguous to the case where
> there's no device in the first place.
>
> Perhaps it would be better to return -ENOTTY from pci_dev_wait() and check
> for that disconnected condition inside
> pci_bridge_wait_for_secondary_bus()? To further complicate things,
> however, DPC also depends on the return value of
> pci_bridge_wait_for_secondary_bus() and from its perspective, returning
> error when there is a device that is disconnected might be the desirable
> alternative (but I'm not fully sure how everything in DPC works but I
> highly suspect I'm correct here).

Just to note here I intend to reverse the return to -ENOTTY in v2.
It is easier and doing anything more complicated than that felt
over-engineering because it would have just avoided marking same
disconnected devices disconnected for another time which is harmless.

--
i.