Re: [PATCH v1] PCI: Wait for device readiness after D3hot -> D0uninitialized transition

From: Bjorn Helgaas

Date: Fri May 15 2026 - 11:57:38 EST


On Fri, May 15, 2026 at 02:46:54PM +0200, Lukas Wunner wrote:
> On Thu, May 14, 2026 at 04:38:11PM -0500, Bjorn Helgaas wrote:
> > On Thu, May 14, 2026 at 08:06:25PM +0000, sashiko-bot@xxxxxxxxxx wrote:
> > > > + if (!(pmcsr & PCI_PM_CTRL_NO_SOFT_RESET))
> > > > + pci_dev_wait(dev, "power up D3hot->D0uninitialized",
> > > > + PCIE_RESET_READY_POLL_MS);
> > >
> > > Is it safe to ignore the return value of pci_dev_wait() here?
> > >
> > > If the device times out or disconnects, pci_dev_wait() returns -ENOTTY. By
> > > ignoring this error, pci_power_up() continues, unconditionally sets
> > > dev->current_state = PCI_D0, and can return success (1 for need_restore).
> > >
> > > If pci_power_up() returns success, will callers like pci_back_from_sleep()
> > > execute pci_restore_state() against an unresponsive device, which could
> > > potentially trigger PCIe Master Aborts, AER events, or system crashes?
> > >
> > > Should pci_power_up() propagate the error from pci_dev_wait() to prevent
> > > this?
> >
> > Good point. I think it should look like this instead:
> >
> > if (state == PCI_D3hot) {
> > pci_dev_d3_sleep(dev);
> > if (!(pmcsr & PCI_PM_CTRL_NO_SOFT_RESET)) {
> > ret = pci_dev_wait(dev, "power up D3hot->D0uninitialized",
> > PCIE_RESET_READY_POLL_MS);
> > if (ret) {
> > pci_err(dev, "Not ready after soft reset\n");
> > dev->current_state = PCI_D3cold;
> > return -EIO;
> > }
>
> pci_dev_wait() already emits a warning message on timeout, so the
> additional pci_err() is probably not needed. Otherwise the user
> would see duplicate messages, i.e.:
>
> pci SSSS:BB:DD.F: not ready 60000ms after power up D3hot->D0uninitialized
> pci SSSS:BB:DD.F: Not ready after soft reset

True, it is redundant. My thinking was that the other message from
pci_power_up() is KERN_ERR, while the one from pci_dev_wait() is only
KERN_WARNING.

But maybe that message from pci_dev_wait() should be KERN_ERR instead
of KERN_WARNING? As far as that device is concerned, the lack of
response does seem like more than just a warning.