Re: [RFC PATCH 0/3] PCI: Use Configuration RRS to wait for device ready

From: Bjorn Helgaas
Date: Wed Aug 28 2024 - 17:42:26 EST


On Wed, Aug 28, 2024 at 04:24:01PM -0500, Mario Limonciello wrote:
> On 8/27/2024 18:48, Bjorn Helgaas wrote:
> > From: Bjorn Helgaas <bhelgaas@xxxxxxxxxx>
> >
> > After a device reset, pci_dev_wait() waits for a device to become
> > completely ready by polling the PCI_COMMAND register. The spec envisions
> > that software would instead poll for the device to stop responding to
> > config reads with Completions with Request Retry Status (RRS).
> >
> > Polling PCI_COMMAND leads to hardware retries that are invisible to
> > software and the backoff between software retries doesn't work correctly.
> >
> > Root Ports are not required to support the Configuration RRS Software
> > Visibility feature that prevents hardware retries and makes the RRS
> > Completions visible to software, so this series only uses it when available
> > and falls back to PCI_COMMAND polling when it's not.
> >
> > This is completely untested and posted for comments.
> >
> > Bjorn Helgaas (3):
> > PCI: Wait for device readiness with Configuration RRS
> > PCI: aardvark: Correct Configuration RRS checking
> > PCI: Rename CRS Completion Status to RRS
> >
> > drivers/bcma/driver_pci_host.c | 10 ++--
> > drivers/pci/controller/dwc/pcie-tegra194.c | 18 +++---
> > drivers/pci/controller/pci-aardvark.c | 64 +++++++++++-----------
> > drivers/pci/controller/pci-xgene.c | 6 +-
> > drivers/pci/controller/pcie-iproc.c | 18 +++---
> > drivers/pci/pci-bridge-emul.c | 4 +-
> > drivers/pci/pci.c | 41 +++++++++-----
> > drivers/pci/pci.h | 11 +++-
> > drivers/pci/probe.c | 33 +++++------
> > include/linux/bcma/bcma_driver_pci.h | 2 +-
> > include/linux/pci.h | 1 +
> > include/uapi/linux/pci_regs.h | 6 +-
> > 12 files changed, 117 insertions(+), 97 deletions(-)
>
> Although this looks like a useful series, I'm sorry to say but this doesn't
> solve the issue that Gary and I raised. We double checked today and found
> that reading the vendor ID works just fine at this time.

Thanks for testing that.

> I think that we're still better off polling PCI_PM_CTRL to "wait" for D0
> after the state change from D3cold.

Is there some spec justification for polling PCI_PM_CTRL? I'm dubious
about doing that just because "it works" in this situation, unless we
have some better understanding about *why* it works and whether all
devices are supposed to work that way.

Bjorn