Re: [PATCH v11 4/9] PCI: Add additional checks for flr reset

From: Bjorn Helgaas

Date: Tue Mar 24 2026 - 18:54:01 EST


On Mon, Mar 16, 2026 at 12:15:39PM -0700, Farhan Ali wrote:
> If a device is in an error state, then any reads of device registers can
> return error value. Add additional checks to validate if a device is in an
> error state before doing an flr reset.

s/flr/FLR/ (also in subject)

Also please extend the subject to say something specific about the
"additional checks". E.g.,

PCI: Fail FLR when config space inaccessible

> Cc: stable@xxxxxxxxxxxxxxx
> Reviewed-by: Benjamin Block <bblock@xxxxxxxxxxxxx>
> Reviewed-by: Niklas Schnelle <schnelle@xxxxxxxxxxxxx>
> Signed-off-by: Farhan Ali <alifm@xxxxxxxxxxxxx>
> ---
> drivers/pci/pci.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 373421f4b9d8..8e4d924f4e88 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -4371,12 +4371,19 @@ EXPORT_SYMBOL_GPL(pcie_flr);
> */
> int pcie_reset_flr(struct pci_dev *dev, bool probe)
> {
> + u32 reg;
> +
> if (dev->dev_flags & PCI_DEV_FLAGS_NO_FLR_RESET)
> return -ENOTTY;
>
> if (!(dev->devcap & PCI_EXP_DEVCAP_FLR))
> return -ENOTTY;
>
> + if (pcie_capability_read_dword(dev, PCI_EXP_DEVCAP, &reg)) {
> + pci_warn(dev, "Device unable to do an FLR\n");
> + return -ENOTTY;
> + }

I guess the point of this is to detect devices that are inaccessible?
The same sort of thing as in pci_dev_save_and_disable() from patch
3/9? But we use "dev->error_state == pci_channel_io_perm_failure"
instead?

No matter what we do, this has the same race as in patch 3/9. And I
think using dev->error_state also depends on AER being enabled, which
cuts out many PCIe devices.

I think using the same exact code as in pci_dev_save_and_disable()
would be more straightforward. And also the same sort of wording in
the message, e.g., "Device config space inaccessible; unable to FLR"
or similar. I foresee many of these messages in my future, and it
will be helpful to have a specific clue about why FLR failed :)

> if (probe)
> return 0;
>
> --
> 2.43.0
>