Re: [PATCH v11 3/9] PCI: Avoid saving config space state if inaccessible

From: Bjorn Helgaas

Date: Tue Mar 24 2026 - 17:41:14 EST


On Mon, Mar 16, 2026 at 12:15:38PM -0700, Farhan Ali wrote:
> The current reset process saves the device's config space state before
> reset and restores it afterward. However errors may occur unexpectedly and
> it may then be impossible to save config space because the device may be
> inaccessible (e.g. DPC) or config space may be corrupted. This results in
> saving corrupted values that get written back to the device during state
> restoration.

This patch only addresses the "inaccessible" part, so I'd drop the
"config space may be corrupted" because we aren't checking for that.

> With a reset we want to recover/restore the device into a functional state.
> So avoid saving the state of the config space when the device config space
> is inaccessible.
>
> Signed-off-by: Farhan Ali <alifm@xxxxxxxxxxxxx>
> Reviewed-by: Niklas Schnelle <schnelle@xxxxxxxxxxxxx>

Reviewed-by: Bjorn Helgaas <bhelgaas@xxxxxxxxxx>

> ---
> drivers/pci/pci.c | 14 ++++++++++++++
> 1 file changed, 14 insertions(+)
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index a93084053537..373421f4b9d8 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -5014,6 +5014,7 @@ static void pci_dev_save_and_disable(struct pci_dev *dev)
> {
> const struct pci_error_handlers *err_handler =
> dev->driver ? dev->driver->err_handler : NULL;
> + u32 val;
>
> /*
> * dev->driver->err_handler->reset_prepare() is protected against
> @@ -5033,6 +5034,19 @@ static void pci_dev_save_and_disable(struct pci_dev *dev)
> */
> pci_set_power_state(dev, PCI_D0);
>
> + /*
> + * If device's config space is inaccessible it can return ~0 for
> + * any reads. Since VFs can also return ~0 for Device and Vendor ID
> + * check Command and Status registers. At the very least we should
> + * avoid restoring config space for device with error bits set in
> + * Status register.
> + */
> + pci_read_config_dword(dev, PCI_COMMAND, &val);
> + if (PCI_POSSIBLE_ERROR(val)) {

Obviously this is still racy because the device may become
inaccessible partway through saving the state, and it might be worth
acknowledging that in the comment. But I think this is an improvement
over what we do now.

Sashiko complains about this, but I think it's mainly because of that
last sentence of the comment that says "error bits set in Status
register". This patch has to do with *saving*, not restoring, so I'd
just drop that last sentence. FWIW, Sashiko said:

The comment indicates that we should avoid restoring config space
for a device with error bits set in the Status register, but the
code only uses PCI_POSSIBLE_ERROR(val).

Since PCI_POSSIBLE_ERROR() only evaluates whether the entire 32-bit
value is exactly 0xFFFFFFFF (indicating complete device
inaccessibility), does this actually check for individual error
flags in the Status register?

If a device logs an error but is still accessible, val will reflect
those bits but will not equal 0xFFFFFFFF, causing the check to
evaluate to false. Should this code also check specific bits in the
upper 16 bits of val (such as PCI_STATUS_SIG_SYSTEM_ERROR or
PCI_STATUS_DETECTED_PARITY) to match the stated intention in the
comment?

> + pci_warn(dev, "Device config space inaccessible\n");
> + return;
> + }
> +
> pci_save_state(dev);
> /*
> * Disable the device by clearing the Command register, except for
> --
> 2.43.0
>