Re: [PATCH v4 01/10] PCI: Avoid saving error values for config space

From: Farhan Ali

Date: Mon Oct 06 2025 - 13:55:02 EST



On 10/4/2025 7:54 AM, Lukas Wunner wrote:
On Wed, Oct 01, 2025 at 10:12:03AM -0700, Farhan Ali wrote:
AFAIU if the state_saved flag was set to true then any state that we have
saved should be valid and should be okay to be restored from. We just want
to avoid saving any invalid data.
The state_saved flag is used by the PCI core to detect whether a driver
has called pci_save_state() in one of its suspend callbacks. If it did,
the PCI core assumes that the driver has taken on the responsibility to
put the device into a low power state. The PCI core will thus not put
the device into a low power state itself and it won't (again) call
pci_save_state().

Hence state_saved is cleared before the driver suspend callbacks are
invoked and it is checked afterwards.

Clearing the state_saved flag in pci_restore_state() merely serves the
purpose of ensuring that the flag is cleared ahead of the next suspend
and resume cycle.

It is a fallacy to think that state_saved indicates validity of the
saved state.

Hi Lukas,

Thanks for the detailed explanation, this was very helpful for me.

Unfortunately pci_restore_state() was amended by c82f63e411f1 to
bail out if state_saved is false. This has arguably caused more
problems than it solved, so I have prepared this development branch
which essentially reverts the commit and undoes most of the awful
workarounds that it necessitated:

https://github.com/l1k/linux/commits/aer_reset_v1

I intend to submit this after the merge window has closed.

The motivation of c82f63e411f1 was to prevent restoring state if
pci_save_state() hasn't been called before. I am solving that by
calling pci_save_state() on device addition, hence error
recoverability is ensured at all times.

I believe this also makes patch [01/10] in your series unnecessary.

I tested your patches + patches 2-10 of this series. It unfortunately didn't completely help with the s390x use case. We still need the check to in pci_save_state() from this patch to make sure we are not saving error values, which can be written back to the device in pci_restore_state().

As part of the error recovery userspace can use the VFIO_DEVICE_RESET to reset the device (pci_try_reset_function()). The function call for this is:

pci_dev_save_and_disable <https://elixir.bootlin.com/linux/v6.17.1/C/ident/pci_dev_save_and_disable>();

__pci_reset_function_locked <https://elixir.bootlin.com/linux/v6.17.1/C/ident/__pci_reset_function_locked>();

pci_dev_restore <https://elixir.bootlin.com/linux/v6.17.1/C/ident/pci_dev_restore>();

So we can end up overwriting the initial saved state (added by you in pci_bus_add_device()). Do we need to update the pci_dev_save_and_disable() not to save the state?

Thanks

Farhan



A lot of drivers call pci_save_state() in their probe hook and
that continues to be correct if they modified Config Space
vis-a-vis what was saved on device addition.

Thanks,

Lukas