Re: [PATCH] PCI/IOV: Fix out-of-bounds access in sriov_restore_vf_rebar_state()

From: Lukas Wunner

Date: Fri Apr 17 2026 - 01:02:56 EST


On Thu, Apr 16, 2026 at 05:57:45PM -0500, Bjorn Helgaas wrote:
> And we have this weird retry loop in pci_restore_config_dword():
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/pci/pci.c?id=v7.0#n1766,
> which was originally added by
> https://git.kernel.org/linus/26f41062f28d ("PCI: check for pci bar
> restore completion and retry") to fix an actual problem:
>
> On some OEM systems, pci_restore_state() is called while FLR has not
> yet completed. As a result, PCI BAR register restore is not
> successful. This fix reads back the restored value and compares it
> with saved value and re-tries 10 times before giving up.
>
> This just gives me the heebie-jeebies. If we still need this retry
> loop, it means all the previous state restoration (PCIe, LTR, ASPM,
> IOV, PRI, ATS, DPC, etc.) probably failed, and we end up with a device
> where the BARs got restored but none of the previous stuff. That
> sounds like a mess.

Nowadays we wait for devices to re-appear after reset by polling the
Vendor ID register, see the call to pci_dev_wait() in pcie_flr().

It seems we didn't do that back in the day when 26f41062f28d introduced
the loop. The commit went into v3.4 and back then, pcie_flr() only
waited for 100 msec:

https://elixir.bootlin.com/linux/v3.4/source/drivers/pci/pci.c#L3052

And indeed pci_reset_function() immediately restored config space
afterwards:

https://elixir.bootlin.com/linux/v3.4/source/drivers/pci/pci.c#L3285

So I strongly suspect that the loop no longer has a valid raison d'être.
Maybe remove it early in the next cycle to get linux-next coverage for
8 weeks and see if anything breaks (which I doubt)?

As to validity of cached config space state in general, see this
discussion with Ilpo yesterday, in response to a regression fix
I submitted:

https://lore.kernel.org/all/aeDXktnNLEtmYsbh@xxxxxxxxx/

Thanks,

Lukas