Re: [PATCH] PCI/IOV: Fix out-of-bounds access in sriov_restore_vf_rebar_state()
From: Bjorn Helgaas
Date: Thu Apr 16 2026 - 18:58:01 EST
[+cc Rafael, Eric, Alex, Lukas, generic pci_restore_state() question]
On Wed, Apr 08, 2026 at 06:39:22PM +0200, Marco Nenciarini wrote:
> sriov_restore_vf_rebar_state() extracts bar_idx from the VF Resizable
> BAR control register using a 3-bit field (PCI_VF_REBAR_CTRL_BAR_IDX,
> bits 0-2), which yields values in the range 0-7. This value is then
> used to index into dev->sriov->barsz[], which has PCI_SRIOV_NUM_BARS
> (6) entries.
>
> If the PCI config space read returns garbage data (e.g. 0xffffffff when
> the device is no longer accessible on the bus), bar_idx is 7, causing
> an out-of-bounds array access. UBSAN reports this as:
>
> UBSAN: array-index-out-of-bounds in drivers/pci/iov.c:948:51
> index 7 is out of range for type 'resource_size_t [6]'
>
> This was observed on an NVIDIA RTX PRO 1000 GPU (GB207GLM) that fell
> off the PCIe bus during a failed GC6 power state exit. The subsequent
> pci_restore_state() call triggered the UBSAN splat in
> sriov_restore_vf_rebar_state() since all config space reads returned
> 0xffffffff.
I think all of pci_restore_state() is problematic for all devices, not
just this GPU. If these config reads fail, all the previous config
writes probably failed (silently) as well.
And we have this weird retry loop in pci_restore_config_dword():
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/pci/pci.c?id=v7.0#n1766,
which was originally added by
https://git.kernel.org/linus/26f41062f28d ("PCI: check for pci bar
restore completion and retry") to fix an actual problem:
On some OEM systems, pci_restore_state() is called while FLR has not
yet completed. As a result, PCI BAR register restore is not
successful. This fix reads back the restored value and compares it
with saved value and re-tries 10 times before giving up.
This just gives me the heebie-jeebies. If we still need this retry
loop, it means all the previous state restoration (PCIe, LTR, ASPM,
IOV, PRI, ATS, DPC, etc.) probably failed, and we end up with a device
where the BARs got restored but none of the previous stuff. That
sounds like a mess.
> Add a bounds check on bar_idx before using it as an array index to
> prevent the out-of-bounds access.
>
> Fixes: 5a8f77e24a30 ("PCI/IOV: Restore VF resizable BAR state after reset")
> Cc: stable@xxxxxxxxxxxxxxx
> Signed-off-by: Marco Nenciarini <mnencia@xxxxxxxx>
> ---
> Cc: Michał Winiarski <michal.winiarski@xxxxxxxxx>
> Cc: Ilpo Järvinen <ilpo.jarvinen@xxxxxxxxxxxxxxx>
>
> drivers/pci/iov.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
> index 00784a60b..521f2cb64 100644
> --- a/drivers/pci/iov.c
> +++ b/drivers/pci/iov.c
> @@ -946,6 +946,8 @@ static void sriov_restore_vf_rebar_state(struct pci_dev *dev)
>
> pci_read_config_dword(dev, pos + PCI_VF_REBAR_CTRL, &ctrl);
> bar_idx = FIELD_GET(PCI_VF_REBAR_CTRL_BAR_IDX, ctrl);
> + if (bar_idx >= PCI_SRIOV_NUM_BARS)
> + continue;
> size = pci_rebar_bytes_to_size(dev->sriov->barsz[bar_idx]);
> ctrl &= ~PCI_VF_REBAR_CTRL_BAR_SIZE;
> ctrl |= FIELD_PREP(PCI_VF_REBAR_CTRL_BAR_SIZE, size);
> --
> 2.47.3
>