Re: [PATCH v18 3/4] vfio/pci: Add a reset_done callback for vfio-pci driver

From: Alex Williamson

Date: Tue Jun 09 2026 - 15:24:57 EST


On Mon, 8 Jun 2026 12:26:36 -0700
Farhan Ali <alifm@xxxxxxxxxxxxx> wrote:
> On 6/4/2026 12:57 PM, Alex Williamson wrote:
> > For things like MSI/X, SR-IOV, RE-BAR, etc. we're actually restoring
> > from the kernel internal state rather than the save buffer state, so
> > this is a no-op. However, one thing in that list stands out, TPH.
> >
> > We don't yet support enabling TPH, but there are series on the list
> > that propose to add this. The TPH buffer space in the saved state is
> > allocated just by the capability being present. On open TPH is
> > disabled and the saved state is untouched, zeros. If TPH is then
> > enabled and the device reset, the pre-reset save state populates the
> > TPH save buffer and we restore that state post-reset. With the change
> > here, reset_done would then push the open saved state. The live TPH
> > state is enabled, therefore the restore pushes the original open state,
> > zeros.
> >
> > This would result in a visible user change and maybe more importantly
> > shows that we're relying on ad-hoc behavior, without really any specific
> > policy to have this work reliably. It actually seems like only in the
> > close function, where we've disabled anything the user might have
> > enabled, is it really valid to restore the original state. Thanks,
> >
> > Alex
>
> I was trying to see if we can remove the callback and still restore the
> device. The original reason why we wanted the callback, was to restore
> the device state into something sane[1]. Since commit a2f1e22390ac
> ("PCI/ERR: Ensure error recoverability at all times"), which removed the
> saved_state check from pci_restore_state(), we will always restore from
> the last saved state. However, the last saved state in vfio can have the
> Command register Memory bit disabled (for example could be done through
> vfio_pci_pre_reset() in QEMU). This becomes problematic when we try to
> restore MSI-X from in kernel data and when MSI-X is enabled. AFAICT the
> msix restore path doesn't check if the Memory bit is enabled before
> writing the MSI-X message, which could cause an Unsupported Request
> error from the device. From my experiments, enabling Memory bit before
> restoring MSI-X state was sufficient to get the device in a sane state
> without this patch.
>
> So we could remove this patch. But maybe there is a gap in MSI-X
> restoration path?

I'd say yes, that's a gap in __pci_restore_msix_state() that it's
writing to device MMIO space under the assumption that the device
entered save/reset/restore with memory enabled. It should really force
memory enable to be set around the access and restored after. Thanks,

Alex