Re: [PATCH v19 4/4] PCI/MSI: Enable memory decoding before restoring MSI-X messages
From: Niklas Schnelle
Date: Tue Jun 16 2026 - 07:19:05 EST
On Mon, 2026-06-15 at 15:17 -0700, Farhan Ali wrote:
> On 6/15/2026 11:59 AM, Niklas Schnelle wrote:
> > On Mon, 2026-06-15 at 11:35 -0700, Farhan Ali wrote:
> > > The current MSI-X restoration path assumes the Command register Memory bit
> > > is enabled when writing MSI-X messages. But its possible the last saved and
> > ^ it's
> >
> > > restored state of device may not have the Memory bit enabled, even if a
> > ^a
> >
> > > device driver later enables Memory bit and MSI-X. Attempting to access
> > > Memory space without Memory bit enabled can lead to Unsupported Request
> > > (UR) from the device. Fix this by enabling Memory bit if we write MSI-X
> > > messages, and restore it afterwards.
> > Don't we have the same issue in __pci_restore_msi_state()?
>
> I think for MSI it's done through config space registers and so it
> doesn't need to access MMIO.
Ah you're right, I was confused by the call to
__pci_restore_msi_state() in __pci_restore_msi_state() but looking at
the code paths it does seem like for MSI's it won't do MMIO.
>
> >
> > > Signed-off-by: Farhan Ali <alifm@xxxxxxxxxxxxx>
> > > ---
> > > drivers/pci/msi/msi.c | 9 +++++++++
> > > 1 file changed, 9 insertions(+)
> > >
> > > diff --git a/drivers/pci/msi/msi.c b/drivers/pci/msi/msi.c
> > > index 81d24a270a79..d8d3c8a911ac 100644
> > > --- a/drivers/pci/msi/msi.c
> > > +++ b/drivers/pci/msi/msi.c
> > > @@ -874,6 +874,7 @@ void __pci_restore_msix_state(struct pci_dev *dev)
> > > {
> > > struct msi_desc *entry;
> > > bool write_msg;
> > > + u16 cmd;
> > >
> > > if (!dev->msix_enabled)
> > > return;
> > > @@ -884,6 +885,11 @@ void __pci_restore_msix_state(struct pci_dev *dev)
> > > PCI_MSIX_FLAGS_ENABLE | PCI_MSIX_FLAGS_MASKALL);
> > >
> > > write_msg = arch_restore_msi_irqs(dev);
> > > + if (write_msg) {
> > > + pci_read_config_word(dev, PCI_COMMAND, &cmd);
> > Sashiko complains that the read cmd may be all 0xFF…F in case of
> > transient errors. Though I don't see how we'd handle that here without
> > a larger refactor.
> >
> > > + pci_write_config_word(dev, PCI_COMMAND,
> > > + cmd | PCI_COMMAND_MEMORY);
> > > + }
> > >
> > > scoped_guard (msi_descs_lock, &dev->dev) {
> > > msi_for_each_desc(entry, &dev->dev, MSI_DESC_ALL) {
> > > @@ -893,6 +899,9 @@ void __pci_restore_msix_state(struct pci_dev *dev)
> > > }
> > > }
> > >
> > > + if (write_msg)
> > > + pci_write_config_word(dev, PCI_COMMAND, cmd);
> > I wonder if it may be safer to have the Command register reads/writes
> > within the msic_descs_lock critical section in case this gets executed
> > concurrently and so the restore above from one thread could slip
> > between the Memory bit enable and the beginning of the critical
> > section.
>
> Hmm I am not sure if its necessary, AFAIU the msi_desc_lock is used to
> protect the descriptor list. Other config space changes in this function
> is also done outside of this critical section.
>
> OTOH Sashiko does mention correctly pci_msix_write_vector_ctrl() can
> happen even when write_msg is false and which I missed. But this got me
> curious, as to why would we write the vector control for an MSI-X entry
> if we are not writing the message (Address and Data). AFAICT the change
> to write message based on arch_restore_msi_irq() was originally
> introduced with 76ccc297018d ("x86/PCI: Expand the x86_msi_ops to have a
> restore MSIs."). Prior to that we always restored the message and vector
> control mask for an entry. So I am not sure if write_msg is false and we
> don't write a message, why would we want to write the vector control
> mask? If we have a valid reason to write the vector control, even though
> we don't write the message then we should we enable the Memory bit
> unconditionally? Thanks
>
> Farhan
It seems to me that the only place where arch_restore_msi_irqs() even
returns false is in xen_initdom_restore_msi() and from what I can tell
from the Xen code this then ends up doing the MSI write message in the
Xen hypervisor so for MSI-X also does MMIO. And yes I'd think that if
write_msg is false the pci_msix_write_vector_ctrl() would potentially
also cause an UR if the Memory bit remains off. So from my basic
understanding I'd think it should be enabled unconditionally. As a
bonus it would make the code easier to read too.
Thanks,
Niklas