[PATCH] xen: xen-pciback: Reset MSI-X state when exposing a device

From: Chao Gao
Date: Tue Dec 04 2018 - 21:15:43 EST


I find some pass-thru devices don't work any more across guest reboot.
Assigning it to another guest also meets the same issue. And the only
way to make it work again is un-binding and binding it to pciback.
Someone reported this issue one year ago [1]. More detail also can be
found in [2].

The root-cause is Xen's internal MSI-X state isn't reset properly
during reboot or re-assignment. In the above case, Xen set maskall bit
to mask all MSI interrupts after it detected a potential security
issue. Even after device reset, Xen didn't reset its internal maskall
bit. As a result, maskall bit would be set again in next write to
MSI-X message control register.

Given that PHYSDEVOPS_prepare_msix() also triggers Xen resetting MSI-X
internal state of a device, we employ it to fix this issue rather than
introducing another dedicated sub-hypercall.

Note that PHYSDEVOPS_release_msix() will fail if the mapping between
the device's msix and pirq has been created. This limitation prevents
us calling this function when detaching a device from a guest during
guest shutdown. Thus it is called right before calling
PHYSDEVOPS_prepare_msix().

[1]: https://lists.xenproject.org/archives/html/xen-devel/2017-09/
msg02520.html
[2]: https://lists.xen.org/archives/html/xen-devel/2018-11/msg01616.html

Signed-off-by: Chao Gao <chao.gao@xxxxxxxxx>
---
drivers/xen/xen-pciback/pci_stub.c | 49 ++++++++++++++++++++++++++++++++++++++
drivers/xen/xen-pciback/pciback.h | 1 +
drivers/xen/xen-pciback/xenbus.c | 10 ++++++++
3 files changed, 60 insertions(+)

diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
index 59661db..f8623d0 100644
--- a/drivers/xen/xen-pciback/pci_stub.c
+++ b/drivers/xen/xen-pciback/pci_stub.c
@@ -87,6 +87,55 @@ static struct pcistub_device *pcistub_device_alloc(struct pci_dev *dev)
return psdev;
}

+/*
+ * Reset Xen internal MSI-X state by invoking PHYSDEVOP_{release, prepare}_msix.
+ */
+int pcistub_msix_reset(struct pci_dev *dev)
+{
+#ifdef CONFIG_PCI_MSI
+ if (dev->msix_cap) {
+ struct physdev_pci_device ppdev = {
+ .seg = pci_domain_nr(dev->bus),
+ .bus = dev->bus->number,
+ .devfn = dev->devfn
+ };
+ int err;
+ u16 val;
+
+ /*
+ * Do a write first to flush Xen's internal state to hardware
+ * such that the following read can infer whether MSI-X maskall
+ * bit is set by Xen.
+ */
+ pci_read_config_word(dev, dev->msix_cap + PCI_MSIX_FLAGS, &val);
+ pci_write_config_word(dev, dev->msix_cap + PCI_MSIX_FLAGS, val);
+
+ pci_read_config_word(dev, dev->msix_cap + PCI_MSIX_FLAGS, &val);
+ if (!(val & PCI_MSIX_FLAGS_MASKALL))
+ return 0;
+
+ pr_info("Reset MSI-X state for device %04x:%02x:%02x.%d\n",
+ ppdev.seg, ppdev.bus, PCI_SLOT(ppdev.devfn),
+ PCI_FUNC(ppdev.devfn));
+
+ err = HYPERVISOR_physdev_op(PHYSDEVOP_release_msix, &ppdev);
+ if (err) {
+ dev_warn(&dev->dev, "MSI-X release failed (%d)\n",
+ err);
+ return err;
+ }
+
+ err = HYPERVISOR_physdev_op(PHYSDEVOP_prepare_msix, &ppdev);
+ if (err) {
+ dev_err(&dev->dev, "MSI-X preparation failed (%d)\n",
+ err);
+ return err;
+ }
+ }
+#endif
+ return 0;
+}
+
/* Don't call this directly as it's called by pcistub_device_put */
static void pcistub_device_release(struct kref *kref)
{
diff --git a/drivers/xen/xen-pciback/pciback.h b/drivers/xen/xen-pciback/pciback.h
index 263c059..9046154 100644
--- a/drivers/xen/xen-pciback/pciback.h
+++ b/drivers/xen/xen-pciback/pciback.h
@@ -66,6 +66,7 @@ struct pci_dev *pcistub_get_pci_dev_by_slot(struct xen_pcibk_device *pdev,
struct pci_dev *pcistub_get_pci_dev(struct xen_pcibk_device *pdev,
struct pci_dev *dev);
void pcistub_put_pci_dev(struct pci_dev *dev);
+int pcistub_msix_reset(struct pci_dev *dev);

/* Ensure a device is turned off or reset */
void xen_pcibk_reset_device(struct pci_dev *pdev);
diff --git a/drivers/xen/xen-pciback/xenbus.c b/drivers/xen/xen-pciback/xenbus.c
index 581c4e1..2f71f26 100644
--- a/drivers/xen/xen-pciback/xenbus.c
+++ b/drivers/xen/xen-pciback/xenbus.c
@@ -243,6 +243,16 @@ static int xen_pcibk_export_device(struct xen_pcibk_device *pdev,
goto out;
}

+ /*
+ * Reset Xen's internal MSI-X state before exposing a device.
+ *
+ * In some cases, Xen's internal MSI-X state is not clean, which would
+ * incur the new guest cannot receive MSIs.
+ */
+ err = pcistub_msix_reset(dev);
+ if (err)
+ goto out;
+
err = xen_pcibk_add_pci_dev(pdev, dev, devid,
xen_pcibk_publish_pci_dev);
if (err)
--
1.8.3.1