On Thu, Nov 12 2020 at 20:15, Thomas Gleixner wrote:
The recent changes to store the MSI irqdomain pointer in struct device
missed that Intel DMAR does not register virtual function devices. Due to
that a VF device gets the plain PCI-MSI domain assigned and then issues
compat MSI messages which get caught by the interrupt remapping unit.
Cure that by inheriting the irq domain from the physical function
device.
That's a temporary workaround. The correct fix is to inherit the irq domain
from the bus, but that's a larger effort which needs quite some other
changes to the way how x86 manages PCI and MSI domains.
Bah, that's not really going to work with the way how irq remapping
works on x86 because at least Intel/DMAR can have more than one DMAR
unit on a bus.
So the alternative solution would be to assign the domain per device,
but the current ordering creates a hen and egg problem. Looking the
domain up in pci_set_msi_domain() does not work because at that point
the device is not registered in the IOMMU. That happens from
device_add().
Marc, is there any problem to reorder the calls in pci_device_add():
device_add();
pci_set_msi_domain();
That would allow to add a irq_find_matching_fwspec() based lookup to
pci_msi_get_device_domain().
Though I'm not yet convinced that the outcome would be less horrible
than the hack in the DMAR driver when I'm taking all the other horrors
of x86 (including XEN) into account :)