Re: [PATCH 2/3] iommu/vt-d: Add PCI segment and vendor:device ID to DMAR fault logs

From: Pranjal Shrivastava

Date: Thu May 07 2026 - 15:21:57 EST


On Wed, May 06, 2026 at 03:05:38PM +0000, Yigit Oguz wrote:
> Include the full SSSS:BB:DD.F address with PCI segment and
> vendor:device ID (VVVV:DDDD) in DMAR fault messages. Uses
> iommu->segment for the PCI domain and pci_get_domain_bus_and_slot
> to look up the pci_dev. Falls back to segment:BDF without
> vendor:device if the device is not found.
>
> This brings Intel IOMMU fault logging in line with the ARM SMMUv3
> event decoding, making it easier to identify faulting devices
> (e.g. after FLR) without cross-referencing lspci.
>
> Before:
> DMAR: [DMA Write NO_PASID] Request device [86:00.0] fault addr 0xe0000000
> [fault reason 0x05] PTE Write access is not set
>
> After:
> DMAR: [DMA Write NO_PASID] Request device [0000:86:00.0 8086:1533] fault addr 0xe0000000
> [fault reason 0x05] PTE Write access is not set
>
> Signed-off-by: Yigit Oguz <yigitogu@xxxxxxxxx>
> Signed-off-by: Lilit Janpoladyan <lilitj@xxxxxxxxxx>
> Assisted-by: Claude:claude-4.6-opus
> ---
> drivers/iommu/intel/dmar.c | 33 +++++++++++++++++++++------------
> 1 file changed, 21 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c
> index d33c119a935e..225fa498d714 100644
> --- a/drivers/iommu/intel/dmar.c
> +++ b/drivers/iommu/intel/dmar.c
> @@ -1890,30 +1890,39 @@ static int dmar_fault_do_one(struct intel_iommu *iommu, int type,
> {
> const char *reason;
> int fault_type;
> + u8 bus = source_id >> 8;
> + u8 devfn = source_id & 0xFF;
> + struct pci_dev *pdev;
> + char devid[48];

Why not have a #define for this like you have for AMD and Arm?

>
> reason = dmar_get_fault_reason(fault_reason, &fault_type);
>
> + pdev = pci_get_domain_bus_and_slot(iommu->segment, bus, devfn);

Not an Intel iommu expert, but I have concerns about using
pci_get_domain_bus_and_slot() in this path.

AFAICT, dmar_fault_do_one() is running in a IRQ context & the pci_get_*
family of functions iterates the global PCI klist. It eventually calls
bus_to_subsys(), which takes a plain spin_lock(&bus_kset->list_lock) [1]
which isn't IRQ-safe. Same thing with klist_put [2] called in klist_iter_exit

> + if (pdev) {
> + snprintf(devid, sizeof(devid), "%04x:%02x:%02x.%d %04x:%04x",
> + iommu->segment, bus, PCI_SLOT(devfn), PCI_FUNC(devfn),
> + pdev->vendor, pdev->device);
> + pci_dev_put(pdev);

Same here, pci_dev_put call put_device which might sleep [3] and hence
shouldn't be called in hard IRQ context.

> + } else {
> + snprintf(devid, sizeof(devid), "%04x:%02x:%02x.%d",
> + iommu->segment, bus, PCI_SLOT(devfn), PCI_FUNC(devfn));
> + }
> +
> if (fault_type == INTR_REMAP) {
> - pr_err("[INTR-REMAP] Request device [%02x:%02x.%d] fault index 0x%llx [fault reason 0x%02x] %s\n",
> - source_id >> 8, PCI_SLOT(source_id & 0xFF),
> - PCI_FUNC(source_id & 0xFF), addr >> 48,
> - fault_reason, reason);
> + pr_err("[INTR-REMAP] Request device [%s] fault index 0x%llx [fault reason 0x%02x] %s\n",
> + devid, addr >> 48, fault_reason, reason);
>
> return 0;
> }
>

[-------------- >8 -------------------]

Thanks,
Praan

[1] https://elixir.bootlin.com/linux/v7.0.1/source/drivers/base/bus.c#L60
[2] https://elixir.bootlin.com/linux/v7.0.1/source/lib/klist.c#L209
[3] https://elixir.bootlin.com/linux/v7.0.1/source/drivers/base/core.c#L3794