Re: [PATCH v4 0/8] iommu: Remove IOMMU_DEV_FEAT_SVA/_IOPF

From: Zhangfei Gao
Date: Fri Mar 14 2025 - 00:05:13 EST


On Thu, 13 Mar 2025 at 19:37, Zhangfei Gao <zhangfei.gao@xxxxxxxxxx> wrote:
>
> On Thu, 13 Mar 2025 at 19:20, Baolu Lu <baolu.lu@xxxxxxxxxxxxxxx> wrote:
> >
> > On 2025/3/13 18:57, Zhangfei Gao wrote:
> > > On Thu, 13 Mar 2025 at 17:51, Zhangfei Gao<zhangfei.gao@xxxxxxxxxx> wrote:
> > >> Hi, Baolu
> > >>
> > >> On Thu, 13 Mar 2025 at 13:19, Lu Baolu<baolu.lu@xxxxxxxxxxxxxxx> wrote:
> > >>> The new method for driver fault reporting support relies on the domain
> > >>> to specify a iopf_handler. The driver should detect this and setup the
> > >>> HW when fault capable domains are attached.
> > >>>
> > >>> Move SMMUv3 to use this method and have VT-D validate support during
> > >>> attach so that all three fault capable drivers have a no-op FEAT_SVA and
> > >>> _IOPF. Then remove them.
> > >>>
> > >>> This was initiated by Jason. I'm following up to remove FEAT_IOPF and
> > >>> further clean up.
> > >>>
> > >>> The whole series is also available at github:
> > >>> https://github.com/LuBaolu/intel-iommu/commits/iommu_no_feat-v4
> > >> I got an issue on this branch.
> > >>
> > >> Linux 6.14-rc4 + iommu_no_feat-v2
> > >> drivers/pci/quirks.c
> > >> quirk_huawei_pcie_sva will set dma-can-stall first
> > >> arm_smmu_probe_device will check dma-can-stall and set stall_enabled
> > >> accordingly.
> > > This working branch arm_smmu_probe_device is called from pci_bus_add_device
> > > So pci_fixup_device is called first
> > >
> > > [ 1121.314405] arm_smmu_probe_device+0x48/0x450
> > > [ 1121.314410] __iommu_probe_device+0xc4/0x3c8
> > > [ 1121.314412] iommu_probe_device+0x40/0x90
> > > [ 1121.314414] acpi_dma_configure_id+0xb4/0x100
> > > [ 1121.314417] pci_dma_configure+0xf8/0x108
> > > [ 1121.314421] really_probe+0x78/0x278
> > > [ 1121.314425] __driver_probe_device+0x80/0x140
> > > [ 1121.314427] driver_probe_device+0x48/0x130
> > > [ 1121.314430] __device_attach_driver+0xc0/0x108
> > > [ 1121.314432] bus_for_each_drv+0x8c/0xf8
> > > [ 1121.314435] __device_attach+0x104/0x1a0
> > > [ 1121.314437] device_attach+0x1c/0x30
> > > [ 1121.314440] pci_bus_add_device+0xb8/0x1f0
> > > [ 1121.314442] pci_iov_add_virtfn+0x2ac/0x300
> > > [ 1121.314446] sriov_enable+0x204/0x468
> > > [ 1121.314447] pci_enable_sriov+0x20/0x40
> > >
> > >
> > >> This branch
> > >> arm_smmu_probe_device happens first, when dma-can-stall = 0, so
> > >> stall_enabled =0.
> > >> Then drivers/pci/quirks.c: quirk_xxx happens
> > > This not working branch: Linux 6.14-rc6 + iommu_no_feat-v4
> > > arm_smmu_probe_device is called by pci_device_add
> > > Then call pci_bus_add_device -> pci_fixup_device
> > >
> > > 215.072859] arm_smmu_probe_device+0x48/0x450
> > > [ 215.072871] __iommu_probe_device+0xc0/0x468
> > > [ 215.072875] iommu_probe_device+0x40/0x90
> > > [ 215.072877] iommu_bus_notifier+0x38/0x68
> > > [ 215.072879] notifier_call_chain+0x80/0x148
> > > [ 215.072886] blocking_notifier_call_chain+0x50/0x80
> > > [ 215.072889] bus_notify+0x44/0x68
> > > [ 215.072896] device_add+0x580/0x768
> > > [ 215.072898] pci_device_add+0x1e8/0x568
> > > [ 215.072906] pci_iov_add_virtfn+0x198/0x300
> > > [ 215.072910] sriov_enable+0x204/0x468
> > > [ 215.072912] pci_enable_sriov+0x20/0x40
> > >
> > > pci_iov_add_virtfn:
> > > pci_device_add(virtfn, virtfn->bus);
> > > pci_bus_add_device(virtfn); -> pci_fixup_device(pci_fixup_final, dev);
> >
> > This probably is not caused by this patch series. Can you please have a
> > try with the next branch of iommu tree? Or the latest linux-next tree?
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux.git
>
>
> Ok, will try.
>
> Though still not finding which patch caused the problem, will do more
> investigation.
>
> The issue is arm_smmu_probe_device is changed
> from
> pci_bus_add_device(virtfn); -> pci_fixup_device(pci_fixup_final, dev);
> to
> pci_device_add(virtfn, virtfn->bus) -> pci_fixup_device(pci_fixup_header, dev);

Update:

This sequence change is caused by
bcb81ac6ae3c iommu: Get DT/ACPI parsing into the proper probe path

The probe is put in earlier so we have to make fixup earlier as well.
DECLARE_PCI_FIXUP_FINAL -> DECLARE_PCI_FIXUP_HEADER

Thanks

>
> And the issue can be solved by
> -DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa250, quirk_huawei_pcie_sva);
> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HUAWEI, 0xa250, quirk_huawei_pcie_sva);
>
> Thanks