Re: [PATCH 00/12] iommu: Remove IOMMU_DEV_FEAT_SVA/_IOPF

From: Zhangfei Gao
Date: Tue Feb 18 2025 - 01:14:11 EST


On Tue, 18 Feb 2025 at 11:00, Baolu Lu <baolu.lu@xxxxxxxxxxxxxxx> wrote:
>
> On 2/15/25 19:35, Zhangfei Gao wrote:
> > On Sat, 15 Feb 2025 at 18:09, Baolu Lu<baolu.lu@xxxxxxxxxxxxxxx> wrote:
> >> On 2/15/25 16:11, Zhangfei Gao wrote:
> >>> It does not relate to multi devices, one device also happens when user
> >>> page fault triggers.
> >>>
> >>> iopf_queue_remove_device is called.
> >>> rcu_assign_pointer(param->fault_param, NULL);
> >>>
> >>> call trace
> >>> [ 304.961312] Call trace:
> >>> [ 304.961314] show_stack+0x20/0x38 (C)
> >>> [ 304.961319] dump_stack_lvl+0xc0/0xd0
> >>> [ 304.961324] dump_stack+0x18/0x28
> >>> [ 304.961327] iopf_queue_remove_device+0xb0/0x1f0
> >>> [ 304.961331] arm_smmu_remove_master_domain+0x204/0x250
> >>> [ 304.961336] arm_smmu_attach_commit+0x64/0x100
> >>> [ 304.961338] arm_smmu_attach_dev_nested+0x104/0x1a8
> >>> [ 304.961340] __iommu_attach_device+0x2c/0x110
> >>> [ 304.961343] __iommu_device_set_domain.isra.0+0x78/0xe0
> >>> [ 304.961345] __iommu_group_set_domain_internal+0x78/0x160
> >>> [ 304.961347] iommu_replace_group_handle+0x9c/0x150
> >>> [ 304.961350] iommufd_fault_domain_replace_dev+0x88/0x120
> >>> [ 304.961353] iommufd_device_do_replace+0x190/0x3c0
> >>> [ 304.961355] iommufd_device_change_pt+0x270/0x688
> >>> [ 304.961357] iommufd_device_replace+0x20/0x38
> >>> [ 304.961359] vfio_iommufd_physical_attach_ioas+0x30/0x78
> >>> [ 304.961363] vfio_df_ioctl_attach_pt+0xa8/0x188
> >>> [ 304.961366] vfio_device_fops_unl_ioctl+0x310/0x990
> >>>
> >>>
> >>> When page fault triggers:
> >>>
> >>> [ 1016.383578] ------------[ cut here ]-----------
> >>> [ 1016.388184] WARNING: CPU: 35 PID: 717 at
> >>> drivers/iommu/io-pgfault.c:231 iommu_report_device_fault+0x2c8/0x470
> >> It's likely that iopf_queue_add_device() was not called for this device.
> > iopf_queue_add_device is called, but quickly iopf_queue_remove_device
> > is called during guest bootup.
> > Then fault_param is set to NULL.
> >
> > arm_smmu_attach_commit
> > arm_smmu_remove_master_domain
> > // newly added in the first patch
> > if (master_domain) {
> > if (master_domain->using_iopf)
>
> It seems the above check is incorrect. We only need to disable iopf when
> an iopf-capable domain is about to be removed. Will the following
> additional change make any difference?
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 28e67a9e3861..9b9ef738d070 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -2822,7 +2822,7 @@ static void arm_smmu_remove_master_domain(struct
> arm_smmu_master *master,
> spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
>
> if (master_domain) {
> - if (master_domain->using_iopf)
> + if (domain->iopf_handler)
> arm_smmu_disable_iopf(master);
> kfree(master_domain);
> }

Yes, good idea, using this can solve the issue.

Thanks