[PATCH rc v8 7/8] iommu: Fix WARN_ON in __iommu_group_set_domain_nofail() due to reset
From: Nicolin Chen
Date: Fri Apr 24 2026 - 21:16:27 EST
In __iommu_group_set_domain_internal(), concurrent domain attachments are
rejected when any device in the group is recovering. This is necessary to
fence concurrent attachments to a multi-device group where devices might
share the same RID due to PCI DMA alias quirks, but triggers the WARN_ON in
__iommu_group_set_domain_nofail().
Other IOMMU_SET_DOMAIN_MUST_SUCCEED callers in detach/teardown paths, such
as __iommu_group_set_core_domain and __iommu_release_dma_ownership, should
not be rejected, as the domain would be freed anyway in these nofail paths
while group->domain is still pointing to it. So pci_dev_reset_iommu_done()
could trigger a UAF when re-attaching group->domain.
Honor the IOMMU_SET_DOMAIN_MUST_SUCCEED flag, allowing the callers through
the group->recovery_cnt fence, so as to update the group->domain pointer.
Instead add a gdev->blocked check in the device iteration loop, to prevent
any concurrent per-device detachment.
Fixes: c279e83953d9 ("iommu: Introduce pci_dev_reset_iommu_prepare/done()")
Cc: stable@xxxxxxxxxxxxxxx
Closes: https://sashiko.dev/#/patchset/20260407194644.171304-1-nicolinc%40nvidia.com
Reviewed-by: Kevin Tian <kevin.tian@xxxxxxxxx>
Reviewed-by: Lu Baolu <baolu.lu@xxxxxxxxxxxxxxx>
Signed-off-by: Nicolin Chen <nicolinc@xxxxxxxxxx>
---
drivers/iommu/iommu.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index d0f32bd954a72..f21d352a67f70 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2469,9 +2469,10 @@ static int __iommu_group_set_domain_internal(struct iommu_group *group,
/*
* This is a concurrent attach during device recovery. Reject it until
- * pci_dev_reset_iommu_done() attaches the device to group->domain.
+ * pci_dev_reset_iommu_done() attaches the device to group->domain, if
+ * IOMMU_SET_DOMAIN_MUST_SUCCEED is not set.
*/
- if (group->recovery_cnt)
+ if (group->recovery_cnt && !(flags & IOMMU_SET_DOMAIN_MUST_SUCCEED))
return -EBUSY;
/*
@@ -2482,6 +2483,13 @@ static int __iommu_group_set_domain_internal(struct iommu_group *group,
*/
result = 0;
for_each_group_device(group, gdev) {
+ /*
+ * Device under recovery is attached to group->blocking_domain.
+ * Don't change that. pci_dev_reset_iommu_done() will re-attach
+ * its domain to the updated group->domain, after the recovery.
+ */
+ if (gdev->blocked)
+ continue;
ret = __iommu_device_set_domain(group, gdev->dev, new_domain,
group->domain, flags);
if (ret) {
--
2.43.0