[PATCH rc v2] iommu: Fix nested pci_dev_reset_iommu_prepare/done()

From: Nicolin Chen

Date: Thu Mar 19 2026 - 00:35:00 EST


Shuai found that cxl_reset_bus_function() calls pci_reset_bus_function()
internally while both are calling pci_dev_reset_iommu_prepare/done().

As pci_dev_reset_iommu_prepare() doesn't support re-entry, the inner call
will trigger a WARN_ON and return -EBUSY, resulting in failing the entire
device reset.

On the other hand, removing the outer calls in the PCI callers is unsafe.
As pointed out by Kevin, device-specific quirks like reset_hinic_vf_dev()
execute custom firmware waits after their inner pcie_flr() completes. If
the IOMMU protection relies solely on the inner reset, the IOMMU will be
unblocked prematurely while the device is still resetting.

Instead, fix this by making pci_dev_reset_iommu_prepare/done() reentrant.

Introduce a 'reset_cnt' in struct iommu_group. Safely increment/decrement
the reference counter in the nested calls, ensuring the IOMMU domains are
only restored when the outermost reset finally completes.

Fixes: c279e83953d9 ("iommu: Introduce pci_dev_reset_iommu_prepare/done()")
Cc: stable@xxxxxxxxxxxxxxx
Reported-by: Shuai Xue <xueshuai@xxxxxxxxxxxxxxxxx>
Closes: https://lore.kernel.org/all/absKsk7qQOwzhpzv@Asurada-Nvidia/
Suggested-by: Kevin Tian <kevin.tian@xxxxxxxxx>
Signed-off-by: Nicolin Chen <nicolinc@xxxxxxxxxx>
---
Changelog
v2:
* Fix in the helpers by allowing re-entry
v1:
https://lore.kernel.org/all/20260318220028.1146905-1-nicolinc@xxxxxxxxxx/

drivers/iommu/iommu.c | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 35db51780954..16155097b27c 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -68,6 +68,7 @@ struct iommu_group {
struct iommu_domain *resetting_domain;
struct iommu_domain *domain;
struct list_head entry;
+ unsigned int reset_cnt;
unsigned int owner_cnt;
void *owner;
};
@@ -3961,9 +3962,10 @@ int pci_dev_reset_iommu_prepare(struct pci_dev *pdev)

guard(mutex)(&group->mutex);

- /* Re-entry is not allowed */
- if (WARN_ON(group->resetting_domain))
- return -EBUSY;
+ if (group->resetting_domain) {
+ group->reset_cnt++;
+ return 0;
+ }

ret = __iommu_group_alloc_blocking_domain(group);
if (ret)
@@ -3988,6 +3990,7 @@ int pci_dev_reset_iommu_prepare(struct pci_dev *pdev)
pasid_array_entry_to_domain(entry));

group->resetting_domain = group->blocking_domain;
+ group->reset_cnt = 1;
return ret;
}
EXPORT_SYMBOL_GPL(pci_dev_reset_iommu_prepare);
@@ -4021,6 +4024,12 @@ void pci_dev_reset_iommu_done(struct pci_dev *pdev)
if (!group->resetting_domain)
return;

+ /* Unbalanced done() calls that would underflow the counter */
+ if (WARN_ON(group->reset_cnt == 0))
+ return;
+ if (--group->reset_cnt > 0)
+ return;
+
/* pci_dev_reset_iommu_prepare() was not successfully called */
if (WARN_ON(!group->blocking_domain))
return;
--
2.34.1