Hi Robin,
On Fri, Feb 28, 2025 at 03:46:33PM +0000, Robin Murphy wrote:
In hindsight, there were some crucial subtleties overlooked when moving
{of,acpi}_dma_configure() to driver probe time to allow waiting for
IOMMU drivers with -EPROBE_DEFER, and these have become an
ever-increasing source of problems. The IOMMU API has some fundamental
assumptions that iommu_probe_device() is called for every device added
to the system, in the order in which they are added. Calling it in a
random order or not at all dependent on driver binding leads to
malformed groups, a potential lack of isolation for devices with no
driver, and all manner of unexpected concurrency and race conditions.
We've attempted to mitigate the latter with point-fix bodges like
iommu_probe_device_lock, but it's a losing battle and the time has come
to bite the bullet and address the true source of the problem instead.
@@ -426,6 +438,12 @@ static int iommu_init_device(struct device *dev)
ret = -ENODEV;
goto err_free;
}
+ /*
+ * And if we do now see any replay calls, they would indicate someone
+ * misusing the dma_configure path outside bus code.
+ */
+ if (dev->driver)
+ dev_WARN(dev, "late IOMMU probe at driver bind, something fishy here!\n");
if (!try_module_get(ops->owner)) {
ret = -EINVAL;
diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
index e10a68b5ffde..6b989a62def2 100644
--- a/drivers/iommu/of_iommu.c
+++ b/drivers/iommu/of_iommu.c
@@ -155,7 +155,12 @@ int of_iommu_configure(struct device *dev, struct device_node *master_np,
dev_iommu_free(dev);
mutex_unlock(&iommu_probe_device_lock);
- if (!err && dev->bus)
+ /*
+ * If we're not on the iommu_probe_device() path (as indicated by the
+ * initial dev->iommu) then try to simulate it. This should no longer
+ * happen unless of_dma_configure() is being misused outside bus code.
+ */
This assumption does not hold as there is nothing preventing iommu
driver probe from racing with a client driver probe.
+ if (!err && dev->bus && !dev_iommu_present)
err = iommu_probe_device(dev);
if (err && err != -EPROBE_DEFER)
I hit the (now moved) dev_WARN() on the ThinkPad T14s where the GPU SMMU
is probed late due to a clock dependency and can end up probing in
parallel with the GPU driver.
[ 3.805282] arm-smmu 3da0000.iommu: probing hardware configuration...
[ 3.806007] arm-smmu 3da0000.iommu: SMMUv2 with:
[ 3.806843] arm-smmu 3da0000.iommu: stage 1 translation
[ 3.807562] arm-smmu 3da0000.iommu: coherent table walk
[ 3.808253] arm-smmu 3da0000.iommu: stream matching with 24 register groups
[ 3.808957] arm-smmu 3da0000.iommu: 22 context banks (0 stage-2 only)
[ 3.809651] arm-smmu 3da0000.iommu: Supported page sizes: 0x61311000
[ 3.810339] arm-smmu 3da0000.iommu: Stage-1: 48-bit VA -> 40-bit IPA
[ 3.811130] arm-smmu 3da0000.iommu: preserved 0 boot mappings
[ 3.829042] platform 3d6a000.gmu: Adding to iommu group 8
[ 3.992050] ------------[ cut here ]------------
[ 3.993045] adreno 3d00000.gpu: late IOMMU probe at driver bind, something fishy here!
[ 3.994058] WARNING: CPU: 9 PID: 343 at drivers/iommu/iommu.c:579 __iommu_probe_device+0x2b0/0x4ac
[ 4.003272] CPU: 9 UID: 0 PID: 343 Comm: kworker/u50:2 Not tainted 6.15.0-rc1 #109 PREEMPT
[ 4.003276] Hardware name: LENOVO 21N2ZC5PUS/21N2ZC5PUS, BIOS N42ET83W (2.13 ) 10/04/2024
[ 4.025943] Call trace:
[ 4.025945] __iommu_probe_device+0x2b0/0x4ac (P)
[ 4.030453] iommu_probe_device+0x38/0x7c
[ 4.030455] of_iommu_configure+0x188/0x26c
[ 4.030457] of_dma_configure_id+0xcc/0x300
[ 4.030460] platform_dma_configure+0x74/0xac
[ 4.030462] really_probe+0x74/0x38c
[ 4.030464] __driver_probe_device+0x7c/0x160
[ 4.030465] driver_probe_device+0x40/0x110
[ 4.030467] __device_attach_driver+0xbc/0x158
[ 4.030468] bus_for_each_drv+0x84/0xe0
[ 4.030470] __device_attach+0xa8/0x1d4
[ 4.030472] device_initial_probe+0x14/0x20
[ 4.030473] bus_probe_device+0xb0/0xb4
[ 4.030476] deferred_probe_work_func+0xa0/0xf4
[ 4.030501] ---[ end trace 0000000000000000 ]---
[ 4.031269] adreno 3d00000.gpu: Adding to iommu group 9
Johan