Re: [PATCH] iommu: Fix bypass of IOMMU readiness check for multi-IOMMU devices

From: Tudor Ambarus

Date: Thu Apr 02 2026 - 07:36:25 EST


Hi, Jason,

On 3/23/26 7:31 PM, Jason Gunthorpe wrote:
> On Mon, Mar 23, 2026 at 06:46:39PM +0200, Tudor Ambarus wrote:
>
>> Downstream we have a display controller that's using:
>> iommus = <&sysmmu_19840000>, <&sysmmu_19c40000>;
>>
>> These are 2 distinct platform devices, they probe independently, they
>> each call iommu_device_register() independently.
>
> Sure, I guessed that is what you ment..
>
> Do you have an example of this in an upstream DTS file?

Yes, Exynos multimedia blocks use this upstream For example, in
arch/arm64/boot/dts/exynos/exynos5433.dtsi, the `decon` and `decon_tv`
nodes route through multiple sysmmus:
iommus = <&sysmmu_decon0x>, <&sysmmu_decon1x>;

Looking at the upstream exynos-iommu.c driver, it doesn't return
-EPROBE_DEFER if all the instances listed in iommus doesn't exist.

It seems it survives the race though, but only because of the
core_initcall ordering. In downstream the IOMMU is forced to be a
module which exposes this gap.

>
>> If I understood you correctly, the downstream driver shall model its
>> architecture and call iommu_device_register() only once after both
>> devices are configured.
>
> No.. I'm not being so perscriptive, I'm just saying that once
> iommu->ops->probe_device() returns then the device is fully setup and
> dev->iommu will operate all of the iommus described in iommus=<..>
>
> probe_device() cannot return some half setup device with only some of
> the iommu instances working.
>
> We don't have any core idea of a half setup result from
> probe_device() today.
>
>> If the core's intent is to strictly enforce a single IOMMU instance,
>> shouldn't iommu_fwspec_init() be checking
>> fwspec->iommu_fwnode == iommu_fwnode
>> instead of matching the ops? Because the core currently matches on
>> ops, it permits aggregating multiple physical instances with the
>> same ops into one fwspec.
>
> The driver is responsible to handle this, not the core. It has to hide
> this mess under its covers, not rely on multiple calls to of_xlate or
> however it has been hacked up.
>
> Probably it means something like of_xlate/probe_device has to
> EPROBE_DEFER if all the instances listed in iommus don't exist.
>
I can probably track whether all instances are ready, and defer if any
is not ready, but then I'll force the iommu clients to use the sketchy
replay path, which seems like a bad idea, according to Robin's feedback.

I haven't seen functional problems with the races, just the "something
fishy" dev_WARN. Maybe we shall downgrade that to dev_info.

Thanks!
ta