Re: arm64 iommu groups issue

From: John Garry
Date: Thu Sep 19 2019 - 10:35:29 EST


On 19/09/2019 14:25, Robin Murphy wrote:
When the port eventually probes it gets a new, separate group.

This all seems to be as the built-in module init ordering is as
follows: pcieport drv, smmu drv, mlx5 drv

I notice that if I build the mlx5 drv as a ko and insert after boot,
all functions + pcieport are in the same group:

[ 11.530046] hisi_sas_v2_hw HISI0162:01: Adding to iommu group 0
[ 17.301093] hns_dsaf HISI00B2:00: Adding to iommu group 1
[ 18.743600] ehci-platform PNP0D20:00: Adding to iommu group 2
[ 20.212284] pcieport 0002:f8:00.0: Adding to iommu group 3
[ 20.356303] pcieport 0004:88:00.0: Adding to iommu group 4
[ 20.493337] pcieport 0005:78:00.0: Adding to iommu group 5
[ 20.702999] pcieport 000a:10:00.0: Adding to iommu group 6
[ 20.859183] pcieport 000c:20:00.0: Adding to iommu group 7
[ 20.996140] pcieport 000d:30:00.0: Adding to iommu group 8
[ 21.152637] serial 0002:f9:00.0: Adding to iommu group 3
[ 21.346991] serial 0002:f9:00.1: Adding to iommu group 3
[ 100.754306] mlx5_core 000a:11:00.0: Adding to iommu group 6
[ 101.420156] mlx5_core 000a:11:00.1: Adding to iommu group 6
[ 292.481714] mlx5_core 000a:11:00.2: Adding to iommu group 6
[ 293.281061] mlx5_core 000a:11:00.3: Adding to iommu group 6

This does seem like a problem for arm64 platforms which don't support
ACS, yet enable an SMMU. Maybe also a problem even if they do support
ACS.

Opinion?


Hi Robin,

Yeah, this is less than ideal.

For sure. Our production D05 boards don't ship with the SMMU enabled in BIOS, but it would be slightly concerning in this regard if they did.

> One way to bodge it might be to make
pci_device_group() also walk downwards to see if any non-ACS-isolated
children already have a group, rather than assuming that groups get
allocated in hierarchical order, but that's far from ideal.

Agree.

My own workaround was to hack the mentioned iort code to defer the PF probe if the parent port had also yet to probe.


The underlying issue is that, for historical reasons, OF/IORT-based
IOMMU drivers have ended up with group allocation being tied to endpoint
driver probing via the dma_configure() mechanism (long story short,
driver probe is the only thing which can be delayed in order to wait for
a specific IOMMU instance to be ready).However, in the meantime, the
IOMMU API internals have evolved sufficiently that I think there's a way
to really put things right - I have the spark of an idea which I'll try
to sketch out ASAP...


OK, great.

Thanks,
John

Robin.