Re: [PATCH] of/device: add blacklist for iommu dma_ops

From: Robin Murphy
Date: Mon Jun 03 2019 - 07:18:22 EST

On 03/06/2019 11:47, Rob Clark wrote:
On Sun, Jun 2, 2019 at 11:25 PM Tomasz Figa <tfiga@xxxxxxxxxxxx> wrote:

On Mon, Jun 3, 2019 at 4:40 AM Rob Clark <robdclark@xxxxxxxxx> wrote:

On Fri, May 10, 2019 at 7:35 AM Rob Clark <robdclark@xxxxxxxxx> wrote:

On Tue, Dec 4, 2018 at 2:29 PM Rob Herring <robh+dt@xxxxxxxxxx> wrote:

On Sat, Dec 1, 2018 at 10:54 AM Rob Clark <robdclark@xxxxxxxxx> wrote:

This solves a problem we see with drm/msm, caused by getting
iommu_dma_ops while we attach our own domain and manage it directly at
the iommu API level:

[0000000000000038] user address but active_mm is swapper
Internal error: Oops: 96000005 [#1] PREEMPT SMP
Modules linked in:
CPU: 7 PID: 70 Comm: kworker/7:1 Tainted: G W 4.19.3 #90
Hardware name: xxx (DT)
Workqueue: events deferred_probe_work_func
pstate: 80c00009 (Nzcv daif +PAN +UAO)
pc : iommu_dma_map_sg+0x7c/0x2c8
lr : iommu_dma_map_sg+0x40/0x2c8
sp : ffffff80095eb4f0
x29: ffffff80095eb4f0 x28: 0000000000000000
x27: ffffffc0f9431578 x26: 0000000000000000
x25: 00000000ffffffff x24: 0000000000000003
x23: 0000000000000001 x22: ffffffc0fa9ac010
x21: 0000000000000000 x20: ffffffc0fab40980
x19: ffffffc0fab40980 x18: 0000000000000003
x17: 00000000000001c4 x16: 0000000000000007
x15: 000000000000000e x14: ffffffffffffffff
x13: ffff000000000000 x12: 0000000000000028
x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f
x9 : 0000000000000000 x8 : ffffffc0fab409a0
x7 : 0000000000000000 x6 : 0000000000000002
x5 : 0000000100000000 x4 : 0000000000000000
x3 : 0000000000000001 x2 : 0000000000000002
x1 : ffffffc0f9431578 x0 : 0000000000000000
Process kworker/7:1 (pid: 70, stack limit = 0x0000000017d08ffb)
Call trace:
Code: d2800004 91000725 6b17039f 5400048a (f9401f40)
---[ end trace f22dda57f3648e2c ]---
Kernel panic - not syncing: Fatal exception
SMP: stopping secondary CPUs
Kernel Offset: disabled
CPU features: 0x0,22802a18
Memory Limit: none

The problem is that when drm/msm does it's own iommu_attach_device(),
now the domain returned by iommu_get_domain_for_dev() is drm/msm's
domain, and it doesn't have domain->iova_cookie.

We kind of avoided this problem prior to sdm845/dpu because the iommu
was attached to the mdp node in dt, which is a child of the toplevel
mdss node (which corresponds to the dev passed in dma_map_sg()). But
with sdm845, now the iommu is attached at the mdss level so we hit the
iommu_dma_ops in dma_map_sg().

But auto allocating/attaching a domain before the driver is probed was
already a blocking problem for enabling per-context pagetables for the
GPU. This problem is also now solved with this patch.

Fixes: 97890ba9289c dma-mapping: detect and configure IOMMU in of_dma_configure
Tested-by: Douglas Anderson <dianders@xxxxxxxxxxxx>
Signed-off-by: Rob Clark <robdclark@xxxxxxxxx>
This is an alternative/replacement for [1]. What it lacks in elegance
it makes up for in practicality ;-)


drivers/of/device.c | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)

diff --git a/drivers/of/device.c b/drivers/of/device.c
index 5957cd4fa262..15ffee00fb22 100644
--- a/drivers/of/device.c
+++ b/drivers/of/device.c
@@ -72,6 +72,14 @@ int of_device_add(struct platform_device *ofdev)
return device_add(&ofdev->dev);

+static const struct of_device_id iommu_blacklist[] = {
+ { .compatible = "qcom,mdp4" },
+ { .compatible = "qcom,mdss" },
+ { .compatible = "qcom,sdm845-mdss" },
+ { .compatible = "qcom,adreno" },
+ {}

Not completely clear to whether this is still needed or not, but this
really won't scale. Why can't the driver for these devices override
whatever has been setup by default?

fwiw, at the moment it is not needed, but it will become needed again
to implement per-context pagetables (although I suppose for this we
only need to blacklist qcom,adreno and not also the display nodes).

So, another case I've come across, on the display side.. I'm working
on handling the case where bootloader enables display (and takes iommu
out of reset).. as soon as DMA domain gets attached we get iommu
faults, because bootloader has already configured display for scanout.
Unfortunately this all happens before actual driver is probed and has
a chance to intervene.

It's rather unfortunate that we tried to be clever rather than just
making drivers call some function to opt-in to the hookup of dma iommu
ops :-(

I think it still works for the 90% of cases and if 10% needs some
explicit work in the drivers, that's better than requiring 100% of the
drivers to do things manually.

Right, it's not about "being clever", it's about not adding opt-in code to the hundreds and hundreds and hundreds of drivers which *might* ever find themselves used on a system where they would need the IOMMU's help for DMA.

Adding Marek who had the same problem on Exynos.

I do wonder how many drivers need to iommu_map in their ->probe()?
I'm thinking moving the auto-hookup to after a successful probe(),
with some function a driver could call if they need mapping in probe,
might be a way to eventually get rid of the blacklist. But I've no
idea how to find the subset of drivers that would be broken without a
dma_setup_iommu_stuff() call in their probe.

Wouldn't help much. That particular issue is nothing to do with DMA ops really, it's about IOMMU initialisation. On something like SMMUv3 there is literally no way to turn the thing on without blocking unknown traffic - it *has* to have stream table entries programmed before it can even allow stuff to bypass.

The answer is either to either pay attention to the "Quiesce all DMA capable devices" part of the boot protocol (which has been there since pretty much forever), or to come up with some robust way of communicating "live" boot-time mappings to IOMMU drivers so that they can program themselves appropriately during probe.