Re: [PATCH v6 4/5] iommu/mediatek: Add mt8173 IOMMU driver

From: Yong Wu
Date: Wed Dec 16 2015 - 00:59:56 EST


On Tue, 2015-12-15 at 12:37 +0000, Robin Murphy wrote:
> On 15/12/15 03:28, Yong Wu wrote:
> > On Mon, 2015-12-14 at 15:16 +0100, Joerg Roedel wrote:
> >> On Tue, Dec 08, 2015 at 05:49:12PM +0800, Yong Wu wrote:
> >>> +static int mtk_iommu_attach_device(struct iommu_domain *domain,
> >>> + struct device *dev)
> >>> +{
> >>> + struct mtk_iommu_domain *dom = to_mtk_domain(domain);
> >>> + struct mtk_iommu_client_priv *priv = dev->archdata.iommu;
> >>> + struct mtk_iommu_data *data;
> >>> + int ret;
> >>> +
> >>> + if (!priv)
> >>> + return -ENODEV;
> >>> +
> >>> + data = dev_get_drvdata(priv->m4udev);
> >>> + if (!data) {
> >>> + /*
> >>> + * The DMA core will run earlier than this probe, and it will
> >>> + * create a default iommu domain for each a iommu device.
> >>> + * But here there is only one domain called the m4u domain
> >>> + * which all the multimedia HW share.
> >>> + * The default domain isn't needed here.
> >>> + */
> >>
> >> The iommu core creates one domain per iommu-group. In your case this
> >> means one default domain per iommu in the system.
> >
> > Yes. The iommu core will create one domain per iommu-group.
> > see the next "if" here.
> >
> > But the domain here is created by the current DMA64. It's from this
> > function do_iommu_attach which will be called too early and will help
> > create a default domain for each a iommu device.(my codebase is
> > v4.4-rc1).
>
> I still don't really understand the problem here. The M4U device is
> created early in mtk_iommu_init_fn, so it should be probed and ready to
> go long before anything else. Indeed, for your mtk_iommu_device_group()
> callback to work then everything must already be happening in the right
> order - add_device will only call iommu_group_get_for_dev() if of_xlate
> has populated dev->archdata.iommu, and of_xlate will only do that if the
> M4U was up and running before the client device started probing.
> Furthermore, if mtk_iommu_device_group() *does* work, then the IOMMU
> core will go ahead and allocate the default domain there and then, which
> the arch code should find and use later.

Thanks. This is very helpful.

I understand your confuse right now and your expectant flow.

Our IOMMU probe was PROBE_DEFER by our SMI device, so currently it probe
was delayed, then have to add the workaround code.

Following your comment above, I test as below. Then the flows seems meet
the "best case" that the iommu core will help create default DMA domain.

@@ -664,19 +636,41 @@ static int mtk_iommu_probe(struct platform_device
*pdev)
for (i = 0; i < larb_nr; i++) {
struct device_node *larbnode;
struct platform_device *plarbdev;

larbnode = of_parse_phandle(dev->of_node, "mediatek,larbs", i);
if (!larbnode)
return -EINVAL;

plarbdev = of_find_device_by_node(larbnode);
of_node_put(larbnode);
- if (!plarbdev)
- return -EPROBE_DEFER;
+ if (!plarbdev) {
+ plarbdev = of_platform_device_create(larbnode,
NULL, platform_bus_type.dev_root);
+ if (IS_ERR(pdev))
+ return -EPROBE_DEFER;
+ }
}

I only add of_platform_device_create for the SMI local arbiter devices
here.

This is a big improvement for us. If this is ok, I will send a quick
next version for this.

>
> The potential issue I *do* see, looking more closely, is that
> iommu_group_get_for_dev() is setting group->domain but not calling the
> attach_dev callback, which looks wrong...

This is the backtrace,

(151216_09:58:05.207)Call trace:
(151216_09:58:05.207)[<ffffffc000400668>] mtk_iommu_attach_device
+0xb8/0x178
(151216_09:58:05.207)[<ffffffc0003fc55c>] iommu_group_add_device
+0x1d8/0x31c
(151216_09:58:05.207)[<ffffffc0003fc988>] iommu_group_get_for_dev
+0x88/0x108
(151216_09:58:05.207)[<ffffffc0003ffcfc>] mtk_iommu_add_device+0x14/0x34
(151216_09:58:05.207)[<ffffffc0003fb280>] add_iommu_group+0x20/0x44
(151216_09:58:05.207)[<ffffffc000406cec>] bus_for_each_dev+0x58/0x98
(151216_09:58:05.207)[<ffffffc0003fbe8c>] bus_set_iommu+0x9c/0xf8

If I change like above, I will delete the workaround code..

>
> >
> > //=====the next "if"===========
> > } else if (!data->m4u_dom) {
> > /*
> > * While a device is added into a iommu group, the iommu core
> > * will create a default domain for each a iommu group.
> > * This default domain is reserved as the m4u domain and is
> > * initiated here.
> > */
> > data->m4u_dom = dom;
> > if (domain->type == IOMMU_DOMAIN_DMA) {
> > ret = iommu_dma_init_domain(domain, 0,
> > DMA_BIT_MASK(32));
> > if (ret)
> > goto err_uninit_dom;
> > }
> >
> > ret = mtk_iommu_domain_finalise(data);
> > if (ret)
> > goto err_uninit_dom;
> > }
> > //======================
> >
> >>
> >>> + iommu_domain_free(domain);
> >>
> >> This function is not supposed to free the domain passed to it.
> >
> > As above this domain is created in the do_iommu_attach which will help
> > create a default domain for each a iommu device.
> > We don't need this default domain!
> >
> > If we don't free it here, there will be a memory leak.
> >
> > From Robin's comment, He will improve the sequence of the
> > __iommu_setup_dma_ops in the future.
>
> That already happened. The final version of the arm64 code which was
> merged makes sure that the IOMMU driver always sees the callbacks in the
> desired of_xlate -> add_device -> attach_dev order. The whole point of
> the comment below is that the driver itself *doesn't* have to care about
> the awkward way in which that is currently achieved.
>
> > /*
> > * TODO: Right now __iommu_setup_dma_ops() gets called too early to do
> > * everything it needs to - the device is only partially created and the
> > * IOMMU driver hasn't seen it yet, so it can't have a group. Thus we
> > * need this delayed attachment dance. Once IOMMU probe ordering is
> > sorted
> > * to move the arch_setup_dma_ops() call later, all the notifier bits
> > below
> > * become unnecessary, and will go away.
> > */
> >
> > /*
> > * Best case: The device is either part of a group which was
> > * already attached to a domain in a previous call, or it's
> > * been put in a default DMA domain by the IOMMU core.
> > */
>
> That was before Joerg made the device_group changes which enabled proper
> default domains for platform devices - with those, we should be now be
> hitting the "best case" behaviour every time. In fact I think the "fake
> default domain" workaround shouldn't be needed at all any more, so I
> will add removing it to my giant list of things to do.
>
> > But there is no this patch currently, so I add iommu_domain_free
> > here.
> >
> > "free the domain" here looks really not good. Then I delete the
> > iommu_domain_free here(allow this memory leak right now), is it ok?
> > (It will also works after Robin's change in the future.)
> >
> >>
> >>> +static int mtk_iommu_add_device(struct device *dev)
> >>> +{
> >>> + struct iommu_group *group;
> >>> +
> >>> + if (!dev->archdata.iommu) /* Not a iommu client device */
> >>> + return -ENODEV;
> >>> +
> >>> + group = iommu_group_get_for_dev(dev);
> >>> + if (IS_ERR(group))
> >>> + return PTR_ERR(group);
> >>> +
> >>> + iommu_group_put(group);
> >>> + return 0;
> >>> +}
> >>
> >> [...]
> >>
> >>> +static struct iommu_group *mtk_iommu_device_group(struct device *dev)
> >>> +{
> >>> + struct mtk_iommu_data *data;
> >>> + struct mtk_iommu_client_priv *priv;
> >>> +
> >>> + priv = dev->archdata.iommu;
> >>> + if (!priv)
> >>> + return ERR_PTR(-ENODEV);
> >>> +
> >>> + /* All the client devices are in the same m4u iommu-group */
> >>> + data = dev_get_drvdata(priv->m4udev);
> >>> + if (!data->m4u_group) {
> >>> + data->m4u_group = iommu_group_alloc();
> >>> + if (IS_ERR(data->m4u_group))
> >>> + dev_err(dev, "Failed to allocate M4U IOMMU group\n");
> >>> + }
> >>> + return data->m4u_group;
> >>> +}
>
> As long as this works as expected, then AFAICS you shouldn't have to
> have *any* special-case behaviour or tracking of domains at all.

We only need one iommu-group, one iommu domain here.

What's the special-case behavior, how can we track of domains.
Could you help give me a example?

>
> Robin.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/