Re: [PATCH v3 2/3] iommu/tegra-smmu: Rework .probe_device and .attach_dev

From: Thierry Reding
Date: Thu Oct 01 2020 - 06:46:21 EST


On Wed, Sep 30, 2020 at 01:36:18PM -0700, Nicolin Chen wrote:
> On Wed, Sep 30, 2020 at 05:31:31PM +0200, Thierry Reding wrote:
> > On Wed, Sep 30, 2020 at 01:42:57AM -0700, Nicolin Chen wrote:
> > > Previously the driver relies on bus_set_iommu() in .probe() to call
> > > in .probe_device() function so each client can poll iommus property
> > > in DTB to configure fwspec via tegra_smmu_configure(). According to
> > > the comments in .probe(), this is a bit of a hack. And this doesn't
> > > work for a client that doesn't exist in DTB, PCI device for example.
> > >
> > > Actually when a device/client gets probed, the of_iommu_configure()
> > > will call in .probe_device() function again, with a prepared fwspec
> > > from of_iommu_configure() that reads the SWGROUP id in DTB as we do
> > > in tegra-smmu driver.
> > >
> > > Additionally, as a new helper devm_tegra_get_memory_controller() is
> > > introduced, there's no need to poll the iommus property in order to
> > > get mc->smmu pointers or SWGROUP id.
> > >
> > > This patch reworks .probe_device() and .attach_dev() by doing:
> > > 1) Using fwspec to get swgroup id in .attach_dev/.dettach_dev()
> > > 2) Removing DT polling code, tegra_smmu_find/tegra_smmu_configure()
> > > 3) Calling devm_tegra_get_memory_controller() in .probe_device()
> > > 4) Also dropping the hack in .probe() that's no longer needed.
> > >
> > > Signed-off-by: Nicolin Chen <nicoleotsuka@xxxxxxxxx>
> [...]
> > > static struct iommu_device *tegra_smmu_probe_device(struct device *dev)
> > > {
> > > - struct device_node *np = dev->of_node;
> > > - struct tegra_smmu *smmu = NULL;
> > > - struct of_phandle_args args;
> > > - unsigned int index = 0;
> > > - int err;
> > > -
> > > - while (of_parse_phandle_with_args(np, "iommus", "#iommu-cells", index,
> > > - &args) == 0) {
> > > - smmu = tegra_smmu_find(args.np);
> > > - if (smmu) {
> > > - err = tegra_smmu_configure(smmu, dev, &args);
> > > - of_node_put(args.np);
> > > -
> > > - if (err < 0)
> > > - return ERR_PTR(err);
> > > -
> > > - /*
> > > - * Only a single IOMMU master interface is currently
> > > - * supported by the Linux kernel, so abort after the
> > > - * first match.
> > > - */
> > > - dev_iommu_priv_set(dev, smmu);
> > > -
> > > - break;
> > > - }
> > > + struct tegra_mc *mc = devm_tegra_get_memory_controller(dev);
> > > + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
> >
> > It looks to me like the only reason why you need this new global API is
> > because PCI devices may not have a device tree node with a phandle to
> > the IOMMU. However, SMMU support for PCI will only be enabled if the
> > root complex has an iommus property, right? In that case, can't we
> > simply do something like this:
> >
> > if (dev_is_pci(dev))
> > np = find_host_bridge(dev)->of_node;
> > else
> > np = dev->of_node;
> >
> > ? I'm not sure exactly what find_host_bridge() is called, but I'm pretty
> > sure that exists.
> >
> > Once we have that we can still iterate over the iommus property and do
> > not need to rely on this global variable.
>
> I agree that it'd work. But I was hoping to simplify the code
> here if it's possible. Looks like we have an argument on this
> so I will choose to go with your suggestion above for now.
>
> > > - of_node_put(args.np);
> > > - index++;
> > > - }
> > > + /* An invalid mc pointer means mc and smmu drivers are not ready */
> > > + if (IS_ERR(mc))
> > > + return ERR_PTR(-EPROBE_DEFER);
> > >
> > > - if (!smmu)
> > > + /*
> > > + * IOMMU core allows -ENODEV return to carry on. So bypass any call
> > > + * from bus_set_iommu() during tegra_smmu_probe(), as a device will
> > > + * call in again via of_iommu_configure when fwspec is prepared.
> > > + */
> > > + if (!mc->smmu || !fwspec || fwspec->ops != &tegra_smmu_ops)
> > > return ERR_PTR(-ENODEV);
> > >
> > > - return &smmu->iommu;
> > > + dev_iommu_priv_set(dev, mc->smmu);
> > > +
> > > + return &mc->smmu->iommu;
> > > }
> > >
> > > static void tegra_smmu_release_device(struct device *dev)
> > > @@ -1089,16 +1027,6 @@ struct tegra_smmu *tegra_smmu_probe(struct device *dev,
> > > if (!smmu)
> > > return ERR_PTR(-ENOMEM);
> > >
> > > - /*
> > > - * This is a bit of a hack. Ideally we'd want to simply return this
> > > - * value. However the IOMMU registration process will attempt to add
> > > - * all devices to the IOMMU when bus_set_iommu() is called. In order
> > > - * not to rely on global variables to track the IOMMU instance, we
> > > - * set it here so that it can be looked up from the .probe_device()
> > > - * callback via the IOMMU device's .drvdata field.
> > > - */
> > > - mc->smmu = smmu;
> >
> > I don't think this is going to work. I distinctly remember putting this
> > here because we needed access to this before ->probe_device() had been
> > called for any of the devices.
>
> Do you remember which exact part of code needs to access mc->smmu
> before ->probe_device() is called?
>
> What I understood is that IOMMU core didn't allow ERR_PTR(-ENODEV)
> return value from ->probe_device(), previously ->add_device(), to
> carry on when you added this code/driver:
> commit 8918465163171322c77a19d5258a95f56d89d2e4
> Author: Thierry Reding <treding@xxxxxxxxxx>
> Date: Wed Apr 16 09:24:44 2014 +0200
> memory: Add NVIDIA Tegra memory controller support
>
> ..until the core had a change one year later:
> commit 38667f18900afe172a4fe44279b132b4140f920f
> Author: Joerg Roedel <jroedel@xxxxxxx>
> Date: Mon Jun 29 10:16:08 2015 +0200
> iommu: Ignore -ENODEV errors from add_device call-back
>
> As my commit message of this change states, ->probe_device() will
> be called in from both bus_set_iommu() and really_probe() of each
> device through of_iommu_configure() -- the later one initializes
> an fwspec by polling the iommus property in the IOMMU core, same
> as what we do here in tegra-smmu. If this works, we can probably
> drop the hack here and get rid of tegra_smmu_configure().

Looking at this a bit more, I notice that tegra_smmu_configure() does a
lot of what's already done during of_iommu_configure(), so it'd indeed
be nice if we could somehow get rid of that. However, like I said, I do
recall that for DMA/IOMMU we need this prior to ->probe_device(), so it
isn't clear to me if we can do that.

So I think in order to make progress we need to check that dropping this
does indeed still work when we enable DMA/IOMMU (and the preliminary
patches to pass 1:1 mappings via reserved-memory regions). If so, I
think it should be safe to remove this.

Thierry

Attachment: signature.asc
Description: PGP signature