Re: [Freedreno] [PATCH v7 3/6] iommu/arm-smmu: Invoke pm_runtime during probe, add/remove device

From: Jordan Crouse
Date: Fri Feb 23 2018 - 10:41:02 EST


On Fri, Feb 23, 2018 at 04:06:39PM +0530, Vivek Gautam wrote:
> On Fri, Feb 23, 2018 at 5:22 AM, Jordan Crouse <jcrouse@xxxxxxxxxxxxxx> wrote:
> > On Wed, Feb 07, 2018 at 04:01:19PM +0530, Vivek Gautam wrote:
> >> From: Sricharan R <sricharan@xxxxxxxxxxxxxx>
> >>
> >> The smmu device probe/remove and add/remove master device callbacks
> >> gets called when the smmu is not linked to its master, that is without
> >> the context of the master device. So calling runtime apis in those places
> >> separately.
> >>
> >> Signed-off-by: Sricharan R <sricharan@xxxxxxxxxxxxxx>
> >> [vivek: Cleanup pm runtime calls]
> >> Signed-off-by: Vivek Gautam <vivek.gautam@xxxxxxxxxxxxxx>
> >> ---
> >> drivers/iommu/arm-smmu.c | 42 ++++++++++++++++++++++++++++++++++++++----
> >> 1 file changed, 38 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> >> index 9e2f917e16c2..c024f69c1682 100644
> >> --- a/drivers/iommu/arm-smmu.c
> >> +++ b/drivers/iommu/arm-smmu.c
> >> @@ -913,11 +913,15 @@ static void arm_smmu_destroy_domain_context(struct iommu_domain *domain)
> >> struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> >> struct arm_smmu_device *smmu = smmu_domain->smmu;
> >> struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
> >> - int irq;
> >> + int ret, irq;
> >>
> >> if (!smmu || domain->type == IOMMU_DOMAIN_IDENTITY)
> >> return;
> >>
> >> + ret = pm_runtime_get_sync(smmu->dev);
> >> + if (ret)
> >> + return;
> >> +
> >> /*
> >> * Disable the context bank and free the page tables before freeing
> >> * it.
> >> @@ -932,6 +936,8 @@ static void arm_smmu_destroy_domain_context(struct iommu_domain *domain)
> >>
> >> free_io_pgtable_ops(smmu_domain->pgtbl_ops);
> >> __arm_smmu_free_bitmap(smmu->context_map, cfg->cbndx);
> >> +
> >> + pm_runtime_put_sync(smmu->dev);
> >> }
> >>
> >> static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)
> >> @@ -1407,14 +1413,22 @@ static int arm_smmu_add_device(struct device *dev)
> >> while (i--)
> >> cfg->smendx[i] = INVALID_SMENDX;
> >>
> >> - ret = arm_smmu_master_alloc_smes(dev);
> >> + ret = pm_runtime_get_sync(smmu->dev);
> >> if (ret)
> >> goto out_cfg_free;
> >
> > Hey Vivek, I just hit a problem with this on sdm845. It turns out that
> > pm_runtime_get_sync() returns a positive 1 if the device is already active.
> >
> > I hit this in the GPU code. The a6xx has two platform devices that each use a
> > different sid on the iommu. The GPU is probed normally from a platform driver
> > and it in turn initializes the GMU device by way of a phandle.
> >
> > Because the GMU isn't probed with a platform driver we need to call
> > of_dma_configure() on the device to set up the IOMMU for the device which ends
> > up calling through this path and we discover that the smmu->dev is already
> > powered (pm_runtime_get_sync returns 1).
> >
> > I'm not immediately sure if this is a bug on sdm845 or not because a cursory
> > inspection says that the SMMU device shouldn't be powered at this time but there
> > might be a connection that I'm not seeing. Obviously if the SMMU was left
> > powered thats a bad thing. But putting that aside it is obvious that this
> > code should be accommodating of the possibility that the device is already
> > powered, and so this should be
> >
> > if (ret < 0)
> > goto out_cfg_free;
>
> Right, as Tomasz also pointed, we should surely check the negative value of
> pm_runtime_get_sync().

Sorry, I didn't notice that Tomasz had pointed it out as well. I wanted to
quickly get it on the mailing list so you could catch it in your time zone.

> From your description, it may be that the GPU has turned on the smmu, and
> then once if goes and probes the GMU, the GMU device also wants to turn-on
> the same smmu device. But that's already active. So pm_runtime_get_sync()
> returns 1.
> Am i making sense?

My concern is that this is happening during the probe and we shouldn't be
energizing the GPU at this point. But it is entirely possible that the
bus is on for other reasons. I'll do a bit of digging today and see exactly
which device is at fault.


Jordan
--
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project