Re: [PATCH V4 3/6] iommu/arm-smmu: Invoke pm_runtime during probe, add/remove device

From: Will Deacon
Date: Fri Jul 14 2017 - 13:09:22 EST


On Thu, Jul 13, 2017 at 10:55:10AM -0400, Rob Clark wrote:
> On Thu, Jul 13, 2017 at 9:53 AM, Sricharan R <sricharan@xxxxxxxxxxxxxx> wrote:
> > Hi,
> >
> > On 7/13/2017 5:20 PM, Rob Clark wrote:
> >> On Thu, Jul 13, 2017 at 1:35 AM, Sricharan R <sricharan@xxxxxxxxxxxxxx> wrote:
> >>> Hi Vivek,
> >>>
> >>> On 7/13/2017 10:43 AM, Vivek Gautam wrote:
> >>>> Hi Stephen,
> >>>>
> >>>>
> >>>> On 07/13/2017 04:24 AM, Stephen Boyd wrote:
> >>>>> On 07/06, Vivek Gautam wrote:
> >>>>>> @@ -1231,12 +1237,18 @@ static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova,
> >>>>>> static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned long iova,
> >>>>>> size_t size)
> >>>>>> {
> >>>>>> - struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops;
> >>>>>> + struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> >>>>>> + struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
> >>>>>> + size_t ret;
> >>>>>> if (!ops)
> >>>>>> return 0;
> >>>>>> - return ops->unmap(ops, iova, size);
> >>>>>> + pm_runtime_get_sync(smmu_domain->smmu->dev);
> >>>>> Can these map/unmap ops be called from an atomic context? I seem
> >>>>> to recall that being a problem before.
> >>>>
> >>>> That's something which was dropped in the following patch merged in master:
> >>>> 523d7423e21b iommu/arm-smmu: Remove io-pgtable spinlock
> >>>>
> >>>> Looks like we don't need locks here anymore?
> >>>
> >>> Apart from the locking, wonder why a explicit pm_runtime is needed
> >>> from unmap. Somehow looks like some path in the master using that
> >>> should have enabled the pm ?
> >>>
> >>
> >> Yes, there are a bunch of scenarios where unmap can happen with
> >> disabled master (but not in atomic context). On the gpu side we
> >> opportunistically keep a buffer mapping until the buffer is freed
> >> (which can happen after gpu is disabled). Likewise, v4l2 won't unmap
> >> an exported dmabuf while some other driver holds a reference to it
> >> (which can be dropped when the v4l2 device is suspended).
> >>
> >> Since unmap triggers tbl flush which touches iommu regs, the iommu
> >> driver *definitely* needs a pm_runtime_get_sync().
> >
> > Ok, with that being the case, there are two things here,
> >
> > 1) If the device links are still intact at these places where unmap is called,
> > then pm_runtime from the master would setup the all the clocks. That would
> > avoid reintroducing the locking indirectly here.
> >
> > 2) If not, then doing it here is the only way. But for both cases, since
> > the unmap can be called from atomic context, resume handler here should
> > avoid doing clk_prepare_enable , instead move the clk_prepare to the init.
> >
>
> I do kinda like the approach Marek suggested.. of deferring the tlb
> flush until resume. I'm wondering if we could combine that with
> putting the mmu in a stalled state when we suspend (and not resume the
> mmu until after the pending tlb flush)?

I'm not sure that a stalled state is what we're after here, because we need
to take care to prevent any table walks if we've freed the underlying pages.
What we could try to do is disable the SMMU (put into global bypass) and
invalidate the TLB when performing a suspend operation, then we just ignore
invalidation whilst the clocks are stopped and, on resume, enable the SMMU
again.

That said, I don't think we can tolerate suspend/resume racing with
map/unmap, and it's not clear to me how we avoid that without penalising
the fastpath.

Will