RE: [PATCH] iommu/io-pgtable-arm: Optimize partial walk flush for large scatter-gather list

From: Krishna Reddy
Date: Mon Jun 14 2021 - 13:48:42 EST


> Right but we won't know until we profile the specific usecases or try them in
> generic workload to see if they affect the performance. Sure, over invalidation is
> a concern where multiple buffers can be mapped to same context and the cache
> is not usable at the time for lookup and such but we don't do it for small buffers
> and only for large buffers which means thousands of TLB entry mappings in
> which case TLBIASID is preferred (note: I mentioned the HW team
> recommendation to use it for anything greater than 128 TLB entries) in my
> earlier reply. And also note that we do this only for partial walk flush, we are not
> arbitrarily changing all the TLBIs to ASID based.

Most of the heavy bw use cases does involve processing larger buffers.
When the physical memory is allocated dis-contiguously at page_size (let's use 4KB here)
granularity, each aligned 2MB chunks IOVA unmap would involve performing a TLBIASID
as 2MB is not a leaf. Essentially, It happens all the time during large buffer unmaps and
potentially impact active traffic on other large buffers. Depending on how much
latency HW engines can absorb, the overflow/underflow issues for ISO engines can be
sporadic and vendor specific.
Performing TLBIASID as default for all SoCs is not a safe operation.


> I am no camera expert but from what the camera team mentioned is that there
> is a thread which frees memory(large unused memory buffers) periodically which
> ends up taking around 100+ms and causing some camera test failures with
> frame drops. Parallel efforts are already being made to optimize this usage of
> thread but as I mentioned previously, this is *not a camera specific*, lets say
> someone else invokes such large unmaps, it's going to face the same issue.

>From the above, It doesn't look like the root cause of frame drops is fully understood.
Why is 100+ms delay causing camera frame drop? Is the same thread submitting the buffers
to camera after unmap is complete? If not, how is the unmap latency causing issue here?


> > If unmap is queued and performed on a back ground thread, would it
> > resolve the frame drops?
>
> Not sure I understand what you mean by queuing on background thread but with
> that or not, we still do the same number of TLBIs and hop through
> iommu->io-pgtable->arm-smmu to perform the the unmap, so how will that
> help?

I mean adding the unmap requests into a queue and processing them from a different thread.
It is not to reduce the TLBIs. But, not to block subsequent buffer allocation, IOVA map requests, if they
are being requested from same thread that is performing unmap. If unmap is already performed from
a different thread, then the issue still need to be root caused to understand it fully. Check for any
serialization issues.


-KR