Re: [PATCHv4] iommu/arm-smmu: Optimize ->tlb_flush_walk() for qcom implementation

From: Sai Prakash Ranjan
Date: Wed Aug 11 2021 - 06:53:40 EST

On 2021-08-11 16:00, Will Deacon wrote:
On Wed, Aug 11, 2021 at 11:37:25AM +0530, Sai Prakash Ranjan wrote:
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index f7da8953afbe..3904b598e0f9 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -327,9 +327,16 @@ static void arm_smmu_tlb_inv_range_s2(unsigned long iova, size_t size,
static void arm_smmu_tlb_inv_walk_s1(unsigned long iova, size_t size,
size_t granule, void *cookie)
- arm_smmu_tlb_inv_range_s1(iova, size, granule, cookie,
- arm_smmu_tlb_sync_context(cookie);
+ struct arm_smmu_domain *smmu_domain = cookie;
+ struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
+ if (cfg->flush_walk_prefer_tlbiasid) {
+ arm_smmu_tlb_inv_context_s1(cookie);

Hmm, this introduces an unconditional wmb() if tlbiasid is preferred. I
think that should be predicated on ARM_SMMU_FEAT_COHERENT_WALK like it is
for the by-VA ops. Worth doing as a separate patch.

Ok I will keep this as-is for now then.

+ } else {
+ arm_smmu_tlb_inv_range_s1(iova, size, granule, cookie,
+ arm_smmu_tlb_sync_context(cookie);
+ }

static void arm_smmu_tlb_add_page_s1(struct iommu_iotlb_gather *gather,
@@ -765,8 +772,10 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
.iommu_dev = smmu->dev,

- if (!iommu_get_dma_strict(domain))
+ if (!iommu_get_dma_strict(domain)) {
pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
+ cfg->flush_walk_prefer_tlbiasid = true;

This is going to interact badly with Robin's series to allow dynamic
transition to non-strict mode, as we don't have a mechanism to switch
over to the by-ASID behaviour. Yes, it should _work_, but it's ugly having
different TLBI behaviour just because of the how the domain became

Robin -- I think this originated from your idea at [1]. Any idea how to make
it work with your other series, or shall we drop this part for now and leave
the TLB invalidation behaviour the same for now?



Right, I think we can drop this non-strict change for now because it also makes
it a pain to backport it to 5.4/5.10 kernels because of large number of changes
in dma apis in recent kernels. I will let you and Robin decide if it's ok to
drop this change and introduce it later with a different patch.


QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation