Re: [PATCHv4] iommu/arm-smmu: Optimize ->tlb_flush_walk() for qcom implementation

From: Will Deacon
Date: Wed Aug 11 2021 - 06:30:21 EST


On Wed, Aug 11, 2021 at 11:37:25AM +0530, Sai Prakash Ranjan wrote:
> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> index f7da8953afbe..3904b598e0f9 100644
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> @@ -327,9 +327,16 @@ static void arm_smmu_tlb_inv_range_s2(unsigned long iova, size_t size,
> static void arm_smmu_tlb_inv_walk_s1(unsigned long iova, size_t size,
> size_t granule, void *cookie)
> {
> - arm_smmu_tlb_inv_range_s1(iova, size, granule, cookie,
> - ARM_SMMU_CB_S1_TLBIVA);
> - arm_smmu_tlb_sync_context(cookie);
> + struct arm_smmu_domain *smmu_domain = cookie;
> + struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
> +
> + if (cfg->flush_walk_prefer_tlbiasid) {
> + arm_smmu_tlb_inv_context_s1(cookie);

Hmm, this introduces an unconditional wmb() if tlbiasid is preferred. I
think that should be predicated on ARM_SMMU_FEAT_COHERENT_WALK like it is
for the by-VA ops. Worth doing as a separate patch.

> + } else {
> + arm_smmu_tlb_inv_range_s1(iova, size, granule, cookie,
> + ARM_SMMU_CB_S1_TLBIVA);
> + arm_smmu_tlb_sync_context(cookie);
> + }
> }
>
> static void arm_smmu_tlb_add_page_s1(struct iommu_iotlb_gather *gather,
> @@ -765,8 +772,10 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
> .iommu_dev = smmu->dev,
> };
>
> - if (!iommu_get_dma_strict(domain))
> + if (!iommu_get_dma_strict(domain)) {
> pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
> + cfg->flush_walk_prefer_tlbiasid = true;

This is going to interact badly with Robin's series to allow dynamic
transition to non-strict mode, as we don't have a mechanism to switch
over to the by-ASID behaviour. Yes, it should _work_, but it's ugly having
different TLBI behaviour just because of the how the domain became
non-strict.

Robin -- I think this originated from your idea at [1]. Any idea how to make
it work with your other series, or shall we drop this part for now and leave
the TLB invalidation behaviour the same for now?

Will

[1] https://lore.kernel.org/r/da62ff1c-9b49-34d3-69a1-1a674e4a30f7@xxxxxxx