Re: [PATCHv2 1/3] iommu/io-pgtable: Add a quirk to use tlb_flush_all() for partial walk flush

From: Robin Murphy
Date: Tue Jun 22 2021 - 08:11:26 EST


On 2021-06-22 08:11, Sai Prakash Ranjan wrote:
Hi Robin,

On 2021-06-21 21:15, Robin Murphy wrote:
On 2021-06-18 03:51, Sai Prakash Ranjan wrote:
Add a quirk IO_PGTABLE_QUIRK_TLB_INV_ALL to invalidate entire context
with tlb_flush_all() callback in partial walk flush to improve unmap
performance on select few platforms where the cost of over-invalidation
is less than the unmap latency.

I still think this doesn't belong anywhere near io-pgtable at all.
It's a driver-internal decision how exactly it implements a non-leaf
invalidation, and that may be more complex than a predetermined
boolean decision. For example, I've just realised for SMMUv3 we can't
invalidate multiple levels of table at once with a range command,
since if we assume the whole thing is mapped at worst-case page
granularity we may fail to invalidate any parts which are mapped as
intermediate-level blocks. If invalidating a 1GB region (with 4KB
granule) means having to fall back to 256K non-range commands, we may
not want to invalidate by VA then, even though doing so for a 2MB
region is still optimal.

It's also quite feasible that drivers might want to do this for leaf
invalidations too - if you don't like issuing 512 commands to
invalidate 2MB, do you like issuing 511 commands to invalidate 2044KB?
- and at that point the logic really has to be in the driver anyway.


Ok I will move this to tlb_flush_walk() functions in the drivers. In the previous
v1 thread, you suggested to make the choice in iommu_get_dma_strict() test,
I assume you meant the test in iommu_dma_init_domain() with a flag or was it
the leaf driver(ex:arm-smmu.c) test of iommu_get_dma_strict() in init_domain?

Yes, I meant literally inside the same condition where we currently set "pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;" in arm_smmu_init_domain_context().

I am still a bit confused on where this flag would be? Should this be a part
of struct iommu_domain?

Well, if you were to rewrite the config with an alternative set of flush_ops at that point it would be implicit. For a flag, probably either in arm_smmu_domain or arm_smmu_impl. Maybe a flag would be less useful than generalising straight to a "maximum number of by-VA invalidations it's worth sending individually" threshold value? It's clear to me what overall shape and separation of responsibility is most logical, but beyond that I don't have a particularly strong opinion on the exact implementation; I've just been chucking ideas around :)

Cheers,
Robin.