Re: [PATCH] iommu/io-pgtable-arm: Allow non-coherent masters to use system cache

From: isaacm
Date: Thu Jan 07 2021 - 11:58:45 EST

On 2021-01-06 03:56, Will Deacon wrote:
On Thu, Dec 24, 2020 at 12:10:07PM +0530, Sai Prakash Ranjan wrote:
commit ecd7274fb4cd ("iommu: Remove unused IOMMU_SYS_CACHE_ONLY flag")
removed unused IOMMU_SYS_CACHE_ONLY prot flag and along with it went
the memory type setting required for the non-coherent masters to use
system cache. Now that system cache support for GPU is added, we will
need to mark the memory as normal sys-cached for GPU to use system cache.
Without this, the system cache lines are not allocated for GPU. We use
the IO_PGTABLE_QUIRK_ARM_OUTER_WBWA quirk instead of a page protection
flag as the flag cannot be exposed via DMA api because of no in-tree

Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@xxxxxxxxxxxxxx>
drivers/iommu/io-pgtable-arm.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 7c9ea9d7874a..3fb7de8304a2 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -415,6 +415,9 @@ static arm_lpae_iopte arm_lpae_prot_to_pte(struct arm_lpae_io_pgtable *data,
else if (prot & IOMMU_CACHE)
+ else if (data->iop.cfg.quirks & IO_PGTABLE_QUIRK_ARM_OUTER_WBWA)

While this approach of enabling system cache globally for both page tables and other buffers
works for the GPU usecase, this isn't ideal for other clients that use system cache. For example,
video clients only want to cache a subset of their buffers in the system cache, due to the sizing constraint
imposed by how much of the system cache they can use. So, it would be ideal to have
a way of expressing the desire to use the system cache on a per-buffer basis. Additionally,
our video clients use the DMA layer, and since the requirement is for caching in the system cache
to be a per buffer attribute, it seems like we would have to have a DMA attribute to express
this on a per-buffer basis.


drivers/iommu/io-pgtable.c currently documents this quirk as applying only
to the page-table walker. Given that we only have one user at the moment,
I think it's ok to change that, but please update the comment.

We also need to decide on whether we want to allow the quirk to be passed
if the coherency of the page-table walker differs from the DMA device, since
we have these combinations:

0: N 0 0
1: N 0 1
2: N 1 0
3: N 1 1
4: Y 0 0
5: Y 0 1
6: Y 1 0
7: Y 1 1

Some of them are obviously bogus, such as (7), but I don't know what to
do about cases such as (3) and (5).


