Re: [RFC PATCH] drm/ttm: force cached mappings for system RAM on ARM

From: Koenig, Christian
Date: Thu Jan 10 2019 - 03:37:01 EST


Hi Ard,

thanks a lot for this! At least somebody who can explain why this
doesn't work as expected.

The problem is that the hardware actually needs a few pages as uncached
in a couple of cases to work correctly. So we could still run into
issues with that solution.

For now we have blocked userspace/kernel from extensively using write
combine mappings by adjusting drm_arch_can_wc_memory(), but that is
probably degrading performance quite a bit.

What can be done to improve the solution? At least on X86 we solve this
by marking the write combined pages in the linear mapping as uncached as
well. Would this be doable on ARM as well?

Thanks,
Christian.

Am 10.01.19 um 08:28 schrieb Ard Biesheuvel:
> ARM systems do not permit the use of anything other than cached
> mappings for system memory, since that memory may be mapped in the
> linear region as well, and the architecture does not permit aliases
> with mismatched attributes.
>
> So short-circuit the evaluation in ttm_io_prot() if the flags include
> TTM_PL_SYSTEM when running on ARM or arm64, and just return cached
> attributes immediately.
>
> This fixes the radeon and amdgpu [TBC] drivers when running on arm64.
> Without this change, amdgpu does not start at all, and radeon only
> produces corrupt display output.
>
> Cc: Christian Koenig <christian.koenig@xxxxxxx>
> Cc: Huang Rui <ray.huang@xxxxxxx>
> Cc: Junwei Zhang <Jerry.Zhang@xxxxxxx>
> Cc: David Airlie <airlied@xxxxxxxx>
> Reported-by: Carsten Haitzler <Carsten.Haitzler@xxxxxxx>
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@xxxxxxxxxx>
> ---
> drivers/gpu/drm/ttm/ttm_bo_util.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c
> index 046a6dda690a..0c1eef5f7ae3 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo_util.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
> @@ -530,6 +530,11 @@ pgprot_t ttm_io_prot(uint32_t caching_flags, pgprot_t tmp)
> if (caching_flags & TTM_PL_FLAG_CACHED)
> return tmp;
>
> +#if defined(__arm__) || defined(__aarch64__)
> + /* ARM only permits cached mappings of system memory */
> + if (caching_flags & TTM_PL_SYSTEM)
> + return tmp;
> +#endif
> #if defined(__i386__) || defined(__x86_64__)
> if (caching_flags & TTM_PL_FLAG_WC)
> tmp = pgprot_writecombine(tmp);