Re: [PATCH v2] loongarch/mm: disable WUC for pgprot_writecombine as same as ioremap_wc

From: Xi Ruoyao
Date: Tue Oct 10 2023 - 08:27:16 EST


On Tue, 2023-10-10 at 11:02 +0800, Sui Jingfeng wrote:

>
> On LoongArch, cached mapping and uncached mappings are DMA-coherent and guaranteed by
> the hardware. While WC mappings is *NOT* DMA-coherent when 3D GPU is involved. Therefore,
> On downstream kernel, We disable write combine(WC) mappings at the drm drivers side.

Why it's only an issue when 3D GPU is involved? What's the difference
between 3D GPUs and other devices? Is it possible that the other
devices (say neural accelerators) start to perform DMA accesses in a
similar way and then suddenly broken?

> - For buffers at VRAM(device memory), we replace the WC mappings with uncached mappings.
> - For buffers reside in RAM, we replace the WC mappings with cached mappings.
>
> By this way, we were able to minimum the side effects, and meet the usable requirements
> for all of the GPU drivers.

AFAIK there has been some clear NAK from DRM maintainers towards this
"approach". So it's not possible to be applied upstream.

> For DMA non-coherent buffers, we should try to implement arch-specific dma_map_ops,
> invalidate the CPU cache and flush the CPU write buffer before the device do DMA. Instead
> of pretend to be DMA coherent for all buffers, a kernel cmdline is not a system level
> solution for all of GPU drivers and OS release.

IIUC this is a hardware bug of 7A1000 and 7A2000, so the proper location
of the workaround is in the bridge chip driver. Or am I
misunderstanding something?

--
Xi Ruoyao <xry111@xxxxxxxxxxx>
School of Aerospace Science and Technology, Xidian University