Re: [RFC PATCH] drm: disable WC optimization for cache coherent devices on non-x86

From: Koenig, Christian
Date: Thu Jan 24 2019 - 04:45:17 EST


Am 24.01.19 um 10:28 schrieb Ard Biesheuvel:
> On Thu, 24 Jan 2019 at 10:25, Koenig, Christian
> <Christian.Koenig@xxxxxxx> wrote:
>> Am 24.01.19 um 10:13 schrieb Christoph Hellwig:
>>> On Wed, Jan 23, 2019 at 05:52:50PM +0100, Ard Biesheuvel wrote:
>>>> But my concern is that it seems likely that non-cache coherent
>>>> implementations are relying on this hack as well. There must be a
>>>> reason that this hack is only disabled for PowerPC platforms if they
>>>> are cache coherent, for instance, and I suspect that that reason is
>>>> that the hack is the only thing ensuring that the CPU mapping
>>>> attributes match the device ones used for these buffers (the vmap()ed
>>>> ones), whereas the rings and other consistent data structures are
>>>> using the DMA API as intended, and thus getting uncached attributes in
>>>> the correct way.
>>> Dave, who added that commit is on Cc together with just about everyone
>>> involved in the review chain. Based on the previous explanation
>>> that idea what we might want an uncached mapping for some non-coherent
>>> architectures for this to work at all makes sense, but then again
>>> the way to create those mappings is entirely architecture specific,
>>> and also need a cache flushing before creating the mapping to work
>>> properly. So my working theory is that this code never properly
>>> worked on architectures without DMA coherent for PCIe at all, but
>>> I'd love to be corrected by concrete examples including an explanation
>>> of how it actually ends up working.
>> Cache coherency is mandatory for modern GPU operation.
>>
>> Otherwise you can't implement a bunch of the requirements of the
>> userspace APIs.
>>
>> In other words the applications doesn't inform the driver that the GPU
>> or the CPU is accessing data, it just does it and assumes that it works.
>>
> Wonderful!
>
> In that case, do you have any objections to the patch proposed by
> Christoph above?

Yeah, the patch of Christoph actually goes way to far cause we have
reports that this works on a bunch of other architectures.

E.g. X86 64bit, PowerPC (under some conditions) and some MIPS.

The only problematic here actually seems to be ARM, so you should
probably just add an "#ifdef .._ARM return false;".

Regards,
Christian.