Re: [RFC PATCH] drm/ttm: force cached mappings for system RAM on ARM

From: Ard Biesheuvel
Date: Wed Jan 16 2019 - 02:47:56 EST


On Wed, 16 Jan 2019 at 08:36, Koenig, Christian
<Christian.Koenig@xxxxxxx> wrote:
>
> Am 16.01.19 um 01:33 schrieb Benjamin Herrenschmidt:
> > On Tue, 2019-01-15 at 22:31 +1100, Michael Ellerman wrote:
> >>>> As far as I know Power doesn't really supports un-cached memory at all,
> >>>> except for a very very old and odd configuration with AGP.
> >>> Hopefully Michael/Ben can elaborate here, but I was under the (possibly
> >>> mistaken) impression that mismatched attributes could cause a machine-check
> >>> on Power.
> >> That's what I've always been told, but I can't actually find where it's
> >> documented, I'll keep searching.
> >>
> >> But you're right that mixing cached / uncached is not really supported,
> >> and probably results in a machine check or worse.
> > .. or worse :) It could checkstop.
>
> Not sure if that would be so bad, it would at least give us a clear
> indicator that something is wrong instead of silently corrupting data.
>
> > It's also my understanding that on ARM v7 and above, it's technically
> > forbidden to map the same physical page with both cached and non-cached
> > mappings, since the cached one could prefetch (or speculatively load),
> > thus creating collisions and inconsistencies. Am I wrong here ?
>
> No, but you answer the wrong question.
>
> See we don't want to have different mappings of cached and non-cached on
> the CPU, but rather want to know if a snooped DMA from the PCIe counts
> as cached access as well.
>
> As far as I know on x86 it doesn't, so when you have an un-cached page
> you can still access it with a snooping DMA read/write operation and
> don't cause trouble.
>

I think it is the other way around. The question is, on an otherwise
cache coherent device, whether the NoSnoop attribute set by the GPU
propagates all the way to the bus so that it bypasses the caches.

On x86, we can tolerate if this is not the case, since uncached memory
accesses by the CPU snoop the caches as well.

On other architectures, uncached accesses go straight to main memory,
so if the device wrote anything to the caches we won't see it.

So to use this optimization, you have to either be 100% sure that
NoSnoop is implemented correctly, or have a x86 CPU.

> > The old hack of using non-cached mapping to avoid snoop cost in AGP and
> > others is just that ... an ugly and horrible hacks that should have
> > never eventuated, when the search for performance pushes HW people into
> > utter insanity :)
>
> Well I agree that un-cached system memory makes things much more
> complicated for a questionable gain.
>
> But fact is we now have to deal with the mess, so no point in
> complaining about it to much :)
>

Indeed. I wonder if we should just disable it altogether unless CONFIG_X86=y