[SNIP]
Okay, that confirms my theory on why this is needed. So things don'tSnooping the CPU caches introduces some extra latency, so what canUserspace components can then of course tell the exporter what theWhat exactly do you mean by "scanout requires non-coherent memory"?
importer needs, but validation if that stuff is correct and doesn't
crash the system must happen in the kernel.
Does the scanout requestor always set the no-snoop PCI flag, so you get
garbage if some writes to memory are still stuck in the caches, or is
it some other requirement?
happen is that the response to the PCIe read comes to late for the
scanout. The result is an underflow and flickering whenever something is
in the cache which needs to be flushed first.
totally explode if you don't do it, but to in order to guarantee access
latency you need to take the no-snoop path, which means your device
effectively gets dma-noncoherent.
On the other hand when the don't snoop the CPU caches we at least getIf you attach to the dma-buf with a struct device which is non-coherent
garbage/stale data on the screen. That wouldn't be that worse, but the
big problem is that we have also seen machine check exceptions when
don't snoop and the cache is dirty.
it's the exporters job to flush any dirty caches. Unfortunately the DRM
caching of the dma-buf attachments in the DRM framework will get a bit
in the way here, so a DRM specific flush might be be needed. :/ Maybe
moving the whole buffer to uncached sysmem location on first attach of
a non-coherent importer would be enough?
So this should better be coherent or you can crash the box. ARM seems toARM really dislikes pagetable setups with different attributes pointing
be really susceptible for this, x86 is fortunately much more graceful
and I'm not sure about other architectures.
to the same physical page, however you should be fine as long as all
cached aliases are properly flushed from the cache before access via a
different alias.
Regards,
Lucas