Re: [Xen-devel] [PATCH v2] drm/xen-front: Make shmem backed display buffer coherent

From: Oleksandr Andrushchenko
Date: Tue Jan 22 2019 - 05:28:09 EST


Hello, Julien!

On 1/21/19 7:09 PM, Julien Grall wrote:
> Hello,
>
> On 21/01/2019 12:43, Oleksandr Andrushchenko wrote:
>> On 1/18/19 1:43 PM, Julien Grall wrote:
>>> On 18/01/2019 09:40, Oleksandr Andrushchenko wrote:
>>>> On 1/17/19 11:18 AM, Christoph Hellwig wrote:
>>>>> On Wed, Jan 16, 2019 at 06:43:29AM +0000, Oleksandr Andrushchenko
>>>>> wrote:
>>>>>>> This whole issue keeps getting more and more confusing.
>>>>>> Well, I don't really do DMA here, but instead the buffers in
>>>>>> question are shared with other Xen domain, so effectively it
>>>>>> could be thought of some sort of DMA here, where the "device" is
>>>>>> that remote domain. If the buffers are not flushed then the
>>>>>> remote part sees some inconsistency which in my case results
>>>>>> in artifacts on screen while displaying the buffers.
>>>>>> When buffers are allocated via DMA API then there are no artifacts;
>>>>>> if buffers are allocated with shmem + DMA mapping then there are no
>>>>>> artifacts as well.
>>>>>> The only offending use-case is when I use shmem backed buffers,
>>>>>> but do not flush them
>>>>> The right answer would be to implement cache maintainance hooks for
>>>>> this case in the Xen arch code. These would basically look the same
>>>>> as the low-level cache maintainance used by the DMA ops, but without
>>>>> going through the DMA mapping layer, in fact they should probably
>>>>> reuse the same low-level assembly routines.
>>>>>
>>>>> I don't think this is the first usage of such Xen buffer sharing, so
>>>>> what do the other users do?
>>>> I'll have to get even deeper into it. Initially I
>>>> looked at the code, but didn't find anything useful.
>>>> Or maybe I have just overlooked obvious things there
>>> ÂFrom Xen on Arm ABI:
>>>
>>> "All memory which is shared with other entities in the system
>>> (including the hypervisor and other guests) must reside in memory
>>> which is mapped as Normal Inner Write-Back Outer Write-Back
>>> Inner-Shareable.
>>> This applies to:
>>> ÂÂ - hypercall arguments passed via a pointer to guest memory.
>>> ÂÂ - memory shared via the grant table mechanism (including PV I/O
>>> ÂÂÂÂ rings etc).
>>> ÂÂ - memory shared with the hypervisor (struct shared_info, struct
>>> ÂÂÂÂ vcpu_info, the grant table, etc).
>>> "
>>>
>>> So you should not need any cache maintenance here. Can you provide
>>> more details on the memory attribute you use for memory shared in both
>>> the backend and frontend?
>>>
>> It takes quite some time to collect this (because many components are
>> involved in the
>> use-case), but for now the pages in the guest domain are:
>> !PTE_RDONLY + PTE_PXN + PTE_SHARED + PTE_AF + PTE_UXN +
>> PTE_ATTRINDX(MT_NORMAL)
>
> So that's the attribute for the page mapped in the frontend, right?
> How about the backend side?
Please see below
>
> Also, could that page be handed to the graphic card correctly?
Yes, if we use zero-copying. But please see below
> If so, is your graphic card coherent?
Yes, it is
>
> If one of your components is mapping with non-cacheable attributes
> then you are already not following the Xen Arm ABI. In that case, we
> would need to discuss how to extend the ABI.
>
> Cheers,
>
Well, I didn't get the attributes of pages at the backend side, but IMO
those
do not matter in my use-case (for simplicity I am not using zero-copying at
backend side):

1. Frontend device allocates display buffer pages which come from shmem
and have these attributes:
!PTE_RDONLY + PTE_PXN + PTE_SHARED + PTE_AF + PTE_UXN +
PTE_ATTRINDX(MT_NORMAL)

2. Frontend grants references to these pages and shares those with the
backend

3. Backend is a user-space application (Weston client), so it uses
gntdev kernel
driver to mmap the pages. The pages, which are used by gntdev, are those
coming
from the Xen balloon driver and I believe they are all normal memory and
shouldn't be non-cached.

4. Once the frontend starts displaying it flips the buffers and backend
does *memcpy*
from the frontend-backend shared buffer into Weston's buffer. This means
no HW at the backend side touches the shared buffer.

5. I can see distorted picture.

Previously I used setup with zero-copying, so then the picture becomes
more complicated
in terms of buffers and how those used by the backed, but anyways it
seems that the
very basic scenario with memory copying doesn't work for me.

Using DMA API on frontend's side does help - no artifacts are seen.
This is why I'm thinking that this is related to frontend/kernel side
rather then to
the backend side. This is why I'm thinking this is related to caches and
trying to figure
out what can be done here instead of using DMA API.

Thank you,
Olkesandr