Re: [PATCH v4 1/2] media: videobuf2: Fix dmabuf cache sync/flush in dma-contig
From: Mikhail Rudenko
Date: Wed Mar 05 2025 - 02:54:01 EST
Hi Nicolas,
On 2025-03-03 at 10:24 -05, Nicolas Dufresne <nicolas@xxxxxxxxxxxx> wrote:
> Hi Mikhail,
>
> Le lundi 03 mars 2025 à 14:40 +0300, Mikhail Rudenko a écrit :
>> When support for V4L2_FLAG_MEMORY_NON_CONSISTENT was removed in
>> commit 129134e5415d ("media: media/v4l2: remove
>> V4L2_FLAG_MEMORY_NON_CONSISTENT flag"),
>> vb2_dc_dmabuf_ops_{begin,end}_cpu_access() functions were made
>> no-ops. Later, when support for V4L2_MEMORY_FLAG_NON_COHERENT was
>> introduced in commit c0acf9cfeee0 ("media: videobuf2: handle
>> V4L2_MEMORY_FLAG_NON_COHERENT flag"), the above functions remained
>> no-ops, making cache maintenance for non-coherent dmabufs allocated
>> by
>> dma-contig impossible.
>>
>> Fix this by reintroducing dma_sync_sgtable_for_{cpu,device} and
>> {flush,invalidate}_kernel_vmap_range calls to
>> vb2_dc_dmabuf_ops_{begin,end}_cpu_access() functions for non-coherent
>> buffers.
>>
>> Fixes: c0acf9cfeee0 ("media: videobuf2: handle
>> V4L2_MEMORY_FLAG_NON_COHERENT flag")
>> Cc: stable@xxxxxxxxxxxxxxx
>> Signed-off-by: Mikhail Rudenko <mike.rudenko@xxxxxxxxx>
>> ---
>> .../media/common/videobuf2/videobuf2-dma-contig.c | 22
>> ++++++++++++++++++++++
>> 1 file changed, 22 insertions(+)
>>
>> diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c
>> b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
>> index
>> a13ec569c82f6da2d977222b94af32e74c6c6c82..d41095fe5bd21faf815d6b035d7
>> bc888a84a95d5 100644
>> --- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c
>> +++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
>> @@ -427,6 +427,17 @@ static int
>> vb2_dc_dmabuf_ops_begin_cpu_access(struct dma_buf *dbuf,
>> enum dma_data_direction
>> direction)
>> {
>> + struct vb2_dc_buf *buf = dbuf->priv;
>> + struct sg_table *sgt = buf->dma_sgt;
>> +
>> + if (!buf->non_coherent_mem)
>> + return 0;
>> +
>> + if (buf->vaddr)
>> + invalidate_kernel_vmap_range(buf->vaddr, buf->size);
>
> What would make me a lot more confortable with this change is if you
> enable kernel mappings for one test. This will ensure you cover the
> call to "invalidate" in your testing. I'd like to know about the
> performance impact. With this implementation it should be identical to
> the VB2 one.
I'll enable kernel mappings and rerun my tests later this week.
> What I was trying to say in previous comments, is that my impression is
> that we can skip this for CPU read access, since we don't guaranty
> concurrent access anyway. Both address space can keep their cache in
> that case. Though, I see RKISP does not use kernel mapping plus I'm not
> reporting a bug, but checking if we should leave a comment for possible
> users of kernel mapping in the future ?
I trust Tomasz here, I'd wait for his comment on v4.
>> +
>> + dma_sync_sgtable_for_cpu(buf->dev, sgt, direction);
>> +
>> return 0;
>> }
>>
>> @@ -434,6 +445,17 @@ static int
>> vb2_dc_dmabuf_ops_end_cpu_access(struct dma_buf *dbuf,
>> enum dma_data_direction direction)
>> {
>> + struct vb2_dc_buf *buf = dbuf->priv;
>> + struct sg_table *sgt = buf->dma_sgt;
>> +
>> + if (!buf->non_coherent_mem)
>> + return 0;
>> +
>> + if (buf->vaddr)
>> + flush_kernel_vmap_range(buf->vaddr, buf->size);
>> +
>> + dma_sync_sgtable_for_device(buf->dev, sgt, direction);
>> +
>> return 0;
>> }
>>
>>
--
Best regards,
Mikhail Rudenko