Re: [PATCH 01/12] dma-buf: add dynamic caching of sg_table

From: Daniel Vetter
Date: Wed May 22 2019 - 14:33:56 EST


On Wed, May 22, 2019 at 7:28 PM Christian KÃnig
<ckoenig.leichtzumerken@xxxxxxxxx> wrote:
>
> Am 22.05.19 um 18:17 schrieb Sumit Semwal:
> > Hi Christian,
> >
> > On Sat, 27 Apr 2019 at 05:31, Liam Mark <lmark@xxxxxxxxxxxxxx> wrote:
> >> On Tue, 16 Apr 2019, Christian KÃnig wrote:
> >>
> >>> To allow a smooth transition from pinning buffer objects to dynamic
> >>> invalidation we first start to cache the sg_table for an attachment
> >>> unless the driver explicitly says to not do so.
> >>>
> >>> ---
> >>> drivers/dma-buf/dma-buf.c | 24 ++++++++++++++++++++++++
> >>> include/linux/dma-buf.h | 11 +++++++++++
> >>> 2 files changed, 35 insertions(+)
> >>>
> >>> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> >>> index 7c858020d14b..65161a82d4d5 100644
> >>> --- a/drivers/dma-buf/dma-buf.c
> >>> +++ b/drivers/dma-buf/dma-buf.c
> >>> @@ -573,6 +573,20 @@ struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
> >>> list_add(&attach->node, &dmabuf->attachments);
> >>>
> >>> mutex_unlock(&dmabuf->lock);
> >>> +
> >>> + if (!dmabuf->ops->dynamic_sgt_mapping) {
> >>> + struct sg_table *sgt;
> >>> +
> >>> + sgt = dmabuf->ops->map_dma_buf(attach, DMA_BIDIRECTIONAL);
> >>> + if (!sgt)
> >>> + sgt = ERR_PTR(-ENOMEM);
> >>> + if (IS_ERR(sgt)) {
> >>> + dma_buf_detach(dmabuf, attach);
> >>> + return ERR_CAST(sgt);
> >>> + }
> >>> + attach->sgt = sgt;
> >>> + }
> >>> +
> >>> return attach;
> >>>
> >>> err_attach:
> >>> @@ -595,6 +609,10 @@ void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *attach)
> >>> if (WARN_ON(!dmabuf || !attach))
> >>> return;
> >>>
> >>> + if (attach->sgt)
> >>> + dmabuf->ops->unmap_dma_buf(attach, attach->sgt,
> >>> + DMA_BIDIRECTIONAL);
> >>> +
> >>> mutex_lock(&dmabuf->lock);
> >>> list_del(&attach->node);
> >>> if (dmabuf->ops->detach)
> >>> @@ -630,6 +648,9 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach,
> >>> if (WARN_ON(!attach || !attach->dmabuf))
> >>> return ERR_PTR(-EINVAL);
> >>>
> >>> + if (attach->sgt)
> >>> + return attach->sgt;
> >>> +
> >> I am concerned by this change to make caching the sg_table the default
> >> behavior as this will result in the exporter's map_dma_buf/unmap_dma_buf
> >> calls are no longer being called in
> >> dma_buf_map_attachment/dma_buf_unmap_attachment.
> > Probably this concern from Liam got lost between versions of your
> > patches; could we please request a reply to these points here?
>
> Sorry I indeed never got this mail, but this is actually not an issue
> because Daniel had similar concerns and we didn't made this the default
> in the final version.
>
> >> This seems concerning to me as it appears to ignore the cache maintenance
> >> aspect of the map_dma_buf/unmap_dma_buf calls.
> >> For example won't this potentially cause issues for clients of ION.
> >>
> >> If we had the following
> >> - #1 dma_buf_attach coherent_device
> >> - #2 dma_buf attach non_coherent_device
> >> - #3 dma_buf_map_attachment non_coherent_device
> >> - #4 non_coherent_device writes to buffer
> >> - #5 dma_buf_unmap_attachment non_coherent_device
> >> - #6 dma_buf_map_attachment coherent_device
> >> - #7 coherent_device reads buffer
> >> - #8 dma_buf_unmap_attachment coherent_device
> >>
> >> There wouldn't be any CMO at step #5 anymore (specifically no invalidate)
> >> so now at step #7 the coherent_device could read a stale cache line.
> >>
> >> Also, now by default dma_buf_unmap_attachment no longer removes the
> >> mappings from the iommu, so now by default dma_buf_unmap_attachment is not
> >> doing what I would expect and clients are losing the potential sandboxing
> >> benefits of removing the mappings.
> >> Shouldn't this caching behavior be something that clients opt into instead
> >> of being the default?
>
> Well, it seems you are making incorrect assumptions about the cache
> maintenance of DMA-buf here.
>
> At least for all DRM devices I'm aware of mapping/unmapping an
> attachment does *NOT* have any cache maintenance implications.
>
> E.g. the use case you describe above would certainly fail with amdgpu,
> radeon, nouveau and i915 because mapping a DMA-buf doesn't stop the
> exporter from reading/writing to that buffer (just the opposite actually).
>
> All of them assume perfectly coherent access to the underlying memory.
> As far as I know there is no documented cache maintenance requirements
> for DMA-buf.

I think it is documented. It's just that on x86, we ignore that
because the dma-api pretends there's never a need for cache flushing
on x86, and that everything snoops the cpu caches. Which isn't true
since over 20 ago when AGP happened. The actual rules for x86 dma-buf
are very much ad-hoc (and we occasionally reapply some duct-tape when
cacheline noise shows up somewhere).

I've just filed this away as another instance of the dma-api not
fitting gpus, and I think giving recent discussions that won't improve
anytime soon. So we're stuck with essentially undefined dma-buf
behaviour.
-Daniel

> The IOMMU concern on the other hand is certainly valid and I perfectly
> agree that keeping the mapping time as short as possible is desirable.
>
> Regards,
> Christian.
>
> >> Liam
> >>
> >> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
> >> a Linux Foundation Collaborative Project
> >>
> > Best,
> > Sumit.
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@xxxxxxxxxxxxxxxxxxxxx
> https://lists.freedesktop.org/mailman/listinfo/dri-devel



--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch