Re: [PATCH v10 07/11] drm/etnaviv: Add support for the dma coherent device

From: Lucas Stach
Date: Wed Jun 21 2023 - 13:55:44 EST


Am Donnerstag, dem 22.06.2023 um 01:31 +0800 schrieb Sui Jingfeng:
> Hi,
>
> On 2023/6/22 00:07, Lucas Stach wrote:
> > And as the HW guarantees it on your platform, your platform
> > implementation makes this function effectively a no-op. Skipping the
> > call to this function is breaking the DMA API abstraction, as now the
> > driver is second guessing the DMA API implementation. I really see no
> > reason to do this.
>
> It is the same reason you chose the word 'effectively', not 'difinitely'.
>
> We don't want waste the CPU's time,
>
>
>  to running the dma_sync_sg_for_cpu funcion() function
>
>
> ```
>
> void dma_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg,
>             int nelems, enum dma_data_direction dir)
> {
>     const struct dma_map_ops *ops = get_dma_ops(dev);
>
>     BUG_ON(!valid_dma_direction(dir));
>     if (dma_map_direct(dev, ops))
>         dma_direct_sync_sg_for_cpu(dev, sg, nelems, dir);
>     else if (ops->sync_sg_for_cpu)
>         ops->sync_sg_for_cpu(dev, sg, nelems, dir);
>     debug_dma_sync_sg_for_cpu(dev, sg, nelems, dir);
> }
>
> ```
>
>
>  to running the this:
>
>
> ```
>
> int etnaviv_gem_cpu_fini(struct drm_gem_object *obj)
> {
>     struct drm_device *dev = obj->dev;
>     struct etnaviv_gem_object *etnaviv_obj = to_etnaviv_bo(obj);
>     struct etnaviv_drm_private *priv = dev->dev_private;
>
>     if (!priv->dma_coherent && etnaviv_obj->flags & ETNA_BO_CACHED) {
>         /* fini without a prep is almost certainly a userspace error */
>         WARN_ON(etnaviv_obj->last_cpu_prep_op == 0);
>         dma_sync_sgtable_for_device(dev->dev, etnaviv_obj->sgt,
> etnaviv_op_to_dma_dir(etnaviv_obj->last_cpu_prep_op));
>         etnaviv_obj->last_cpu_prep_op = 0;
>     }
>
>     return 0;
> }
>
> ```
>
My judgment as the maintainer of this driver is that the small CPU
overhead of calling this function is very well worth it, if the
alternative is breaking the DMA API abstractions.

>
> But, this is acceptable, because we can kill the GEM_CPU_PREP and
> GEM_CPU_FINI ioctl entirely
>
> at userspace for cached buffer, as this is totally not needed for cached
> mapping on our platform.
>
And that statement isn't true either. The CPU_PREP/FINI ioctls also
provide fence synchronization between CPU and GPU. There are a few very
specific cases where skipping those ioctls is acceptable (mostly when
the userspace driver explicitly wants unsynchronized access), but in
most cases they are required for correctness.

Regards,
Lucas