Re: [PATCH v10 07/11] drm/etnaviv: Add support for the dma coherent device

From: Sui Jingfeng
Date: Wed Jun 21 2023 - 11:55:00 EST


Hi,

On 2023/6/21 23:33, Lucas Stach wrote:
Am Mittwoch, dem 21.06.2023 um 23:00 +0800 schrieb Sui Jingfeng:
On 2023/6/21 18:00, Lucas Stach wrote:
static inline enum dma_data_direction etnaviv_op_to_dma_dir(u32 op)
@@ -369,6 +381,7 @@ int etnaviv_gem_cpu_prep(struct drm_gem_object *obj, u32 op,
{
struct etnaviv_gem_object *etnaviv_obj = to_etnaviv_bo(obj);
struct drm_device *dev = obj->dev;
+ struct etnaviv_drm_private *priv = dev->dev_private;
bool write = !!(op & ETNA_PREP_WRITE);
int ret;
@@ -395,7 +408,7 @@ int etnaviv_gem_cpu_prep(struct drm_gem_object *obj, u32 op,
return ret == 0 ? -ETIMEDOUT : ret;
}
- if (etnaviv_obj->flags & ETNA_BO_CACHED) {
+ if (!priv->dma_coherent && etnaviv_obj->flags & ETNA_BO_CACHED) {
Why do you need this? Isn't dma_sync_sgtable_for_cpu a no-op on your
platform when the device is coherent?

I need this to show that our hardware is truly dma-coherent!

I have tested that the driver still works like a charm without adding
this code '!priv->dma_coherent'.


But I'm expressing the idea that a truly dma-coherent just device don't
need this.

I don't care if it is a no-op.

It is now, it may not in the future.
And that's exactly the point. If it ever turns into something more than
a no-op on your platform, then that's probably for a good reason and a
driver should not assume that it knows better than the DMA API
implementation what is or is not required on a specific platform to
make DMA work.

Even it is, the overhead of function call itself still get involved.

cpu_prep/fini aren't total fast paths, you already synchronized with
the GPU here, potentially waiting for jobs to finish, etc. If your
platform no-ops this then the function call will be in the noise.
Also, we want to try flush the write buffer with the CPU manually.


Currently, we want the absolute correctness in the concept,

not only the rendering results.
And if you want absolute correctness then calling dma_sync_sgtable_* is
the right thing to do, as it can do much more than just manage caches.

For our hardware, cached mapping don't need calling dma_sync_sgtable_*.

This is the the right thing to do. The hardware already guarantee it for use.


We may only want to call it for WC mapping BO,  please don't tangle all of this together.

We simply want to do the right thing.

Right now it also provides SWIOTLB translation if needed.

SWIOTLB introduce the bounce buffer, slower the performance.

We don't need it. It should be avoid.

 I know you know everything. No sugar-coated bullets please.


Regards,
Lucas

--
Jingfeng