Re: [PATCH 3/3] drm: omapdrm: Do no allocate non-scanout GEMs through DMM/TILER

From: Ivaylo Dimitrov
Date: Mon Oct 31 2022 - 03:05:18 EST


HI Nikolaus,

On 31.10.22 г. 0:08 ч., H. Nikolaus Schaller wrote:
Hi Ivaylo,

it took a while until I found time to test newer kernels (mainline + Letux additions)
on the OMAP5 Pyra but unfortunately I did not get screen display for v6.1. Even worse,
the console was flooded by

Could you elaborate on that - do you have anything on the display (during boot or dunno). Do you have simplefb enabled, so boot log to be visible on the display? Is that wayland you are trying to run? Do you have PVR driver enabled? Did you try to boot vanilla kernel?


[ 39.419846] WARNING: CPU: 0 PID: 3673 at drivers/bus/omap_l3_noc.c:139 l3_interrupt_handler+0x23c/0x330
[ 39.429914] 44000000.l3-noc:L3 Custom Error: MASTER MPU TARGET GPMC (Idle): Data Access in Supervisor mode during Functional access
...


I have no idea what that error is supposed to mean. @Tony?

making the system unuseable.

After doing some manual bisect by installing different kernel versions on the boot SD card,
I was able to identify that it crept in between v5.18 and v5.19-rc1. A git bisect on this
range (adding Letux patches on top of each bisect base) did reveal this patch as the first bad one.

After reverting it seems as if I can use any v5.19 .. v6.1-rc2 kernel without issues.

Now I wonder why this patch breaks my system?


A wild guess - omap5 has some cache issues (as is visible from 7cb0d6c17b96b8bf3c25de2dfde4fdeb9191f4c3), which lead to the above. Before the patch *all* access to the BO backing memory was done through TILER/DMM, mitigating the issue. After the patch, whoever tries to render to non-scanout buffer is doing it directly to the memory, causing the issue.

Another possibility - someone assumes that memory is always linear, which is true when it is accessed through DMM, but it is not after the patch. Do you have my "drm: pvrsgx: dmabuf import - Do not assume scatterlist memory is contiguous" patch in your PVR driver? Maybe there is another driver that lacks similar patch.

Regards,
Ivo

BR and thanks,
Nikolaus


Am 19.01.2022 um 11:23 schrieb Ivaylo Dimitrov <ivo.g.dimitrov.75@xxxxxxxxx>:

On devices with DMM, all allocations are done through either DMM or TILER.
DMM/TILER being a limited resource means that such allocations will start
to fail before actual free memory is exhausted. What is even worse is that
with time DMM/TILER space gets fragmented to the point that even if we have
enough free DMM/TILER space and free memory, allocation fails because there
is no big enough free block in DMM/TILER space.

Such failures can be easily observed with OMAP xorg DDX, for example -
starting few GUI applications (so buffers for their windows are allocated)
and then rotating landscape<->portrait while closing and opening new
windows soon results in allocation failures.

Fix that by mapping buffers through DMM/TILER only when really needed,
like, for scanout buffers.

Signed-off-by: Ivaylo Dimitrov <ivo.g.dimitrov.75@xxxxxxxxx>
---
drivers/gpu/drm/omapdrm/omap_gem.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/omapdrm/omap_gem.c b/drivers/gpu/drm/omapdrm/omap_gem.c
index 41c1a6d..cf57179 100644
--- a/drivers/gpu/drm/omapdrm/omap_gem.c
+++ b/drivers/gpu/drm/omapdrm/omap_gem.c
@@ -821,10 +821,12 @@ int omap_gem_pin(struct drm_gem_object *obj, dma_addr_t *dma_addr)
if (ret)
goto fail;

- if (priv->has_dmm) {
- ret = omap_gem_pin_tiler(obj);
- if (ret)
- goto fail;
+ if (omap_obj->flags & OMAP_BO_SCANOUT) {
+ if (priv->has_dmm) {
+ ret = omap_gem_pin_tiler(obj);
+ if (ret)
+ goto fail;
+ }
}
} else {
refcount_inc(&omap_obj->pin_cnt);
@@ -861,6 +863,8 @@ static void omap_gem_unpin_locked(struct drm_gem_object *obj)
kfree(omap_obj->sgt);
omap_obj->sgt = NULL;
}
+ if (!(omap_obj->flags & OMAP_BO_SCANOUT))
+ return;
if (priv->has_dmm) {
ret = tiler_unpin(omap_obj->block);
if (ret) {
--
1.9.1