Re: [PATCH 3/3] drm: omapdrm: Do no allocate non-scanout GEMs through DMM/TILER

From: H. Nikolaus Schaller
Date: Mon Oct 31 2022 - 03:44:36 EST


Hi Ivaylo,

> Am 31.10.2022 um 08:05 schrieb Ivaylo Dimitrov <ivo.g.dimitrov.75@xxxxxxxxx>:
>
> HI Nikolaus,
>
> On 31.10.22 г. 0:08 ч., H. Nikolaus Schaller wrote:
>> Hi Ivaylo,
>> it took a while until I found time to test newer kernels (mainline + Letux additions)
>> on the OMAP5 Pyra but unfortunately I did not get screen display for v6.1. Even worse,
>> the console was flooded by
>
> Could you elaborate on that - do you have anything on the display (during boot or dunno). Do you have simplefb enabled, so boot log to be visible on the display?

No bootlog enabled but I have some printk in the panel driver. It is initially enabled, then disabled and enabled again. Then the issues start...

> Is that wayland you are trying to run? Do you have PVR driver enabled? Did you try to boot vanilla kernel?

I have tested with Debian Stretch with standard Xorg with "omap" driver. PVR is not enabled. And without your patch everything is fine on all kernels since 4.something.

Vanilla kernel can not be booted on that machine - there is not even a device tree...

>
>> [ 39.419846] WARNING: CPU: 0 PID: 3673 at drivers/bus/omap_l3_noc.c:139 l3_interrupt_handler+0x23c/0x330
>> [ 39.429914] 44000000.l3-noc:L3 Custom Error: MASTER MPU TARGET GPMC (Idle): Data Access in Supervisor mode during Functional access
>> ...
>
> I have no idea what that error is supposed to mean. @Tony?

A coincidence is that the display is sometimes showing some artistic patterns. So maybe DMA accesses non-existing memory?

>
>> making the system unuseable.
>> After doing some manual bisect by installing different kernel versions on the boot SD card,
>> I was able to identify that it crept in between v5.18 and v5.19-rc1. A git bisect on this
>> range (adding Letux patches on top of each bisect base) did reveal this patch as the first bad one.
>> After reverting it seems as if I can use any v5.19 .. v6.1-rc2 kernel without issues.
>> Now I wonder why this patch breaks my system?
>
> A wild guess - omap5 has some cache issues (as is visible from 7cb0d6c17b96b8bf3c25de2dfde4fdeb9191f4c3), which lead to the above. Before the patch *all* access to the BO backing memory was done through TILER/DMM, mitigating the issue. After the patch, whoever tries to render to non-scanout buffer is doing it directly to the memory, causing the issue.
>
> Another possibility - someone assumes that memory is always linear, which is true when it is accessed through DMM, but it is not after the patch. Do you have my "drm: pvrsgx: dmabuf import - Do not assume scatterlist memory is contiguous" patch in your PVR driver? Maybe there is another driver that lacks similar patch.

Yes, it is included.

Best regards,
Nikolaus

>
> Regards,
> Ivo
>
>> BR and thanks,
>> Nikolaus
>>> Am 19.01.2022 um 11:23 schrieb Ivaylo Dimitrov <ivo.g.dimitrov.75@xxxxxxxxx>:
>>>
>>> On devices with DMM, all allocations are done through either DMM or TILER.
>>> DMM/TILER being a limited resource means that such allocations will start
>>> to fail before actual free memory is exhausted. What is even worse is that
>>> with time DMM/TILER space gets fragmented to the point that even if we have
>>> enough free DMM/TILER space and free memory, allocation fails because there
>>> is no big enough free block in DMM/TILER space.
>>>
>>> Such failures can be easily observed with OMAP xorg DDX, for example -
>>> starting few GUI applications (so buffers for their windows are allocated)
>>> and then rotating landscape<->portrait while closing and opening new
>>> windows soon results in allocation failures.
>>>
>>> Fix that by mapping buffers through DMM/TILER only when really needed,
>>> like, for scanout buffers.
>>>
>>> Signed-off-by: Ivaylo Dimitrov <ivo.g.dimitrov.75@xxxxxxxxx>
>>> ---
>>> drivers/gpu/drm/omapdrm/omap_gem.c | 12 ++++++++----
>>> 1 file changed, 8 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/omapdrm/omap_gem.c b/drivers/gpu/drm/omapdrm/omap_gem.c
>>> index 41c1a6d..cf57179 100644
>>> --- a/drivers/gpu/drm/omapdrm/omap_gem.c
>>> +++ b/drivers/gpu/drm/omapdrm/omap_gem.c
>>> @@ -821,10 +821,12 @@ int omap_gem_pin(struct drm_gem_object *obj, dma_addr_t *dma_addr)
>>> if (ret)
>>> goto fail;
>>>
>>> - if (priv->has_dmm) {
>>> - ret = omap_gem_pin_tiler(obj);
>>> - if (ret)
>>> - goto fail;
>>> + if (omap_obj->flags & OMAP_BO_SCANOUT) {
>>> + if (priv->has_dmm) {
>>> + ret = omap_gem_pin_tiler(obj);
>>> + if (ret)
>>> + goto fail;
>>> + }
>>> }
>>> } else {
>>> refcount_inc(&omap_obj->pin_cnt);
>>> @@ -861,6 +863,8 @@ static void omap_gem_unpin_locked(struct drm_gem_object *obj)
>>> kfree(omap_obj->sgt);
>>> omap_obj->sgt = NULL;
>>> }
>>> + if (!(omap_obj->flags & OMAP_BO_SCANOUT))
>>> + return;
>>> if (priv->has_dmm) {
>>> ret = tiler_unpin(omap_obj->block);
>>> if (ret) {
>>> --
>>> 1.9.1
>>>