Re: [PATCH 3/3] drm: omapdrm: Do no allocate non-scanout GEMs through DMM/TILER

From: H. Nikolaus Schaller
Date: Mon Oct 31 2022 - 03:58:05 EST




> Am 31.10.2022 um 08:44 schrieb H. Nikolaus Schaller <hns@xxxxxxxxxxxxx>:
>
> Hi Ivaylo,
>
>> Am 31.10.2022 um 08:05 schrieb Ivaylo Dimitrov <ivo.g.dimitrov.75@xxxxxxxxx>:
>>
>> HI Nikolaus,
>>
>> On 31.10.22 г. 0:08 ч., H. Nikolaus Schaller wrote:
>>> Hi Ivaylo,
>>> it took a while until I found time to test newer kernels (mainline + Letux additions)
>>> on the OMAP5 Pyra but unfortunately I did not get screen display for v6.1. Even worse,
>>> the console was flooded by
>>
>> Could you elaborate on that - do you have anything on the display (during boot or dunno). Do you have simplefb enabled, so boot log to be visible on the display?
>
> No bootlog enabled but I have some printk in the panel driver. It is initially enabled, then disabled and enabled again. Then the issues start...
>
>> Is that wayland you are trying to run? Do you have PVR driver enabled? Did you try to boot vanilla kernel?
>
> I have tested with Debian Stretch with standard Xorg with "omap" driver. PVR is not enabled.

Have cross-checked: my setup uses the fbdev driver.

> And without your patch everything is fine on all kernels since 4.something.
>
> Vanilla kernel can not be booted on that machine - there is not even a device tree...
>
>>
>>> [ 39.419846] WARNING: CPU: 0 PID: 3673 at drivers/bus/omap_l3_noc.c:139 l3_interrupt_handler+0x23c/0x330
>>> [ 39.429914] 44000000.l3-noc:L3 Custom Error: MASTER MPU TARGET GPMC (Idle): Data Access in Supervisor mode during Functional access
>>> ...
>>
>> I have no idea what that error is supposed to mean. @Tony?
>
> A coincidence is that the display is sometimes showing some artistic patterns. So maybe DMA accesses non-existing memory?
>
>>
>>> making the system unuseable.
>>> After doing some manual bisect by installing different kernel versions on the boot SD card,
>>> I was able to identify that it crept in between v5.18 and v5.19-rc1. A git bisect on this
>>> range (adding Letux patches on top of each bisect base) did reveal this patch as the first bad one.
>>> After reverting it seems as if I can use any v5.19 .. v6.1-rc2 kernel without issues.
>>> Now I wonder why this patch breaks my system?
>>
>> A wild guess - omap5 has some cache issues (as is visible from 7cb0d6c17b96b8bf3c25de2dfde4fdeb9191f4c3), which lead to the above. Before the patch *all* access to the BO backing memory was done through TILER/DMM, mitigating the issue. After the patch, whoever tries to render to non-scanout buffer is doing it directly to the memory, causing the issue.
>>
>> Another possibility - someone assumes that memory is always linear, which is true when it is accessed through DMM, but it is not after the patch. Do you have my "drm: pvrsgx: dmabuf import - Do not assume scatterlist memory is contiguous" patch in your PVR driver? Maybe there is another driver that lacks similar patch.
>
> Yes, it is included.
>
> Best regards,
> Nikolaus
>
>>
>> Regards,
>> Ivo
>>
>>> BR and thanks,
>>> Nikolaus
>>>> Am 19.01.2022 um 11:23 schrieb Ivaylo Dimitrov <ivo.g.dimitrov.75@xxxxxxxxx>:
>>>>
>>>> On devices with DMM, all allocations are done through either DMM or TILER.
>>>> DMM/TILER being a limited resource means that such allocations will start
>>>> to fail before actual free memory is exhausted. What is even worse is that
>>>> with time DMM/TILER space gets fragmented to the point that even if we have
>>>> enough free DMM/TILER space and free memory, allocation fails because there
>>>> is no big enough free block in DMM/TILER space.
>>>>
>>>> Such failures can be easily observed with OMAP xorg DDX, for example -
>>>> starting few GUI applications (so buffers for their windows are allocated)
>>>> and then rotating landscape<->portrait while closing and opening new
>>>> windows soon results in allocation failures.
>>>>
>>>> Fix that by mapping buffers through DMM/TILER only when really needed,
>>>> like, for scanout buffers.
>>>>
>>>> Signed-off-by: Ivaylo Dimitrov <ivo.g.dimitrov.75@xxxxxxxxx>
>>>> ---
>>>> drivers/gpu/drm/omapdrm/omap_gem.c | 12 ++++++++----
>>>> 1 file changed, 8 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/omapdrm/omap_gem.c b/drivers/gpu/drm/omapdrm/omap_gem.c
>>>> index 41c1a6d..cf57179 100644
>>>> --- a/drivers/gpu/drm/omapdrm/omap_gem.c
>>>> +++ b/drivers/gpu/drm/omapdrm/omap_gem.c
>>>> @@ -821,10 +821,12 @@ int omap_gem_pin(struct drm_gem_object *obj, dma_addr_t *dma_addr)
>>>> if (ret)
>>>> goto fail;
>>>>
>>>> - if (priv->has_dmm) {
>>>> - ret = omap_gem_pin_tiler(obj);
>>>> - if (ret)
>>>> - goto fail;
>>>> + if (omap_obj->flags & OMAP_BO_SCANOUT) {
>>>> + if (priv->has_dmm) {
>>>> + ret = omap_gem_pin_tiler(obj);
>>>> + if (ret)
>>>> + goto fail;
>>>> + }
>>>> }
>>>> } else {
>>>> refcount_inc(&omap_obj->pin_cnt);
>>>> @@ -861,6 +863,8 @@ static void omap_gem_unpin_locked(struct drm_gem_object *obj)
>>>> kfree(omap_obj->sgt);
>>>> omap_obj->sgt = NULL;
>>>> }
>>>> + if (!(omap_obj->flags & OMAP_BO_SCANOUT))
>>>> + return;
>>>> if (priv->has_dmm) {
>>>> ret = tiler_unpin(omap_obj->block);
>>>> if (ret) {
>>>> --
>>>> 1.9.1
>>>>
>