Re: Change eats memory on my server

From: Thomas Zimmermann
Date: Mon Jan 18 2021 - 04:47:32 EST


Hi

Am 18.01.21 um 10:13 schrieb Eli Cohen:
On Mon, Jan 18, 2021 at 08:54:07AM +0100, Thomas Zimmermann wrote:
Hi

Am 18.01.21 um 08:43 schrieb Christian König:
Hi Eli,

have you already tried using kmemleak?

This sounds like a leak of memory allocated using kmalloc(), so kmemleak
should be able to catch it.

I have an idea what happens here. When the refcount is 0 in kmap, a new page
mapping for the BO is being established. But VRAM helpers unmap the previous
pages only on BO moves or frees; not in kunmap. So the old mapping might
still be around. I'll send out a test patch later today.


Great! Looking forward to test it.

Here's the patch against the latest DRM tree. v5.11-rc3 should work as well.

I was able to reproduce the memory leak locally and found that the patch fixes it. Please give it a try.

Best regards
Thomas


Best regards
Thomas


Regards,
Christian.

Am 17.01.21 um 06:08 schrieb Eli Cohen:
On Fri, Jan 15, 2021 at 10:03:50AM +0100, Thomas Zimmermann wrote:
Could you please double-check that 3fb91f56aea4 ("drm/udl: Retrieve USB
device from struct drm_device.dev") works correctly
Checked again, it does not seem to leak.

and that 823efa922102
("drm/cma-helper: Remove empty drm_gem_cma_prime_vunmap()") is broken?

Yes, this one leaks, as does the one preceding it:

1086db71a1db ("drm/vram-helper: Remove invariant parameters from
internal kmap function")
For one of the broken commits, could you please send us the output of

   dmesg | grep -i drm

after most of the memory got leaked?

I ran the following script in the shell:

while true; do cat /proc/meminfo | grep MemFree:; sleep 5; done

and this is what I saw before I got disconnected from the shell:

MemFree:          148208 kB
MemFree:          148304 kB
MemFree:          146660 kB
Connection to nps-server-24 closed by remote host.
Connection to nps-server-24 closed.


I also mointored the output of dmesg | grep -i drm
The last output I was able to save on disk is this:

[   46.140720] ast 0000:03:00.0: [drm] Using P2A bridge for configuration
[   46.140737] ast 0000:03:00.0: [drm] AST 2500 detected
[   46.140754] ast 0000:03:00.0: [drm] Analog VGA only
[   46.140772] ast 0000:03:00.0: [drm] dram MCLK=800 Mhz type=7
bus_width=16
[   46.153553] [drm] Initialized ast 0.1.0 20120228 for 0000:03:00.0
on minor 0
[   46.165097] fbcon: astdrmfb (fb0) is primary device
[   46.391381] ast 0000:03:00.0: [drm] fb0: astdrmfb frame buffer device
[   56.097697] systemd[1]: Starting Load Kernel Module drm...
[   56.343556] systemd[1]: modprobe@drm.service: Succeeded.
[   56.350382] systemd[1]: Finished Load Kernel Module drm.
[13319.469462] [   2683] 70889  2683    55586        0    73728
138             0 tdrm
[13320.658386] [   2683] 70889  2683    55586        0    73728
138             0 tdrm
[13321.800970] [   2683] 70889  2683    55586        0    73728
138             0 tdrm

_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel

--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer





--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer
From e8462600662621db47bccf8174bf683513aa7102 Mon Sep 17 00:00:00 2001
From: Thomas Zimmermann <tzimmermann@xxxxxxx>
Date: Mon, 18 Jan 2021 09:58:07 +0100
Subject: [PATCH] drm/vram-helper: Reuse existing page mappings in vmap

For performance, BO page mappings can stay in place even if the
map counter has returned to 0. In these cases, the existing page
mapping has to be reused by the next vmap operation. Otherwise
a new mapping would be installed and the old mapping's pages leak.

Fix the issue by reusing existing page mappings for vmap operations.

Signed-off-by: Thomas Zimmermann <tzimmermann@xxxxxxx>
Reported-by: Eli Cohen <elic@xxxxxxxxxx>
---
drivers/gpu/drm/drm_gem_vram_helper.c | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem_vram_helper.c b/drivers/gpu/drm/drm_gem_vram_helper.c
index 02ca22e90290..a57790b0d985 100644
--- a/drivers/gpu/drm/drm_gem_vram_helper.c
+++ b/drivers/gpu/drm/drm_gem_vram_helper.c
@@ -387,9 +387,16 @@ static int drm_gem_vram_kmap_locked(struct drm_gem_vram_object *gbo,
if (gbo->vmap_use_count > 0)
goto out;

- ret = ttm_bo_vmap(&gbo->bo, &gbo->map);
- if (ret)
- return ret;
+ /*
+ * VRAM helpers unmap the BO only on demand. So the previous
+ * page mapping might still be arround. Only vmap if the there's
+ * no mapping present.
+ */
+ if (dma_buf_map_is_null(&gbo->map)) {
+ ret = ttm_bo_vmap(&gbo->bo, &gbo->map);
+ if (ret)
+ return ret;
+ }

out:
++gbo->vmap_use_count;
@@ -577,6 +584,7 @@ static void drm_gem_vram_bo_driver_move_notify(struct drm_gem_vram_object *gbo,
return;

ttm_bo_vunmap(bo, &gbo->map);
+ dma_buf_map_clear(&gbo->map); /* explicitly clear mapping for next vmap call */
}

static int drm_gem_vram_bo_driver_move(struct drm_gem_vram_object *gbo,
--
2.29.2

Attachment: OpenPGP_signature
Description: OpenPGP digital signature