Re: [PATCH v2] drm/atomic: Unref duplicated drm_atomic_state in drm_atomic_helper_resume()

From: Maarten Lankhorst
Date: Mon Oct 09 2017 - 07:57:12 EST


Op 09-10-17 om 08:46 schreef Jeffy Chen:
> Kmemleak reported memory leak after suspend and resume:
> unreferenced object 0xffffffc0e31d8880 (size 128):
> comm "bash", pid 181, jiffies 4294763583 (age 24.694s)
> hex dump (first 32 bytes):
> 01 00 00 00 00 00 00 00 00 20 a2 eb c0 ff ff ff ......... ......
> 01 00 00 00 00 00 00 00 80 87 1d e3 c0 ff ff ff ................
> backtrace:
> [<ffffffc00034bb64>] __save_stack_trace+0x48/0x6c
> [<ffffffc00034c244>] create_object+0x138/0x254
> [<ffffffc0009dd218>] kmemleak_alloc+0x58/0x8c
> [<ffffffc000346de4>] kmem_cache_alloc_trace+0x188/0x254
> [<ffffffc0005af4c0>] drm_atomic_state_alloc+0x3c/0x88
> [<ffffffc000591f0c>] drm_atomic_helper_duplicate_state+0x28/0x158
> [<ffffffc000592098>] drm_atomic_helper_suspend+0x5c/0xf0
>
> Problem here is that we are duplicating the drm_atomic_state in
> drm_atomic_helper_suspend(), but not unreference it in the resume path.
>
> Fixes: 1494276000db ("drm/atomic-helper: Implement subsystem-level suspend/resume")
> Signed-off-by: Jeffy Chen <jeffy.chen@xxxxxxxxxxxxxx>
> ---
>
> Changes in v2:
> Unref duplicated drm_atomic_state in drm_atomic_helper_resume() instead
> of specific drivers.
>
> drivers/gpu/drm/drm_atomic_helper.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
> index 01c34bc5b5b0..4a262380c631 100644
> --- a/drivers/gpu/drm/drm_atomic_helper.c
> +++ b/drivers/gpu/drm/drm_atomic_helper.c
> @@ -3052,6 +3052,7 @@ int drm_atomic_helper_resume(struct drm_device *dev,
> drm_modeset_backoff(&ctx);
> }
>
> + drm_atomic_state_put(state);
> drm_modeset_drop_locks(&ctx);
> drm_modeset_acquire_fini(&ctx);
>

Reviewed-by: Maarten Lankhorst <maarten.lankhorst@xxxxxxxxxxxxxxx>
Fixes: 0853695c3ba4 ("drm: Add reference counting to drm_atomic_state")
Cc: <stable@xxxxxxxxxxxxxxx> # v4.10+

and pushed, thanks for finding it. :)

The bug is probably older than that commit, but only happened on failure paths before. If resume fails we probably have bigger issues than leaking some memory.