Re: [PATCH v2] drm/amdgpu: fix fence reference leak in amdgpu_gfx_run_cleaner_shader_job

From: Christian König

Date: Wed Jun 17 2026 - 04:44:45 EST




On 6/16/26 17:35, Wentao Liang wrote:
> In amdgpu_gfx_run_cleaner_shader_job(), amdgpu_job_submit() returns a
> dma_fence with an elevated reference count. The function correctly
> releases this reference on the success path after dma_fence_wait().
> However, if dma_fence_wait() fails (though with infinite timeout and
> non-interruptible it never does), the code jumps to the error label
> without calling dma_fence_put(), resulting in a reference leak.
>
> Fix the potential leak by adding dma_fence_put(f) before the goto err
> when dma_fence_wait() returns an error.
>
> Fixes: 559a285816af ("drm/amdgpu: Replace 'amdgpu_job_submit_direct' with 'drm_sched_entity' in cleaner shader")
> Cc: stable@xxxxxxxxxxxxxxx
> Signed-off-by: Wentao Liang <vulab@xxxxxxxxxxx>
> ---
> v2: Also cleanup the scheduler entity and simplify error handling paths.
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 9 +++------
> 1 file changed, 3 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> index b8ca876694ff..be13ce6ce377 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> @@ -1658,7 +1658,7 @@ static int amdgpu_gfx_run_cleaner_shader_job(struct amdgpu_ring *ring)
> &sched, 1, NULL);
> if (r) {
> dev_err(adev->dev, "Failed setting up GFX kernel entity.\n");
> - goto err;
> + return r;
> }
>
> /*
> @@ -1686,16 +1686,13 @@ static int amdgpu_gfx_run_cleaner_shader_job(struct amdgpu_ring *ring)
> f = amdgpu_job_submit(job);
>
> r = dma_fence_wait(f, false);

I would just drop checking the error code here. A non interruptible dma_fence_wait() can never fail.

> - if (r)
> - goto err;
> + goto err;

That goto does looks correct to me. You are now always skipping the dma_fence_put(f) below.

Regards,
Christian.

>
> dma_fence_put(f);
>
> +err:
> /* Clean up the scheduler entity */
> drm_sched_entity_destroy(&entity);
> - return 0;
> -
> -err:
> return r;
> }
>