Re: [PATCH v3] drm/sched: Fix deadlock in drm_sched_entity_kill_jobs_cb

From: Pierre-Eric Pelloux-Prayer

Date: Tue Nov 04 2025 - 10:37:02 EST




Le 04/11/2025 à 16:30, Philipp Stanner a écrit :
On Tue, 2025-11-04 at 16:24 +0100, Pierre-Eric Pelloux-Prayer wrote:


Le 04/11/2025 à 13:43, Philipp Stanner a écrit :


Some things I have unfortunately overlooked below.


Fixes: 2fdb8a8f07c2 ("drm/scheduler: rework entity flush, kill and fini")

We should +Cc stable. It's a deadlock after all.

OK.


Link: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13908
Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov@xxxxxxxxx>
Suggested-by: Christian König <christian.koenig@xxxxxxx>
Reviewed-by: Christian König <christian.koenig@xxxxxxx>
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@xxxxxxx>
---
  drivers/gpu/drm/scheduler/sched_entity.c | 34 +++++++++++++-----------
  1 file changed, 19 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
index c8e949f4a568..fe174a4857be 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -173,26 +173,15 @@ int drm_sched_entity_error(struct drm_sched_entity *entity)
  }
  EXPORT_SYMBOL(drm_sched_entity_error);
+static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f,
+   struct dma_fence_cb *cb);

It's far better to move the function up instead. Can you do that?

Since drm_sched_entity_kill_jobs_cb uses drm_sched_entity_kill_jobs and vice
versa, I'll have to forward declare one of the 2 functions anyway.

Ah, right.
OK then.

I can push this and +Cc stable in the commit message if you want.


Would be great, thanks!

Pierre-Eric