[PATCH 2/6] drm/msm: Recover HW before retire hung submit

From: Akhil P Oommen

Date: Thu Jun 04 2026 - 16:10:37 EST


From: Jie Zhang <jie.zhang@xxxxxxxxxxxxxxxx>

During recovery, it is not safe to retire the hung submit before we
recover the GPU. Retiring the submit triggers BO free and that can
result in GPU pagefaults since the GPU may be actively accessing those
BOs.

To fix this, retire the submits after gpu recovery is complete in
recover_worker().

Fixes: 1a370be9ac51 ("drm/msm: restart queued submits after hang")
Signed-off-by: Jie Zhang <jie.zhang@xxxxxxxxxxxxxxxx>
Signed-off-by: Akhil P Oommen <akhilpo@xxxxxxxxxxxxxxxx>
---
drivers/gpu/drm/msm/msm_gpu.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index 18ed00e5f143..9ac7740a87f0 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -552,11 +552,11 @@ static void recover_worker(struct kthread_work *work)
msm_update_fence(ring->fctx, fence);
}

+ gpu->funcs->recover(gpu);
+
/* retire completed submits, plus the one that hung: */
retire_submits(gpu);

- gpu->funcs->recover(gpu);
-
/*
* Replay all remaining submits starting with highest priority
* ring

--
2.51.0