[PATCH v2 1/2] drm/panfrost: Handle resetting on timeout better

From: Steven Price
Date: Wed Oct 09 2019 - 05:45:04 EST

Panfrost uses multiple schedulers (one for each slot, so 2 in reality),
and on a timeout has to stop all the schedulers to safely perform a
reset. However more than one scheduler can trigger a timeout at the same
time. This race condition results in jobs being freed while they are
still in use.

When stopping other slots use cancel_delayed_work_sync() to ensure that
any timeout started for that slot has completed. Also use
mutex_trylock() to obtain reset_lock. This means that only one thread
attempts the reset, the other threads will simply complete without doing
anything (the first thread will wait for this in the call to

While we're here and since the function is already dependent on
sched_job not being NULL, let's remove the unnecessary checks.

Fixes: aa20236784ab ("drm/panfrost: Prevent concurrent resets")
Tested-by: Neil Armstrong <narmstrong@xxxxxxxxxxxx>
Signed-off-by: Steven Price <steven.price@xxxxxxx>
* Added fixes and tested-by tags
* Moved cleanup of panfrost_core_dump() comment to separate patch

drivers/gpu/drm/panfrost/panfrost_job.c | 16 +++++++++++-----
1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
index a58551668d9a..21f34d44aac2 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -381,13 +381,19 @@ static void panfrost_job_timedout(struct drm_sched_job *sched_job)
job_read(pfdev, JS_TAIL_LO(js)),

- mutex_lock(&pfdev->reset_lock);
+ if (!mutex_trylock(&pfdev->reset_lock))
+ return;

- for (i = 0; i < NUM_JOB_SLOTS; i++)
- drm_sched_stop(&pfdev->js->queue[i].sched, sched_job);
+ for (i = 0; i < NUM_JOB_SLOTS; i++) {
+ struct drm_gpu_scheduler *sched = &pfdev->js->queue[i].sched;
+ drm_sched_stop(sched, sched_job);
+ if (js != i)
+ /* Ensure any timeouts on other slots have finished */
+ cancel_delayed_work_sync(&sched->work_tdr);
+ }

- if (sched_job)
- drm_sched_increase_karma(sched_job);
+ drm_sched_increase_karma(sched_job);

spin_lock_irqsave(&pfdev->js->job_lock, flags);
for (i = 0; i < NUM_JOB_SLOTS; i++) {