[PATCH 1/2] drm/sched: Guard sched->ready with ACCESS_ONCE()

From: Philipp Stanner

Date: Mon Jun 29 2026 - 06:41:38 EST


commit faf6e1a87e07 ("drm/sched: Add boolean to mark if sched is ready to work v5")

moved tracking of the hardware ring's state from the driver (amdgpu in
that case) into drm_sched. To do so, it added a 'ready' flag to the
scheduler.

This flag is currently being accessed through drm_sched_wqueue_ready()
and, even worse, directly through the scheduler struct. Since drm_sched
does not have a consistent locking design, all these accesses are
potentially undefined behavior as they are subject to compiler
optimizations.

Make the code base more robust by guarding access to the 'ready' flag
with ACCESS_ONCE().

Signed-off-by: Philipp Stanner <phasta@xxxxxxxxxx>
---
drivers/gpu/drm/scheduler/sched_main.c | 17 ++++++++++++++---
1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index d2ca01b31ee4..c4e4ac436a86 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -929,7 +929,7 @@ drm_sched_pick_best(struct drm_gpu_scheduler **sched_list,
for (i = 0; i < num_sched_list; ++i) {
sched = sched_list[i];

- if (!sched->ready) {
+ if (!READ_ONCE(sched->ready)) {
DRM_WARN("scheduler %s is not ready, skipping",
sched->name);
continue;
@@ -1143,7 +1143,18 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched)

if (sched->own_submit_wq)
destroy_workqueue(sched->submit_wq);
- sched->ready = false;
+
+ /* The 'ready' flag only exists in drm_sched because amdgpu uses it to
+ * represent the state of its hardware rings. This problem is related to
+ * the fundamental issue of drm_sched not having a solid, consistent
+ * locking design.
+ *
+ * Obviously, it does not make sense at all to set this flag to false
+ * here, but since it's unclear whether it can ever be removed from
+ * amdgpu's point of view, we guard it here with WRITE_ONCE() to make it
+ * slightly less broken.
+ */
+ WRITE_ONCE(sched->ready, false);

if (!list_empty(&sched->pending_list))
dev_warn(sched->dev, "Tearing down scheduler while jobs are pending!\n");
@@ -1195,7 +1206,7 @@ EXPORT_SYMBOL(drm_sched_increase_karma);
*/
bool drm_sched_wqueue_ready(struct drm_gpu_scheduler *sched)
{
- return sched->ready;
+ return READ_ONCE(sched->ready);
}
EXPORT_SYMBOL(drm_sched_wqueue_ready);


base-commit: 6648301c5bb2ef23f0fb15bcb01d21ff66f36799
--
2.54.0