Re: [PATCH] sched_ext: Fix NULL pointer deref and warnings during scx teardown
From: Tejun Heo
Date: Mon Feb 02 2026 - 15:56:26 EST
On Mon, Feb 02, 2026 at 07:54:50PM +0100, Andrea Righi wrote:
> I'm able to reproduce the NULL pointer dereference in set_cpu_allowed_scx()
> quite easily running `stress-ng --race-sched 0` with an scx scheduler that
> is intentionally starving tasks, triggering a stall => disable.
>
> I think this is what's happening:
>
> CPU0 CPU1
> ---- ----
> __sched_setscheduler()
> task_rq_lock(p)
>
> next_class = __setscheduler_class()
> // next_class is ext_sched_class
> scx_disable_workfn()
> scx_set_enable_state(SCX_DISABLING)
>
> scx_task_iter_start()
> while ((p = next())) {
> ...
> p->sched_class = fair_sched_class
> ...
> }
> scx_task_iter_stop()
>
> synchronize_rcu()
> RCU_INIT_POINTER(scx_root, NULL)
>
> scoped_guard(sched_change, ...) {
> p->sched_class = next_class;
> // next_class is still ext_sched_class,
> // overwriting fair_sched_class!
> }
> // Guard ends, calls sched_change_end()
> // switching_to_scx() called
> // scx_root == NULL => returns early
>
> task_rq_unlock(p)
>
> sched_setaffinity(p)
> set_cpus_allowed_scx()
> sch = scx_root; // scx_root == NULL => BUG!
Does the following patch fix the issue?
Thanks.
diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index 136b01950a62..1fc2b358a175 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -4234,7 +4234,13 @@ static void scx_disable_workfn(struct kthread_work *work)
* Here, every runnable task is guaranteed to make forward progress and
* we can safely use blocking synchronization constructs. Actually
* disable ops.
+ *
+ * Wait for all CPUs to observe %SCX_DISABLING. Otherwise,
+ * task_should_scx() can see %SCX_ENABLED and __sched_setscheduler() put
+ * a task into sched_ext while we're migrating tasks out below.
*/
+ synchronize_rcu();
+
mutex_lock(&scx_enable_mutex);
static_branch_disable(&__scx_switched_all);