Re: [PATCH] sched_ext: Fix NULL pointer deref and warnings during scx teardown

From: Andrea Righi

Date: Tue Feb 03 2026 - 09:02:40 EST


On Mon, Feb 02, 2026 at 11:50:05PM +0100, Andrea Righi wrote:
> On Mon, Feb 02, 2026 at 10:52:04AM -1000, Tejun Heo wrote:
> > On Mon, Feb 02, 2026 at 07:54:50PM +0100, Andrea Righi wrote:
> > > I'm able to reproduce the NULL pointer dereference in set_cpu_allowed_scx()
> > > quite easily running `stress-ng --race-sched 0` with an scx scheduler that
> > > is intentionally starving tasks, triggering a stall => disable.
> > >
> > > I think this is what's happening:
> > >
> > > CPU0 CPU1
> > > ---- ----
> > > __sched_setscheduler()
> > > task_rq_lock(p)
> > >
> > > next_class = __setscheduler_class()
> > > // next_class is ext_sched_class
> > > scx_disable_workfn()
> > > scx_set_enable_state(SCX_DISABLING)
> > >
> > > scx_task_iter_start()
> > > while ((p = next())) {
> > > ...
> > > p->sched_class = fair_sched_class
> > > ...
> > > }
> > > scx_task_iter_stop()
> > >
> > > synchronize_rcu()
> > > RCU_INIT_POINTER(scx_root, NULL)
> > >
> > > scoped_guard(sched_change, ...) {
> > > p->sched_class = next_class;
> > > // next_class is still ext_sched_class,
> > > // overwriting fair_sched_class!
> > > }
> > > // Guard ends, calls sched_change_end()
> > > // switching_to_scx() called
> > > // scx_root == NULL => returns early
> > >
> > > task_rq_unlock(p)
> > >
> > > sched_setaffinity(p)
> > > set_cpus_allowed_scx()
> > > sch = scx_root; // scx_root == NULL => BUG!
> >
> > Does the following patch fix the issue?
>
> Nope, I can still trigger this (with the same modified scx_simple +
> stress-ng --race-sched 0:

A quick reproducer:
https://github.com/sched-ext/scx/tree/scx-bug

$ make
$ vng -vr -- "stress-ng --race-sched 0 & ./build/scheds/c/scx_bug"
...
[ 3.375119] BUG: kernel NULL pointer dereference, address: 00000000000001c0
[ 3.375836] RIP: 0010:set_cpus_allowed_scx+0x1a/0xa0

It happens almost immediately.

-Andrea