Re: [PATCH 1/3] sched_ext: fix NULL deref in bpf_scx_unreg() due to ops->priv race

From: Tejun Heo

Date: Wed Mar 25 2026 - 22:51:03 EST


On Thu, Mar 26, 2026 at 10:28:25AM +0800, zhidao su wrote:
> The reload_loop selftest triggers a KASAN null-ptr-deref at
> scx_claim_exit+0x83 when two threads concurrently attach and
> destroy BPF schedulers using the same ops map.
>
> The race occurs between bpf_scx_unreg() and a concurrent reg():
>
> 1. Thread A's bpf_scx_unreg() calls scx_disable() then
> kthread_flush_work(), which blocks until disable completes
> and transitions state back to SCX_DISABLED.
>
> 2. With state SCX_DISABLED, a concurrent reg() allocates a
> new sch_B and sets ops->priv = sch_B under scx_enable_mutex.
>
> 3. Thread A's bpf_scx_unreg() then executes
> RCU_INIT_POINTER(ops->priv, NULL), overwriting sch_B.
>
> 4. When Thread B's link is destroyed, bpf_scx_unreg() reads
> ops->priv == NULL and passes it to scx_disable(), which
> calls scx_claim_exit(NULL), crashing at NULL+0x310.
>
> Fix by adding a NULL guard for the case where ops->priv was
> never set, and by acquiring scx_enable_mutex before clearing
> ops->priv so that the check-and-clear is atomic with respect
> to reg() which also sets ops->priv under scx_enable_mutex.

Can you reproduce this? How do you trigger enable on the same ops that has
already been enabled?

Thanks.

--
tejun