Re: [PATCH v2 sched_ext/for-7.1-fixes] sched_ext: Fix deadlock between scx_root_disable() and concurrent forks

From: Tejun Heo

Date: Sun May 17 2026 - 15:08:55 EST


Hello,

On Sun, May 17, 2026 at 08:47:31PM +0200, Andrea Righi wrote:
...
> Yeah, this is much better than my comment (that was quite confusing).
>
> To make sure I understand: what fixes the deadlock is checking scx_switching_all
> before DISABLING in task_should_scx(), because in this way the sched_ext_helper
> kthread goes to scx (not fair), runs, the enable path completes, releases the
> mutex and the disable path moves forward.
>
> When I wrote my comment I was looking at the ordering of [__]scx_switched_all in
> scx_root_disable():
>
> static_branch_disable(&__scx_switched_all);
> WRITE_ONCE(scx_switching_all, false);
>
> And I was wondering, if we invert those we'd have a similar issue: a small
> window where __scx_switched_all == ON and scx_switching_all == false. But the
> current order is already the safe one, so no change needed.

Yeah, and even if create that window between __scx_switched_all and
scx_switching_all, it's transient. Let's say a task slips into eevdf between
the two. The task has no way of preventing disable from completing
__scx_switched_all transition, and the condition would unwind. The problem
with DISABLING transition was that it could make a racing enable path to
wait for kthread creation to finish while holding enable_mutex. Because
disable path needs the same mutex to turn off __scx_switched_all and the
stalled task needs __scx_switched_all to be turned off to progress, we end
up in a deadlock.

Thanks.

--
tejun