Re: [PATCH sched_ext/for-7.1-fixes] sched_ext: Fix deadlock between scx_root_disable() and concurrent forks
From: Tejun Heo
Date: Sun May 17 2026 - 13:25:54 EST
Hello,
On Sun, May 17, 2026 at 12:56:34PM +0200, Andrea Righi wrote:
> > + * Must come after scx_switching_all test. While both are set, we must
> > + * return true via the branch above: [__]scx_switching_all are cleared
> > + * together under scx_enable_mutex, and a fork routed to fair while
> > + * __scx_switched_all is still on would stall because
> > + * next_active_class() skips fair.
>
> Just being extra picky: [__]scx_switching_all are cleared together sequentially,
> but not atomically (in fact the order is what matters). To make it more clear,
> how about rephrasing the comment block above like this:
>
> * Must come after the scx_switching_all test. scx_root_disable()
> * clears __scx_switched_all before scx_switching_all (both under
> * scx_enable_mutex), so while scx_switching_all is observed as true,
> * __scx_switched_all may still be on. A fork routed to fair in that
> * window would stall because next_active_class() skips fair.
Hmm... I don't think the ordering between scx_switching_all and
__scx_switching_all matters here. The stall is caused by the gap between the
earlier DISABLING transition and __scx_switching_all being turned off which
here is tested through scx_switching_all and at this point as the mutex is
already held, even if you swapped scx_switching_all's position with
__scx_switching_all, it wouldn't matter. It's just kinda confusing because
what's actually involved in the stall and deadlock is __scx_switching_all
but we're testing it via scx_switching_all. I'll update the comment so that
it just mentions __scx_switching_all. I'm not even sure we actually need
scx_switching_all.
Thanks.
--
tejun