Re: [PATCHSET sched_ext/for-7.1] Fix sub-sched locking issues
From: Andrea Righi
Date: Tue Mar 10 2026 - 02:51:39 EST
On Mon, Mar 09, 2026 at 03:16:48PM -1000, Tejun Heo wrote:
> Hello,
>
> Cheng-Yang reported a lockdep circular dependency between scx_sched_lock and
> rq->__lock. scx_bypass() and sysrq_handle_sched_ext_dump() take
> scx_sched_lock -> rq lock, while scx_claim_exit() (reachable from many paths
> with rq lock held) takes rq -> scx_sched_lock. In addition, scx_disable()
> directly calling kthread_queue_work() under scx_sched_lock creates another
> chain through worker->lock -> pi_lock -> rq->__lock.
>
> This patchset fixes these issues:
>
> 1. Fix wrong sub_detach op check.
> 2. Add scx_dump_lock and dump_disabled to decouple dump from scx_sched_lock.
> 3. Always bounce scx_disable() through irq_work to avoid lock nesting.
> 4. Flip scx_bypass() lock order and drop scx_sched_lock from sysrq dump.
> 5. Reject sub-sched attachment to a disabled parent.
>
> Tested on three machines (16-CPU QEMU, 192-CPU dual-socket EPYC, AMD Ryzen)
> with lockdep trigger tests and an 11-test stress suite covering
> attach/detach, nesting, reverse teardown, rapid cycling, error injection,
> SysRq-D/S dump/exit, and combined stress. Lockdep triggered on baseline,
> clean after patches.
With the comment from Cheng-Yang about fixing the link in patch 4/5.
Reviewed-by: Andrea Righi <arighi@xxxxxxxxxx>
Thanks,
-Andrea