Re: [PATCH sched_ext/for-6.12] sched_ext: TASK_DEAD tasks must be switched out of SCX on ops_disable

From: Tejun Heo
Date: Wed Sep 04 2024 - 16:23:21 EST


On Fri, Aug 30, 2024 at 01:44:40PM -1000, Tejun Heo wrote:
> scx_ops_disable_workfn() only switches !TASK_DEAD tasks out of SCX while
> calling scx_ops_exit_task() on all tasks including dead ones. This can leave
> a dead task on SCX but with SCX_TASK_NONE state, which is inconsistent.
>
> If another task was in the process of changing the TASK_DEAD task's
> scheduling class and grabs the rq lock after scx_ops_disable_workfn() is
> done with the task, the task ends up calling scx_ops_disable_task() on the
> dead task which is in an inconsistent state triggering a warning:
>
> WARNING: CPU: 6 PID: 3316 at kernel/sched/ext.c:3411 scx_ops_disable_task+0x12c/0x160
> ...
> RIP: 0010:scx_ops_disable_task+0x12c/0x160
> ...
> Call Trace:
> <TASK>
> check_class_changed+0x2c/0x70
> __sched_setscheduler+0x8a0/0xa50
> do_sched_setscheduler+0x104/0x1c0
> __x64_sys_sched_setscheduler+0x18/0x30
> do_syscall_64+0x7b/0x140
> entry_SYSCALL_64_after_hwframe+0x76/0x7e
> RIP: 0033:0x7f140d70ea5b
>
> There is no reason to leave dead tasks on SCX when unloading the BPF
> scheduler. Fix by making scx_ops_disable_workfn() eject all tasks including
> the dead ones from SCX.
>
> Signed-off-by: Tejun Heo <tj@xxxxxxxxxx>

Applied to sched_ext/for-6.12.

Thanks.

--
tejun