Re: [PATCHSET sched_ext/for-6.12-fixes] sched_ext: Fix RCU and other stalls while iterating tasks during enable/disable

From: Tejun Heo
Date: Thu Oct 10 2024 - 17:43:38 EST


On Wed, Oct 09, 2024 at 11:40:56AM -1000, Tejun Heo wrote:
> The enable/disable paths walk all tasks a couple times in bypass mode. There
> are a couple problems:
>
> - Bypass mode incorrectly depends on ops.select_cpu() which must not be
> trusted in bypass mode.
>
> - scx_tasks_lock is held while walking all tasks. This can lead to RCU and
> other stalls on a large heavily contended system with many tasks.
>
> Fix the former by always using the default select_cpu() in bypass mode and
> the latter by periodically dropping scx_tasks_lock while iterating tasks.
>
> This patchset contains the following patches:
>
> 0001-Revert-sched_ext-Use-shorter-slice-while-bypassing.patch
> 0002-sched_ext-Start-schedulers-with-consistent-p-scx.sli.patch
> 0003-sched_ext-Move-scx_buildin_idle_enabled-check-to-scx.patch
> 0004-sched_ext-bypass-mode-shouldn-t-depend-on-ops.select.patch
> 0005-sched_ext-Move-scx_tasks_lock-handling-into-scx_task.patch
> 0006-sched_ext-Don-t-hold-scx_tasks_lock-for-too-long.patch

Applied to sched_ext/for-6.12-fixes.

Thanks.

--
tejun