Re: [PATCHSET sched_ext/for-7.1] sched_ext: Implement SCX_ENQ_IMMED
From: Andrea Righi
Date: Sat Mar 07 2026 - 17:37:14 EST
Hi Tejun,
On Fri, Mar 06, 2026 at 02:28:14PM -1000, Tejun Heo wrote:
> Hello,
>
> SCX_ENQ_IMMED makes enqueue to local DSQs succeed only if the task can
> start running immediately - the current task is done and no other tasks are
> waiting. If the condition isn't met, the task is re-enqueued through
> ops.enqueue(). This gives the BPF scheduler tighter control over when tasks
> actually land on a CPU.
This looks interesting, but I'm trying to understand the typical use case
of this feature.
I agree that we need some kernel support to "atomically" determine when a
CPU is available (it can't be done fully in BPF). Initially I thought the
main target for ENQ_IMMED was to improve latency-sensitive workloads, but
this actually hurts latency, due to the additional re-enqueue cost and in
this case it might be better to be "less perfect" and not use ENQ_IMMED.
So I'm wondering if this feature is more focused at the multiple
sub-scheduler scenario, to prevent that a single scheduler can fill local
DSQs (effectively monopolizing a CPU while tasks sit in line). With
ENQ_IMMED, instead, we can put task on a CPU when it can run *right now*.
So the benefit is more in terms of fairness and isolation between
schedulers, rather than raw latency or throughput.
Am I understanding correctly? If that's the case it might be useful to
clarify this or describe some use cases that you have in mind.
Thanks,
-Andrea
>
> - Patch 1 disallows setting slice to zero via scx_bpf_task_set_slice() as
> zero slice is used by ENQ_IMMED to detect whether the current task is
> done.
>
> - Patch 2 implements SCX_ENQ_IMMED with reenqueue support and loop
> detection.
>
> - Patch 3 adds SCX_OPS_ALWAYS_ENQ_IMMED ops flag to automatically apply
> IMMED to all local DSQ enqueues.
>
> This patchset depends on:
>
> - "sched_ext: Overhaul DSQ reenqueue infrastructure"
> http://lkml.kernel.org/r/20260306190623.1076074-1-tj@xxxxxxxxxx
>
> Based on sched_ext/for-7.1 (4f8b122848db) + scx-reenq (a41719e6ae12).
>
> 0001-sched_ext-Disallow-setting-slice-to-zero-via-scx_bpf.patch
> 0002-sched_ext-Implement-SCX_ENQ_IMMED.patch
> 0003-sched_ext-Add-SCX_OPS_ALWAYS_ENQ_IMMED-ops-flag.patch
>
> Git tree:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext.git scx-enq-immed
>
> include/linux/sched/ext.h | 3 +
> kernel/sched/ext.c | 208 +++++++++++++++++++++++++++++++----
> kernel/sched/ext_internal.h | 43 ++++++++
> kernel/sched/sched.h | 2 +
> tools/sched_ext/include/scx/compat.h | 1 +
> tools/sched_ext/scx_qmap.bpf.c | 7 +-
> tools/sched_ext/scx_qmap.c | 9 +-
> 7 files changed, 250 insertions(+), 23 deletions(-)
>
> --
> tejun