[PATCHSET v3 sched_ext/for-6.20] sched_ext: Fix ops.dequeue() semantics

From: Andrea Righi

Date: Mon Jan 26 2026 - 03:43:19 EST


The callback ops.dequeue() is provided to let BPF schedulers observe when a
task leaves the scheduler, either because it is dispatched or due to a task
property change. However, this callback is currently unreliable and not
invoked systematically, which can result in missed ops.dequeue() events.

In particular, once a task is removed from the scheduler (whether for
dispatch or due to a property change) the BPF scheduler loses visibility of
the task and the sched_ext core may not always trigger ops.dequeue().

This breaks accurate accounting (i.e., per-DSQ queued runtime sums) and
prevents reliable tracking of task lifecycle transitions.

This patch set fixes the semantics of ops.dequeue(), ensuring that every
ops.enqueue() is balanced by a corresponding ops.dequeue() invocation. In
addition, ops.dequeue() is now properly invoked when tasks are removed from
the sched_ext class, such as on task property changes.

To distinguish between a "regular" dequeue and a property change dequeue a
new dequeue flag is introduced: %SCX_DEQ_SCHED_CHANGE. BPF schedulers can
use this flag to distinguish between regular dispatch dequeues
(%SCX_DEQ_SCHED_CHANGE unset) and property change dequeues
(%SCX_DEQ_SCHED_CHANGE set).

Together, these changes allow BPF schedulers to reliably track task
ownership and maintain accurate accounting.

Changes in v3:
- Rename SCX_DEQ_ASYNC to SCX_DEQ_SCHED_CHANGE
- Handle core-sched dequeues (Kuba)
- Link to v2: https://lore.kernel.org/all/20260121123118.964704-1-arighi@xxxxxxxxxx/

Changes in v2:
- Distinguish between "dispatch" dequeues and "property change" dequeues
(flag SCX_DEQ_ASYNC)
- Link to v1: https://lore.kernel.org/all/20251219224450.2537941-1-arighi@xxxxxxxxxx

Andrea Righi (2):
sched_ext: Fix ops.dequeue() semantics
selftests/sched_ext: Add test to validate ops.dequeue() semantics

Documentation/scheduler/sched-ext.rst | 33 ++++
include/linux/sched/ext.h | 11 ++
kernel/sched/ext.c | 89 +++++++++-
kernel/sched/ext_internal.h | 7 +
tools/sched_ext/include/scx/enum_defs.autogen.h | 2 +
tools/sched_ext/include/scx/enums.autogen.bpf.h | 2 +
tools/sched_ext/include/scx/enums.autogen.h | 1 +
tools/testing/selftests/sched_ext/Makefile | 1 +
tools/testing/selftests/sched_ext/dequeue.bpf.c | 209 ++++++++++++++++++++++++
tools/testing/selftests/sched_ext/dequeue.c | 182 +++++++++++++++++++++
10 files changed, 534 insertions(+), 3 deletions(-)
create mode 100644 tools/testing/selftests/sched_ext/dequeue.bpf.c
create mode 100644 tools/testing/selftests/sched_ext/dequeue.c