[PATCH v2 sched_ext/for-7.1-fixes] sched_ext: Reset dsq_vtime and slice when a task leaves SCX
From: Andrea Righi
Date: Mon Jun 08 2026 - 13:07:29 EST
When a task leaves the BPF scheduler's control, p->scx.dsq_vtime and
p->scx.slice keep whatever values they last held. The slice value is
core-managed and is refilled on the next enqueue, but dsq_vtime is owned
by the BPF scheduler and is never cleared by the core, so a task that
leaves SCX and later returns carries a stale dsq_vtime across the
round-trip.
The stale values are also visible to other SCX schedulers that inspect
the scx fields of non-SCX tasks.
Fix this by resetting both dsq_vtime and slice in scx_disable_task(),
after ops.disable(), so the BPF scheduler can still observe the task's
final values and non-SCX tasks do not retain stale SCX state.
Signed-off-by: Andrea Righi <arighi@xxxxxxxxxx>
---
Changes in v2:
- Reset SCX values in scx_disable_task() (Sashiko)
- Link to v1: https://lore.kernel.org/all/20260608134908.3232097-1-arighi@xxxxxxxxxx/
kernel/sched/ext.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index 8e88a25bc602f..01b4f29de073c 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -3659,6 +3659,13 @@ static void scx_disable_task(struct scx_sched *sch, struct task_struct *p)
SCX_CALL_OP_TASK(sch, disable, rq, p);
scx_set_task_state(p, SCX_TASK_READY);
+ /*
+ * Reset the SCX-managed fields when @p leaves the BPF scheduler's
+ * control, after ops.disable() has observed their final values.
+ */
+ p->scx.dsq_vtime = 0;
+ p->scx.slice = 0;
+
/*
* Verify the task is not in BPF scheduler's custody. If flag
* transitions are consistent, the flag should always be clear
--
2.54.0