[PATCH v4 1/2] sched/fair: Rebuild load weight when switching to fair

From: Zicheng Qu

Date: Wed Jun 24 2026 - 05:45:20 EST


Tasks that run outside fair may not keep p->se.load in sync with their
current scheduling policy and static priority. sched_ext, for example,
uses p->scx.weight as the active scheduling weight, so p->se.load can be
stale when a task moves back to fair.

The fair_sched_class expects the sched_entity load weight to be valid
before the task is enqueued. Rebuild it from fair's switching_to hook,
which runs after the class has been changed to fair and before enqueue,
so both sched_ext disable and SCHED_EXT to SCHED_NORMAL transitions get
a native fair load weight.

Fixes: f0e1a0643a59 ("sched_ext: Implement BPF extensible scheduler class")
Suggested-by: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Signed-off-by: Zicheng Qu <quzicheng@xxxxxxxxxx>
Acked-by: Tejun Heo <tj@xxxxxxxxxx>
---
Changes in v4:
- No code changes.
- Add Tejun's Acked-by.
- Split the PELT load_avg synchronization into patch 2.

Changes in v3:
- Move the rebuild into fair's switching_to hook, as suggested by Peter.
This lets fair prepare its own state before enqueue and avoids adding a
sched_ext/fair-specific fixup to the generic sched_change_end() path.

Changes in v2:
- Move the fix from scx_root_disable() to the class switch path so it also
covers partial-mode SCHED_EXT to SCHED_NORMAL transitions through
sched_setscheduler(). Andrea identified this missing case in the v1
discussion.

kernel/sched/fair.c | 10 ++++++++++
1 file changed, 10 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index d78467ec6ee1..edb11065a9fc 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -14998,6 +14998,15 @@ static void switching_from_fair(struct rq *rq, struct task_struct *p)
dequeue_task(rq, p, DEQUEUE_SLEEP | DEQUEUE_DELAYED | DEQUEUE_NOCLOCK);
}

+static void switching_to_fair(struct rq *rq, struct task_struct *p)
+{
+ /*
+ * Tasks may come from classes that don't keep se.load up to date.
+ * Rebuild it before the task is enqueued.
+ */
+ set_load_weight(p, false);
+}
+
static void switched_from_fair(struct rq *rq, struct task_struct *p)
{
detach_task_cfs_rq(p);
@@ -15379,6 +15388,7 @@ DEFINE_SCHED_CLASS(fair) = {
.reweight_task = reweight_task_fair,
.prio_changed = prio_changed_fair,
.switching_from = switching_from_fair,
+ .switching_to = switching_to_fair,
.switched_from = switched_from_fair,
.switched_to = switched_to_fair,

--
2.53.0