[PATCH] sched: fair: Prevent negative lag increase during delayed dequeue

From: Vincent Guittot

Date: Fri Mar 27 2026 - 14:03:01 EST


Delayed dequeue feature aims to reduce the negative lag of a dequeued task
while sleeping but it can happens that newly enqueued tasks will move
backward the avg vruntime and increase its negative lag.
When the delayed dequeued task wakes up, it has more neg lag compared to
being dequeued immediately or to other delayed or not tasks that have been
dequeued just before theses new enqueues.

Ensure that the negative lag of a delayed dequeued task doesn't increase
during its delayed dequeued phase while waiting for its neg lag to
diseappear. Similarly, we remove any positive lag that the delayed
dequeued task could have gain during thsi period.

Short slice tasks are particularly impacted in overloaded system.

Test on snapdragon rb5:

hackbench -T -p -l 16000000 -g 2 1> /dev/null &
cyclictest -t 1 -i 2777 -D 333 --policy=fair --mlock -h 20000 -q

The scheduling latency of cyclictest is:

tip/sched/core tip/sched/core +this patch
cyclictest slice (ms) (default)2.8 8 8
hackbench slice (ms) (default)2.8 20 20
Total Samples | 115632 119733 119806
Average (us) | 364 64(-82%) 61(- 5%)
Median (P50) (us) | 60 56(- 7%) 56( 0%)
90th Percentile (us) | 1166 62(-95%) 62( 0%)
99th Percentile (us) | 4192 73(-98%) 72(- 1%)
99.9th Percentile (us) | 8528 2707(-68%) 1300(-52%)
Maximum (us) | 17735 14273(-20%) 13525(- 5%)

Signed-off-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
---
kernel/sched/fair.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 226509231e67..efa9dfa8c583 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5595,6 +5595,7 @@ dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
if (sched_feat(DELAY_DEQUEUE) && delay &&
!entity_eligible(cfs_rq, se)) {
update_load_avg(cfs_rq, se, 0);
+ update_entity_lag(cfs_rq, se);
set_delayed(se);
return false;
}
@@ -7089,12 +7090,16 @@ requeue_delayed_entity(struct sched_entity *se)
WARN_ON_ONCE(!se->on_rq);

if (sched_feat(DELAY_ZERO)) {
+ s64 vlag, prev_vlag = se->vlag;
update_entity_lag(cfs_rq, se);
- if (se->vlag > 0) {
+ /* prev_vlag < 0 otherwise se would not be delayed */
+ vlag = clamp(se->vlag, prev_vlag, 0);
+
+ if (vlag != se->vlag) {
cfs_rq->nr_queued--;
if (se != cfs_rq->curr)
__dequeue_entity(cfs_rq, se);
- se->vlag = 0;
+ se->vlag = vlag;
place_entity(cfs_rq, se, 0);
if (se != cfs_rq->curr)
__enqueue_entity(cfs_rq, se);
--
2.43.0