Re: [PATCH] sched: fair: Prevent negative lag increase during delayed dequeue

From: Peter Zijlstra

Date: Mon Mar 30 2026 - 10:18:15 EST


On Fri, Mar 27, 2026 at 07:02:19PM +0100, Vincent Guittot wrote:
> Delayed dequeue feature aims to reduce the negative lag of a dequeued task
> while sleeping but it can happens that newly enqueued tasks will move
> backward the avg vruntime and increase its negative lag.
> When the delayed dequeued task wakes up, it has more neg lag compared to
> being dequeued immediately or to other delayed or not tasks that have been
> dequeued just before theses new enqueues.
>
> Ensure that the negative lag of a delayed dequeued task doesn't increase
> during its delayed dequeued phase while waiting for its neg lag to
> diseappear. Similarly, we remove any positive lag that the delayed
> dequeued task could have gain during thsi period.

*groan*, indeed!


> Signed-off-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
> ---
> kernel/sched/fair.c | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 226509231e67..efa9dfa8c583 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -5595,6 +5595,7 @@ dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
> if (sched_feat(DELAY_DEQUEUE) && delay &&
> !entity_eligible(cfs_rq, se)) {
> update_load_avg(cfs_rq, se, 0);
> + update_entity_lag(cfs_rq, se);
> set_delayed(se);
> return false;
> }
> @@ -7089,12 +7090,16 @@ requeue_delayed_entity(struct sched_entity *se)
> WARN_ON_ONCE(!se->on_rq);
>
> if (sched_feat(DELAY_ZERO)) {
> + s64 vlag, prev_vlag = se->vlag;
> update_entity_lag(cfs_rq, se);
> - if (se->vlag > 0) {
> + /* prev_vlag < 0 otherwise se would not be delayed */
> + vlag = clamp(se->vlag, prev_vlag, 0);
> +
> + if (vlag != se->vlag) {
> cfs_rq->nr_queued--;
> if (se != cfs_rq->curr)
> __dequeue_entity(cfs_rq, se);
> - se->vlag = 0;
> + se->vlag = vlag;
> place_entity(cfs_rq, se, 0);
> if (se != cfs_rq->curr)
> __enqueue_entity(cfs_rq, se);

So I think this should apply irrespective of DELAY_ZERO -- although
since we default to that, it is the most relevant.

Also, I'm thinking we might want a comment with this. We're bound to
forget the details at some point.

Also, there is one other site that has DLAY_ZERO/clear_delayed(), does
that want updating too?