Re: [PATCH v2] sched: fair: Prevent negative lag increase during delayed dequeue

From: Vincent Guittot

Date: Thu Apr 23 2026 - 06:36:16 EST


On Thu, 23 Apr 2026 at 11:41, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> On Thu, Apr 23, 2026 at 09:28:22AM +0200, Vincent Guittot wrote:
> > On Thu, 23 Apr 2026 at 00:20, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> > >
> > > On Wed, Apr 22, 2026 at 04:28:28PM +0200, Peter Zijlstra wrote:
> > >
> > > > Let me ponder this a bit...
> > >
> > > Like this? Or am I still making a mess of things? AFAICT this is the
> > > exact same as your initial version.
> > >
> > > ---
> > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > > index 69361c63353a..24e8c78b110a 100644
> > > --- a/kernel/sched/fair.c
> > > +++ b/kernel/sched/fair.c
> > > @@ -847,13 +847,13 @@ static s64 entity_lag(struct cfs_rq *cfs_rq, struct sched_entity *se, u64 avrunt
> > > * Similarly, check that the entity didn't gain positive lag when DELAY_ZERO
> > > * is set.
> > > *
> > > - * Return true if the lag has been adjusted.
> > > + * Return true if the lag of a delayed entity has been adjusted.
> > > */
> > > static __always_inline
> > > bool update_entity_lag(struct cfs_rq *cfs_rq, struct sched_entity *se)
> > > {
> > > s64 vlag = entity_lag(cfs_rq, se, avg_vruntime(cfs_rq));
> > > - bool ret;
> > > + bool ret = false;
> > >
> > > WARN_ON_ONCE(!se->on_rq);
> > >
> > > @@ -862,8 +862,9 @@ bool update_entity_lag(struct cfs_rq *cfs_rq, struct sched_entity *se)
> > > vlag = max(vlag, se->vlag);
> > > if (sched_feat(DELAY_ZERO))
> > > vlag = min(vlag, 0);
> > > +
> > > + ret = (vlag != se->vlag);
> >
> > No this is not enough.
>
> Argh yes. I think I finally see. How about this then?

Yes, this looks good.

I have also launched more tests in case I missed something.

>
> ---
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 69361c63353a..f4d1457d1837 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -847,13 +847,19 @@ static s64 entity_lag(struct cfs_rq *cfs_rq, struct sched_entity *se, u64 avrunt
> * Similarly, check that the entity didn't gain positive lag when DELAY_ZERO
> * is set.
> *
> - * Return true if the lag has been adjusted.
> + * Return true if the vlag has been modified. Specifically:
> + *
> + * se->vlag != avg_vruntime() - se->vruntime
> + *
> + * This can be due to clamping in entity_lag() or clamping due to
> + * sched_delayed. Either way, when vlag is modified and the entity is
> + * retained, the tree needs to be adjusted.
> */
> static __always_inline
> bool update_entity_lag(struct cfs_rq *cfs_rq, struct sched_entity *se)
> {
> - s64 vlag = entity_lag(cfs_rq, se, avg_vruntime(cfs_rq));
> - bool ret;
> + u64 avruntime = avg_vruntime(cfs_rq);
> + s64 vlag = entity_lag(cfs_rq, se, avruntime);
>
> WARN_ON_ONCE(!se->on_rq);
>
> @@ -863,10 +869,9 @@ bool update_entity_lag(struct cfs_rq *cfs_rq, struct sched_entity *se)
> if (sched_feat(DELAY_ZERO))
> vlag = min(vlag, 0);
> }
> - ret = (vlag == se->vlag);
> se->vlag = vlag;
>
> - return ret;
> + return avruntime - vlag != se->vruntime;
> }
>
> /*