Re: [PATCH v2 6/7] sched/fair: Revert 6d71a9c61604 ("sched/fair: Fix EEVDF entity placement bug causing scheduling lag")

From: Peter Zijlstra

Date: Tue Apr 07 2026 - 09:56:04 EST


On Tue, Mar 24, 2026 at 10:01:26AM +0000, William Montaz wrote:
> Hi,
>
> > Zicheng Qu reported that, because avg_vruntime() always includes
> > cfs_rq->curr, when ->on_rq, place_entity() doesn't work right.
>
> > Specifically, the lag scaling in place_entity() relies on
> > avg_vruntime() being the state *before* placement of the new entity.
> > However in this case avg_vruntime() will actually already include the
> > entity, which breaks things.
>
> This has proven to be harmful on our production cluster using kernel version 6.18.19

> I tested the following versions:
> * LTS 5.10.252, 5.15.202, 6.1.166, 6.6.129, 6.12.77 --> no issue
> * LTS 6.18.19 has the issue
> * Stable 6.19.9 has the issue
> * Mainline 7.0-rc5 has the issue
> * Tip 7.0.0-rc5+ no issue
>
> Finally, I applied the patch to 6.18.19 LTS which solves the issue. However, we do not benefit from previous patches
> such as [PATCH v2 5/7] sched/fair: Increase weight bits for avg_vruntime.
>
> Thus I would prefer to let you decide how you want to adress backport on 6.18
>
> If you want I can share my patch file, let me know.

I've (finally!) had a look at stable-6.18.y and yes, I think this can be
backported without too much issue.

Feel free to submit a backport to stable for this.