Re: [PATCH] sched/fair: Sync load_sum with load_avg after dequeue

From: Sachin Sant
Date: Fri Jul 02 2021 - 02:22:54 EST




> On 01-Jul-2021, at 10:48 PM, Vincent Guittot <vincent.guittot@xxxxxxxxxx> wrote:
>
> commit 9e077b52d86a ("sched/pelt: Check that *_avg are null when *_sum are")
> reported some inconsitencies between *_avg and *_sum.
>
> commit 1c35b07e6d39 ("sched/fair: Ensure _sum and _avg values stay consistent")
> fixed some but one remains when dequeuing load.
>
> sync the cfs's load_sum with its load_avg after dequeuing the load of a
> sched_entity.
>
> Fixes: 9e077b52d86a ("sched/pelt: Check that *_avg are null when *_sum are")
> Reported-by: Sachin Sant <sachinp@xxxxxxxxxxxxxxxxxx>
> Signed-off-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
> ---
>
> I have been able to trigger a WARN on my system even with the patch
> listed above. This patch fixes it.
> Sachin could you test that it also fixes yours ?
>

I ran various LTP stress tests, scheduler tests and kernel compile operation for about 5 hours.
Haven’t seen the warning during the testing.

Tested-by: Sachin Sant <sachinp@xxxxxxxxxxxxxxxxxx>

I have left the tests running, will let it run for few more hours.

Thanks
-Sachin

> kernel/sched/fair.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 11d22943753f..48fc7dfc2f66 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -3037,8 +3037,9 @@ enqueue_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se)
> static inline void
> dequeue_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se)
> {
> + u32 divider = get_pelt_divider(&se->avg);
> sub_positive(&cfs_rq->avg.load_avg, se->avg.load_avg);
> - sub_positive(&cfs_rq->avg.load_sum, se_weight(se) * se->avg.load_sum);
> + cfs_rq->avg.load_sum = cfs_rq->avg.load_avg * divider;
> }
> #else
> static inline void
> --
> 2.17.1
>