Re: [PATCH 1/3] sched/fair: Add tg_load_contrib cfs_rq decay checking
From: Vincent Guittot
Date: Tue May 25 2021 - 05:58:58 EST
On Tue, 18 May 2021 at 14:54, Odin Ugedal <odin@xxxxxxx> wrote:
>
> Make sure cfs_rq does not contribute to task group load avg when
> checking if it is decayed. Due to how the pelt tracking works,
> the divider can result in a situation where:
>
> cfs_rq->avg.load_sum = 0
> cfs_rq->avg.load_avg = 4
Could you give more details about how cfs_rq->avg.load_avg = 4 but
cfs_rq->avg.load_sum = 0 ?
cfs_rq->avg.load_sum is decayed and can become null when crossing
period which implies an update of cfs_rq->avg.load_avg. This means
that your case is generated by something outside the pelt formula ...
like maybe the propagation of load in the tree. If this is the case,
we should find the error and fix it
> cfs_rq->avg.tg_load_avg_contrib = 4
>
> If pelt tracking in this case does not cross a period, there is no
> "change" in load_sum, and therefore load_avg is not recalculated, and
> keeps its value.
>
> If this cfs_rq is then removed from the leaf list, it results in a
> situation where the load is never removed from the tg. If that happen,
> the fiarness is permanently skewed.
>
> Fixes: 039ae8bcf7a5 ("sched/fair: Fix O(nr_cgroups) in the load balancing path")
> Signed-off-by: Odin Ugedal <odin@xxxxxxx>
> ---
> kernel/sched/fair.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 3248e24a90b0..ceda53c2a87a 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -8004,6 +8004,9 @@ static inline bool cfs_rq_is_decayed(struct cfs_rq *cfs_rq)
> if (cfs_rq->avg.runnable_sum)
> return false;
>
> + if (cfs_rq->tg_load_avg_contrib)
> + return false;
> +
> return true;
> }
>
> --
> 2.31.1
>