Re: [PATCH v2] sched/fair: fix sgc->{min,max}_capacity miscalculate

From: Dietmar Eggemann
Date: Mon Jan 06 2020 - 04:25:54 EST


On 04/01/2020 14:08, Peng Liu wrote:

Could you add a hint that this is about the SD_OVERLAP path? Something
like 'Fix sgc->{min,max}_capacity calculation for SD_OVERLAP'

> commit bf475ce0a3dd ("sched/fair: Add per-CPU min capacity to
> sched_group_capacity") introduced per-cpu min_capacity.
>
> commit e3d6d0cb66f2 ("sched/fair: Add sched_group per-CPU max capacity")
> introduced per-cpu max_capacity.
>
> Here, capacity is the accumulated sum of (maybe) many CPUs' capacity.
> Compare with capacity to get {min,max}_capacity makes no sense. Instead,
> we should compare one by one in each iteration to get
> sgc->{min,max}_capacity of the group.
>
> Also, the only CPU in rq->sd->groups should be rq's CPU. Thus,
> capacity_of(cpu_of(rq)) should be equal to rq->sd->groups->sgc->capacity.
> Code can be simplified by removing the if/else.

Could we improve the description of the issue and the change a little
bit? Something like:

In the SD_OVERLAP case, the local variable 'capacity' represents the sum
of CPU capacity of all CPUs in the first sched group (sg) of the sched
domain (sd).

It is erroneously used to calculate sg's min and max CPU capacity.
To fix this use capacity_of(cpu) instead of 'capacity'.

The code which achieves this via cpu_rq(cpu)->sd->groups->sgc->capacity
(for rq->sd != NULL) can be removed since it delivers the same value as
capacity_of(cpu) which is currently only used for the (!rq->sd) case
(see update_cpu_capacity()).
A sg of the lowest sd (rq->sd or sd->child == NULL) represents a single
CPU (and hence sg->sgc->capacity == capacity_of(cpu)).

> Signed-off-by: Peng Liu <iwtbavbm@xxxxxxxxx>
> Reviewed-by: Valentin Schneider <valentin.schneider@xxxxxxx>
> ---
> v1: https://lkml.org/lkml/2019/12/30/502
>
> kernel/sched/fair.c | 26 ++++----------------------
> 1 file changed, 4 insertions(+), 22 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 2d170b5da0e3..e14698a8ee38 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -7793,29 +7793,11 @@ void update_group_capacity(struct sched_domain *sd, int cpu)
> */
>
> for_each_cpu(cpu, sched_group_span(sdg)) {
> - struct sched_group_capacity *sgc;
> - struct rq *rq = cpu_rq(cpu);
> + unsigned long cpu_cap = capacity_of(cpu);
>
> - /*
> - * build_sched_domains() -> init_sched_groups_capacity()
> - * gets here before we've attached the domains to the
> - * runqueues.
> - *
> - * Use capacity_of(), which is set irrespective of domains
> - * in update_cpu_capacity().
> - *
> - * This avoids capacity from being 0 and
> - * causing divide-by-zero issues on boot.
> - */
> - if (unlikely(!rq->sd)) {
> - capacity += capacity_of(cpu);
> - } else {
> - sgc = rq->sd->groups->sgc;
> - capacity += sgc->capacity;
> - }
> -
> - min_capacity = min(capacity, min_capacity);
> - max_capacity = max(capacity, max_capacity);
> + min_capacity = min(cpu_cap, min_capacity);
> + max_capacity = max(cpu_cap, max_capacity);
> + capacity += cpu_cap;

Nit: Why not

+ capacity += cpu_cap;
+ min_capacity = min(cpu_cap, min_capacity);
+ max_capacity = max(cpu_cap, max_capacity);

like in the !SD_OVERLAP path?

> }
> } else {
> /*
>