Re: [PATCH v2] sched/fair: fix sgc->{min,max}_capacity miscalculate

From: Peter Zijlstra
Date: Mon Jan 06 2020 - 05:28:36 EST


On Mon, Jan 06, 2020 at 10:25:49AM +0100, Dietmar Eggemann wrote:
> On 04/01/2020 14:08, Peng Liu wrote:
>
> Could you add a hint that this is about the SD_OVERLAP path? Something
> like 'Fix sgc->{min,max}_capacity calculation for SD_OVERLAP'
>
> > commit bf475ce0a3dd ("sched/fair: Add per-CPU min capacity to
> > sched_group_capacity") introduced per-cpu min_capacity.
> >
> > commit e3d6d0cb66f2 ("sched/fair: Add sched_group per-CPU max capacity")
> > introduced per-cpu max_capacity.
> >
> > Here, capacity is the accumulated sum of (maybe) many CPUs' capacity.
> > Compare with capacity to get {min,max}_capacity makes no sense. Instead,
> > we should compare one by one in each iteration to get
> > sgc->{min,max}_capacity of the group.
> >
> > Also, the only CPU in rq->sd->groups should be rq's CPU. Thus,
> > capacity_of(cpu_of(rq)) should be equal to rq->sd->groups->sgc->capacity.
> > Code can be simplified by removing the if/else.
>
> Could we improve the description of the issue and the change a little
> bit? Something like:
>
> In the SD_OVERLAP case, the local variable 'capacity' represents the sum
> of CPU capacity of all CPUs in the first sched group (sg) of the sched
> domain (sd).
>
> It is erroneously used to calculate sg's min and max CPU capacity.
> To fix this use capacity_of(cpu) instead of 'capacity'.
>
> The code which achieves this via cpu_rq(cpu)->sd->groups->sgc->capacity
> (for rq->sd != NULL) can be removed since it delivers the same value as
> capacity_of(cpu) which is currently only used for the (!rq->sd) case
> (see update_cpu_capacity()).
> A sg of the lowest sd (rq->sd or sd->child == NULL) represents a single
> CPU (and hence sg->sgc->capacity == capacity_of(cpu)).
>

I've made it like so.

---
Subject: sched/fair: Fix sgc->{min,max}_capacity calculation for SD_OVERLAP
From: Peng Liu <iwtbavbm@xxxxxxxxx>
Date: Sat, 4 Jan 2020 21:08:28 +0800

commit bf475ce0a3dd ("sched/fair: Add per-CPU min capacity to
sched_group_capacity") introduced per-cpu min_capacity.

commit e3d6d0cb66f2 ("sched/fair: Add sched_group per-CPU max capacity")
introduced per-cpu max_capacity.

In the SD_OVERLAP case, the local variable 'capacity' represents the sum
of CPU capacity of all CPUs in the first sched group (sg) of the sched
domain (sd).

It is erroneously used to calculate sg's min and max CPU capacity.
To fix this use capacity_of(cpu) instead of 'capacity'.

The code which achieves this via cpu_rq(cpu)->sd->groups->sgc->capacity
(for rq->sd != NULL) can be removed since it delivers the same value as
capacity_of(cpu) which is currently only used for the (!rq->sd) case
(see update_cpu_capacity()).
An sg of the lowest sd (rq->sd or sd->child == NULL) represents a single
CPU (and hence sg->sgc->capacity == capacity_of(cpu)).

Signed-off-by: Peng Liu <iwtbavbm@xxxxxxxxx>
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
Reviewed-by: Valentin Schneider <valentin.schneider@xxxxxxx>
Link: 20200104130828.GA7718@iZj6chx1xj0e0buvshuecpZ">https://lkml.kernel.org/r/20200104130828.GA7718@iZj6chx1xj0e0buvshuecpZ
---
kernel/sched/fair.c | 26 ++++----------------------
1 file changed, 4 insertions(+), 22 deletions(-)

--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7855,29 +7855,11 @@ void update_group_capacity(struct sched_
*/

for_each_cpu(cpu, sched_group_span(sdg)) {
- struct sched_group_capacity *sgc;
- struct rq *rq = cpu_rq(cpu);
+ unsigned long cpu_cap = capacity_of(cpu);

- /*
- * build_sched_domains() -> init_sched_groups_capacity()
- * gets here before we've attached the domains to the
- * runqueues.
- *
- * Use capacity_of(), which is set irrespective of domains
- * in update_cpu_capacity().
- *
- * This avoids capacity from being 0 and
- * causing divide-by-zero issues on boot.
- */
- if (unlikely(!rq->sd)) {
- capacity += capacity_of(cpu);
- } else {
- sgc = rq->sd->groups->sgc;
- capacity += sgc->capacity;
- }
-
- min_capacity = min(capacity, min_capacity);
- max_capacity = max(capacity, max_capacity);
+ capacity += cpu_cap;
+ min_capacity = min(cpu_cap, min_capacity);
+ max_capacity = max(cpu_cap, max_capacity);
}
} else {
/*