Re: [PATCH] sched/fair: fix sgc->{min,max}_capacity miscalculate

From: Valentin Schneider
Date: Wed Jan 01 2020 - 00:57:50 EST


Hi Peng,

On 31/12/2019 03:51, Peng Liu wrote:
> commit bf475ce0a3dd ("sched/fair: Add per-CPU min capacity to
> sched_group_capacity") introduced per-cpu min_capacity.
>
> commit e3d6d0cb66f2 ("sched/fair: Add sched_group per-CPU max capacity")
> introduced per-cpu max_capacity.
>
> sgc->capacity is the *SUM* of all CPU's capacity in the group.
> sgc->{min,max}_capacity are the sg per-cpu variables. Compare with
> sgc->capacity to get sgc->{min,max}_capacity makes no sense. Instead,
> we should compare one by one in each iteration to get
> sgc->{min,max}_capacity of the group.
>

Worth noting this only affects the SD_OVERLAP case, the other case is fine
(I checked again just to be sure).

Now, on the bright side of things I don't think this currently causes any
harm. The {min,max}_capacity values are used in bits of code that only
gets run on topologies with asymmetric CPU Âarchs (SD_ASYM_CPUCAPACITY), and
I know of no such system that is also NUMA, i.e. end up with SD_OVERLAP
(here's hoping nobody gets any funny idea).

Still, nice find!

> Signed-off-by: Peng Liu <iwtbavbm@xxxxxxxxx>
> ---
> kernel/sched/fair.c | 11 +++++++----
> 1 file changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 2d170b5da0e3..97b164fcda93 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -7795,6 +7795,7 @@ void update_group_capacity(struct sched_domain *sd, int cpu)
> for_each_cpu(cpu, sched_group_span(sdg)) {
> struct sched_group_capacity *sgc;
> struct rq *rq = cpu_rq(cpu);
> + unsigned long cap;
>
> /*
> * build_sched_domains() -> init_sched_groups_capacity()
> @@ -7808,14 +7809,16 @@ void update_group_capacity(struct sched_domain *sd, int cpu)
> * causing divide-by-zero issues on boot.
> */
> if (unlikely(!rq->sd)) {
> - capacity += capacity_of(cpu);
> + cap = capacity_of(cpu);
> + capacity += cap;
> + min_capacity = min(cap, min_capacity);
> + max_capacity = max(cap, max_capacity);
> } else {
> sgc = rq->sd->groups->sgc;
> capacity += sgc->capacity;
> + min_capacity = min(sgc->min_capacity, min_capacity);
> + max_capacity = max(sgc->max_capacity, max_capacity);
> }
> -
> - min_capacity = min(capacity, min_capacity);
> - max_capacity = max(capacity, max_capacity);
> }
> } else {
> /*
>

All that could be shortened like the below, no?

---
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 08a233e97a01..9f6c015639ef 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7773,8 +7773,8 @@ void update_group_capacity(struct sched_domain *sd, int cpu)
*/

for_each_cpu(cpu, sched_group_span(sdg)) {
- struct sched_group_capacity *sgc;
struct rq *rq = cpu_rq(cpu);
+ unsigned long cpu_cap;

/*
* build_sched_domains() -> init_sched_groups_capacity()
@@ -7787,15 +7787,15 @@ void update_group_capacity(struct sched_domain *sd, int cpu)
* This avoids capacity from being 0 and
* causing divide-by-zero issues on boot.
*/
- if (unlikely(!rq->sd)) {
- capacity += capacity_of(cpu);
- } else {
- sgc = rq->sd->groups->sgc;
- capacity += sgc->capacity;
- }
+ if (unlikely(!rq->sd))
+ cpu_cap = capacity_of(cpu);
+ else
+ cpu_cap = rq->sd->groups->sgc->capacity;
+
+ min_capacity = min(cpu_cap, min_capacity);
+ max_capacity = max(cpu_cap, max_capacity);

- min_capacity = min(capacity, min_capacity);
- max_capacity = max(capacity, max_capacity);
+ capacity += cpu_cap;
}
} else {
/*