Re: [PATCH 1/2] sched/topology: build_sched_groups: Skip duplicate group rewrites

From: Qais Yousef
Date: Wed Apr 10 2019 - 05:27:32 EST


On 04/09/19 18:35, Valentin Schneider wrote:
> While staring at build_sched_domains(), I realized that get_group()
> does several duplicate (thus useless) writes.
>
> If you take the Arm Juno r0 (LITTLEs = [0, 3, 4, 5], bigs = [1, 2]), the
> sched_group build flow would look like this:
>
> ('MC[cpu]->sg' means 'per_cpu_ptr(&tl->data->sg, cpu)' with 'tl == MC')
>
> build_sched_groups(MC[CPU0]->sd, CPU0)
> get_group(0) -> MC[CPU0]->sg
> get_group(3) -> MC[CPU3]->sg
> get_group(4) -> MC[CPU4]->sg
> get_group(5) -> MC[CPU5]->sg
>
> build_sched_groups(DIE[CPU0]->sd, CPU0)
> get_group(0) -> DIE[CPU0]->sg
> get_group(1) -> DIE[CPU1]->sg <-----------------+
> |
> build_sched_groups(MC[CPU1]->sd, CPU1) |
> get_group(1) -> MC[CPU1]->sg |
> get_group(2) -> MC[CPU2]->sg |
> |
> build_sched_groups(DIE[CPU1]->sd, CPU1) ^
> get_group(1) -> DIE[CPU1]->sg } We've set up these two up here!
> get_group(3) -> DIE[CPU0]->sg }
>
> From this point on, we will only use sched_groups that have been
> previously visited & initialized. The only new operation will
> be which group pointer we affect to sd->groups.
>
> On the Juno r0 we get 32 get_group() calls, every single one of them
> writing to a sched_group->cpumask. However, all of the data structures
> we need are set up after 8 visits (see above).
>
> Return early from get_group() if we've already visited (and thus
> initialized) the sched_group we're looking at. Overlapping domains
> are not affected as they do not use build_sched_groups().
>
> Tested on a Juno and a 2 * (Xeon E5-2690) system.
>
> Signed-off-by: Valentin Schneider <valentin.schneider@xxxxxxx>
> ---
> FWIW I initially checked the refs for both sg && sg->sgc, but figured if
> they weren't both 0 or > 1 then something must have gone wrong, so I
> threw in a WARN_ON().
> ---
> kernel/sched/topology.c | 12 +++++++++---
> 1 file changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
> index 64bec54ded3e..6c0b7326f66e 100644
> --- a/kernel/sched/topology.c
> +++ b/kernel/sched/topology.c
> @@ -1059,6 +1059,7 @@ static struct sched_group *get_group(int cpu, struct sd_data *sdd)
> struct sched_domain *sd = *per_cpu_ptr(sdd->sd, cpu);
> struct sched_domain *child = sd->child;
> struct sched_group *sg;
> + bool already_visited;
>
> if (child)
> cpu = cpumask_first(sched_domain_span(child));
> @@ -1066,9 +1067,14 @@ static struct sched_group *get_group(int cpu, struct sd_data *sdd)
> sg = *per_cpu_ptr(sdd->sg, cpu);
> sg->sgc = *per_cpu_ptr(sdd->sgc, cpu);
>
> - /* For claim_allocations: */
> - atomic_inc(&sg->ref);
> - atomic_inc(&sg->sgc->ref);
> + /* Increase refcounts for claim_allocations: */
> + already_visited = atomic_inc_return(&sg->ref) > 1;
> + /* sgc visits should follow a similar trend as sg */
> + WARN_ON(already_visited != (atomic_inc_return(&sg->sgc->ref) > 1));

NIT: I think it would be better to not have any side effect of calling
WARN_ON(). Ie: do the atomic_inc_return() outside the WARN_ON() condition.
Having two bool already_visited_sg and already_visited_sgc will make the code
more readable too.

Thanks

--
Qais Yousef

> +
> + /* If we have already visited that group, it's already initialized. */
> + if (already_visited)
> + return sg;
>
> if (child) {
> cpumask_copy(sched_group_span(sg), sched_domain_span(child));
> --
> 2.20.1
>