Re: [PATCH] sched/topology: Disable sched_asym_cpucapacity on domain destruction

From: Quentin Perret
Date: Mon Oct 14 2019 - 08:16:59 EST

Hi Valentin,

On Monday 14 Oct 2019 at 12:47:10 (+0100), Valentin Schneider wrote:
> While the static key is correctly initialized as being disabled, it will
> remain forever enabled once turned on. This means that if we start with an
> asymmetric system and hotplug out enough CPUs to end up with an SMP system,
> the static key will remain set - which is obviously wrong. We should detect
> this and turn off things like misfit migration and EAS wakeups.

FWIW we already clear the EAS static key properly (based on the sd
pointer, not the static key), so this is really only for the
capacity-aware stuff.

> Having that key enabled should also mandate
> per_cpu(sd_asym_cpucapacity, cpu) != NULL
> for all CPUs, but this is obviously not true with the above.
> On top of that, sched domain rebuilds first lead to attaching the NULL
> domain to the affected CPUs, which means there will be a window where the
> static key is set but the sd_asym_cpucapacity shortcut points to NULL even
> if asymmetry hasn't been hotplugged out.
> Disable the static key when destroying domains, and let
> build_sched_domains() (re) enable it as needed.
> Cc: <stable@xxxxxxxxxxxxxxx>
> Fixes: df054e8445a4 ("sched/topology: Add static_key for asymmetric CPU capacity optimizations")
> Signed-off-by: Valentin Schneider <valentin.schneider@xxxxxxx>
> ---
> kernel/sched/topology.c | 2 ++
> 1 file changed, 2 insertions(+)
> diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
> index b5667a273bf6..c49ae57a0611 100644
> --- a/kernel/sched/topology.c
> +++ b/kernel/sched/topology.c
> @@ -2123,7 +2123,8 @@ static void detach_destroy_domains(const struct cpumask *cpu_map)
> {
> int i;
> + static_branch_disable_cpuslocked(&sched_asym_cpucapacity);
> +
> rcu_read_lock();
> for_each_cpu(i, cpu_map)
> cpu_attach_domain(NULL, &def_root_domain, i);

So what happens it you have mutiple root domains ? You might skip
build_sched_domains() for one of them and end up not setting the static
key when you should no ?

I suppose an alternative would be to play with static_branch_inc() /
static_branch_dec() from build_sched_domains() or something along those