Re: [PATCH v3 1/2] sched/topology: Don't try to build empty sched domains

From: Valentin Schneider
Date: Tue Oct 22 2019 - 08:46:36 EST


On 22/10/2019 12:43, Dietmar Eggemann wrote:
> First I thought we can do with a little less drama by only preventing
> arch_scale_cpu_capacity() from consuming >= nr_cpu_ids.
>
> @@ -1894,6 +1894,9 @@ static struct sched_domain_topology_level
> struct sched_domain_topology_level *tl, *asym_tl = NULL;
> unsigned long cap;
>
> + if (cpumask_empty(cpu_map))
> + return NULL;
> +
>
> Until I tried to hp'ed in CPU4 after CPU4/5 had been hp'ed out (your
> example further below) and I got another:
>
> [ 68.014564] Unable to handle kernel paging request at virtual address
> fffe8009903d8ee0
> ...
> [ 68.191293] Call trace:
> [ 68.193712] partition_sched_domains_locked+0x1a4/0x4a0
> [ 68.198882] rebuild_sched_domains_locked+0x4d0/0x7b0
> [ 68.203880] rebuild_sched_domains+0x24/0x40
> [ 68.208104] cpuset_hotplug_workfn+0xe0/0x5f8
> ...
>
> @@ -2213,6 +2216,11 @@ void partition_sched_domains_locked(int
> ndoms_new, cpumask_var_t doms_new[],
> * will be recomputed in function
> * update_tasks_root_domain().
> */
> + if (cpumask_empty(doms_cur[i]))
> + printk("doms_cur[%d] empty\n", i);
> +
> rd = cpu_rq(cpumask_any(doms_cur[i]))->rd;
>
> doms_cur[i] is empty when hp'ing in CPU4 again.
>
> Your patch fixes this as well.
>

Thanks for giving it a spin!

> Might be worth noting that this is not only about asym CPU capacity
> handling but missing checks after cpumask operations in case the cpuset
> is empty.

Aye, we end up saving whatever we're given (doms_cur = doms_new at the end
of the rebuild). As you pointed out this is also an issue for the operation
done by

f9a25f776d78 ("cpusets: Rebuild root domain deadline accounting information")

but it has been introduced after the asymmetry check, hence why I'm tagging
the latter for stable.