Re: [PATCH 2/2] sched/fair: Fix use of NULL with find_idlest_group

From: Peter Zijlstra
Date: Mon Aug 21 2017 - 17:26:21 EST


On Mon, Aug 21, 2017 at 11:14:00PM +0200, Peter Zijlstra wrote:
> +static int
> +find_idlest_cpu(struct sched_domain *sd, struct task_struct *p, int cpu, int sd_flag)
> +{
> + struct sched_domain *tmp;
> + int new_cpu = cpu;
> +
> + while (sd) {
> + struct sched_group *group;
> + int weight;
> +
> + if (!(sd->flags & sd_flag)) {
> + sd = sd->child;
> + continue;
> + }
> +
> + group = find_idlest_group(sd, p, cpu, sd_flag);
> + if (!group) {
> + sd = sd->child;
> + continue;
> + }
> +
> + new_cpu = find_idlest_group_cpu(group, p, cpu);
> + if (new_cpu == -1 || new_cpu == cpu) {
> + /* Now try balancing at a lower domain level of cpu */
> + sd = sd->child;
> + continue;
> + }
> +
> + /* Now try balancing at a lower domain level of new_cpu */
> + cpu = new_cpu;
> + weight = sd->span_weight;
> + sd = NULL;
> + for_each_domain(cpu, tmp) {
> + if (weight <= tmp->span_weight)
> + break;
> + if (tmp->flags & sd_flag)
> + sd = tmp;
> + }

This find-the-sd-for-another-cpu thing is horrific. And it has always
bugged me that the whole thing is O(n^2) to find a CPU.

I understand why it has this form, but scanning each CPU more than once
is just offensive.

> + /* while loop will break here if sd == NULL */
> + }
> +
> + return new_cpu;
> +}