Re: [PATCH 6/6] sched/numa: Delay retrying placement for automatic NUMA balance after wake_affine

From: Peter Zijlstra
Date: Tue Feb 13 2018 - 09:01:49 EST


On Tue, Feb 13, 2018 at 01:37:30PM +0000, Mel Gorman wrote:
> +static void
> +update_wa_numa_placement(struct task_struct *p, int prev_cpu, int target)
> +{
> + unsigned long interval;
> +
> + if (!static_branch_likely(&sched_numa_balancing))
> + return;
> +
> + /* If balancing has no preference then continue gathering data */
> + if (p->numa_preferred_nid == -1)
> + return;
> +
> + /*
> + * If the wakeup is not affecting locality then it is neutral from
> + * the perspective of NUMA balacing so continue gathering data.
> + */
> + if (cpus_share_cache(prev_cpu, target))
> + return;

Dang, I wanted to mention this before, but it slipped my mind. The
comment and code don't match.

Did you want to write:

if (cpu_to_node(prev_cpu) == cpu_to_node(target))
return;

> + /*
> + * Temporarily prevent NUMA balancing trying to place waker/wakee after
> + * wakee has been moved by wake_affine. This will potentially allow
> + * related tasks to converge and update their data placement. The
> + * 4 * numa_scan_period is to allow the two-pass filter to migrate
> + * hot data to the wakers node.
> + */
> + interval = max(sysctl_numa_balancing_scan_delay,
> + p->numa_scan_period << 2);
> + p->numa_migrate_retry = jiffies + msecs_to_jiffies(interval);
> +
> + interval = max(sysctl_numa_balancing_scan_delay,
> + current->numa_scan_period << 2);
> + current->numa_migrate_retry = jiffies + msecs_to_jiffies(interval);
> +}