Re: [PATCH] sched: Fix numabalancing to work with isolated cpus

From: Mel Gorman
Date: Tue Apr 04 2017 - 16:37:34 EST


On Tue, Apr 04, 2017 at 10:57:28PM +0530, Srikar Dronamraju wrote:
> When performing load balancing, numabalancing only looks at
> task->cpus_allowed to see if the task can run on the target cpu. If
> isolcpus kernel parameter is set, then isolated cpus will not be part of
> mask task->cpus_allowed.
>
> For example: (On a Power 8 box running in smt 1 mode)
>
> isolcpus=56,64,72,80,88
>
> Cpus_allowed_list: 0-55,57-63,65-71,73-79,81-87,89-175
> /proc/20996/task/20996/status:Cpus_allowed_list: 0-55,57-63,65-71,73-79,81-87,89-175
> /proc/20996/task/20997/status:Cpus_allowed_list: 0-55,57-63,65-71,73-79,81-87,89-175
> /proc/20996/task/20998/status:Cpus_allowed_list: 0-55,57-63,65-71,73-79,81-87,89-175
>
> Note: offline cpus are excluded in cpus_allowed_list.
>
> However a task might call sched_setaffinity() that includes all possible
> cpus in the system including the isolated cpus.
>
> For example:
> perf bench numa mem --no-data_rand_walk -p 4 -t $THREADS -G 0 -P 3072 -T 0 -l 50 -c -s 1000
> would call sched_setaffinity that resets the cpus_allowed mask.
>
> Cpus_allowed_list: 0-55,57-63,65-71,73-79,81-87,89-175
> Cpus_allowed_list: 0,8,16,24,32,40,48,56,64,72,80,88,96,104,112,120,128,136,144,152,160,168
> Cpus_allowed_list: 0,8,16,24,32,40,48,56,64,72,80,88,96,104,112,120,128,136,144,152,160,168
> Cpus_allowed_list: 0,8,16,24,32,40,48,56,64,72,80,88,96,104,112,120,128,136,144,152,160,168
> Cpus_allowed_list: 0,8,16,24,32,40,48,56,64,72,80,88,96,104,112,120,128,136,144,152,160,168
>
> The isolated cpus are part of the cpus allowed list. In the above case,
> numabalancing ends up scheduling some of these tasks on isolated cpus.
>
> To avoid this, please check for isolated cpus before choosing a target
> cpu.
>

Hmm, would this also prevent a task running inside a cgroup that is
allowed accessed to isolated CPUs from balancing? I severely doubt it
matters because if a process is isolated from interference then it
follows that automatic NUMA balancing should not be involved. If
anything the protection should be absolute but either way;

Acked-by: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>

--
Mel Gorman
SUSE Labs