Re: [tip:sched/core] sched/numa: Use effective_load() to balance NUMA loads
From: Rik van Riel
Date: Wed Jul 09 2014 - 12:03:10 EST
On 07/05/2014 06:44 AM, tip-bot for Rik van Riel wrote:
> Commit-ID: 6dc1a672ab15604947361dcd02e459effa09bad5
> Gitweb: http://git.kernel.org/tip/6dc1a672ab15604947361dcd02e459effa09bad5
> Author: Rik van Riel <riel@xxxxxxxxxx>
> AuthorDate: Mon, 23 Jun 2014 11:46:14 -0400
> Committer: Ingo Molnar <mingo@xxxxxxxxxx>
> CommitDate: Sat, 5 Jul 2014 11:17:35 +0200
>
> sched/numa: Use effective_load() to balance NUMA loads
>
> When CONFIG_FAIR_GROUP_SCHED is enabled, the load that a task places
> on a CPU is determined by the group the task is in. The active groups
> on the source and destination CPU can be different, resulting in a
> different load contribution by the same task at its source and at its
> destination. As a result, the load needs to be calculated separately
> for each CPU, instead of estimated once with task_h_load().
>
> Getting this calculation right allows some workloads to converge,
> where previously the last thread could get stuck on another node,
> without being able to migrate to its final destination.
Self-NAK
This patch should be reverted.
It turns out that the tree I am working on was missing
changeset a003a25b227d59ded9197ced109517f037d01c27,
which makes weighted_cpuload and update_numa_stats use
p->se.avg.load_avg_contrib instead of p->se.load.weight.
This means using effective_load, which operates on the
load.weight is inappropriate.
Sorry for the noise.
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index f287d0b..d6526d2 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -1151,6 +1151,7 @@ static void task_numa_compare(struct task_numa_env *env,
> struct rq *src_rq = cpu_rq(env->src_cpu);
> struct rq *dst_rq = cpu_rq(env->dst_cpu);
> struct task_struct *cur;
> + struct task_group *tg;
> long src_load, dst_load;
> long load;
> long imp = (groupimp > 0) ? groupimp : taskimp;
> @@ -1225,14 +1226,21 @@ static void task_numa_compare(struct task_numa_env *env,
> * In the overloaded case, try and keep the load balanced.
> */
> balance:
> - load = task_h_load(env->p);
> - dst_load = env->dst_stats.load + load;
> - src_load = env->src_stats.load - load;
> + src_load = env->src_stats.load;
> + dst_load = env->dst_stats.load;
> +
> + /* Calculate the effect of moving env->p from src to dst. */
> + load = env->p->se.load.weight;
> + tg = task_group(env->p);
> + src_load += effective_load(tg, env->src_cpu, -load, -load);
> + dst_load += effective_load(tg, env->dst_cpu, load, load);
>
> if (cur) {
> - load = task_h_load(cur);
> - dst_load -= load;
> - src_load += load;
> + /* Cur moves in the opposite direction. */
> + load = cur->se.load.weight;
> + tg = task_group(cur);
> + src_load += effective_load(tg, env->src_cpu, load, load);
> + dst_load += effective_load(tg, env->dst_cpu, -load, -load);
> }
>
> if (load_too_imbalanced(src_load, dst_load, env))
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/