Re: [PATCH 7/7] sched/fair: don't wake affine recently load balanced tasks

From: Brendan Jackman
Date: Tue Aug 01 2017 - 06:54:43 EST


Hi Josef,

I happened to be thinking about something like this while investigating
a totally different issue with ARM big.LITTLE. Comment below...

On Fri, Jul 14 2017 at 13:21, Josef Bacik wrote:
> From: Josef Bacik <jbacik@xxxxxx>
>
> The wake affinity logic will move tasks between two cpu's that appear to be
> loaded equally at the current time, with a slight bias towards cache locality.
> However on a heavily loaded system the load balancer has a better insight into
> what needs to be moved around, so instead keep track of the last time a task was
> migrated by the load balancer. If it was recent, opt to let the process stay on
> it's current CPU (or an idle sibling).
>
> Signed-off-by: Josef Bacik <jbacik@xxxxxx>
> ---
> include/linux/sched.h | 1 +
> kernel/sched/fair.c | 11 +++++++++++
> 2 files changed, 12 insertions(+)
>
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 1a0eadd..d872780 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -528,6 +528,7 @@ struct task_struct {
> unsigned long wakee_flip_decay_ts;
> struct task_struct *last_wakee;
>
> + unsigned long last_balance_ts;
> int wake_cpu;
> #endif
> int on_rq;
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 034d5df..6a98a38 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -5604,6 +5604,16 @@ static int wake_wide(struct task_struct *p)
> unsigned int slave = p->wakee_flips;
> int factor = this_cpu_read(sd_llc_size);
>
> + /*
> + * If we've balanced this task recently we don't want to undo all of
> + * that hard work by the load balancer and move it to the current cpu.
> + * Constantly overriding the load balancers decisions is going to make
> + * it question its purpose in life and give it anxiety and self worth
> + * issues, and nobody wants that.
> + */
> + if (time_before(jiffies, p->last_balance_ts + HZ))
> + return 1;
> +
> if (master < slave)
> swap(master, slave);
> if (slave < factor || master < slave * factor)
> @@ -7097,6 +7107,7 @@ static int detach_tasks(struct lb_env *env)
> goto next;
>
> detach_task(p, env);
> + p->last_balance_ts = jiffies;

I guess this timestamp should be set in the active balance path too?

> list_add(&p->se.group_node, &env->tasks);
>
> detached++;

Cheers,
Brendan