Re: [PATCH 1/4 v2] sched/fair: relax constraint on task's load during load balance

From: Valentin Schneider
Date: Wed Sep 23 2020 - 10:43:45 EST



On 21/09/20 08:24, Vincent Guittot wrote:
> Some UCs like 9 always running tasks on 8 CPUs can't be balanced and the
> load balancer currently migrates the waiting task between the CPUs in an
> almost random manner. The success of a rq pulling a task depends of the
> value of nr_balance_failed of its domains and its ability to be faster
> than others to detach it. This behavior results in an unfair distribution
> of the running time between tasks because some CPUs will run most of the
> time, if not always, the same task whereas others will share their time
> between several tasks.
>
> Instead of using nr_balance_failed as a boolean to relax the condition
> for detaching task, the LB will use nr_balanced_failed to relax the
> threshold between the tasks'load and the imbalance. This mecanism
> prevents the same rq or domain to always win the load balance fight.
>
> Reviewed-by: Phil Auld <pauld@xxxxxxxxxx>
> Signed-off-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>

Reviewed-by: Valentin Schneider <valentin.schneider@xxxxxxx>