Re: [PATCH 2/3] sched/fair: don't set LBF_ALL_PINNED unnecessarily

From: Valentin Schneider
Date: Wed Jan 06 2021 - 10:14:38 EST


On 06/01/21 14:34, Vincent Guittot wrote:
> Setting LBF_ALL_PINNED during active load balance is only valid when there
> is only 1 running task on the rq otherwise this ends up increasing the
> balance interval whereas other tasks could migrate after the next interval
> once they become cache-cold as an example.
>
> Signed-off-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
> ---
> kernel/sched/fair.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 5428b8723e61..69a455113b10 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -9759,7 +9759,8 @@ static int load_balance(int this_cpu, struct rq *this_rq,
> if (!cpumask_test_cpu(this_cpu, busiest->curr->cpus_ptr)) {
> raw_spin_unlock_irqrestore(&busiest->lock,
> flags);
> - env.flags |= LBF_ALL_PINNED;
> + if (busiest->nr_running == 1)
> + env.flags |= LBF_ALL_PINNED;

So LBF_ALL_PINNED *can* be set if busiest->nr_running > 1, because
before we get there we have:

if (nr_running > 1) {
env.flags |= LBF_ALL_PINNED;
detach_tasks(&env); // Removes LBF_ALL_PINNED if > 0 tasks can be pulled
...
}

What about following the logic used by detach_tasks() and only clear the
flag? Say something like the below. if nr_running > 1, then we'll have
gone through detach_tasks() and will have cleared the flag (if
possible).
---
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 04a3ce20da67..211c86ba3f5b 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9623,6 +9623,8 @@ static int load_balance(int this_cpu, struct rq *this_rq,
env.src_rq = busiest;

ld_moved = 0;
+ /* Clear this as soon as we find a single pullable task */
+ env.flags |= LBF_ALL_PINNED;
if (busiest->nr_running > 1) {
/*
* Attempt to move tasks. If find_busiest_group has found
@@ -9630,7 +9632,6 @@ static int load_balance(int this_cpu, struct rq *this_rq,
* still unbalanced. ld_moved simply stays zero, so it is
* correctly treated as an imbalance.
*/
- env.flags |= LBF_ALL_PINNED;
env.loop_max = min(sysctl_sched_nr_migrate, busiest->nr_running);

more_balance:
@@ -9756,10 +9757,11 @@ static int load_balance(int this_cpu, struct rq *this_rq,
if (!cpumask_test_cpu(this_cpu, busiest->curr->cpus_ptr)) {
raw_spin_unlock_irqrestore(&busiest->lock,
flags);
- env.flags |= LBF_ALL_PINNED;
goto out_one_pinned;
}

+ env.flags &= ~LBF_ALL_PINNED;
+
/*
* ->active_balance synchronizes accesses to
* ->active_balance_work. Once set, it's cleared
---