Re: [PATCH 1/4] sched/fair: make sure to try to detach at least one movable task

From: Vincent Guittot
Date: Tue Sep 13 2022 - 04:54:25 EST


On Mon, 12 Sept 2022 at 10:44, Dietmar Eggemann
<dietmar.eggemann@xxxxxxx> wrote:
>
> On 25/08/2022 14:27, Vincent Guittot wrote:
>
> s/sched/fair: make/sched/fair: Make
>
> > During load balance, we try at most env->loop_max time to move a task.
> > But it can happen that the loop_max LRU tasks (ie tail of
> > the cfs_tasks list) can't be moved to dst_cpu because of affinity.
> > In this case, loop in the list until we found at least one.
> >
> > The maximum of detached tasks remained the same as before.
>
> Not sure how this relates to the patch? Isn't this given by the
> `env->imbalance <= 0` check at the end of detach_tasks()?

The number of detached tasks can't be higher than loop_max in
detached_tasks() and it remains the same with this patch as we will
continue to loop only if we didn't find task that can move to the cpu

>
> >
> > Signed-off-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
> > ---
> > kernel/sched/fair.c | 12 +++++++++---
> > 1 file changed, 9 insertions(+), 3 deletions(-)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index da388657d5ac..02b7b808e186 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -8052,8 +8052,12 @@ static int detach_tasks(struct lb_env *env)
> > p = list_last_entry(tasks, struct task_struct, se.group_node);
> >
> > env->loop++;
> > - /* We've more or less seen every task there is, call it quits */
> > - if (env->loop > env->loop_max)
> > + /*
> > + * We've more or less seen every task there is, call it quits
>
> I never understood this `more or less`. Either we have seen all tasks or
> not?
>
> > + * unless we haven't found any movable task yet.
> > + */
> > + if (env->loop > env->loop_max &&
> > + !(env->flags & LBF_ALL_PINNED))
> > break;
> >
> > /* take a breather every nr_migrate tasks */
> > @@ -10182,7 +10186,9 @@ static int load_balance(int this_cpu, struct rq *this_rq,
> >
> > if (env.flags & LBF_NEED_BREAK) {
> > env.flags &= ~LBF_NEED_BREAK;
> > - goto more_balance;
> > + /* Stop if we tried all running tasks */
>
> Would say s/running/runnable but I see that we do use running/runnable
> interchangeably.
>
> > + if (env.loop < busiest->nr_running)
> > + goto more_balance;
> > }
> >
> > /*
>
> IMHO, there will be some interaction with the `All tasks on this
> runqueue were pinned by CPU affinity` check at the end of load_balance()?