Re: [PATCH v3 04/10] sched/fair: rework load_balance

From: Vincent Guittot
Date: Wed Oct 16 2019 - 07:57:07 EST


On Wed, 16 Oct 2019 at 09:21, Parth Shah <parth@xxxxxxxxxxxxx> wrote:
>
>
>
> On 9/19/19 1:03 PM, Vincent Guittot wrote:
>
> [...]
>
> > Signed-off-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
> > ---
> > kernel/sched/fair.c | 585 ++++++++++++++++++++++++++++++++++------------------
> > 1 file changed, 380 insertions(+), 205 deletions(-)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 017aad0..d33379c 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -7078,11 +7078,26 @@ static unsigned long __read_mostly max_load_balance_interval = HZ/10;
> >
> > enum fbq_type { regular, remote, all };
> >
> > +/*
> > + * group_type describes the group of CPUs at the moment of the load balance.
> > + * The enum is ordered by pulling priority, with the group with lowest priority
> > + * first so the groupe_type can be simply compared when selecting the busiest
> > + * group. see update_sd_pick_busiest().
> > + */
> > enum group_type {
> > - group_other = 0,
> > + group_has_spare = 0,
> > + group_fully_busy,
> > group_misfit_task,
> > + group_asym_packing,
> > group_imbalanced,
> > - group_overloaded,
> > + group_overloaded
> > +};
> > +
> > +enum migration_type {
> > + migrate_load = 0,
> > + migrate_util,
> > + migrate_task,
> > + migrate_misfit
> > };
> >
> > #define LBF_ALL_PINNED 0x01
> > @@ -7115,7 +7130,7 @@ struct lb_env {
> > unsigned int loop_max;
> >
> > enum fbq_type fbq_type;
> > - enum group_type src_grp_type;
> > + enum migration_type balance_type;
> > struct list_head tasks;
> > };
> >
> > @@ -7347,7 +7362,7 @@ static int detach_tasks(struct lb_env *env)
> > {
> > struct list_head *tasks = &env->src_rq->cfs_tasks;
> > struct task_struct *p;
> > - unsigned long load;
> > + unsigned long util, load;
> > int detached = 0;
> >
> > lockdep_assert_held(&env->src_rq->lock);
> > @@ -7380,19 +7395,53 @@ static int detach_tasks(struct lb_env *env)
> > if (!can_migrate_task(p, env))
> > goto next;
> >
> > - load = task_h_load(p);
> > + switch (env->balance_type) {
> > + case migrate_load:
> > + load = task_h_load(p);
> >
> > - if (sched_feat(LB_MIN) && load < 16 && !env->sd->nr_balance_failed)
> > - goto next;
> > + if (sched_feat(LB_MIN) &&
> > + load < 16 && !env->sd->nr_balance_failed)
> > + goto next;
> >
> > - if ((load / 2) > env->imbalance)
> > - goto next;
> > + if ((load / 2) > env->imbalance)
> > + goto next;
> > +
> > + env->imbalance -= load;
> > + break;
> > +
> > + case migrate_util:
> > + util = task_util_est(p);
> > +
> > + if (util > env->imbalance)
>
> Can you please explain what would happen for
> `if (util/2 > env->imbalance)` ?
> just like when migrating load, even util shouldn't be migrated if
> env->imbalance is just near the utilization of the task being moved, isn't it?

I have chosen uti and not util/2 to be conservative because
migrate_util is used to fill spare capacity.
With `if (util/2 > env->imbalance)`, we can more easily overload the
local group or pick too much utilization from the overloaded group.

>
> > + goto next;
> > +
> > + env->imbalance -= util;
> > + break;
> > +[ ... ]
>
> Thanks,
> Parth
>