Re: [PATCH 3/3] sched/fair: fix unnecessary increase of balance interval

From: Vincent Guittot
Date: Thu Dec 20 2018 - 09:50:19 EST


On Thu, 20 Dec 2018 at 12:22, Valentin Schneider
<valentin.schneider@xxxxxxx> wrote:
>
> On 20/12/2018 07:55, Vincent Guittot wrote:
> > In case of active balance, we increase the balance interval to cover
> > pinned tasks cases not covered by all_pinned logic. Neverthless, the
> > active migration triggered by asym packing should be treated as the normal
> > unbalanced case and reset the interval to default value otherwise active
> > migration for asym_packing can be easily delayed for hundreds of ms
> > because of this pinned task detection mecanism.
> > The same happen to other conditions tested in need_active_balance() like
> > mistfit task and when the capacity of src_cpu is reduced compared to
> > dst_cpu (see comments in need_active_balance() for details).
> >
> > Signed-off-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
> > ---
> > kernel/sched/fair.c | 40 +++++++++++++++++++++++++++-------------
> > 1 file changed, 27 insertions(+), 13 deletions(-)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 487c73e..9b1e701 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -8849,21 +8849,25 @@ static struct rq *find_busiest_queue(struct lb_env *env,
> > */
> > #define MAX_PINNED_INTERVAL 512
> >
> > -static int need_active_balance(struct lb_env *env)
> > +static inline bool
> > +asym_active_balance(struct lb_env *env)
> > {
> > - struct sched_domain *sd = env->sd;
> > + /*
> > + * ASYM_PACKING needs to force migrate tasks from busy but
> > + * lower priority CPUs in order to pack all tasks in the
> > + * highest priority CPUs.
> > + */
> > + return env->idle != CPU_NOT_IDLE && (env->sd->flags & SD_ASYM_PACKING) &&
> > + sched_asym_prefer(env->dst_cpu, env->src_cpu);
> > +}
> >
> > - if (env->idle != CPU_NOT_IDLE) {
> > +static inline bool
> > +voluntary_active_balance(struct lb_env *env)
> > +{
> > + struct sched_domain *sd = env->sd;
> >
> > - /*
> > - * ASYM_PACKING needs to force migrate tasks from busy but
> > - * lower priority CPUs in order to pack all tasks in the
> > - * highest priority CPUs.
> > - */
> > - if ((sd->flags & SD_ASYM_PACKING) &&
> > - sched_asym_prefer(env->dst_cpu, env->src_cpu))
> > - return 1;
> > - }
> > + if (asym_active_balance(env))
> > + return 1;
> >
> > /*
> > * The dst_cpu is idle and the src_cpu CPU has only 1 CFS task.
> > @@ -8881,6 +8885,16 @@ static int need_active_balance(struct lb_env *env)
> > if (env->src_grp_type == group_misfit_task)
> > return 1;
> >
> > + return 0;
> > +}
> > +
>
> Yeah so that's the active balance classification I was afraid of, and I
> don't agree with it.
>
> The way I see things, we happen to have some mechanisms that let us know
> straight away if we need an active balance (asym packing, misfit, lowered
> capacity), and we rely on the sd->nr_balance_failed threshold for the
> scenarios where we don't have any more information.
>
> We do happen to have a threshold because we don't want to go overboard with
> it, but when it is reached it's a clear sign that we *do* want to active
> balance because that's all we can do to try and solve the imbalance.
>
> To me, those are all legitimate reasons to. So they're all "voluntary"
> really, we *do* want all of these.
>
> > +static int need_active_balance(struct lb_env *env)
> > +{
> > + struct sched_domain *sd = env->sd;
> > +
> > + if (voluntary_active_balance(env))
> > + return 1;
> > +
> > return unlikely(sd->nr_balance_failed > sd->cache_nice_tries+2);
> > }
> >
> > @@ -9142,7 +9156,7 @@ static int load_balance(int this_cpu, struct rq *this_rq,
> > } else
> > sd->nr_balance_failed = 0;
> >
> > - if (likely(!active_balance)) {
> > + if (likely(!active_balance) || voluntary_active_balance(&env)) {
>
> So now we reset the interval for all active balances (expect last active
> balance case), even when it is done as a last resort because all other
> tasks were pinned.
>
> Arguably the current code isn't much better (always increase the interval
> when active balancing), but at least it covers this case. It would be a
> waste not to take this into account when we can detect this scenario

So i'd like to remind the $subject of this patchset: fix some known
issues for asym_packing.
While looking at this, we have added few other voluntary active
balances because it was "obvious" that this active migration were
voluntary one. But in fact, we don't have any UC which show a problem
for those additional UC so far.

The default behavior for active migration is to increase the interval
Now you want to extend the exception to others active migration UC
whereas it's not clear that we don't fall in the default active
migration UC and we should not increase the interval.

What you want is changing the behavior of the scheduler for UC that
haven't raised any problem where asym_packing has known problem/

Changing default behavior for active migration is not subject of this
patchset and should be treated in another one like the one discussed
with peter few days ago

> (I'll reiterate my LBF_ALL_PINNED suggestion).
>
> > /* We were unbalanced, so reset the balancing interval */
> > sd->balance_interval = sd->min_interval;
> > } else {
> >