Re: [PATCH 3/4] sched/fair: Consider SMT in ASYM_PACKING load balance

From: Peter Zijlstra
Date: Thu Apr 08 2021 - 12:47:21 EST


On Tue, Apr 06, 2021 at 04:17:10PM -0700, Ricardo Neri wrote:
> On Tue, Apr 06, 2021 at 01:17:28PM +0200, Peter Zijlstra wrote:
> > On Mon, Apr 05, 2021 at 09:11:07PM -0700, Ricardo Neri wrote:
> > > @@ -8507,6 +8619,10 @@ static bool update_sd_pick_busiest(struct lb_env *env,
> > > if (!sgs->sum_h_nr_running)
> > > return false;
> > >
> > > + if (sgs->group_type == group_asym_packing &&
> > > + !asym_can_pull_tasks(env->dst_cpu, sds, sgs, sg))
> > > + return false;
> >
> > All of this makes my head hurt; but afaict this isn't right.
> >
> > Your update_sg_lb_stats() change makes that we unconditionally set
> > sgs->group_asym_packing, and then this is to undo that. But it's not
> > clear this covers all cases right.
>
> We could not make a decision to set sgs->group_asym_packing in
> update_sg_lb_stats() because we don't have information about the dst_cpu
> and its SMT siblings if any. That is the reason I proposed to delay the
> decision to update_sd_pick_busiest(), where we can compare local and
> sgs.

Yeah, I sorta got that.

> > Even if !sched_asym_prefer(), we could end up selecting this sg as
> > busiest, but you're just bailing out here.
>
> Even if sgs->group_asym_packing is unconditionally set, sgs can still
> be classified as group_overloaded and group_imbalanced. In such cases
> we wouldn't bailout. sgs could not be classified as group_fully_busy
> or group_has_spare and we would bailout, though. Is your concern about
> these? I can fixup these two cases.

Yes. Either explain (in a comment) why those cases are not relevant, or
handle them properly.

Because when reading this, it wasn't at all obvious that this is correct
or as intended.