Re: [PATCH v3 5/6] sched/fair: Consider SMT in ASYM_PACKING load balance

From: Ricardo Neri
Date: Tue May 18 2021 - 15:08:33 EST


On Fri, May 14, 2021 at 07:14:15PM -0700, Ricardo Neri wrote:
> On Fri, May 14, 2021 at 11:47:45AM +0200, Peter Zijlstra wrote:
> > On Thu, May 13, 2021 at 08:49:08AM -0700, Ricardo Neri wrote:
> > > include/linux/sched/topology.h | 1 +
> > > kernel/sched/fair.c | 101 +++++++++++++++++++++++++++++++++
> > > 2 files changed, 102 insertions(+)
> > >
> > > diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h
> > > index 8f0f778b7c91..43bdb8b1e1df 100644
> > > --- a/include/linux/sched/topology.h
> > > +++ b/include/linux/sched/topology.h
> > > @@ -57,6 +57,7 @@ static inline int cpu_numa_flags(void)
> > > #endif
> > >
> > > extern int arch_asym_cpu_priority(int cpu);
> > > +extern bool arch_asym_check_smt_siblings(void);
> > >
> > > struct sched_domain_attr {
> > > int relax_domain_level;
> > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > > index c8b66a5d593e..3d6cc027e6e6 100644
> > > --- a/kernel/sched/fair.c
> > > +++ b/kernel/sched/fair.c
> > > @@ -106,6 +106,15 @@ int __weak arch_asym_cpu_priority(int cpu)
> > > return -cpu;
> > > }
> > >
> > > +/*
> > > + * For asym packing, first check the state of SMT siblings before deciding to
> > > + * pull tasks.
> > > + */
> > > +bool __weak arch_asym_check_smt_siblings(void)
> > > +{
> > > + return false;
> > > +}
> > > +
> > > /*
> > > * The margin used when comparing utilization with CPU capacity.
> > > *
> >
> > > @@ -8458,6 +8550,9 @@ sched_asym(struct lb_env *env, struct sd_lb_stats *sds, struct sg_lb_stats *sgs
> > > if (group == sds->local)
> > > return false;
> > >
> > > + if (arch_asym_check_smt_siblings())
> > > + return asym_can_pull_tasks(env->dst_cpu, sds, sgs, group);
> > > +
> > > return sched_asym_prefer(env->dst_cpu, group->asym_prefer_cpu);
> > > }
> >
> > So I'm thinking that this is a property of having ASYM_PACKING at a core
> > level, rather than some arch special. Wouldn't something like this be
> > more appropriate?
> >
> > ---
> > --- a/include/linux/sched/topology.h
> > +++ b/include/linux/sched/topology.h
> > @@ -57,7 +57,6 @@ static inline int cpu_numa_flags(void)
> > #endif
> >
> > extern int arch_asym_cpu_priority(int cpu);
> > -extern bool arch_asym_check_smt_siblings(void);
> >
> > struct sched_domain_attr {
> > int relax_domain_level;
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -107,15 +107,6 @@ int __weak arch_asym_cpu_priority(int cp
> > }
> >
> > /*
> > - * For asym packing, first check the state of SMT siblings before deciding to
> > - * pull tasks.
> > - */
> > -bool __weak arch_asym_check_smt_siblings(void)
> > -{
> > - return false;
> > -}
> > -
> > -/*
> > * The margin used when comparing utilization with CPU capacity.
> > *
> > * (default: ~20%)
> > @@ -8550,7 +8541,8 @@ sched_asym(struct lb_env *env, struct sd
> > if (group == sds->local)
> > return false;
> >
> > - if (arch_asym_check_smt_siblings())
> > + if ((sds->local->flags & SD_SHARE_CPUCAPACITY) ||
> > + (group->flags & SD_SHARE_CPUCAPACITY))
> > return asym_can_pull_tasks(env->dst_cpu, sds, sgs, group);
>
> Thanks Peter for the quick review! This makes sense to me. The only
> reason we proposed arch_asym_check_smt_siblings() is because we were
> about breaking powerpc (I need to study how they set priorities for SMT,
> if applicable). If you think this is not an issue I can post a
> v4 with this update.

As far as I can see, priorities in powerpc are set by the CPU number.
However, I am not sure how CPUs are enumerated? If CPUs in brackets are
SMT sibling, Does an enumeration looks like A) [0, 1], [2, 3] or B) [0, 2],
[1, 3]? I guess B is the right answer. Otherwise, both SMT siblings of a
core would need to be busy before a new core is used.

Still, I think the issue described in the cover letter may be
reproducible in powerpc as well. If CPU3 is offlined, and [0, 2] pulled
tasks from [1, -] so that both CPU0 and CPU2 become busy, CPU1 would not be
able to help since CPU0 has the highest priority.

I am cc'ing the linuxppc list to get some feedback.

Thanks and BR,
Ricardo