Re: [PATCH v2 1/7] sched/fair: Generalize asym_packing logic for SMT local sched group

From: Ricardo Neri
Date: Mon Dec 12 2022 - 12:45:52 EST


On Tue, Dec 06, 2022 at 06:22:41PM +0100, Dietmar Eggemann wrote:
> On 22/11/2022 21:35, Ricardo Neri wrote:
> > When balancing load between two physical cores, an idle destination CPU can
> > help another core only if all of its SMT siblings are also idle. Otherwise,
> > there is not increase in throughput. It does not matter whether the other
> > core is composed of SMT siblings.
> >
> > Simply check if there are any tasks running on the local group and the
> > other core has exactly one busy CPU before proceeding. Let
> > find_busiest_group() handle the case of more than one busy CPU. This is
> > true for SMT2, SMT4, SMT8, etc.
>
> [...]

Thank you very much for your feedback, Dietmar!

>
> > Changes since v1:
> > * Reworded commit message and inline comments for clarity.
> > * Stated that this changeset does not impact STM4 or SMT8.
> > ---
> > kernel/sched/fair.c | 29 +++++++++--------------------
> > 1 file changed, 9 insertions(+), 20 deletions(-)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index e4a0b8bd941c..18c672ff39ef 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -8900,12 +8900,10 @@ static bool asym_smt_can_pull_tasks(int dst_cpu, struct sd_lb_stats *sds,
> > struct sched_group *sg)
>
> I'm not sure why you change asym_smt_can_pull_tasks() together with
> removing SD_ASYM_PACKING from SMT level (patch 5/7)?

In x86 we have SD_ASYM_PACKING at the MC, CLS* and, before my patches, SMT
sched domains.

>
> update_sg_lb_stats()
>
> ... && env->sd->flags & SD_ASYM_PACKING && .. && sched_asym()
> ^^^^^^^^^^^^
> sched_asym()
>
> if ((sds->local->flags & SD_SHARE_CPUCAPACITY) ||
> (group->flags & SD_SHARE_CPUCAPACITY))
> return asym_smt_can_pull_tasks()
> ^^^^^^^^^^^^^^^^^^^^^^^^^
>
> So x86 won't have a sched domain with SD_SHARE_CPUCAPACITY and
> SD_ASYM_PACKING anymore. So sched_asym() would call sched_asym_prefer()
> directly on MC. What do I miss here?

asym_smt_can_pull_tasks() is used above the SMT level *and* when either the
local or sg sched groups are composed of CPUs that are SMT siblings.

In fact, asym_smt_can_pull_tasks() can only be called above the SMT level.
This is because the flags of a sched_group in a sched_domain are equal to
the flags of the child sched_domain. Since SMT is the lowest sched_domain,
its groups' flags are 0.

sched_asym() calls sched_asym_prefer() directly if balancing at the
SMT level and, at higher domains, if the child domain is not SMT.

This meets the requirement of Power7, where SMT siblings have different
priorities; and of x86, where physical cores have different priorities.

Thanks and BR,
Ricardo

* The target of these patches is Intel hybrid processors, on which cluster
scheduling is disabled - cabdc3a8475b ("sched,x86: Don't use cluster
topology for x86 hybrid CPUs"). Also, I have not observed topologies in
which CPUs of the same cluster have different priorities.

We are also looking into re-enabling cluster scheduling on hybrid
topologies.