Re: [PATCH v2 3/7] sched: Teach arch_asym_cpu_priority() the idle state of SMT siblings
From: Ricardo Neri
Date: Mon Dec 12 2022 - 12:46:33 EST
On Tue, Dec 06, 2022 at 06:54:39PM +0100, Dietmar Eggemann wrote:
> On 22/11/2022 21:35, Ricardo Neri wrote:
> > Some processors (e.g., Intel processors with ITMT) use asym_packing to
> > balance load between physical cores with SMT. In such case, a core with all
> > its SMT siblings idle is more desirable than another with one or more busy
> > siblings.
> >
> > Other processors (e.g, Power7 with SMT8) use asym_packing to balance load
> > among SMT siblings of different priority, regardless of their idle state.
> >
> > Add a new parameter, check_smt, that architectures can use as needed.
>
> [...]
>
> > Changes since v1:
> > * Introduced this patch.
>
> [...]
>
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index d18947a9c03e..0e4251f83807 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -142,8 +142,11 @@ __setup("sched_thermal_decay_shift=", setup_sched_thermal_decay_shift);
> > #ifdef CONFIG_SMP
> > /*
> > * For asym packing, by default the lower numbered CPU has higher priority.
> > + *
> > + * When doing ASYM_PACKING at the "MC" or higher domains, architectures may
>
> There is this new CLUSTER level between SMT and MC.
This is a good catch! I will update the comment to say "domains above SMT".
>
> > + * want to check the idle state of the SMT siblngs of @cpu.
>
> s/siblngs/siblings
>
> The scheduler calls sched_asym_prefer(..., true) from
> find_busiest_queue(), asym_active_balance() and nohz_balancer_kick()
In these places we are comparing two specific CPUs, of which the idle
state of its siblings impact their throughput and, in consequence, the
decision of attempt to balance load.
In the places were sched_asym_prefer(...., false) is called we compare the
destination CPU with a CPU that bears the priority of a sched group,
regardless of the idle state of its siblings.
> even from SMT layer on !x86.
This is true, but the default arch_asym_cpu_priority ignores check_smt.
> So I guess a `bool check_smt` wouldn't be
> sufficient to distinguish whether sched_smt_siblings_idle() should be
> called or not.
But it is the caller who determines whether the idle state of the SMT
siblings of @cpu may be relevant.
> To me this comment is a little bit misleading. Not an
> issue currently since there is only the x86 overwrite right now.
If my justification make sense to you, I can expand the comment to state
that the caller decides whether check_smt is needed but arch-specific
implementations are free to ignore it.
Thanks and BR,
Ricardo