RE: [Linuxarm] Re: [PATCH] sched/fair: remove redundant test_idle_cores for non-smt

From: Song Bao Hua (Barry Song)
Date: Mon Mar 22 2021 - 01:09:27 EST




> -----Original Message-----
> From: Li, Aubrey [mailto:aubrey.li@xxxxxxxxxxxxxxx]
> Sent: Monday, March 22, 2021 5:37 PM
> To: Song Bao Hua (Barry Song) <song.bao.hua@xxxxxxxxxxxxx>;
> vincent.guittot@xxxxxxxxxx; mingo@xxxxxxxxxx; peterz@xxxxxxxxxxxxx;
> juri.lelli@xxxxxxxxxx; dietmar.eggemann@xxxxxxx; rostedt@xxxxxxxxxxx;
> bsegall@xxxxxxxxxx; mgorman@xxxxxxx
> Cc: valentin.schneider@xxxxxxx; linux-arm-kernel@xxxxxxxxxxxxxxxxxxx;
> linux-kernel@xxxxxxxxxxxxxxx; xuwei (O) <xuwei5@xxxxxxxxxx>; Zengtao (B)
> <prime.zeng@xxxxxxxxxxxxx>; guodong.xu@xxxxxxxxxx; yangyicong
> <yangyicong@xxxxxxxxxx>; Liguozhu (Kenneth) <liguozhu@xxxxxxxxxxxxx>;
> linuxarm@xxxxxxxxxxxxx
> Subject: [Linuxarm] Re: [PATCH] sched/fair: remove redundant test_idle_cores
> for non-smt
>
> Hi Barry,
>
> On 2021/3/21 6:14, Barry Song wrote:
> > update_idle_core() is only done for the case of sched_smt_present.
> > but test_idle_cores() is done for all machines even those without
> > smt.
>
> The patch looks good to me.
> May I know for what case we need to keep CONFIG_SCHED_SMT for non-smt
> machines?


Hi Aubrey,

I think the defconfig of arm64 has always enabled
CONFIG_SCHED_SMT:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm64/configs/defconfig

it is probably true for x86 as well.

I don't think Linux distribution will build a separate kernel
for machines without smt. so basically the kernel depends on
runtime topology parse to figure out if smt is present rather
than depending on a rebuild.


>
> Thanks,
> -Aubrey
>
>
> > this could contribute to up 8%+ hackbench performance loss on a
> > machine like kunpeng 920 which has no smt. this patch removes the
> > redundant test_idle_cores() for non-smt machines.
> >
> > we run the below hackbench with different -g parameter from 2 to
> > 14, for each different g, we run the command 10 times and get the
> > average time:
> > $ numactl -N 0 hackbench -p -T -l 20000 -g $1
> >
> > hackbench will report the time which is needed to complete a certain
> > number of messages transmissions between a certain number of tasks,
> > for example:
> > $ numactl -N 0 hackbench -p -T -l 20000 -g 10
> > Running in threaded mode with 10 groups using 40 file descriptors each
> > (== 400 tasks)
> > Each sender will pass 20000 messages of 100 bytes
> >
> > The below is the result of hackbench w/ and w/o this patch:
> > g= 2 4 6 8 10 12 14
> > w/o: 1.8151 3.8499 5.5142 7.2491 9.0340 10.7345 12.0929
> > w/ : 1.8428 3.7436 5.4501 6.9522 8.2882 9.9535 11.3367
> > +4.1% +8.3% +7.3% +6.3%
> >
> > Signed-off-by: Barry Song <song.bao.hua@xxxxxxxxxxxxx>
> > ---
> > kernel/sched/fair.c | 8 +++++---
> > 1 file changed, 5 insertions(+), 3 deletions(-)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 2e2ab1e..de42a32 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -6038,9 +6038,11 @@ static inline bool test_idle_cores(int cpu, bool def)
> > {
> > struct sched_domain_shared *sds;
> >
> > - sds = rcu_dereference(per_cpu(sd_llc_shared, cpu));
> > - if (sds)
> > - return READ_ONCE(sds->has_idle_cores);
> > + if (static_branch_likely(&sched_smt_present)) {
> > + sds = rcu_dereference(per_cpu(sd_llc_shared, cpu));
> > + if (sds)
> > + return READ_ONCE(sds->has_idle_cores);
> > + }
> >
> > return def;
> > }

Thanks
Barry