RE: [RFC PATCH v5 4/4] scheduler: Add cluster scheduler level for x86

From: Song Bao Hua (Barry Song)
Date: Wed Mar 31 2021 - 06:08:36 EST




> -----Original Message-----
> From: Song Bao Hua (Barry Song)
> Sent: Wednesday, March 24, 2021 12:15 PM
> To: 'Tim Chen' <tim.c.chen@xxxxxxxxxxxxxxx>; catalin.marinas@xxxxxxx;
> will@xxxxxxxxxx; rjw@xxxxxxxxxxxxx; vincent.guittot@xxxxxxxxxx; bp@xxxxxxxxx;
> tglx@xxxxxxxxxxxxx; mingo@xxxxxxxxxx; lenb@xxxxxxxxxx; peterz@xxxxxxxxxxxxx;
> dietmar.eggemann@xxxxxxx; rostedt@xxxxxxxxxxx; bsegall@xxxxxxxxxx;
> mgorman@xxxxxxx
> Cc: msys.mizuma@xxxxxxxxx; valentin.schneider@xxxxxxx;
> gregkh@xxxxxxxxxxxxxxxxxxx; Jonathan Cameron <jonathan.cameron@xxxxxxxxxx>;
> juri.lelli@xxxxxxxxxx; mark.rutland@xxxxxxx; sudeep.holla@xxxxxxx;
> aubrey.li@xxxxxxxxxxxxxxx; linux-arm-kernel@xxxxxxxxxxxxxxxxxxx;
> linux-kernel@xxxxxxxxxxxxxxx; linux-acpi@xxxxxxxxxxxxxxx; x86@xxxxxxxxxx;
> xuwei (O) <xuwei5@xxxxxxxxxx>; Zengtao (B) <prime.zeng@xxxxxxxxxxxxx>;
> guodong.xu@xxxxxxxxxx; yangyicong <yangyicong@xxxxxxxxxx>; Liguozhu (Kenneth)
> <liguozhu@xxxxxxxxxxxxx>; linuxarm@xxxxxxxxxxxxx; hpa@xxxxxxxxx
> Subject: RE: [RFC PATCH v5 4/4] scheduler: Add cluster scheduler level for x86
>
>
>
> > -----Original Message-----
> > From: Tim Chen [mailto:tim.c.chen@xxxxxxxxxxxxxxx]
> > Sent: Wednesday, March 24, 2021 11:51 AM
> > To: Song Bao Hua (Barry Song) <song.bao.hua@xxxxxxxxxxxxx>;
> > catalin.marinas@xxxxxxx; will@xxxxxxxxxx; rjw@xxxxxxxxxxxxx;
> > vincent.guittot@xxxxxxxxxx; bp@xxxxxxxxx; tglx@xxxxxxxxxxxxx;
> > mingo@xxxxxxxxxx; lenb@xxxxxxxxxx; peterz@xxxxxxxxxxxxx;
> > dietmar.eggemann@xxxxxxx; rostedt@xxxxxxxxxxx; bsegall@xxxxxxxxxx;
> > mgorman@xxxxxxx
> > Cc: msys.mizuma@xxxxxxxxx; valentin.schneider@xxxxxxx;
> > gregkh@xxxxxxxxxxxxxxxxxxx; Jonathan Cameron <jonathan.cameron@xxxxxxxxxx>;
> > juri.lelli@xxxxxxxxxx; mark.rutland@xxxxxxx; sudeep.holla@xxxxxxx;
> > aubrey.li@xxxxxxxxxxxxxxx; linux-arm-kernel@xxxxxxxxxxxxxxxxxxx;
> > linux-kernel@xxxxxxxxxxxxxxx; linux-acpi@xxxxxxxxxxxxxxx; x86@xxxxxxxxxx;
> > xuwei (O) <xuwei5@xxxxxxxxxx>; Zengtao (B) <prime.zeng@xxxxxxxxxxxxx>;
> > guodong.xu@xxxxxxxxxx; yangyicong <yangyicong@xxxxxxxxxx>; Liguozhu
> (Kenneth)
> > <liguozhu@xxxxxxxxxxxxx>; linuxarm@xxxxxxxxxxxxx; hpa@xxxxxxxxx
> > Subject: Re: [RFC PATCH v5 4/4] scheduler: Add cluster scheduler level for
> x86
> >
> >
> >
> > On 3/18/21 9:16 PM, Barry Song wrote:
> > > From: Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx>
> > >
> > > There are x86 CPU architectures (e.g. Jacobsville) where L2 cahce
> > > is shared among a cluster of cores instead of being exclusive
> > > to one single core.
> > >
> > > To prevent oversubscription of L2 cache, load should be
> > > balanced between such L2 clusters, especially for tasks with
> > > no shared data.
> > >
> > > Also with cluster scheduling policy where tasks are woken up
> > > in the same L2 cluster, we will benefit from keeping tasks
> > > related to each other and likely sharing data in the same L2
> > > cluster.
> > >
> > > Add CPU masks of CPUs sharing the L2 cache so we can build such
> > > L2 cluster scheduler domain.
> > >
> > > Signed-off-by: Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx>
> > > Signed-off-by: Barry Song <song.bao.hua@xxxxxxxxxxxxx>
> >
> >
> > Barry,
> >
> > Can you also add this chunk to the patch.
> > Thanks.
>
> Sure, Tim, Thanks. I'll put that into patch 4/4 in v6.

Hi Tim,
You might want to take a look at this qemu patchset:
https://lore.kernel.org/qemu-devel/20210331095343.12172-1-wangyanan55@xxxxxxxxxx/T/#t

someone is trying to leverage this cluster topology
to improve KVM virtual machines performance.

>
> >
> > Tim
> >
> >
> > diff --git a/arch/x86/include/asm/topology.h
> > b/arch/x86/include/asm/topology.h
> > index 2a11ccc14fb1..800fa48c9fcd 100644
> > --- a/arch/x86/include/asm/topology.h
> > +++ b/arch/x86/include/asm/topology.h
> > @@ -115,6 +115,7 @@ extern unsigned int __max_die_per_package;
> >
> > #ifdef CONFIG_SMP
> > #define topology_die_cpumask(cpu) (per_cpu(cpu_die_map, cpu))
> > +#define topology_cluster_cpumask(cpu) (cpu_clustergroup_mask(cpu))
> > #define topology_core_cpumask(cpu) (per_cpu(cpu_core_map, cpu))
> > #define topology_sibling_cpumask(cpu) (per_cpu(cpu_sibling_map, cpu))
> >
>

Thanks
Barry