Re: [RFC PATCH v5 1/4] topology: Represent clusters of CPUs within a die

From: Jonathan Cameron
Date: Fri Mar 19 2021 - 05:38:28 EST


On Fri, 19 Mar 2021 06:57:08 +0000
"Song Bao Hua (Barry Song)" <song.bao.hua@xxxxxxxxxxxxx> wrote:

> > -----Original Message-----
> > From: Greg KH [mailto:gregkh@xxxxxxxxxxxxxxxxxxx]
> > Sent: Friday, March 19, 2021 7:35 PM
> > To: Song Bao Hua (Barry Song) <song.bao.hua@xxxxxxxxxxxxx>
> > Cc: tim.c.chen@xxxxxxxxxxxxxxx; catalin.marinas@xxxxxxx; will@xxxxxxxxxx;
> > rjw@xxxxxxxxxxxxx; vincent.guittot@xxxxxxxxxx; bp@xxxxxxxxx;
> > tglx@xxxxxxxxxxxxx; mingo@xxxxxxxxxx; lenb@xxxxxxxxxx; peterz@xxxxxxxxxxxxx;
> > dietmar.eggemann@xxxxxxx; rostedt@xxxxxxxxxxx; bsegall@xxxxxxxxxx;
> > mgorman@xxxxxxx; msys.mizuma@xxxxxxxxx; valentin.schneider@xxxxxxx; Jonathan
> > Cameron <jonathan.cameron@xxxxxxxxxx>; juri.lelli@xxxxxxxxxx;
> > mark.rutland@xxxxxxx; sudeep.holla@xxxxxxx; aubrey.li@xxxxxxxxxxxxxxx;
> > linux-arm-kernel@xxxxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx;
> > linux-acpi@xxxxxxxxxxxxxxx; x86@xxxxxxxxxx; xuwei (O) <xuwei5@xxxxxxxxxx>;
> > Zengtao (B) <prime.zeng@xxxxxxxxxxxxx>; guodong.xu@xxxxxxxxxx; yangyicong
> > <yangyicong@xxxxxxxxxx>; Liguozhu (Kenneth) <liguozhu@xxxxxxxxxxxxx>;
> > linuxarm@xxxxxxxxxxxxx; hpa@xxxxxxxxx
> > Subject: Re: [RFC PATCH v5 1/4] topology: Represent clusters of CPUs within
> > a die
> >
> > On Fri, Mar 19, 2021 at 05:16:15PM +1300, Barry Song wrote:
> > > diff --git a/Documentation/admin-guide/cputopology.rst
> > b/Documentation/admin-guide/cputopology.rst
> > > index b90dafc..f9d3745 100644
> > > --- a/Documentation/admin-guide/cputopology.rst
> > > +++ b/Documentation/admin-guide/cputopology.rst
> > > @@ -24,6 +24,12 @@ core_id:
> > > identifier (rather than the kernel's). The actual value is
> > > architecture and platform dependent.
> > >
> > > +cluster_id:
> > > +
> > > + the Cluster ID of cpuX. Typically it is the hardware platform's
> > > + identifier (rather than the kernel's). The actual value is
> > > + architecture and platform dependent.
> > > +
> > > book_id:
> > >
> > > the book ID of cpuX. Typically it is the hardware platform's
> > > @@ -56,6 +62,14 @@ package_cpus_list:
> > > human-readable list of CPUs sharing the same physical_package_id.
> > > (deprecated name: "core_siblings_list")
> > >
> > > +cluster_cpus:
> > > +
> > > + internal kernel map of CPUs within the same cluster.
> > > +
> > > +cluster_cpus_list:
> > > +
> > > + human-readable list of CPUs within the same cluster.
> > > +
> > > die_cpus:
> > >
> > > internal kernel map of CPUs within the same die.
> >
> > Why are these sysfs files in this file, and not in a Documentation/ABI/
> > file which can be correctly parsed and shown to userspace?
>
> Well. Those ABIs have been there for much a long time. It is like:
>
> [root@ceph1 topology]# ls
> core_id core_siblings core_siblings_list physical_package_id thread_siblings thread_siblings_list
> [root@ceph1 topology]# pwd
> /sys/devices/system/cpu/cpu100/topology
> [root@ceph1 topology]# cat core_siblings_list
> 64-127
> [root@ceph1 topology]#
>
> >
> > Any chance you can fix that up here as well?
>
> Yes. we will send a separate patch to address this, which won't
> be in this patchset. This patchset will base on that one.
>
> >
> > Also note that "list" is not something that goes in sysfs, sysfs is "one
> > value per file", and a list is not "one value". How do you prevent
> > overflowing the buffer of the sysfs file if you have a "list"?
> >
>
> At a glance, the list is using "-" rather than a real list
> [root@ceph1 topology]# cat core_siblings_list
> 64-127
>
> Anyway, I will take a look if it has any chance to overflow.

It could in theory be alternate CPUs as comma separated list.
So it's would get interesting around 500-1000 cpus (guessing).

Hopefully no one has that crazy a cpu numbering scheme but it's possible
(note that cluster is fine for this, but I guess it might eventually
happen for core-siblings list (cpus within a package).

Shouldn't crash or anything like that but might terminate early.

On sysfs file conversion, that got mentioned earlier but I forgot
to remind Barry about it when he took this patch into his series.
Sorry about that!

Jonathan


>
> > thanks,
> >
> > greg k-h
>
> Thanks
> Barry
>