Re: [PATCH] cpu-topology: warn if NUMA configurations conflicts with lower layer

From: Sudeep Holla
Date: Thu Jan 02 2020 - 08:59:53 EST


On Thu, Jan 02, 2020 at 12:47:01PM +0000, Zengtao (B) wrote:
> > -----Original Message-----
> > From: Sudeep Holla [mailto:sudeep.holla@xxxxxxx]
> > Sent: Thursday, January 02, 2020 7:30 PM
> > To: Zengtao (B)
> > Cc: Linuxarm; Greg Kroah-Hartman; Rafael J. Wysocki;
> > linux-kernel@xxxxxxxxxxxxxxx; Morten Rasmussen
> > Subject: Re: [PATCH] cpu-topology: warn if NUMA configurations conflicts
> > with lower layer
> >
> > On Thu, Jan 02, 2020 at 03:05:40AM +0000, Zengtao (B) wrote:
> > > Hi Sudeep:
> > >
> > > Thanks for your reply.
> > >
> > > > -----Original Message-----
> > > > From: Sudeep Holla [mailto:sudeep.holla@xxxxxxx]
> > > > Sent: Wednesday, January 01, 2020 12:41 AM
> > > > To: Zengtao (B)
> > > > Cc: Linuxarm; Greg Kroah-Hartman; Rafael J. Wysocki;
> > > > linux-kernel@xxxxxxxxxxxxxxx; Sudeep Holla; Morten Rasmussen
> > > > Subject: Re: [PATCH] cpu-topology: warn if NUMA configurations
> > conflicts
> > > > with lower layer
> > > >
> > > > On Mon, Dec 23, 2019 at 04:16:19PM +0800, z00214469 wrote:
> > > > > As we know, from sched domain's perspective, the DIE layer should
> > be
> > > > > larger than or at least equal to the MC layer, and in some cases, MC
> > > > > is defined by the arch specified hardware, MPIDR for example, but
> > > > NUMA
> > > > > can be defined by users,
> > > >
> > > > Who are the users you are referring above ?
> > > For example, when I use QEMU to start a guest linux, I can define the
> > > NUMA topology of the guest linux whatever i want.
> >
> > OK and how is the information passed to the kernel ? DT or ACPI ?
> > We need to fix the miss match if any during the initial parse of those
> > information.
> >
>
> Both, For the current QEMU, we don't have the correct cpu topology
> passed to linux. Luckily drjones planed to deal with the issue.
> https://patchwork.ozlabs.org/cover/939301/
>
> > > > > with the following system configrations:
> > > >
> > > > Do you mean ACPI tables or DT or some firmware tables ?
> > > >
> > > > > *************************************
> > > > > NUMA: 0-2, 3-7
> > > >
> > > > Is the above simply wrong with respect to hardware and it actually
> > match
> > > > core_siblings ?
> > > >
> > > Actually, we can't simply say this is wrong, i just want to show an
> > example.
> > > And this example also can be:
> > > NUMA: 0-23, 24-47
> > > core_siblings: 0-15, 16-31, 32-47
> > >
> >
> > Are you sure of the above ? Possible values w.r.t hardware config:
> > core_siblings: 0-15, 16-23, 24-31, 32-47
> >
> > But what you have specified above is still wrong core_siblings IMO.
> >
> It depends on the hardware, on my platform, 16 cores per cluster.
>

Sorry, I made mistake with my examples above, I was assuming 8 CPUs
per cluster but didn't represent it well. Anyways my point was:

Can few CPUs in a cluster be part of one NUMA node while the remaining
CPUs of the same cluster part of another NUMA node ? That sounds
interesting and quite complex topology to me. How does the cache
topology look like in that case ?

--
Regards,
Sudeep