Re: [PATCH v6 17/21] arch_topology: Limit span of cpu_clustergroup_mask()

From: Sudeep Holla
Date: Fri Jul 08 2022 - 04:05:45 EST


Hi Darren,

I will let Ionela or Dietmar cover some of the scheduler aspects as
I don't have much knowledge in that area.

On Thu, Jul 07, 2022 at 05:10:19PM -0700, Darren Hart wrote:
> On Mon, Jul 04, 2022 at 11:16:01AM +0100, Sudeep Holla wrote:
> > From: Ionela Voinescu <ionela.voinescu@xxxxxxx>
>
> Hi Sudeep and Ionela,
>
> >
> > Currently the cluster identifier is not set on DT based platforms.
> > The reset or default value is -1 for all the CPUs. Once we assign the
> > cluster identifier values correctly, the cluster_sibling mask will be
> > populated and returned by cpu_clustergroup_mask() to contribute in the
> > creation of the CLS scheduling domain level, if SCHED_CLUSTER is
> > enabled.
> >
> > To avoid topologies that will result in questionable or incorrect
> > scheduling domains, impose restrictions regarding the span of clusters,
>
> Can you provide a specific example of a valid topology that results in
> the wrong thing currently?
>

As a simple example, Juno with 2 clusters and L2 for each cluster. IIUC
MC is preferred instead of CLS and both MC and CLS domains are exact
match.

> >
> > While previously the scheduling domain builder code would have removed MC
> > as redundant and kept CLS if SCHED_CLUSTER was enabled and the
> > cpu_coregroup_mask() and cpu_clustergroup_mask() spanned the same CPUs,
> > now CLS will be removed and MC kept.
> >
>
> This is not desireable for all systems, particular those which don't
> have an L3 but do share other resources - such as the snoop filter in
> the case of the Ampere Altra.
>
> While not universally supported, we agreed in the discussion on the
> above patch to allow systems to define clusters independently from the
> L3 as an LLC since this is also independently defined in PPTT.
>
> Going back to my first comment - does this fix an existing system with a
> valid topology?

Yes as mentioned above Juno.

> It's not clear to me what that would look like. The Ampere Altra presents
> a cluster level in PPTT because that is the desireable topology for the
> system.

Absolutely wrong reason. It should present because the hardware is so,
not because some OSPM desires something in someway. Sorry that's not how
DT/ACPI is designed for. If 2 different OSPM desires different things, then
one ACPI will not be sufficient.

> If it's not desirable for another system to have the cluster topology -
> shouldn't it not present that layer to the kernel in the first place?

Absolutely 100% yes, it must present it if the hardware is designed so.
No if or but.

--
Regards,
Sudeep