Re: [RFC PATCH] topology: Represent clusters of CPUs within a die.
From: Valentin Schneider
Date: Mon Oct 19 2020 - 09:48:11 EST
+Cc Jeremy
On 19/10/20 14:10, Morten Rasmussen wrote:
> Hi Jonathan,
> The problem I see is that the benefit of keeping tasks together due to
> the interconnect layout might vary significantly between systems. So if
> we introduce a new cpumask for cluster it has to have represent roughly
> the same system properties otherwise generic software consuming this
> information could be tricked.
>
> If there is a provable benefit of having interconnect grouping
> information, I think it would be better represented by a distance matrix
> like we have for NUMA.
>
> Morten
That's my queue to paste some of that stuff I've been rambling on and off
about!
With regards to cache / interconnect layout, I do believe that if we
want to support in the scheduler itself then we should leverage some
distance table rather than to create X extra scheduler topology levels.
I had a chat with Jeremy on the ACPI side of that sometime ago. IIRC given
that SLIT gives us a distance value between any two PXM, we could directly
express core-to-core distance in that table. With that (and if that still
lets us properly discover NUMA node spans), we could let the scheduler
build dynamic NUMA-like topology levels representing the inner quirks of
the cache / interconnect layout.
It's mostly pipe dreams for now, but there seems to be more and more
hardware where that would make sense; somewhat recently the PowerPC guys
added something to their arch-specific code in that regards.