Re: [PATCH v16 0/9] Add support for Sub-NUMA cluster (SNC) systems
From: Reinette Chatre
Date: Tue Mar 19 2024 - 13:53:16 EST
(+Maciej)
Hi Tony,
(Please add x86/resctrl to Subject prefix of cover letter)
On 3/12/2024 2:42 PM, Tony Luck wrote:
> The Sub-NUMA cluster feature on some Intel processors partitions the CPUs
> that share an L3 cache into two or more sets. This plays havoc with the
> Resource Director Technology (RDT) monitoring features. Prior to this
> patch Intel has advised that SNC and RDT are incompatible.
>
> Some of these CPU support an MSR that can partition the RMID counters in
> the same way. This allows monitoring features to be used. With the caveat
> that users must be aware that Linux may migrate tasks more frequently
> between SNC nodes than between "regular" NUMA nodes, so reading counters
> from all SNC nodes may be needed to get a complete picture of activity
> for tasks.
>
> Cache and memory bandwidth allocation features continue to operate at
> the scope of the L3 cache.
>
> Signed-off-by: Tony Luck <tony.luck@xxxxxxxxx>
>
> ---
> Changes since v15: Link: https://lore.kernel.org/all/20240228112935.8087-tony.luck@xxxxxxxxx/
>
> 0) Note that v14 Reviewed/Testing tags have been removed because of the
> extent of refactoring to catch up with upstream. But nothing
> fundamental changed, so everything should look familiar.
>
> 1) Refactor to apply on top of Link: https://lore.kernel.org/all/20240308213846.77075-1-tony.luck@xxxxxxxxx/
> [So base commit is either tip x86/cache, or upstream current merge PLUS
> the two patches in that series]
>
> 2) Add patch 9 which adds files showing mappings from domains to CPUs
> Reinette suggested this, James thinks it duplicates information
> that can be gathered from /sys/devices/system/
> Discussion here: Link: https://lore.kernel.org/all/ZetcM9GO2PH6SC0j@agluck-desk3/
> This part is a nice-to-have. I'm fine if just the first eight patches
> are applied without this while the discussion continues.
I agree to drop patch #9.
The core support for SNC continue to look good to me (I just had a few nitpicks).
What remains is the user interface that continues to gather opinions [3]. These new
discussions were prompted by user space needing a way to determine if resctrl supports
SNC. This started by using the "size" file but thinking about it more user space could
also look at whether the number of L3 control domains are different from the number
of L3 monitoring domains? I am adding Maciej for his opinion (please also include him
in future versions of this series).
Apart from the user space requirement to know if SNC is supported by resctrl there
is also the interface with which user space obtains the monitoring data.
James highlighted [1] that the interface used in this series uses existing files to
represent different content, and can thus be considered as "broken". It is not obvious
to me how to "fix" this. Should we continue to explore interfaces like [2] that
attempts to add SNC support into resctrl or should the message continue to be
that SNC "plays havoc with the RDT monitoring features" and users wanting to use
SNC and RDT at the same time are expected to adapt to the peculiar interface ...
or is the preference that after this series "SNC and RDT are compatible" and
thus presented with an intuitive interface?
Reinette
[1] https://lore.kernel.org/lkml/88430722-67b3-4f7d-8db2-95ee52b6f0b0@xxxxxxx/
[2] https://lore.kernel.org/lkml/SJ1PR11MB608309F47C00F964E16205D6FC2D2@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/
[3] https://lore.kernel.org/lkml/SJ1PR11MB608310C72D7189C139EA6302FC212@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/