Re: [PATCH v5 2/2] perf pmu intel: Adjust cpumaks for sub-NUMA clusters on Emeraldrapids

From: Chun-Tse Shao

Date: Fri May 15 2026 - 14:02:23 EST


Thanks, submitted v6 patch:
lore.kernel.org/20260515172710.428474-1-ctshao@xxxxxxxxxx
I also added SPR into SNC2 in that patch.

-CT

On Thu, Apr 9, 2026 at 9:43 PM Namhyung Kim <namhyung@xxxxxxxxxx> wrote:
>
> On Tue, Apr 07, 2026 at 01:38:43PM -0700, Chun-Tse Shao wrote:
> > Similar to GNR [1], Emeraldrapids supports sub-NUMA clusters as well.
> > Adjust cpumasks as the logic for GNR in [1].
> >
> > Tested on Emeraldrapids with SNC2 enabled:
> > $ perf stat --per-node -e 'UNC_CHA_CLOCKTICKS,UNC_M_CLOCKTICKS' -a -- sleep 1
> >
> > Performance counter stats for 'system wide':
> >
> > N0 30 72125876670 UNC_CHA_CLOCKTICKS
> > N0 4 8815163648 UNC_M_CLOCKTICKS
> > N1 30 72124958844 UNC_CHA_CLOCKTICKS
> > N1 4 8815014974 UNC_M_CLOCKTICKS
> > N2 30 72121049022 UNC_CHA_CLOCKTICKS
> > N2 4 8814592626 UNC_M_CLOCKTICKS
> > N3 30 72117133854 UNC_CHA_CLOCKTICKS
> > N3 4 8814012840 UNC_M_CLOCKTICKS
> >
> > 1.001574118 seconds time elapsed
> >
> > [1] lore.kernel.org/20250515181417.491401-1-irogers@xxxxxxxxxx
> >
> > Reviewed-by: Zide Chen <zide.chen@xxxxxxxxx>
> > Reviewed-by: Ian Rogers <irogers@xxxxxxxxxx>
> > Signed-off-by: Chun-Tse Shao <ctshao@xxxxxxxxxx>
> > ---
> > tools/perf/arch/x86/util/pmu.c | 56 +++++++++++++++++++++++-----------
> > 1 file changed, 38 insertions(+), 18 deletions(-)
> >
> > diff --git a/tools/perf/arch/x86/util/pmu.c b/tools/perf/arch/x86/util/pmu.c
> > index 938be36ec0f7..3743f5145505 100644
> > --- a/tools/perf/arch/x86/util/pmu.c
> > +++ b/tools/perf/arch/x86/util/pmu.c
> > @@ -30,8 +30,9 @@ static bool x86__is_snc_supported(void)
> >
> > if (!checked_if_snc_supported) {
> >
> > - /* Graniterapids supports SNC configuration. */
> > + /* Emeraldrapids Graniterapids support SNC configuration. */
> > static const char *const supported_cpuids[] = {
> > + "GenuineIntel-6-CF", /* Emeraldrapids */
> > "GenuineIntel-6-A[DE]", /* Graniterapids */
>
> It'd be great if we can share these string literals..
>
>
> > };
> > char *cpuid = get_cpuid_str((struct perf_cpu){0});
> > @@ -141,23 +142,42 @@ static int uncore_imc_snc(struct perf_pmu *pmu)
> > // Compute the IMC SNC using lookup tables.
> > unsigned int imc_num;
> > int snc_nodes = snc_nodes_per_l3_cache();
> > - const u8 snc2_map[] = {1, 1, 0, 0};
> > - const u8 snc3_map[] = {1, 1, 0, 0, 2, 2};
> > - const u8 *snc_map;
> > - size_t snc_map_len;
> > -
> > - switch (snc_nodes) {
> > - case 2:
> > - snc_map = snc2_map;
> > - snc_map_len = ARRAY_SIZE(snc2_map);
> > - break;
> > - case 3:
> > - snc_map = snc3_map;
> > - snc_map_len = ARRAY_SIZE(snc3_map);
> > - break;
> > - default:
> > - /* Error or no lookup support for SNC with >3 nodes. */
> > - return 0;
> > + char *cpuid;
> > + static const u8 emr_snc2_map[] = { 0, 0, 1, 1 };
> > + static const u8 gnr_snc2_map[] = { 1, 1, 0, 0 };
> > + static const u8 snc3_map[] = { 1, 1, 0, 0, 2, 2 };
> > + static const u8 *snc_map;
> > + static size_t snc_map_len;
> > +
> > + /* snc_map is not inited yet. We only look up once to avoid expensive operations. */
> > + if (!snc_map) {
> > + switch (snc_nodes) {
> > + case 2:
> > + cpuid = get_cpuid_str((struct perf_cpu){ 0 });
> > + if (cpuid) {
> > + if (strcmp_cpuid_str("GenuineIntel-6-CF", cpuid) == 0) {
> > + snc_map = emr_snc2_map;
> > + snc_map_len = ARRAY_SIZE(emr_snc2_map);
> > + } else if (strcmp_cpuid_str("GenuineIntel-6-A[DE]", cpuid) == 0) {
> > + snc_map = gnr_snc2_map;
> > + snc_map_len = ARRAY_SIZE(gnr_snc2_map);
>
> ... in here as well.
>
> Thanks,
> Namhyung
>
>
> > + }
> > + free(cpuid);
> > + }
> > + break;
> > + case 3:
> > + snc_map = snc3_map;
> > + snc_map_len = ARRAY_SIZE(snc3_map);
> > + break;
> > + default:
> > + /* Error or no lookup support for SNC with >3 nodes. */
> > + return 0;
> > + }
> > +
> > + if (!snc_map) {
> > + pr_warning("Unexpected: can not find snc map config");
> > + return 0;
> > + }
> > }
> >
> > /* Compute SNC for PMU. */
> > --
> > 2.53.0.1213.gd9a14994de-goog
> >