RE: [PATCH v5 8/8] selftests/resctrl: Adjust effective L3 cache size when SNC enabled

From: Shaopeng Tan (Fujitsu)
Date: Thu Sep 07 2023 - 01:23:07 EST


Hi Tony,

> Sub-NUMA Cluster divides CPUs sharing an L3 cache into separate NUMA
> nodes. Systems may support splitting into either two or four nodes.
>
> When SNC mode is enabled the effective amount of L3 cache available for
> allocation is divided by the number of nodes per L3.
>
> Detect which SNC mode is active by comparing the number of CPUs that share
> a cache with CPU0, with the number of CPUs on node0.
>
> This gives some hope of tests passing. But additional test infrastructure
> changes are needed to bind tests to nodes and guarantee memory allocation
> from the local node.
>
> Reported-by: "Shaopeng Tan (Fujitsu)" <tan.shaopeng@xxxxxxxxxxx>
> Signed-off-by: Tony Luck <tony.luck@xxxxxxxxx>
> ---
> tools/testing/selftests/resctrl/resctrl.h | 1 +
> tools/testing/selftests/resctrl/resctrlfs.c | 57
> +++++++++++++++++++++
> 2 files changed, 58 insertions(+)
>
> diff --git a/tools/testing/selftests/resctrl/resctrl.h
> b/tools/testing/selftests/resctrl/resctrl.h
> index 87e39456dee0..a8b43210b573 100644
> --- a/tools/testing/selftests/resctrl/resctrl.h
> +++ b/tools/testing/selftests/resctrl/resctrl.h
> @@ -13,6 +13,7 @@
> #include <signal.h>
> #include <dirent.h>
> #include <stdbool.h>
> +#include <ctype.h>
> #include <sys/stat.h>
> #include <sys/ioctl.h>
> #include <sys/mount.h>
> diff --git a/tools/testing/selftests/resctrl/resctrlfs.c
> b/tools/testing/selftests/resctrl/resctrlfs.c
> index fb00245dee92..79eecbf9f863 100644
> --- a/tools/testing/selftests/resctrl/resctrlfs.c
> +++ b/tools/testing/selftests/resctrl/resctrlfs.c
> @@ -130,6 +130,61 @@ int get_resource_id(int cpu_no, int *resource_id)
> return 0;
> }
>
> +/*
> + * Count number of CPUs in a /sys bit map */ static int
> +count_sys_bitmap_bits(char *name) {
> + FILE *fp = fopen(name, "r");
> + int count = 0, c;
> +
> + if (!fp)
> + return 0;
> +
> + while ((c = fgetc(fp)) != EOF) {
> + if (!isxdigit(c))
> + continue;
> + switch (c) {
> + case 'f':
> + count++;
> + case '7': case 'b': case 'd': case 'e':
> + count++;
> + case '3': case '5': case '6': case '9': case 'a': case 'c':
> + count++;
> + case '1': case '2': case '4': case '8':
> + count++;
> + }
> + }
> + fclose(fp);
> +
> + return count;
> +}
> +
> +/*
> + * Detect SNC by compating #CPUs in node0 with #CPUs sharing LLC with
> +CPU0
> + * Try to get this right, even if a few CPUs are offline so that the
> +number
> + * of CPUs in node0 is not exactly half or a quarter of the CPUs
> +sharing the
> + * LLC of CPU0.
> + */
> +static int snc_ways(void)
> +{
> + int node_cpus, cache_cpus;
> +
> + node_cpus =
> count_sys_bitmap_bits("/sys/devices/system/node/node0/cpumap");
> + cache_cpus =
> +count_sys_bitmap_bits("/sys/devices/system/cpu/cpu0/cache/index3/sh
> ared
> +_cpu_map");
> +
> + if (!node_cpus || !cache_cpus) {
> + fprintf(stderr, "Warning could not determine Sub-NUMA
> Cluster mode\n");
> + return 1;
> + }
> +
> + if (4 * node_cpus >= cache_cpus)
> + return 4;
> + else if (2 * node_cpus >= cache_cpus)
> + return 2;


If "4 * node_cpus >= cache_cpus " is not true,
"2 * node_cpus >= cache_cpus" will never be true.
Is it the following code?

+ if (2 * node_cpus >= cache_cpus)
+ return 2;
+ else if (4 * node_cpus >= cache_cpus)
+ return 4;

Best regards,
Shaopeng TAN