Re: [PATCH V2 12/13] selftests/resctrl: Do not compare performance counters and resctrl at low bandwidth

From: Ilpo Järvinen
Date: Fri Oct 04 2024 - 10:23:37 EST


On Thu, 12 Sep 2024, Reinette Chatre wrote:

> The MBA test incrementally throttles memory bandwidth, each time
> followed by a comparison between the memory bandwidth observed
> by the performance counters and resctrl respectively.
>
> While a comparison between performance counters and resctrl is
> generally appropriate, they do not have an identical view of
> memory bandwidth. For example RAS features or memory performance
> features that generate memory traffic may drive accesses that are
> counted differently by performance counters and MBM respectively,
> for instance generating "overhead" traffic which is not counted
> against any specific RMID. As a ratio, this different view of memory
> bandwidth becomes more apparent at low memory bandwidths.
>
> It is not practical to enable/disable the various features that
> may generate memory bandwidth to give performance counters and
> resctrl an identical view. Instead, do not compare performance
> counters and resctrl view of memory bandwidth when the memory
> bandwidth is low.
>
> Bandwidth throttling behaves differently across platforms
> so it is not appropriate to drop measurement data simply based
> on the throttling level. Instead, use a threshold of 750MiB
> that has been observed to support adequate comparison between
> performance counters and resctrl.
>
> Signed-off-by: Reinette Chatre <reinette.chatre@xxxxxxxxx>
> ---
> Changes since V1:
> - Fix code alignment and spacing.
> - Modify flow to use "continue" instead of "break" now that
> earlier changes decreases throttling.
> - Expand comment of define to elaborate causes of discrepancy
> between performance counters and MBM.
> ---
> tools/testing/selftests/resctrl/mba_test.c | 7 +++++++
> tools/testing/selftests/resctrl/resctrl.h | 10 ++++++++++
> 2 files changed, 17 insertions(+)
>
> diff --git a/tools/testing/selftests/resctrl/mba_test.c b/tools/testing/selftests/resctrl/mba_test.c
> index d8d9637c1951..5c6063d0a77c 100644
> --- a/tools/testing/selftests/resctrl/mba_test.c
> +++ b/tools/testing/selftests/resctrl/mba_test.c
> @@ -98,6 +98,13 @@ static bool show_mba_info(unsigned long *bw_imc, unsigned long *bw_resc)
>
> avg_bw_imc = sum_bw_imc / (NUM_OF_RUNS - 1);
> avg_bw_resc = sum_bw_resc / (NUM_OF_RUNS - 1);
> + if (avg_bw_imc < THROTTLE_THRESHOLD || avg_bw_resc < THROTTLE_THRESHOLD) {
> + ksft_print_msg("Bandwidth below threshold (%d MiB). Dropping results from MBA schemata %u.\n",
> + THROTTLE_THRESHOLD,
> + ALLOCATION_MIN + ALLOCATION_STEP * allocation);
> + continue;
> + }
> +
> avg_diff = (float)labs(avg_bw_resc - avg_bw_imc) / avg_bw_imc;
> avg_diff_per = (int)(avg_diff * 100);
>
> diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
> index dc01dc75cba5..eb151ac74938 100644
> --- a/tools/testing/selftests/resctrl/resctrl.h
> +++ b/tools/testing/selftests/resctrl/resctrl.h
> @@ -43,6 +43,16 @@
>
> #define DEFAULT_SPAN (250 * MB)
>
> +/*
> + * Memory bandwidth (in MiB) below which the bandwidth comparisons
> + * between iMC and resctrl are considered unreliable. For example RAS
> + * features or memory performance features that generate memory traffic
> + * may drive accesses that are counted differently by performance counters
> + * and MBM respectively, for instance generating "overhead" traffic which
> + * is not counted against any specific RMID.
> + */
> +#define THROTTLE_THRESHOLD 750
> +
> /*
> * fill_buf_param: "fill_buf" benchmark parameters
> * @buf_size: Size (in bytes) of buffer used in benchmark.
>

Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@xxxxxxxxxxxxxxx>

--
i.