Re: [PATCH v8 14/25] x86/resctrl: Add data structures and definitions for ABMC assignment

From: Reinette Chatre
Date: Tue Oct 15 2024 - 23:22:07 EST


Hi Babu,

On 10/9/24 10:39 AM, Babu Moger wrote:
> The ABMC feature provides an option to the user to assign a hardware
> counter to an RMID, event pair and monitor the bandwidth as long as the
> counter is assigned. The bandwidth events will be tracked by the hardware
> until the user changes the configuration. Each resctrl group can configure
> maximum two counters, one for total event and one for local event.
>
> The ABMC feature implements an MSR L3_QOS_ABMC_CFG (C000_03FDh).
> Configuration is done by setting the counter id, bandwidth source (RMID)
> and bandwidth configuration supported by BMEC (Bandwidth Monitoring Event
> Configuration).
>
> Attempts to read or write the MSR when ABMC is not enabled will result
> in a #GP(0) exception.
>
> Introduce the data structures and definitions for MSR L3_QOS_ABMC_CFG
> (0xC000_03FDh):
> =========================================================================
> Bits Mnemonic Description Access Reset
> Type Value
> =========================================================================
> 63 CfgEn Configuration Enable R/W 0
>
> 62 CtrEn Enable/disable counting R/W 0
>
> 61:53 – Reserved MBZ 0
>
> 52:48 CtrID Counter Identifier R/W 0
>
> 47 IsCOS BwSrc field is a CLOSID R/W 0
> (not an RMID)
>
> 46:44 – Reserved MBZ 0
>
> 43:32 BwSrc Bandwidth Source R/W 0
> (RMID or CLOSID)
>
> 31:0 BwType Bandwidth configuration R/W 0
> to track for this counter
> ==========================================================================
>
> The feature details are documented in the APM listed below [1].
> [1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
> Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth
> Monitoring (ABMC).
>
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
> Signed-off-by: Babu Moger <babu.moger@xxxxxxx>
> ---

...

> ---
> arch/x86/include/asm/msr-index.h | 1 +
> arch/x86/kernel/cpu/resctrl/internal.h | 33 ++++++++++++++++++++++++++
> 2 files changed, 34 insertions(+)
>
> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
> index 43c9dc473aba..2c281c977342 100644
> --- a/arch/x86/include/asm/msr-index.h
> +++ b/arch/x86/include/asm/msr-index.h
> @@ -1196,6 +1196,7 @@
> #define MSR_IA32_SMBA_BW_BASE 0xc0000280
> #define MSR_IA32_EVT_CFG_BASE 0xc0000400
> #define MSR_IA32_L3_QOS_EXT_CFG 0xc00003ff
> +#define MSR_IA32_L3_QOS_ABMC_CFG 0xc00003fd
>

As Tony mentioned, also please correct order of this MSR.

> /* AMD-V MSRs */
> #define MSR_VM_CR 0xc0010114
> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> index 86e3e188c119..de397468b945 100644
> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> @@ -602,6 +602,39 @@ union cpuid_0x10_x_edx {
> unsigned int full;
> };
>
> +/*
> + * ABMC counters can be configured by writing to L3_QOS_ABMC_CFG.

"ABMC counters are configured by writing to L3_QOS_ABMC_CFG."

> + * Reading L3_QOS_ABMC_DSC returns the configuration of the counter id
> + * specified in L3_QOS_ABMC_CFG.cntr_id.

First and only mention/use of L3_QOS_ABMC_DSC in this series. If this register
is not used then references to it can be removed.

> + * @bw_type : Bandwidth configuration(supported by BMEC)

"configuration(supported" -> "configuration (supported"

> + * tracked by the @cntr_id.
> + * @bw_src : Bandwidth source (RMID or CLOSID).
> + * @reserved1 : Reserved.
> + * @is_clos : @bw_src field is a CLOSID (not an RMID).
> + * @cntr_id : Counter identifier.
> + * @reserved : Reserved.
> + * @cntr_en : Counting enable bit.
> + * @cfg_en : Configuration enable bit.
> + *
> + * Configuration and counting:
> + * cfg_en=0, : No configuration changes applied.

Can this be expanded? (sidenote: It is taking a long time to get clarity on how
to interact with hardware. These incremental cryptic fragments make it difficult
to know how to interact with the hardware.)

For example, "No configuration changes applied. Counter can be configured across
multiple writes to MSR while @cfg_en=0. Configuration applied when @cfg_en=1."

> + * cfg_en=1, cntr_en=0 : Configure cntr_id and but no counting the events.

hmmm ... still the same (""but no counting the events") strange language I
highlighted in V7 ...

I think it will make things easier to understand if similar language is used
between the descriptions of the different fields.

"Apply @cntr_id configuration but do not count events."

> + * cfg_en=1, cntr_en=1 : Configure cntr_id and start counting the events.

"Apply @cntr_id configuration and start counting events."

Can it be added here which of these settings (or combination of settings) result
in counters being reset?

> + */
> +union l3_qos_abmc_cfg {
> + struct {
> + unsigned long bw_type :32,
> + bw_src :12,
> + reserved1: 3,
> + is_clos : 1,
> + cntr_id : 5,
> + reserved : 9,
> + cntr_en : 1,
> + cfg_en : 1;
> + } split;
> + unsigned long full;
> +};
> +
> void rdt_last_cmd_clear(void);
> void rdt_last_cmd_puts(const char *s);
> __printf(1, 2)

Reinette