Re: [PATCH v3 2/2] x86/resctrl: Remove hard-coded memory bandwidth event configuration

From: Moger, Babu
Date: Fri Jan 05 2024 - 19:13:36 EST


Hi Reinette,

On 1/5/2024 3:18 PM, Reinette Chatre wrote:
Hi Babu,

On 1/4/2024 1:21 PM, Babu Moger wrote:
If the BMEC (Bandwidth Monitoring Event Configuration) feature is
supported, the bandwidth events can be configured. The maximum supported
bandwidth bitmask can be determined by following CPUID command.

CPUID_Fn80000020_ECX_x03 [Platform QoS Monitoring Bandwidth Event
Configuration] Read-only. Reset: 0000_007Fh.
Bits Description
31:7 Reserved
6:0 Identifies the bandwidth sources that can be tracked.

The bandwidth sources can change with the processor generations.
Remove the hard-coded value and detect using CPUID command. Also,
I do not think "Remove the hard-coded value" is accurate anymore.

Will change it to.

"Read the supported bandwidth sources using CPUID command. Also,"

Also I need to update the subject line.


print the valid bitmask when the user tries to configure invalid value.

The CPUID details are documentation in the PPR listed below [1].
"are documentation" -> "are documented"
Sure.

[1] Processor Programming Reference (PPR) Vol 1.1 for AMD Family 19h Model
11h B1 - 55901 Rev 0.25.

Fixes: dc2a3e857981 ("x86/resctrl: Add interface to read mbm_total_bytes_config")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
Signed-off-by: Babu Moger <babu.moger@xxxxxxx>
Same comment about "Link:" as for patch 1/2.
Sure.

---
v3: Changed the event_mask name to mbm_cfg_mask. Added comments about the field.
Reverted the masking of event configuration to original code.
Few minor comment changes.

v2: Earlier sent as a part of ABMC feature.
https://lore.kernel.org/lkml/20231201005720.235639-1-babu.moger@xxxxxxx/
But this is not related to ABMC. Sending it separate now.
Removed the global resctrl_max_evt_bitmask. Added event_mask as part of
the resource.
---
arch/x86/kernel/cpu/resctrl/internal.h | 3 +++
arch/x86/kernel/cpu/resctrl/monitor.c | 6 ++++++
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 6 ++++--
3 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index d2979748fae4..e3dc35a00a19 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -394,6 +394,8 @@ struct rdt_parse_data {
* @msr_update: Function pointer to update QOS MSRs
* @mon_scale: cqm counter * mon_scale = occupancy in bytes
* @mbm_width: Monitor width, to detect and correct for overflow.
+ * @mbm_cfg_mask: Bandwidth sources that can be tracked when Bandwidth
+ * Monitoring Event Configuration (BMEC) is supported.
* @cdp_enabled: CDP state of this resource
*
* Members of this structure are either private to the architecture
@@ -408,6 +410,7 @@ struct rdt_hw_resource {
struct rdt_resource *r);
unsigned int mon_scale;
unsigned int mbm_width;
+ unsigned int mbm_cfg_mask;
bool cdp_enabled;
};
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index f136ac046851..acca577e2b06 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -813,6 +813,12 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r)
return ret;
if (rdt_cpu_has(X86_FEATURE_BMEC)) {
+ u32 eax, ebx, ecx, edx;
+
+ /* Detect list of bandwidth sources that can be tracked */
+ cpuid_count(0x80000020, 3, &eax, &ebx, &ecx, &edx);
+ hw_res->mbm_cfg_mask = ecx & MAX_EVT_CONFIG_BITS;
+
if (rdt_cpu_has(X86_FEATURE_CQM_MBM_TOTAL)) {
mbm_total_event.configurable = true;
mbm_config_rftype_init("mbm_total_bytes_config");
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 69a1de92384a..5b5a8f0ffb2f 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -1617,12 +1617,14 @@ static void mon_event_config_write(void *info)
static int mbm_config_write_domain(struct rdt_resource *r,
struct rdt_domain *d, u32 evtid, u32 val)
Not specific to this patch, but since the valid mask is per resource I do not think
it is necessary to check user provided value for every domain. The user provided value
can be checked earlier and only once in mon_config_write() before iterating over all
domains to write the value.
Yes. Agree.

{
+ struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
struct mon_config_info mon_info = {0};
int ret = 0;
/* mon_config cannot be more than the supported set of events */
- if (val > MAX_EVT_CONFIG_BITS) {
- rdt_last_cmd_puts("Invalid event configuration\n");
+ if ((val & hw_res->mbm_cfg_mask) != val) {
+ rdt_last_cmd_printf("Invalid input: The maximum valid bitmask is 0x%02x\n",
+ hw_res->mbm_cfg_mask);
I think keeping "Invalid event configuration" is useful to create a detailed message of:
"Invalid event configuration: maximum valid bitmask is 0x%02x"

Sure.

Thanks

Babu