Hi Babu,
On 12/12/24 12:15 PM, Babu Moger wrote:
Introduce the interface file "mbm_assign_mode" to list monitor modes
supported.
The "mbm_cntr_assign" mode provides the option to assign a counter to
an RMID, event pair and monitor the bandwidth as long as it is assigned.
On AMD systems "mbm_cntr_assign" is backed by the ABMC (Assignable
Bandwidth Monitoring Counters) hardware feature and is enabled by default.
The "default" mode is the existing monitoring mode that works without the
explicit counter assignment, instead relying on dynamic counter assignment
by hardware that may result in hardware not dedicating a counter resulting
in monitoring data reads returning "Unavailable".
Provide an interface to display the monitor mode on the system.
$ cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
[mbm_cntr_assign]
default
Signed-off-by: Babu Moger <babu.moger@xxxxxxx>
---
v10: Added few more text to user documentation clarify on the default mode.
v9: Updated user documentation based on comments.
v8: Commit message update.
v7: Updated the descriptions/commit log in resctrl.rst to generic text.
Thanks to James and Reinette.
Rename mbm_mode to mbm_assign_mode.
Introduced mutex lock in rdtgroup_mbm_mode_show().
v6: Added documentation for mbm_cntr_assign and legacy mode.
Moved mbm_mode fflags initialization to static initialization.
v5: Changed interface name to mbm_mode.
It will be always available even if ABMC feature is not supported.
Added description in resctrl.rst about ABMC mode.
Fixed display abmc and legacy consistantly.
v4: Fixed the checks for legacy and abmc mode. Default it ABMC.
v3: New patch to display ABMC capability.
---
Documentation/arch/x86/resctrl.rst | 33 ++++++++++++++++++++++++++
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 31 ++++++++++++++++++++++++
2 files changed, 64 insertions(+)
diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
index 30586728a4cd..1e4a1f615496 100644
--- a/Documentation/arch/x86/resctrl.rst
+++ b/Documentation/arch/x86/resctrl.rst
@@ -257,6 +257,39 @@ with the following files:
# cat /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config
0=0x30;1=0x30;3=0x15;4=0x15
+"mbm_assign_mode":
+ Reports the list of monitoring modes supported. The enclosed brackets
+ indicate which mode is enabled.
+ ::
+
+ # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
+ [mbm_cntr_assign]
+ default
+
+ "mbm_cntr_assign":
+
The text below jumps into usage with assumption that user space already
understands the feature. How about starting with some context? Something like
"A monitoring event can only accumulate data while it is backed by a hardware
counter."
+ In mbm_cntr_assign mode user-space is able to specify which of the
+ events in CTRL_MON or MON groups should have a counter assigned using the
+ "mbm_assign_control" file. The number of counters available is described
+ in the "num_mbm_cntrs" file. Changing the mode may cause all counters on
+ a resource to reset.
+
+ The mode is useful on AMD platforms which support more CTRL_MON and MON
+ groups than hardware counters, meaning 'unassigned' events on CTRL_MON or
+ MON groups will report 'Unavailable'.
The "meaning 'unassigned'" is not clear to me since in "mbm_cntr_assign" mode
these events will (at end of this series) actually return "Unassigned", no? Perhaps
this portion can be dropped for now and the text found in patch #20 about returning
"Unassigned" can be placed here instead ... but this should probably be done in
patch #19 that adds that capability.
+
+ AMD Platforms with ABMC (Assignable Bandwidth Monitoring Counters) feature
+ enable this mode by default so that counters remain assigned even when the
+ corresponding RMID is not in use by any processor.
+
+ "default":
+
+ In default mode, resctrl assumes there is a hardware counter for each
+ event within every CTRL_MON and MON group. On AMD platforms, it is
+ recommended to use mbm_cntr_assign mode if supported, because reading
+ "mbm_total_bytes" or "mbm_local_bytes" will report 'Unavailable' if
+ there is no counter associated with that event.
+
"max_threshold_occupancy":
Read/write file provides the largest value (in
bytes) at which a previously used LLC_occupancy
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index d54c2701c09c..f25ff1430014 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -845,6 +845,30 @@ static int rdtgroup_rmid_show(struct kernfs_open_file *of,
return ret;
}
+static int rdtgroup_mbm_assign_mode_show(struct kernfs_open_file *of,
+ struct seq_file *s, void *v)
I remember this topic from earlier version yet I still see many instances
of the "rdtgroup_" namespace used for functions that do not interact with
resource groups. Could you please check this series and fix this?
+{
+ struct rdt_resource *r = of->kn->parent->priv;
+
+ mutex_lock(&rdtgroup_mutex);
+
+ if (r->mon.mbm_cntr_assignable) {
+ if (resctrl_arch_mbm_cntr_assign_enabled(r)) {
+ seq_puts(s, "[mbm_cntr_assign]\n");
+ seq_puts(s, "default\n");
+ } else {
+ seq_puts(s, "mbm_cntr_assign\n");
+ seq_puts(s, "[default]\n");
+ }
+ } else {
+ seq_puts(s, "[default]\n");
+ }
+
+ mutex_unlock(&rdtgroup_mutex);
+
+ return 0;
+}
+
#ifdef CONFIG_PROC_CPU_RESCTRL
/*
@@ -1901,6 +1925,13 @@ static struct rftype res_common_files[] = {
.seq_show = mbm_local_bytes_config_show,
.write = mbm_local_bytes_config_write,
},
+ {
+ .name = "mbm_assign_mode",
+ .mode = 0444,
+ .kf_ops = &rdtgroup_kf_single_ops,
+ .seq_show = rdtgroup_mbm_assign_mode_show,
+ .fflags = RFTYPE_MON_INFO,
+ },
{
.name = "cpus",
.mode = 0644,
Reinette