Re: [PATCH v12 19/26] x86/resctrl: Add event configuration directory under info/L3_MON/

From: Moger, Babu
Date: Tue Apr 15 2025 - 16:29:52 EST


Hi Reinette,

On 4/11/25 17:04, Reinette Chatre wrote:
> Hi Babu
>
> On 4/3/25 5:18 PM, Babu Moger wrote:
>> Create the configuration directory and files for mbm_cntr_assign mode.
>> These configurations will be used to assign MBM events in mbm_cntr_assign
>> mode, with two default configurations created upon mounting.
>>
>> Example:
>> $ cd /sys/fs/resctrl/
>> $ cat info/L3_MON/counter_configs/mbm_total_bytes/event_filter
>> local_reads, remote_reads, local_non_temporal_writes,
>> remote_non_temporal_writes, local_reads_slow_memory,
>> remote_reads_slow_memory, dirty_victim_writes_all
>>
>> $ cat info/L3_MON/counter_configs/mbm_local_bytes/event_filter
>> local_reads, local_non_temporal_writes, local_reads_slow_memory
>>
>> Signed-off-by: Babu Moger <babu.moger@xxxxxxx>
>> ---
>> v12: New patch to hold the MBM event configurations for mbm_cntr_assign mode.
>> ---
>> Documentation/arch/x86/resctrl.rst | 29 ++++++++++
>> arch/x86/kernel/cpu/resctrl/internal.h | 2 +
>> arch/x86/kernel/cpu/resctrl/monitor.c | 1 +
>> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 77 ++++++++++++++++++++++++++
>> 4 files changed, 109 insertions(+)
>>
>> diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
>> index 71ed1cfed33a..99f9f4b9b501 100644
>> --- a/Documentation/arch/x86/resctrl.rst
>> +++ b/Documentation/arch/x86/resctrl.rst
>> @@ -306,6 +306,35 @@ with the following files:
>> # cat /sys/fs/resctrl/info/L3_MON/available_mbm_cntrs
>> 0=30;1=30
>>
>> +"counter_configs:
>
> (mismatch quotes)
>

Sure.

> This organization needs some extra thought ... consider that the section starts with
> "If RDT monitoring is available there will be an "L3_MON" directory
> with the following *files*:"
>

Sure.

>
>> + The directory for storing event configuration files, which will be used to
>> + assign counters when the mbm_cntr_assign mode is enabled.
>
> Needs more imperative tone.

Sure.
>> +
>> + Following types of events are supported:
>> +
>> + ==== ========================= ============================================================
>> + Bits Name Description
>> + ==== ========================= ============================================================
>> + 6 dirty_victim_writes_all Dirty Victims from the QOS domain to all types of memory
>> + 5 remote_reads_slow_memory Reads to slow memory in the non-local NUMA domain
>> + 4 local_reads_slow_memory Reads to slow memory in the local NUMA domain
>> + 3 remote_non_temporal_writes Non-temporal writes to non-local NUMA domain
>> + 2 local_non_temporal_writes Non-temporal writes to local NUMA domain
>> + 1 remote_reads Reads to memory in the non-local NUMA domain
>> + 0 local_reads Reads to memory in the local NUMA domain
>> + ==== ========================= ==========================================================
>> +
>> + Two default configurations, mbm_local_bytes and mbm_total_bytes, will be created
>
> "will be created" -> "are created" ... or maybe just:
> There are two default configurations: mbm_local_bytes and mbm_total_bytes.

Looks good.

>
>> + upon mounting.
>
> "upon mounting" seems unnecessary.
>

ok.

>> + ::
>> +
>> + # cat /sys/fs/resctrl/info/L3_MON/counter_configs/mbm_total_bytes/event_filter
>> + local_reads, remote_reads, local_non_temporal_writes, remote_non_temporal_writes,
>> + local_reads_slow_memory, remote_reads_slow_memory, dirty_victim_writes_all
>> +
>> + # cat /sys/fs/resctrl/info/L3_MON/counter_configs/mbm_local_bytes/event_filter
>> + local_reads, local_non_temporal_writes, local_reads_slow_memory
>> +
>> "max_threshold_occupancy":
>> Read/write file provides the largest value (in
>> bytes) at which a previously used LLC_occupancy
>> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
>> index b7d1a59f09f8..a943450bf2c8 100644
>> --- a/arch/x86/kernel/cpu/resctrl/internal.h
>> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
>> @@ -282,11 +282,13 @@ struct mbm_cntr_cfg {
>> #define RFTYPE_RES_CACHE BIT(8)
>> #define RFTYPE_RES_MB BIT(9)
>> #define RFTYPE_DEBUG BIT(10)
>> +#define RFTYPE_CONFIG BIT(11)
>
> hmmm ... these flags are becoming quite complex. Even so, RFTYPE_CONFIG would be
> unique to this new feature so I think a more specific name would be appropriate.
> Maybe even "RFTYPE_MBM_EVENT_CONFIG".

Sure.

>
>> #define RFTYPE_CTRL_INFO (RFTYPE_INFO | RFTYPE_CTRL)
>> #define RFTYPE_MON_INFO (RFTYPE_INFO | RFTYPE_MON)
>> #define RFTYPE_TOP_INFO (RFTYPE_INFO | RFTYPE_TOP)
>> #define RFTYPE_CTRL_BASE (RFTYPE_BASE | RFTYPE_CTRL)
>> #define RFTYPE_MON_BASE (RFTYPE_BASE | RFTYPE_MON)
>> +#define RFTYPE_MON_CONFIG (RFTYPE_CONFIG | RFTYPE_MON)
>
> Why is this flag needed?
>

Not required. Will remove it.

>>
>> /* List of all resource groups */
>> extern struct list_head rdt_all_groups;
>> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
>> index 58476c065921..4525295b1725 100644
>> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
>> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
>> @@ -1264,6 +1264,7 @@ int __init resctrl_mon_resource_init(void)
>> if (r->mon.mbm_cntr_assignable) {
>> resctrl_file_fflags_init("num_mbm_cntrs", RFTYPE_MON_INFO);
>> resctrl_file_fflags_init("available_mbm_cntrs", RFTYPE_MON_INFO);
>> + resctrl_file_fflags_init("event_filter", RFTYPE_MON_CONFIG);
>> }
>>
>> return 0;
>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> index aba23e2096db..b2122a1dd36c 100644
>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> @@ -1907,6 +1907,25 @@ static ssize_t mbm_local_bytes_config_write(struct kernfs_open_file *of,
>> return ret ?: nbytes;
>> }
>>
>> +static int event_filter_show(struct kernfs_open_file *of, struct seq_file *seq, void *v)
>> +{
>> + struct mbm_assign_config *assign_config = of->kn->parent->priv;
>> + bool sep = false;
>> + int i;
>> +
>> + for (i = 0; i < NUM_MBM_EVT_VALUES; i++) {
>> + if (assign_config->val & mbm_evt_values[i].evt_val) {
>> + if (sep)
>> + seq_puts(seq, ", ");
>
> seq_putc()

Sure.

>
>> + seq_printf(seq, "%s", mbm_evt_values[i].evt_name);
>> + sep = true;
>> + }
>> + }
>> + seq_puts(seq, "\n");
> seq_putc()

Sure.

>> +
>> + return 0;
>> +}
>> +
>> /* rdtgroup information files for one cache resource. */
>> static struct rftype res_common_files[] = {
>> {
>> @@ -2019,6 +2038,12 @@ static struct rftype res_common_files[] = {
>> .seq_show = mbm_local_bytes_config_show,
>> .write = mbm_local_bytes_config_write,
>> },
>> + {
>> + .name = "event_filter",
>> + .mode = 0444,
>> + .kf_ops = &rdtgroup_kf_single_ops,
>> + .seq_show = event_filter_show,
>> + },
>> {
>> .name = "mbm_assign_mode",
>> .mode = 0444,
>> @@ -2314,6 +2339,52 @@ static int rdtgroup_mkdir_info_resdir(void *priv, char *name,
>> return ret;
>> }
>>
>> +static int resctrl_mkdir_info_configs(void *priv, char *name, unsigned long fflags)
>
> Why a void * instead of struct rdt_resource *?

Yes. Will change it.

>
> Also please fix spacing.

Sure.

>
> Also, why do fflags need to be provided as parameter? These are so custom I think the
> hardcoding should be contained here instead of the caller. With this the function name

Will remove fflags as parameter.

> can also be made specific to what it does ... perhaps "resctrl_mkdir_counter_configs()"
> (please feel free to improve).

Sounds good.

>
>
>> +{
>> + struct kernfs_node *l3_mon_kn, *kn_subdir, *kn_subdir2;
>> + int ret, i;
>> +
>> + l3_mon_kn = kernfs_find_and_get(kn_info, name);
>> + if (!l3_mon_kn)
>> + return -ENOENT;
>> +
>> + kn_subdir = kernfs_create_dir(l3_mon_kn, "counter_configs", l3_mon_kn->mode, priv);
>> + if (IS_ERR(kn_subdir)) {
>> + kernfs_put(l3_mon_kn);
>> + return PTR_ERR(kn_subdir);
>> + }
>> +
>> + ret = rdtgroup_kn_set_ugid(kn_subdir);
>> + if (ret) {
>> + kernfs_put(l3_mon_kn);
>> + return ret;
>> + }
>> +
>> + for (i = 0; i < NUM_MBM_ASSIGN_CONFIGS; i++) {
>
> This can instead work through the resource's evt_list and use a flag (TBD how to
> adapt "configurable") to determine if a directory should be created for it.

Yes. Will look into this.

>
>> + kn_subdir2 = kernfs_create_dir(kn_subdir, mbm_assign_configs[i].name,
>> + kn_subdir->mode, &mbm_assign_configs[i]);
>> + if (IS_ERR(kn_subdir)) {
>
> IS_ERR(kn_subdir2)?

Yes.

>
>> + ret = PTR_ERR(kn_subdir2);
>> + goto config_out;
>> + }
>> +
>> + ret = rdtgroup_kn_set_ugid(kn_subdir2);
>> + if (ret)
>> + goto config_out;
>> +
>> + ret = rdtgroup_add_files(kn_subdir2, fflags);
>> + if (!ret)
>> + kernfs_activate(kn_subdir);
>> + }
>> +
>> +config_out:
>> + kernfs_put(l3_mon_kn);
>> + if (ret)
>> + kernfs_remove(kn_subdir);
>> +
>> + return ret;
>> +}
>> +
>> static unsigned long fflags_from_resource(struct rdt_resource *r)
>> {
>> switch (r->rid) {
>> @@ -2360,6 +2431,12 @@ static int rdtgroup_create_info_dir(struct kernfs_node *parent_kn)
>> ret = rdtgroup_mkdir_info_resdir(r, name, fflags);
>> if (ret)
>> goto out_destroy;
>> +
>> + if (r->mon.mbm_cntr_assignable) {
>> + ret = resctrl_mkdir_info_configs(r, name, RFTYPE_MON_CONFIG);
>> + if (ret)
>> + goto out_destroy;
>> + }
>> }
>>
>> ret = rdtgroup_kn_set_ugid(kn_info);
>
> Reinette
>

--
Thanks
Babu Moger