Re: [PATCH v1 1/4] x86,fs/resctrl: Make resctrl_arch_is_evt_configurable() aware of mbm_assign_mode
From: Reinette Chatre
Date: Wed Mar 04 2026 - 12:08:08 EST
Hi Ben,
On 3/4/26 3:07 AM, Ben Horgan wrote:
> Hi Reinette,
>
> On 3/3/26 18:09, Reinette Chatre wrote:
>> Hi Ben,
>>
>> On 3/3/26 4:29 AM, Ben Horgan wrote:
>>> Hi Reinette,
>>>
>>> On Mon, Mar 02, 2026 at 03:11:48PM -0800, Reinette Chatre wrote:
>>>> Hi Ben,
>>>>
>>>> On 2/25/26 12:19 PM, Ben Horgan wrote:
>>>>> The features BMEC and ABMC provide separate interfaces to configuring which
>>>>> bandwidth types a counter tracks. Currently
>>>>> resctrl_arch_is_evt_configurable() only ever returns true if BMEC is
>>>>> supported.
>>>>>
>>>>> ABMC is useful even when BMEC is supported as it also provides counter
>>>>> assignment which reduces the number of hardware monitors a system
>>>>> requires. It is an architectural detail that ABMC provides counter
>>>>
>>>> Since the goal is to support MPAM I'd suggest that the first focus be on what
>>>> resctrl fs supports and exposes and how it does or does not work for MPAM.
>>>>
>>>>> configurability without requiring the prior feature, BMEC. On MPAM systems
>>>>> these two features are independent and the bandwidth types are limited to a
>>>>> choice of only read or write.
>>>>
>>>> Does MPAM support exactly these two features? Specifically, does MPAM support
>>>> a feature that allows user to configure events globally per domain and another
>>>> feature that allows user to configure events per PMG?
>>>
>>> No, the bandwidth type configuration in MPAM is per counter and so effectively
>>> per (PARTID, PMG) pair. In supporting hardware, the configuration is made in the
>>> RWBW field of MSMON_CFG_MBWU_FLT and allows counting of just read, just write,
>>> or both.
>>
>> Thank you for confirming.
>>
>> Since BMEC event configuration is per domain I do not believe BMEC is relevant to MPAM.
>>
>>
>>>> These different features are how I understand assignable counters and BMEC to
>>>
>>> We are each approaching this from a different view point. I've just been looking at
>>> ABMC as a way of dealing with systems where there are fewer hardware counters than
>>> (PARTID, PMG) pairs (num_rmid) by requiring a counter to be assigned to a
>>> CTRL_MON or MON group in order to be usable. resctrl otherwise expects a counter
>>> per CTRL_MON/MON group. Sharing bandwidth counters doesn't work
>>
>> No, resctrl does not expect a counter per CTRL_MON/MON group - in assignable
>> counter mode the counter assignment is per monitoring group AND event as a pair:
>> (CTRL_MON/MON group, event).
>
> Yes but these counters aren't necessarily fungible. For MPAM the
> mbm_local_bytes and mbm_total_bytes are necessarily backed by different
> hardware counters. A MPAM bandwidth counters just counts all traffic on
> a link with the only configurability being for read/write. The counters
> are just placed at different point in the topology to get the different
> events.
The distinction between "different hardware counters for mbm_local_bytes and
mbm_total_bytes" and "The counters are just placed at different point in the
topology" is not clear to me". The former implies different counters for the
two events while the latter implies the same counters are used for both events
but perhaps accumulated/displayed differently?
I re-read the thread starting with
https://lore.kernel.org/lkml/CALPaoCh+mRLJEfhKBve3hRf+vHHoObjvWRt74OfpopgtR9g9FQ@xxxxxxxxxxxxxx/
and it sounded to me as though MPAM would only expose the mbm_total_bytes event.
Ignoring for a moment that counters could be configured to count different
transactions, so assuming all counters count the same transactions. Could you
please clarify how MPAM determines the counts returned by the
mbm_local_bytes and mbm_total_bytes respectively?
>>> as they need a fixed (PARTID, PMG) configuration to avoid missing counts.
>>
>> It is not clear to me how sharing counters are at play here.
>
> I was just saying it wasn't possible for bandwidth counters. For
> llc_occupancy, CSU in MPAM, you can share 'counters' as they can just
> recount to get the current cache occupancy.
ack.
>>> The intent of this patch is to allow splitting these two features of ABMC,
>>> bandwidth type configuration and hardware counter assignment in order to just
>>
>> Why keep BMEC which is by its name does event configuration? And then on top
>> of that it is event configuration at a scope that MPAM does not support?
>>
>>> support the hardware counter assignment.
>>>
>>> I'm still not understanding the distinction you are making though.
>>> The files are,
>>> With ABMC:
>>> info/L3_MON/event_configs/mbm_[local,total]_bytes/event_filter
>>
>> This is an event configuration that is global without any assignment. This
>> interface communicates to user space which transactions are counted when
>> this particular event is assigned to a CTRL_MON/MON group. This interface
>> is intended to be extensible. The interface starts with the original mbm_local_bytes
>> and mbm_total_bytes events in order to be backward compatible. The vision is that
>> if the user prefers to count different transactions then they could create
>> a new event with the transactions needing counting. For example,
>>
>> # mkdir /sys/fs/resctrl/info/L3_MON/event_configs/just_local_slow
>> # echo local_reads_slow_memory > /sys/fs/resctrl/info/L3_MON/event_configs/just_local_slow/event_filter
>>
>> The events are just tracked and managed in software with the above interface,
>> no hardware configuration is involved at this point in the above example*.
>>
>> The new "just_local_slow" can can then be assigned to a monitor group via
>> mbm_L3_assignments that will at that time consume one hardware counter and
>> program it with the event (which transactions to monitor) and monitor group
>> details (PARTID, PMG).
>>
>> This is based on original suggestion by Peter in a way that we thus expect to
>> work for customers. See [1].
>>
>>> and with BMEC they are:
>>> info/L3_MON/mbm_[local,total]_bytes_config
>
> I see this makes the intent much clearer to me. Thanks for sharing this
> plan. I think the general idea is good. To me this implies that for MPAM
> to support event configuration we'd want ABMC enabled at the same time.
> Which indeed makes sense as then you can then count read and write
> separately for a given CTRL_MON/MON group without requiring twice the
> number of hardware counters.
>
> However, I now spot an existing issue, bundling mbm_local_bytes and
> mbm_total_bytes together for one pool of counters doesn't work for MPAM.
> As noted above they require different sets of hardware counters. With
> the current counter assignment mode interface the num_mbm_cntrs is
> scoped to all mbm counters. In an MPAM system that supports both
> mbm_local_bytes and mbm_total_bytes this could lead to
> num_mbm_total_cntrs and a num_mbm_local_cntrs or something equivalent.
Is this just needed because MPAM driver does not support counter configuration
yet?
>> This is essentially both an event configuration and assignment that is not
>> compatible with assignable counters. With this interface the user
>> both configures which transactions are counted by a particular event and
>> programs all counters in a domain (across all resource groups) to use that
>> particular configuration. Due to this incompatibility resctrl fs will not expose
>> BMEC files when assignable counters are enabled.
>>
>>
>>> In both cases they have allow configuration for two event types,
>>> mbm_local_bytes, and mbm_total_bytes. What am I missing?
>>
>> The way I see it:
>> BMEC: per domain across all resource groups event configuration and assignment that
>> applies to all counters - intended to support the "default" mode where there
>> is no counter assignment from user space.
>> assignable counters: event configuration via event_filter with assignment done
>> separately using per resource group mbm_L3_assignments file
>
> Make sense.
>
>>
>>>
>>>> be and to support both at the same time requires a user interface that is
>>>> confusing since the user can concurrently configure events globally per-domain
>>>> and per resource group.
>>>
>>> Sure.
>>>
>>>>
>>>> Could you please elaborate how event configuration work on MPAM? If find this
>>>> series quite cryptic. I think it will help if you could elaborate what MPAM
>>>> capabilities are and how you expect resctrl fs to expose these features to
>>>> an MPAM user and how said used is expected to interact with resctrl fs to use
>>>> the features.
>>>
>>> Ok, firstly regarding hardware counter assignment, on MPAM systems with more
>>> (PARTID, PMG) pairs than bandwidth hardware counters we'd like to expose the
>>> mbm_L3_assignments for tracking which CTRL_MON/MON groups have bandwidth
>>> counting events and otherwise not.
>>
>> ok. This sounds like assignable counters to me. I do not believe BMEC comes
>> into play.
>>
>>>
>>> I haven't put much thought into how we would support event configuration with
>>> MPAM but we would want something that allows the configuration per hardware
>>> counter or (PARTID, PMG) pair. I'd rather not commit to the existing interface
>>
>> This is what assignable counters already does, no?
>
> Isn't that only with the future plan you shared above?
Assigning a counter to a (PARTID, PMG) pair is what assignable counters does
today.
Reinette