Re: [PATCH v1 1/4] x86,fs/resctrl: Make resctrl_arch_is_evt_configurable() aware of mbm_assign_mode

From: Reinette Chatre

Date: Wed Mar 04 2026 - 17:51:01 EST


Hi Ben,

On 3/4/26 1:01 PM, Ben Horgan wrote:
> Hi Reinette,
>
> On 3/4/26 19:23, Reinette Chatre wrote:
>> Hi Ben,
>>
>> On 3/4/26 9:37 AM, Ben Horgan wrote:
>>> On 3/4/26 17:02, Reinette Chatre wrote:
>>>> On 3/4/26 3:07 AM, Ben Horgan wrote:
>>>>> On 3/3/26 18:09, Reinette Chatre wrote:
>>>>>> On 3/3/26 4:29 AM, Ben Horgan wrote:
>>>>>>> On Mon, Mar 02, 2026 at 03:11:48PM -0800, Reinette Chatre wrote:
>>>>>>>> Hi Ben,
>>>>>>>>
>>>>>>>> On 2/25/26 12:19 PM, Ben Horgan wrote:
>>>>>>>>> The features BMEC and ABMC provide separate interfaces to configuring which
>>>>>>>>> bandwidth types a counter tracks. Currently
>>>>>>>>> resctrl_arch_is_evt_configurable() only ever returns true if BMEC is
>>>>>>>>> supported.
>>>>>>>>>
>>>>>>>>> ABMC is useful even when BMEC is supported as it also provides counter
>>>>>>>>> assignment which reduces the number of hardware monitors a system
>>>>>>>>> requires. It is an architectural detail that ABMC provides counter
>>>>>>>>
>>>>>>>> Since the goal is to support MPAM I'd suggest that the first focus be on what
>>>>>>>> resctrl fs supports and exposes and how it does or does not work for MPAM.
>>>>>>>>
>>>>>>>>> configurability without requiring the prior feature, BMEC. On MPAM systems
>>>>>>>>> these two features are independent and the bandwidth types are limited to a
>>>>>>>>> choice of only read or write.
>>>>>>>>
>>>>>>>> Does MPAM support exactly these two features? Specifically, does MPAM support
>>>>>>>> a feature that allows user to configure events globally per domain and another
>>>>>>>> feature that allows user to configure events per PMG?
>>>>>>>
>>>>>>> No, the bandwidth type configuration in MPAM is per counter and so effectively
>>>>>>> per (PARTID, PMG) pair. In supporting hardware, the configuration is made in the
>>>>>>> RWBW field of MSMON_CFG_MBWU_FLT and allows counting of just read, just write,
>>>>>>> or both.
>>>>>>
>>>>>> Thank you for confirming.
>>>>>>
>>>>>> Since BMEC event configuration is per domain I do not believe BMEC is relevant to MPAM.
>>>>>>
>>>>>>
>>>>>>>> These different features are how I understand assignable counters and BMEC to
>>>>>>>
>>>>>>> We are each approaching this from a different view point. I've just been looking at
>>>>>>> ABMC as a way of dealing with systems where there are fewer hardware counters than
>>>>>>> (PARTID, PMG) pairs (num_rmid) by requiring a counter to be assigned to a
>>>>>>> CTRL_MON or MON group in order to be usable. resctrl otherwise expects a counter
>>>>>>> per CTRL_MON/MON group. Sharing bandwidth counters doesn't work
>>>>>>
>>>>>> No, resctrl does not expect a counter per CTRL_MON/MON group - in assignable
>>>>>> counter mode the counter assignment is per monitoring group AND event as a pair:
>>>>>> (CTRL_MON/MON group, event).
>>>>>
>>>>> Yes but these counters aren't necessarily fungible. For MPAM the
>>>>> mbm_local_bytes and mbm_total_bytes are necessarily backed by different
>>>>> hardware counters. A MPAM bandwidth counters just counts all traffic on
>>>>> a link with the only configurability being for read/write. The counters
>>>>> are just placed at different point in the topology to get the different
>>>>> events.
>>>>
>>>> The distinction between "different hardware counters for mbm_local_bytes and
>>>> mbm_total_bytes" and "The counters are just placed at different point in the
>>>> topology" is not clear to me". The former implies different counters for the
>>>> two events while the latter implies the same counters are used for both events
>>>> but perhaps accumulated/displayed differently?
>>>
>>> For a given RIS, mpam device hardware unit of which an MSC may consist
>>> of 1 or more, there are MPAMF_MBWUMON_IDR.NUM_MON hardware bandwidth
>>> counters which measure traffic passing a specific point with no
>>> filtering for where it's going. The filtering of this counter is
>>> set up in MSMON_CFG_MBWU_FLT which only allows pmg/partid/(read/write).
>>
>> Thank you for the details. Is the expectation that user should be able to
>> program all these counters via resctrl? If an MSC consists of multiple RIS
>> with different counters then things get complicated very fast. Could it be
>> constrained to only expose the maximum number of counters supported by
>> all RIS at a particular scope? This would match what the existing
>> num_mbm_cntrs file supports.
>
> Not individually, no, they will generally just be one per cache slice or
> memory controller and all be programmed together as a component.

Is this where the risk of double counting comes in? That is, adding up the
memory bandwidth at the cache to the memory bandwidth at memory controller
for a total memory bandwidth count?

...


> So, to try and bring this back to what we can be done now for MPAM to
> fit into the counter mode assignment interface. Just support
> mbm_total_bytes and then num_mbm_cntrs is correct (nothing to do). Make
> the event_filter file always display all the bandwidth types and make
> that the only value that be the only value it accepts (instead of hiding
> the event_filter file). If you agree I'll respin with that.

>From resctrl side this sounds fine. I don't have any insight into what, if any,
kind of gymnastics the MPAM driver needs to do to make the discovered MSCs with
their varying scope and internal vs external counts fit into this. If initial
implementation indeed forces some components into categories that are not a good
match then when resctrl later does get support for diverse components there may
be surprises to user space along the way. For example, user space may not see the
same memory bandwidth numbers reported by the same events on the same system as
the interface evolves.

"make that the only value that be the only value it accepts" - are you saying that
whatever is displayed when user views the "event_filter" file is what the
user can write to the "event_filter" file? I find this a challenging interface
for user space to use. The expectation is that the user can write any supported
memory transaction to that file and when writing fails it can only be because
of an invalid memory transaction. How can user space know that events are not
configurable at all? It sounds as though user space is expected to try configuring
the event with a memory transaction and then, presumably, check last_cmd_status?

Could this not be simplified by making the "event_filter" file read-only on
MPAM systems?

Reinette