Re: [PATCH v11 00/23] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC)
From: Reinette Chatre
Date: Wed Mar 19 2025 - 14:37:31 EST
Hi Babu,
On 3/14/25 9:18 AM, Moger, Babu wrote:
> On 3/13/2025 4:21 PM, Reinette Chatre wrote:
>> On 3/13/25 1:13 PM, Moger, Babu wrote:
>>> On 3/13/25 11:08, Reinette Chatre wrote:
>>>> On 3/12/25 11:14 AM, Moger, Babu wrote:
>>>>> On 3/12/25 12:14, Reinette Chatre wrote:
>>>>>> On 3/12/25 9:03 AM, Moger, Babu wrote:
>>>>>>> On 3/12/25 10:07, Reinette Chatre wrote:
>>
>>
>>> Here are the steps. Just copying steps from Peters proposal.
>>> https://lore.kernel.org/lkml/CALPaoCiii0vXOF06mfV=kVLBzhfNo0SFqt4kQGwGSGVUqvr2Dg@xxxxxxxxxxxxxx/
>>
>> Thank you very much for detailing the steps. It is starting the fall into place
>> for me.
>>
>>>
>>>
>>> 1. Mount the resctrl
>>> mount -t resctrl resctrl /sys/fs/resctrl
>>
>> I assume that on ABMC system the plan remains to have ABMC enabled by default, which
>> will continue to depend on BMEC.
>
> Yes. ABMC will be enabled by default. ABMC will use the configurations from info/L3_MON/counter_configs. ABMC will not depend on BMEC.
I see. The previous dependency was thus just something enforced by OS to support the
chosen implementation?
Looks like the two features share some registers.
>
>> How would the existing BMEC implementation be impacted in this case?
>
> BMEC will only work with pre-ABMC(or default) mode.
ok. Does this mean that if a user boots kernel with "rdt=!bmec" then ABMC will keep working?
>> Without any changes to BMEC support the mbm_total_bytes_config and mbm_local_bytes_config
>> files will remain and user space may continue to use them to change the event
>> configurations with confusing expecations/results on an ABMC system.
>>
>> One possibility may be that a user may see below on ABMC system even if BMEC is supported:
>> # cat /sys/fs/resctrl/info/L3_MON/mon_features
>> llc_occupancy
>> mbm_total_bytes
>> mbm_local_bytes
>>
>> With the above a user cannot be expected to want to interact with mbm_total_bytes_config
>> and mbm_local_bytes_config, which may be the simplest to do.
>
> yes.
>
>>
>> To follow that, we should also consider how "mon_features" will change with this
>> implementation.
>
> May be
>
> # cat /sys/fs/resctrl/info/L3_MON/mon_features
> llc_occupancy
> mbm_total_bytes
> mbm_local_bytes
> counter_configs/mbm_total_bytes/event_filter
> counter_configs/mbm_local_bytes/event_filter
>
I read the docs again to understand what kind of flexibility we have here. As I interpret it
the "mon_features" is associated with "events" and there is a clear documented association
between the "events" listed in this file and which files a user can expect to exist in the
"mon_data" directory. Considering this I think it may be helpful to provide the
counter configurations in this file. This matches well with mbm_total_bytes/mbm_local_bytes
being treated as "counter configurations".
Whether counter configuration is supported could be determined by existence of
the "counter_configs" directory?
For example,
# cat /sys/fs/resctrl/info/L3_MON/mon_features
llc_occupancy
mbm_total_bytes
mbm_local_bytes
# mkdir /sys/fs/resctrl/info/L3_MON/counter_configs/only_read_fills
# cat /sys/fs/resctrl/info/L3_MON/mon_features
llc_occupancy
mbm_total_bytes
mbm_local_bytes
only_read_fills
This could possibly be a way to support user interface when configuring the
counter. For example, a user may easily create a new counter configuration
by creating a directory, but there may be some requirements wrt its configuration
that need to be met before that configuration/event may appear in the
"mon_features" file.
>>> 2. When ABMC is supported two default configurations will be created.
>>>
>>> a. info/L3_MON/counter_configs/mbm_total_bytes/event_filter
>>> b. info/L3_MON/counter_configs/mbm_local_bytes/event_filter
>>>
>>> These files will be populated with default total and local events
>>> # cat info/L3_MON/counter_configs/mbm_total_bytes/event_filter
>>> VictimBW
>>> RmtSlowFill
>>> RmtNTWr
>>> RmtFill
>>> LclFill
>>> LclNTWr
>>> LclSlowFill
>>
>> Looks good. Here we could perhaps start nitpicking about naming and line separation.
>> I think it may be easier if the fields are separated by comma, but more on that
>> below ...
>>
>>>
>>> # cat info/L3_MON/counter_configs/mbm_local_bytes/event_filter
>>> LclFill,
>>> LclNTWr
>>> LclSlowFill
>>>
>>> 3. Users will have options to update the event configuration.
>>> echo LclFill > info/L3_MON/counter_configs/mbm_local_bytes/event_filter
>>
>> We need to be clear on how user space interacts with this file. For example,
>> can user space "append" configurations? Specifically, if the file has
>> contents like your earlier example:
>> # cat info/L3_MON/counter_configs/mbm_local_bytes/event_filter
>> LclFill
>> LclNTWr
>> LclSlowFill
>>
>> Should above be created with (note "append" needed for second and third):
>> echo LclFill > info/L3_MON/counter_configs/mbm_local_bytes/event_filter
>> echo LclNTWr >> info/L3_MON/counter_configs/mbm_local_bytes/event_filter
>> echo LclSlowFill >> info/L3_MON/counter_configs/mbm_local_bytes/event_filter
>>
>> Is it possible to set multiple configurations in one write like below?
>> echo "LclFill,LclNTWr,LclSlowFill" > info/L3_MON/counter_configs/mbm_local_bytes/event_filter
>
> Yes. We should support that.
Reading Peter's response (https://lore.kernel.org/lkml/CALPaoCj7aSVxHisQTdKQ5KN0-aNzN8rRkRPVc7pjGMLSxfPvrA@xxxxxxxxxxxxxx/)
it sounds as though this part is now in the fine-tuning phase.
If there are other formats that is more convenient for user space then we should surely
consider that.
>
>>
>> (note above where it may be easier for user space to use comma (or some other field separator)
>> when providing multiple configurations at a time, with this, to match, having output in
>> commas may be easier since it makes user interface copy&paste easier)
>>
>> If file has content like:
>> # cat info/L3_MON/counter_configs/mbm_local_bytes/event_filter
>> LclNTWr
>> LclSlowFill
>>
>> What is impact of the following:
>> echo LclFill > info/L3_MON/counter_configs/mbm_local_bytes/event_filter
>>
>> Is it (append):
>> # cat info/L3_MON/counter_configs/mbm_local_bytes/event_filter
>> LclFill
>> LclNTWr
>> LclSlowFill
>>
>> or (overwrite):
>> # cat info/L3_MON/counter_configs/mbm_local_bytes/event_filter
>> LclFill
>>
>> I do think the interface will be more intuitive it if follows regular file
>> operations wrt "append" and such. I have not looked into how kernfs supports
>> "append".
>
> Just searching quickly, I have not seen any append operations on kernfs.
>
>
>> As alternative, we can try to work the previous mbm_assign_control syntax in here (use + and -).
>>
>> For example:
>>
>> # cat info/L3_MON/counter_configs/mbm_local_bytes/event_filter
>> LclNTWr
>> # echo "+LclFill,-LclNTWr,+LclSlowFill" > info/L3_MON/counter_configs/mbm_local_bytes/event_filter
>> # cat info/L3_MON/counter_configs/mbm_local_bytes/event_filter
>> LclFill,LclSlowFill
>>
>> With something like above resctrl just deals with file writes as before.
>
> Or without complicating much we can just support basic operations.
>
> # cat info/L3_MON/counter_configs/mbm_local_bytes/event_filter
> LclFill, LclNTWr, LclSlowFill
>
> # echo "LclFill, LclNTWr, LclSlowFill, VictimBW" > info/L3_MON/counter_configs/mbm_local_bytes/event_filter
>
> # cat info/L3_MON/counter_configs/mbm_local_bytes/event_filter
> LclFill, LclNTWr, LclSlowFill, VictimBW
>
> # echo "LclFill, LclNTWr" > info/L3_MON/counter_configs/mbm_local_bytes/event_filter
>
> # cat info/L3_MON/counter_configs/mbm_local_bytes/event_filter
> LclFill, LclNTWr
>
Looks good to me. As I see it this could be expanded to support "append" if needed.
>>
>>
>>>
>>> 4. As usual the events can be read from the mon_data directories.
>>> #mkdir /sys/fs/resctrl/test
>>> #cd /sys/fs/resctr/test
>>> #cat test/mon_data/mon_data/mon_L3_00/mbm_tota_bytes
>>> 101010
>>> #cat test/mon_data/mon_data/mon_L3_00/mbm_local_bytes
>>> 32323
>>>
>>> 5. There will be 3 files created in each group's mon_data directory when
>>> ABMC is supported.
>>>
>>> a. test/mon_data/mon_L3_00/assign_exclusive
>>> b. test/mon_data/mon_L3_00/assign_shared
>>> c. test/mon_data/mon_L3_00/unassign
>>>
>>>
>>> 6. Events can be assigned/unassigned by these commands
>>>
>>> # echo mbm_total_bytes > test/mon_data/mon_L3_00/assign_exclusive
>>> # echo mbm_local_bytes > test/mon_data/mon_L3_01/assign_exclusive
>>> # echo mbm_local_bytes > test/mon_data/mon_L3_01/unassign
>>>
>>>
>>> Note:
>>> I feel 3 files are excessive here. We can probably achieve everything in
>>> just one file.
>>
>> Could you please elaborate what your concern is? You mention that it is
>> excessive but it is not clear to me what issues may arise by
>> having three files instead of one.
>
> All these 3 properties are mutually exclusive. Only one can true at a time. Example:
> #cat assign_exclusive
> 0
> #cat assign_shared
> 0
> #cat uassigned
> 1
>
> Three operations to find out the assign state.
ah - good point.
>
> Instead of that
> #cat mon_l3_assignments
> unassigned
>
>
>>
>> I do think, and Peter also mentioned [1] this, that it may be useful,
>> to "put a group/resource-scoped assign_* file higher in the hierarchy
>> to make it easier for users who want to configure all domains the
>> same for a group."
>>
>> Placing *additional* files higher in hierarchy (used to manage counters in all
>> domains) may be more useful that trying to provide the shared/exclusive/unassign
>> in one file per domain.
>
> Yea. To make it better we can add "mon_l3_assignments" in groups main directory. We can do all the operation in just one file.
>
> https://lore.kernel.org/lkml/efb5293f-b0ef-4c94-bf10-9ca7ebb3b53f@xxxxxxx/
I am hesitant to respond to that message considering the corporate preamble that
sneaked in so I'll just add some thoughts here:
Having the file higher in hierarchy does seem more useful. It may be useful to reduce
amount of parsing to get to needed information. Compare with below two examples that can
be used to convey the same information:
# cat /sys/fs/resctrl/test/mon_L3_assignments
mbm_total_bytes: 0=unassigned; 1=unassigned
mbm_local_bytes: 0=unassigned; 1=unassigned
#cat /sys/fs/resctrl/test/mon_L3_assignments
0=_; 1=_
We need to take care that it is always clear what "0" or "1" means ...
Tony has been mentioning a lot of interesting things about scope
changes. I assume the "L3" in mon_L3_assignments will dictate the scope?
With a syntax like above the needed information can be presented in one line.
For example,
#cat /sys/fs/resctrl/test/mon_L3_assignments
0=mbm_total_bytes; 1=mbm_local_bytes
The caveat is that is only for assigned counters, not shared, so this needs
to change.
To support shared assignment ... I wonder if it will be useful to users to
get the information on which other monitor groups the counter is shared _with_?
Some examples:
a) Just indicate whether a counter is shared or dedicated. (Introduce flags).
#cat /sys/fs/resctrl/test/mon_L3_assignments
0=mbm_total_bytes:s; 1=mbm_local_bytes:d
b) Indicate which groups a counter is shared with:
#cat /sys/fs/resctrl/testA/mon_L3_assignments
0=mbm_total_bytes:s(testB); 1=mbm_local_bytes:d(not needed but perhaps empty for consistent interface?)
#cat /sys/fs/resctrl/testB/mon_L3_assignments
0=mbm_total_bytes:s(testA); 1=mbm_local_bytes:d(?)
... (b) may just be overkill and we should instead follow Tony's
guideline (see https://lore.kernel.org/lkml/Z9CiwLrhuTODruCj@agluck-desk3/ )
that users should be able to keep track themselves.
Reinette