Re: [RFC] mpam,x86,fs/resctrl: Generic schema description Proof of Concept
From: Shaopeng Tan (Fujitsu)
Date: Thu Jun 18 2026 - 21:43:47 EST
Hello Reinette,
>On 6/11/26 6:30 PM, Shaopeng Tan (Fujitsu) wrote:
>> Hello Reinette, Ben, Drew,
>>
>>> On Thu, Jun 04, 2026 at 02:47:39PM -0700, Reinette Chatre wrote:
>>>>> The ability to change scope is much needed for RISC-V. There are
>>>>> compromises in my RFC [1] as a result of trying to map everything to
>>>>> either L2 or L3 scope.
>>>>>
>>>>> I would also like to see a non-cpu cache scope for monitoring too, but
>>>>> would that be better discussed outside the context of this proof of
>>>>> concept?
>>>>
>>>> I also think it would be good for it to be clear that monitoring is based on
>>>> scope, not a resource. With the MB controls supporting different scope I do think
>>>> that this would be a good next step. A previous musing from me on this topic can
>>>> be found (at the end of ) https://lore.kernel.org/lkml/fb1e2686-237b-4536-acd6-15159abafcba@xxxxxxxxx/
>>>>
>>>> I have not yet considered how this can be built on top of this PoC though.
>>>
>>> Thanks for explaining. I like how how you show an example of
>>> mon_data/mon_NODE_00/mbm_total_bytes in that thread. I believe that sort
>>> of scheme would work well for RISc-V as a bandwidth controller
>>> implementing the CBQRI spec can be located anywhere within the system.
>>
>> I have a few questions regarding the scope parameter and the name "mbm_total_bytes".
>>
>> First, concerning the scope parameter, does the "NODE" specified in "scope" refer to a NUMA node?
>
>Yes.
>
>> If so, wouldn't using "NUMA" directly be more explicit and user-friendly?
>> Could you please explain why "NODE" is used instead of "NUMA"?
>
>By using "node" the idea is to create an interface that is familiar to user space. Since
>/sys/devices/system/node already exists to expose the NUMA layout to user space I expect that
>using similar terminology would make the resctrl interface easier to use since it would be clear
>that a "node" in /sys/devices/system/node would have a matching "node" in resctrl.
>
>>
>> Second, regarding the naming of "mbm_total_bytes",
>> the meaning of this name seems to differ from what is typically found under mon_data/mon_L3_<id>/mbm_total_bytes.
>
>Could you please elaborate how the meaning is different? The only information I have about
>the systems needing this is in Nvidia's portion of
>https://lpc.events/event/19/contributions/2093/attachments/1958/4172/resctrl%20Microconference%20LPC%202025%20Tokyo.pdf
>
>> To avoid confusion and better reflect its different nature,
>> how about considering an alternative name such as mbm_global_bytes or another more appropriate identifier.
Current MBM events in resctrl are per L3 to monitor MB between L3 and memory,
and mbm_total_bytes monitors the L3 total external bandwidth to the next level of the memory hierarchy.
Is my understanding of this correct?
Based on NVIDIA's information, they even assume that CPU-less nodes will not feature an L3 cache.
I believe NVIDIA's objective is to collect traffic data for each NODE/NUMA,
without needing to distinguish which specific L3 cache the traffic originated from.
This aligns with the fact that, in ARM architectures (e.g., on NVIDIA's platforms or Fujitsu's MONAKA),
the MPAM MSC (Memory System Component) can also be located on the memory controller.
In such cases,it becomes impossible to distinguish which L3 cache the traffic came from.
Therefore, the proposed 'mbm_global_bytes' refers to account for traffic from all L3 cache within the global system.
Best regards
Shaopeng TAN