Re: [RFC PATCH 01/19] x86,fs/resctrl: Add support for Global Bandwidth Enforcement (GLBE)
From: Reinette Chatre
Date: Fri Feb 13 2026 - 11:18:03 EST
Hi Babu,
On 2/12/26 5:51 PM, Moger, Babu wrote:
> On 2/12/2026 6:05 PM, Reinette Chatre wrote:
>> On 2/12/26 11:09 AM, Babu Moger wrote:
>>> On 2/11/26 21:51, Reinette Chatre wrote:
>>>> On 2/11/26 1:18 PM, Babu Moger wrote:
>>>>> On 2/11/26 10:54, Reinette Chatre wrote:
>>>>>> On 2/10/26 5:07 PM, Moger, Babu wrote:
>>>>>>> On 2/9/2026 12:44 PM, Reinette Chatre wrote:
>>>>>>>> On 1/21/26 1:12 PM, Babu Moger wrote:
>>
>> ...
>>
>>>>>> Another question, when setting aside possible differences between MB and GMB.
>>>>>>
>>>>>> I am trying to understand how user may expect to interact with these interfaces ...
>>>>>>
>>>>>> Consider the starting state example as below where the MB and GMB ceilings are the
>>>>>> same:
>>>>>>
>>>>>> # cat schemata
>>>>>> GMB:0=2048;1=2048;2=2048;3=2048
>>>>>> MB:0=2048;1=2048;2=2048;3=2048
>>>>>>
>>>>>> Would something like below be accurate? Specifically, showing how the GMB limit impacts the
>>>>>> MB limit:
>>>>>> # echo"GMB:0=8;2=8" > schemata
>>>>>> # cat schemata
>>>>>> GMB:0=8;1=2048;2=8;3=2048
>>>>>> MB:0=8;1=2048;2=8;3=2048
>>>>> Yes. That is correct. It will cap the MB setting to 8. Note that we are talking about unit differences to make it simple.
>>>> Thank you for confirming.
>>>>
>>>>>> ... and then when user space resets GMB the MB can reset like ...
>>>>>>
>>>>>> # echo"GMB:0=2048;2=2048" > schemata
>>>>>> # cat schemata
>>>>>> GMB:0=2048;1=2048;2=2048;3=2048
>>>>>> MB:0=2048;1=2048;2=2048;3=2048
>>>>>>
>>>>>> if I understand correctly this will only apply if the MB limit was never set so
>>>>>> another scenario may be to keep a previous MB setting after a GMB change:
>>>>>>
>>>>>> # cat schemata
>>>>>> GMB:0=2048;1=2048;2=2048;3=2048
>>>>>> MB:0=8;1=2048;2=8;3=2048
>>>>>>
>>>>>> # echo"GMB:0=8;2=8" > schemata
>>>>>> # cat schemata
>>>>>> GMB:0=8;1=2048;2=8;3=2048
>>>>>> MB:0=8;1=2048;2=8;3=2048
>>>>>>
>>>>>> # echo"GMB:0=2048;2=2048" > schemata
>>>>>> # cat schemata
>>>>>> GMB:0=2048;1=2048;2=2048;3=2048
>>>>>> MB:0=8;1=2048;2=8;3=2048
>>>>>>
>>>>>> What would be most intuitive way for user to interact with the interfaces?
>>>>> I see that you are trying to display the effective behaviors above.
>>>> Indeed. My goal is to get an idea how user space may interact with the new interfaces and
>>>> what would be a reasonable expectation from resctrl be during these interactions.
>>>>
>>>>> Please keep in mind that MB and GMB units differ. I recommend showing only the values the user has explicitly configured, rather than the effective settings, as displaying both may cause confusion.
>>>> hmmm ... this may be subjective. Could you please elaborate how presenting the effective
>>>> settings may cause confusion?
>>>
>>> I mean in many cases, we cannot determine the effective settings correctly. It depends on benchmarks or applications running on the system.
>>>
>>> Even with MB (without GMB support), even though we set the limit to 10GB, it may not use the whole 10GB. Memory is shared resource. So, the effective bandwidth usage depends on other applications running on the system.
>>
>> Sounds like we interpret "effective limits" differently. To me the limits(*) are deterministic.
>> If I understand correctly, if the GMB limit for domains A and B is set to x GB then that places
>> an x GB limit on MB for domains A and B also. Displaying any MB limit in the schemata that is
>> larger than x GB for domain A or domain B would be inaccurate, no?
>
> Yea. But, I was thinking not to mess with values written at registers.
This is not about what is written to the registers but how the combined values
written to registers control system behavior and how to accurately reflect the
resulting system behavior to user space.
>> When considering your example where the MB limit is 10GB.
>>
>> Consider an example where there are two domains in this example with a configuration like below.
>> (I am using a different syntax from schemata file that will hopefully make it easier to exchange
>> ideas when not having to interpret the different GMB and MB units):
>>
>> MB:0=10GB;1=10GB
>>
>> If user space can create a GMB domain that limits shared bandwidth to 10GB that can be displayed
>> as below and will be accurate:
>>
>> MB:0=10GB;1=10GB
>> GMB:0=10GB;1=10GB
>>
>> If user space then reduces the combined bandwidth to 2GB then the MB limit is wrong since it
>> is actually capped by the GMB limit:
>>
>> MB:0=10GB;1=10GB <==== Does reflect possible per-domain memory bandwidth which is now capped by GMB
>> GMB:0=2GB;1=2GB
>>
>> Would something like below not be more accurate that reflects that the maximum average bandwidth
>> each domain could achieve is 2GB?
>>
>> MB:0=2GB;1=2GB <==== Reflects accurate possible per-domain memory bandwidth
>> GMB:0=2GB;1=2GB
>
> That is reasonable. Will check how we can accommodate that.
Right, this is not about the values in the L3BE registers but instead how those values
are impacted by GLBE registers and how to most accurately present the resulting system
configuration to user space. Thank you for considering.
>
>>
>> (*) As a side-note we may have to start being careful with how we use "limits" because of the planned
>> introduction of a "MAX" as a bandwidth control that is an actual limit as opposed to the
>> current control that is approximate.
>>
>>>>> We also need to track the previous settings so we can revert to the earlier value when needed. The best approach is to document this behavior clearly.
>>>> Yes, this will require resctrl to maintain more state.
>>>>
>>>> Documenting behavior is an option but I think we should first consider if there are things
>>>> resctrl can do to make the interface intuitive to use.
>>>>
>>>>>>>>> From the description it sounds as though there is a new "memory bandwidth
>>>>>>>> ceiling/limit" that seems to imply that MBA allocations are limited by
>>>>>>>> GMBA allocations while the proposed user interface present them as independent.
>>>>>>>>
>>>>>>>> If there is indeed some dependency here ... while MBA and GMBA CLOSID are
>>>>>>>> enumerated separately, under which scenario will GMBA and MBA support different
>>>>>>>> CLOSID? As I mentioned in [1] from user space perspective "memory bandwidth"
>>>>>>> I can see the following scenarios where MBA and GMBA can operate independently:
>>>>>>> 1. If the GMBA limit is set to ‘unlimited’, then MBA functions as an independent CLOS.
>>>>>>> 2. If the MBA limit is set to ‘unlimited’, then GMBA functions as an independent CLOS.
>>>>>>> I hope this clarifies your question.
>>>>>> No. When enumerating the features the number of CLOSID supported by each is
>>>>>> enumerated separately. That means GMBA and MBA may support different number of CLOSID.
>>>>>> My question is: "under which scenario will GMBA and MBA support different CLOSID?"
>>>>> No. There is not such scenario.
>>>>>> Because of a possible difference in number of CLOSIDs it seems the feature supports possible
>>>>>> scenarios where some resource groups can support global AND per-domain limits while other
>>>>>> resource groups can just support global or just support per-domain limits. Is this correct?
>>>>> System can support up to 16 CLOSIDs. All of them support all the features LLC, MB, GMB, SMBA. Yes. We have separate enumeration for each feature. Are you suggesting to change it ?
>>>> It is not a concern to have different CLOSIDs between resources that are actually different,
>>>> for example, having LLC or MB support different number of CLOSIDs. Having the possibility to
>>>> allocate the *same* resource (memory bandwidth) with varying number of CLOSIDs does present a
>>>> challenge though. Would it be possible to have a snippet in the spec that explicitly states
>>>> that MB and GMB will always enumerate with the same number of CLOSIDs?
>>>
>>> I have confirmed that is the case always. All current and planned implementations, MB and GMB will have the same number of CLOSIDs.
>>
>> Thank you very much for confirming. Is this something the architects would be willing to
>> commit to with a snippet in the PQoS spec?
>
> I checked on that. Here is the response.
>
> "I do not plan to add a statement like that to the spec. The CPUID enumeration allows for them to have different number of CLOS's supported for each. However, it is true that for all current and planned implementations, MB and GMB will have the same number of CLOS."
Thank you for asking. At this time the definition of a resource's "num_closids" is:
"num_closids":
The number of CLOSIDs which are valid for this
resource. The kernel uses the smallest number of
CLOSIDs of all enabled resources as limit.
Without commitment from architecture we could expand definition of "num_closids" when
adding multiple controls to indicate that it is the smallest number of CLOSIDs supported
by all controls.
>>>> Please see below where I will try to support this request more clearly and you can decide if
>>>> it is reasonable.
>>>>
>>>>>>>> can be seen as a single "resource" that can be allocated differently based on
>>>>>>>> the various schemata associated with that resource. This currently has a
>>>>>>>> dependency on the various schemata supporting the same number of CLOSID which
>>>>>>>> may be something that we can reconsider?
>>>>>>> After reviewing the new proposal again, I’m still unsure how all the pieces will fit together. MBA and GMBA share the same scope and have inter-dependencies. Without the full implementation details, it’s difficult for me to provide meaningful feedback on new approach.
>>>>>> The new approach is not final so please provide feedback to help improve it so
>>>>>> that the features you are enabling can be supported well.
>>>>> Yes, I am trying. I noticed that the proposal appears to affect how the schemata information is displayed(in info directory). It seems to introduce additional resource information. I don't see any harm in displaying it if it benefits certain architecture.
>>>> It benefits all architectures.
>>>>
>>>> There are two parts to the current proposals.
>>>>
>>>> Part 1: Generic schema description
>>>> I believe there is consensus on this approach. This is actually something that is long
>>>> overdue and something like this would have been a great to have with the initial AMD
>>>> enabling. With the generic schema description forming part of resctrl the user can learn
>>>> from resctrl how to interact with the schemata file instead of relying on external information
>>>> and documentation.
>>>
>>> ok.
>>>
>>>> For example, on an Intel system that uses percentage based proportional allocation for memory
>>>> bandwidth the new resctrl files will display:
>>>> info/MB/resource_schemata/MB/type:scalar linear
>>>> info/MB/resource_schemata/MB/unit:all
>>>> info/MB/resource_schemata/MB/scale:1
>>>> info/MB/resource_schemata/MB/resolution:100
>>>> info/MB/resource_schemata/MB/tolerance:0
>>>> info/MB/resource_schemata/MB/max:100
>>>> info/MB/resource_schemata/MB/min:10
>>>>
>>>>
>>>> On an AMD system that uses absolute allocation with 1/8 GBps steps the files will display:
>>>> info/MB/resource_schemata/MB/type:scalar linear
>>>> info/MB/resource_schemata/MB/unit:GBps
>>>> info/MB/resource_schemata/MB/scale:1
>>>> info/MB/resource_schemata/MB/resolution:8
>>>> info/MB/resource_schemata/MB/tolerance:0
>>>> info/MB/resource_schemata/MB/max:2048
>>>> info/MB/resource_schemata/MB/min:1
>>>>
>>>> Having such interface will be helpful today. Users do not need to first figure out
>>>> whether they are on an AMD or Intel system, and then read the docs to learn the AMD units,
>>>> before interacting with resctrl. resctrl will be the generic interface it intends to be.
>>>
>>> Yes. That is a good point.
>>>
>>>> Part 2: Supporting multiple controls for a single resource
>>>> This is a new feature on which there also appears to be consensus that is needed by MPAM and
>>>> Intel RDT where it is possible to use different controls for the same resource. For example,
>>>> there can be a minimum and maximum control associated with the memory bandwidth resource.
>>>>
>>>> For example,
>>>> info/
>>>> └─ MB/
>>>> └─ resource_schemata/
>>>> ├─ MB/
>>>> ├─ MB_MIN/
>>>> ├─ MB_MAX/
>>>> ┆
>>>>
>>>>
>>>> Here is where the big question comes in for GLBE - is this actually a new resource
>>>> for which resctrl needs to add interfaces to manage its allocation, or is it instead
>>>> an additional control associated with the existing memory bandwith resource?
>>>
>>> It is not a new resource. It is new control mechanism to address limitation with memory bandwidth resource.
>>>
>>> So, it is a new control for the existing memory bandwidth resource.
>>
>> Thank you for confirming.
>>
>>>
>>>> For me things are actually pointing to GLBE not being a new resource but instead being
>>>> a new control for the existing memory bandwidth resource.
>>>>
>>>> I understand that for a PoC it is simplest to add support for GLBE as a new resource as is
>>>> done in this series but when considering it as an actual unique resource does not seem
>>>> appropriate since resctrl already has a "memory bandwidth" resource. User space expects
>>>> to find all the resources that it can allocate in info/ - I do not think it is correct
>>>> to have two separate directories/resources for memory bandwidth here.
>>>>
>>>> What if, instead, it looks something like:
>>>>
>>>> info/
>>>> └── MB/
>>>> └── resource_schemata/
>>>> ├── GMB/
>>>> │ ├──max:4096
>>>> │ ├──min:1
>>>> │ ├──resolution:1
>>>> │ ├──scale:1
>>>> │ ├──tolerance:0
>>>> │ ├──type:scalar linear
>>>> │ └──unit:GBps
>>>> └── MB/
>>>> ├──max:8192
>>>> ├──min:1
>>>> ├──resolution:8
>>>> ├──scale:1
>>>> ├──tolerance:0
>>>> ├──type:scalar linear
>>>> └──unit:GBps
>>>
>>> Yes. It definitely looks very clean.
>>>
>>>> With an interface like above GMB is just another control/schema used to allocate the
>>>> existing memory bandwidth resource. With the planned files it is possible to express the
>>>> different maximums and units used by the MB and GMB schema. Users no longer need to
>>>> dig for the unit information in the docs, it is available in the interface.
>>>
>>>
>>> Yes. That is reasonable.
>>>
>>> Is the plan to just update the resource information in /sys/fs/resctrl/info/<resource_name> ?
>>
>> I do not see any resource information that needs to change. As you confirmed,
>> MB and GMB have the same number of CLOSIDs and looking at the rest of the
>> enumeration done in patch #2 all other properties exposed in top level of
>> /sys/fs/resctrl/info/MB is the same for MB and GMB. Specifically,
>> thread_throttle_mode, delay_linear, min_bandwidth, and bandwidth_gran have
>> the same values for MB and GMB. All other content in
>> /sys/fs/resctrl/info/MB would be new as part of the new "resource_schemata"
>> sub-directory.
>>
>> Even so, I believe we could expect that a user using any new schemata file entry
>> introduced after the "resource_schemata" directory is introduced is aware of how
>> the properties are exposed and will not use the top level files in /sys/fs/resctrl/info/MB
>> (for example min_bandwidth and bandwidth_gran) to understand how to interact with
>> the new schema.
>>
>>
>>>
>>> Also, will the display of /sys/fs/resctrl/schemata change ?
>>
>> There are no plans to change any of the existing schemata file entries.
>>
>>>
>>> Current display:
>>
>> When viewing "current" as what this series does in schemata file ...
>>
>>>
>>> GMB:0=4096;1=4096;2=4096;3=4096
>>> MB:0=8192;1=8192;2=8192;3=8192
>>
>> yes, the schemata file should look like this on boot when all is done. All other
>> user facing changes are to the info/ directory where user space learns about
>> the new control for the resource and how to interact with the control.
>>
>>>> Doing something like this does depend on GLBE supporting the same number of CLOSIDs
>>>> as MB, which seems to be how this will be implemented. If there is indeed a confirmation
>>>> of this from AMD architecture then we can do something like this in resctrl.
>>>
>>> I don't see this being an issue. I will get consensus on it.
>>>
>>> I am wondering about the time frame and who is leading this change. Not sure if that is been discussed already.
>>> I can definitely help.
>>
>> A couple of features depend on the new schema descriptions as well as support for multiple
>> controls: min/max bandwidth controls on the MPAM side, region aware MBA and MBM on the Intel
>> side, and GLBE on the AMD side. I am hoping that the folks working on these features can
>> collaborate on the needed foundation. Since there are no patches for this yet I cannot say
>> if there is a leader for this work yet, at this time this role appears to be available if you
>> would like to see this moving forward in order to meet your goals.
>
>
> I joined this feature effort a bit later, so I may not yet have full context on the MPAM and region‑aware requirements. I’m happy to provide all the necessary information for GMB and MB from the AMD side, and I’m also available to help with reviews and testing.
I understand there is a lot involved. With so many folks dependent on this work I anticipate
that any effort will get support from the various content experts. Your knowledge of resctrl
fs will be valuable in this effort.
Reinette