Re: [RFC] mpam,x86,fs/resctrl: Generic schema description Proof of Concept
From: Drew Fustini
Date: Fri Jun 05 2026 - 15:35:56 EST
On Thu, Jun 04, 2026 at 02:05:08PM -0700, Reinette Chatre wrote:
> >> I plumbed in support for the MB_MIN resource schema which also works under light
> >> testing. The only fs resctrl code change I needed was:
> >>
> >> --- a/include/linux/resctrl.h
> >> +++ b/include/linux/resctrl.h
> >> @@ -483,6 +483,9 @@ static inline u32 resctrl_get_default_ctrlval(struct
> >> resctrl_ctrl *ctrl)
> >> case RESCTRL_CTRL_BITMAP:
> >> return BIT_MASK(ctrl->cache.cbm_len) - 1;
> >> case RESCTRL_CTRL_SCALAR:
> >> + if (ctrl->name == RESCTRL_CTRL_NAME_MIN)
> >> + return ctrl->membw.min_bw;
> >> +
> >> return ctrl->membw.max_bw;
> >> }
> >>
> >>
> >> At least on MPAM systems, we use a default of 0 for minimum bandwidth controls
> >> as the maximum bandwidth controls only take effect if their value is higher than
> >> the minimum bandwidth value. I have specialised this on the ctrl->name which
> >> breaks your ctrl->type based classification but that's fixable by just adding a
> >> default field to membw.
> >
> > This should be useful for RISC-V.
> >
> > RESCTRL_CTRL_NAME_MIN maps well to CBQRI Rbwb (reserved bandwidth
> > blocks). The sum of Rbwb across all control groups must be less than
> > MRBWB (maximum number of reserved bandwidth blocks). As a result, MB_MIN
> > needs to default to 1 so that the sum does not violate that rule. In my
> > RFC series, I added default_to_min to resctrl_membw [1] but this
> > solution looks cleaner.
>
> As I mentioned in response to Ben [2] there seems to be a mismatch between
> architecture requirements here. resctrl uses the value returned by
> resctrl_get_default_ctrlval() as the control value that means "no throttling".
> For Intel this means min == max but this does not seem to be the case for MPAM
> and CBQRI. I am not familiar enough with either to have an alternative proposal here
> so I need to become familiar now. There is a bit of backlog on other resctl
> work right now so this will take me some time to sort out.
Thanks for pointing this out. In that case, it doesn't seem to match
what I was thinking of for MB_MIN. The CBQRI reserved bandwidth blocks
Rbwb) control can be thought of as a minimum amount of guranteed
bandwidth for a control group. Each RCID (e.g. CLOSID) must be assigned
at least 1 bandwidth block per the spec. Therefore, the membw.min_bw
would need to be 1.
There is also a max bandwidth reservation across all control groups
(RCIDs / CLOSIDs) so that there will be some amount of unreserved
bandwidth. Mweight (1-255) controls how much of that unreserved
bandwidth pool that a group can use. Mweight of 0 means no shared
bandwidth. I think the membw.min_bw would need to 255 so that all groups
get equal share of the unreserved pool.
It seems like that would be incorrect use of membw.min_bw in both cases?
> > There is no equivalent to MB (percentage throttle) in RISC-V so I would
> > want it to be valid to have MB_MIN (minimum reservation) without MB.
> >
> > I rebased my RISC-V CBQRI v6 series on top of this proof of concept and
> > was able to validate it works okay in Qemu:
> >
> > MB_WGHT:72=255
> > MB_MIN:72=756
> > L2:64=fff;65=fff
> > L3:75=ffff
>
> Ideally any new support should not break existing user space and the existing
> user interface expects a MB entry in the schemata file when the MB resource exists.
> Is it possible to emulate the percentage based MB control with MB_WGHT or MB_MIN?
> This sounds similar as what is/was planned for MPAM [2].
Yes, I think that Mweight could be mapped to the MB concept of
throttling. All groups could start with the max Mweight of 255 which
could can be represented as 100%.
However, I'm not sure what to do about membw.min_bw. Mweight = 0 means
it can not use any of the shared unreserved bandwidth pool. If
resctrl_get_default_ctrlval() is designed to mean "no throttling", then
it seems like the membw.min_bw would need to be 255. But that feels
weird for the min_bw value to be equal to the max weight for unreserved
bandwidth.
> Something that may be of interest is a proposal that Chenyu is refining to address an
> issue with the region-aware MBA support where there is no intuitive backward compatible
> interface. This was highlighted in the plumbers slides (see slide titled "Open: maintaining
> backward compatibility when region aware"). The current idea to deal with this is to
> introduce a "mode" associated with the resource controls. For example,
>
> # cat /sys/fs/resctrl/info/MB/resource_schemata/mode
> [legacy] native
>
> By default the "legacy" mode will be enabled and exposes the "MB" default control to user
> space via the schemata file. In support of this each new control has a new property file
> named "status" that can have value "enabled" or "disabled". Only "enabled" controls are
> present in the schemata file but all controls are always present in the resource_schemata
> directory. By writing to the "mode" file user space acknowledges familiarity with the new
> "resource_schemata" based interface and can change the status of a control and
> thus manage its visibility in the schemata file.
> Could something like this work for CBQRI?
Yes, I think that would work. There are no existing users of resctrl on
RISC-V so I think having users opt into this resource_schemata interface
would work, especially if that allows a truer represenation of the
controls in the CBQRI spec.
Thanks,
Drew