Re: [RFC PATCH 13/19] x86/resctrl: Add PLZA state tracking and context switch handling

From: Ben Horgan

Date: Thu Feb 19 2026 - 05:22:31 EST


Hi Reinette,

On 2/17/26 18:51, Reinette Chatre wrote:
> Hi Ben,
>
> On 2/16/26 7:18 AM, Ben Horgan wrote:
>> On Thu, Feb 12, 2026 at 10:37:21AM -0800, Reinette Chatre wrote:
>>> On 2/12/26 5:55 AM, Ben Horgan wrote:
>>>> On Wed, Feb 11, 2026 at 02:22:55PM -0800, Reinette Chatre wrote:
>>>>> On 2/11/26 8:40 AM, Ben Horgan wrote:
>>>>>> On Tue, Feb 10, 2026 at 10:04:48AM -0800, Reinette Chatre wrote:
>
>>>>>>> It looks like MPAM has a few more capabilities here and the Arm levels are numbered differently
>>>>>>> with EL0 meaning user space. We should thus aim to keep things as generic as possible. For example,
>>>>>>> instead of CPL0 using something like "kernel" or ... ?
>>>>>>
>>>>>> Yes, PLZA does open up more possibilities for MPAM usage. I've talked to James
>>>>>> internally and here are a few thoughts.
>>>>>>
>>>>>> If the user case is just that an option run all tasks with the same closid/rmid
>>>>>> (partid/pmg) configuration when they are running in the kernel then I'd favour a
>>>>>> mount option. The resctrl filesytem interface doesn't need to change and
>>>>>
>>>>> I view mount options as an interface of last resort. Why would a mount option be needed
>>>>> in this case? The existence of the file used to configure the feature seems sufficient?
>>>>
>>>> If we are taking away a closid from the user then the number of CTRL_MON groups
>>>> that can be created changes. It seems reasonable for user-space to expect
>>>> num_closid to be a fixed value.
>>>
>>> I do you see why we need to take away a CLOSID from the user. Consider a user space that
>>
>> Yes, just slightly simpler to take away a CLOSID but could just go with the
>> default CLOSID is also used for the kernel. I would be ok with a file saying the
>> mode, like the mbm_event file does for counter assignment. It slightly misleading
>> that a configuration file is under info but necessary as we don't have another
>> location global to the resctrl mount.
>
> Indeed, the "info" directory has evolved more into a "config" directory.
>
>>> runs with just two resource groups, for example, "high priority" and "low priority", it seems
>>> reasonable to make it possible to let the "low priority" tasks run with "high priority"
>>> allocations when in kernel space without needing to dedicate a new CLOSID? More reasonable
>>> when only considering memory bandwidth allocation though.
>>>
>>>>
>>>>>
>>>>> Also ...
>>>>>
>>>>> I do not think resctrl should unnecessarily place constraints on what the hardware
>>>>> features are capable of. As I understand, both PLZA and MPAM supports use case where
>>>>> tasks may use different CLOSID/RMID (PARTID/PMG) when running in the kernel. Limiting
>>>>> this to only one CLOSID/PARTID seems like an unmotivated constraint to me at the moment.
>>>>> This may be because I am not familiar with all the requirements here so please do
>>>>> help with insight on how the hardware feature is intended to be used as it relates
>>>>> to its design.
>>>>>
>>>>> We have to be very careful when constraining a feature this much If resctrl does something
>>>>> like this it essentially restricts what users could do forever.
>>>>
>>>> Indeed, we don't want to unnecessarily restrict ourselves here. I was hoping a
>>>> fixed kernel CLOSID/RMID configuration option might just give all we need for
>>>> usecases we know we have and be minimally intrusive enough to not preclude a
>>>> more featureful PLZA later when new usecases come about.
>>>
>>> Having ability to grow features would be ideal. I do not see how a fixed kernel CLOSID/RMID
>>> configuration leaves room to build on top though. Could you please elaborate?
>>
>> If we initially go with a single new configuration file, e.g. kernel_mode, which
>> could be "match_user" or "use_root, this would be the only initial change to the
>> interface needed. If more usecases present themselves a new mode could be added,
>> e.g. "configurable", and an interface to actually change the rmid/closid for the
>> kernel could be added.
>
> Something like this could be a base to work from. I think only the two ("match_user" and
> "use_root") are a bit limiting for even the initial implementation though.
> As I understand, "use_root" implies using the allocations of the default group but
> does not indicate what MON group (which RMID/PMG) should be used to monitor the
> work done in kernel space. A way to specify the actual group may be needed?

Yeah, I'm not sure that flexibility is strictly necessary but will make
the interface easier to use.

>
>>> I wonder if the benefit of the fixed CLOSID/RMID is perhaps mostly in the cost of
>>> context switching which I do not think is a concern for MPAM but it may be for PLZA?
>>>
>>> One option to support fixed kernel CLOSID/RMID at the beginning and leave room to build
>>> may be to create the kernel_group or "tasks_kernel" interface as a baseline but in first
>>> implementation only allow user space to write the same group to all "kernel_group" files or
>>> to only allow to write to one of the "tasks_kernel" files in the resctrl fs hierarchy. At
>>> that time the associated CLOSID/RMID would become the "fixed configuration" and attempts to
>>> write to others can return "ENOSPC"?
>>
>> I think we'd have to be sure of the final interface if we go this way.
>
> I do not think we should aim to know the final interface since that requires knowing all future
> hardware features and their implementations in advance. Instead we should aim to have something
> that we can build on that is accompanied by documentation that supports future flexibility (some may
> refer to this as "weasel words").

Makes sense.

>
>>> From what I can tell this still does not require to take away a CLOSID/RMID from user space
>>> though. Dedicating a CLOSID/RMID to kernel work can still be done but be in control of user
>>> that can, for example leave the "tasks" and "cpus" files empty.
>>>
>>>> One complication is that for fixed kernel CLOSID/RMID option is that for x86 you
>>>> may want to be able to monitor a tasks resource usage whether or not it is in
>>>> the kernel or userspace and so only have a fixed CLOSID. However, for MPAM this
>>>> wouldn't work as PMG (~RMID) is scoped to PARTID (~CLOSID).
>>>>
>>>>>
>>>>>> userspace software doesn't need to change. This could either take away a
>>>>>> closid/rmid from userspace and dedicate it to the kernel or perhaps have a
>>>>>> policy to have the default group as the kernel group. If you use the default
>>>>>
>>>>> Similar to above I do not see PLZA or MPAM preventing sharing of CLOSID/RMID (PARTID/PMG)
>>>>> between user space and kernel. I do not see a motivation for resctrl to place such
>>>>> constraint.
>>>>>
>>>>>> configuration, at least for MPAM, the kernel may not be running at the highest
>>>>>> priority as a minimum bandwidth can be used to give a priority boost. (Once we
>>>>>> have a resctrl schema for this.)
>>>>>>
>>>>>> It could be useful to have something a bit more featureful though. Is there a
>>>>>> need for the two mappings, task->cpl0 config and task->cpl1 to be independent or
>>>>>> would as task->(cp0 config, cp1 config) be sufficient? It seems awkward that
>>>>>> it's not a single write to move a task. If a single mapping is sufficient, then
>>>>>
>>>>> Moving a task in x86 is currently two writes by writing the CLOSID and RMID separately.
>>>>> I think the MPAM approach is better and there may be opportunity to do this in a similar
>>>>> way and both architectures use the same field(s) in the task_struct.
>>>>
>>>> I was referring to the userspace file write but unifying on a the same fields in
>>>> task_struct could be good. The single write is necessary for MPAM as PMG is
>>>> scoped to PARTID and I don't think x86 behaviour changes if it moves to the same
>>>> approach.
>>>>
>>>
>>> ah - I misunderstood. You are suggesting to have one file that user writes to
>>> to set both user space and kernel space CLOSID/RMID? This sounds like what the
>>
>> Yes, the kernel_groups idea does partially have this as once you've set the
>> kernel_group for a CTRL_MON or MON group then the user space configuration
>> dictates the kernel space configuration. As you pointed out, this is also
>> a draw back of the kernel_groups idea.
>>
>>> existing "tasks" file does but only supports the same CLOSID/RMID for both user
>>> space and kernel space. To support the new hardware features where the CLOSID/RMID
>>> can be different we cannot just change "tasks" interface and would need to keep it
>>> backward compatible. So far I assumed that it would be ok for the "tasks" file
>>> to essentially get new meaning as the CLOSID/RMID for just user space work, which
>>> seems to require a second file for kernel space as a consequence? So far I have
>>> not seen an option that does not change meaning of the "tasks" file.
>>
>> Would it make sense to have some new type of entries in the tasks file,
>> e.g. k_ctrl_<pid>, k_mon_<pid> to say, in the kernel, use the closid of this
>> CTRL_MON for this task pid or use the rmid of this CTRL_MON/MON group for this task
>> pid? We would still probably need separate files for the cpu configuration.
>
> I am obligated to nack such a change to the tasks file since it would impact any
> existing user space parsing of this file.
>

Good to know. Do you consider the format of the tasks file fully fixed?

Thanks,

Ben