Re: [RFC PATCH 13/19] x86/resctrl: Add PLZA state tracking and context switch handling

From: Ben Horgan

Date: Tue Feb 17 2026 - 10:57:15 EST


Hi Babu,

On 2/16/26 22:52, Moger, Babu wrote:
> Hi Ben,
>
> On 2/16/2026 9:41 AM, Ben Horgan wrote:
>> Hi Babu, Reinette,
>>
>> On 2/14/26 00:10, Reinette Chatre wrote:
>>> Hi Babu,
>>>
>>> On 2/13/26 8:37 AM, Moger, Babu wrote:
>>>> Hi Reinette,
>>>>
>>>> On 2/10/2026 10:17 AM, Reinette Chatre wrote:
>>>>> Hi Babu,
>>>>>
>>>>> On 1/28/26 9:44 AM, Moger, Babu wrote:
>>>>>>
>>>>>>
>>>>>> On 1/28/2026 11:41 AM, Moger, Babu wrote:
>>>>>>>> On Wed, Jan 28, 2026 at 10:01:39AM -0600, Moger, Babu wrote:
>>>>>>>>> On 1/27/2026 4:30 PM, Luck, Tony wrote:
>>>>>>>> Babu,
>>>>>>>>
>>>>>>>> I've read a bit more of the code now and I think I understand more.
>>>>>>>>
>>>>>>>> Some useful additions to your explanation.
>>>>>>>>
>>>>>>>> 1) Only one CTRL group can be marked as PLZA
>>>>>>>
>>>>>>> Yes. Correct.
>>>>>
>>>>> Why limit it to one CTRL_MON group and why not support it for MON
>>>>> groups?
>>>>
>>>> There can be only one PLZA configuration in a system. The values in
>>>> the MSR_IA32_PQR_PLZA_ASSOC register (RMID, RMID_EN, CLOSID,
>>>> CLOSID_EN) must be identical across all logical processors. The only
>>>> field that may differ is PLZA_EN.
>>
>> Does this have any effect on hypervisors?
>
> Because hypervisor runs at CPL0, there could be some use case. I have
> not completely understood that part.
>
>>
>>>
>>> ah - this is a significant part that I missed. Since this is a per-
>>> CPU register it seems
>>
>> I also missed that.
>>
>>> to have the ability for expanded use in the future where different
>>> CLOSID and RMID may be
>>> written to it? Is PLZA leaving room for such future enhancement or
>>> does the spec contain
>>> the text that state "The values in the MSR_IA32_PQR_PLZA_ASSOC
>>> register (RMID, RMID_EN,
>>> CLOSID, CLOSID_EN) must be identical across all logical processors."?
>>> That is, "forever
>>> and always"?
>>>
>>> If I understand correctly MPAM could have different PARTID and PMG
>>> for kernel use so we
>>> need to consider these different architectural behaviors.
>>
>> Yes, MPAM has a per-cpu register MPAM1_EL1.
>>
>
> oh ok.
>
>>>
>>>> I was initially unsure which RMID should be used when PLZA is
>>>> enabled on MON groups.
>>>>
>>>> After re-evaluating, enabling PLZA on MON groups is still feasible:
>>>>
>>>> 1. Only one group in the system can have PLZA enabled.
>>>> 2. If PLZA is enabled on CTRL_MON group then we cannot enable PLZA
>>>> on MON group.
>>>> 3. If PLZA is enabled on the CTRL_MON group, then the CLOSID and
>>>> RMID of the CTRL_MON group can be written.
>>>> 4. If PLZA is enabled on a MON group, then the CLOSID of the
>>>> CTRL_MON group can be used, while the RMID of the MON group can be
>>>> written.
>>
>> Given that CLOSID and RMID are fixed once in the PLZA configuration
>> could this be simplified by just assuming they have the values of the
>> default group, CLOSID=0 and RMID=0 and let the user base there
>> configuration on that?
>>
>
> I didn't understand this question. There are 16 CLOSIDs and 1024 RMIDs.
> We can use any one of these to enable PLZA.  It is not fixed in that sense.

Sorry, I wasn't clear. What I'm trying to understand is what you gain by
this flexibility. Given that the values CLOSID and the RMID are just
identifiers within the hardware and have only the meaning they are given
by the grouping and controls/monitors set up by resctrl (or any other
software interface) would you lose anything by just saying the PLZA
group has CLOSID=0 and RMID=0. Is there value in changing the PLZA
CLOSID and RMID or can the same effect happen by just changing the
resctrl configuration?

I was also wondering if using the default group this way would mean that
you wouldn't need to reserve the group for only kernel use.

>
>
>>>>
>>>> I am thinking this approach should work.
>>>>
>>>>>
>>>>> Limiting it to a single CTRL group seems restrictive in a few ways:
>>>>> 1) It requires that the "PLZA" group has a dedicated CLOSID. This
>>>>> reduces the
>>>>>      number of use cases that can be supported. Consider, for
>>>>> example, an existing
>>>>>      "high priority" resource group and a "low priority" resource
>>>>> group. The user may
>>>>>      just want to let the tasks in the "low priority" resource
>>>>> group run as "high priority"
>>>>>      when in CPL0. This of course may depend on what resources are
>>>>> allocated, for example
>>>>>      cache may need more care, but if, for example, user is only
>>>>> interested in memory
>>>>>      bandwidth allocation this seems a reasonable use case?
>>>>> 2) Similar to what Tony [1] mentioned this does not enable what the
>>>>> hardware is
>>>>>      capable of in terms of number of different control groups/
>>>>> CLOSID that can be
>>>>>      assigned to MSR_IA32_PQR_PLZA_ASSOC. Why limit PLZA to one
>>>>> CLOSID?
>>>>> 3) The feature seems to support RMID in MSR_IA32_PQR_PLZA_ASSOC
>>>>> similar to
>>>>>      MSR_IA32_PQR_ASSOC. With this, it should be possible for user
>>>>> space to, for
>>>>>      example, create a resource group that contains tasks of
>>>>> interest and create
>>>>>      a monitor group within it that monitors all tasks' bandwidth
>>>>> usage when in CPL0.
>>>>>      This will give user space better insight into system behavior
>>>>> and from what I can
>>>>>      tell is supported by the feature but not enabled?
>>>>
>>>>
>>>> Yes, as long as PLZA is enabled on only one group in the entire system
>>>>
>>>>>
>>>>>>>
>>>>>>>> 2) It can't be the root/default group
>>>>>>>
>>>>>>> This is something I added to keep the default group in a un-
>>>>>>> disturbed,
>>>>>
>>>>> Why was this needed?
>>>>>
>>>>
>>>> With the new approach mentioned about we can enable in default group
>>>> also.
>>>>
>>>>>>>
>>>>>>>> 3) It can't have sub monitor groups
>>>>>
>>>>> Why not?
>>>>
>>>> Ditto. With the new approach mentioned about we can enable in
>>>> default group also.
>>>>
>>>>>
>>>>>>>> 4) It can't be pseudo-locked
>>>>>>>
>>>>>>> Yes.
>>>>>>>
>>>>>>>>
>>>>>>>> Would a potential use case involve putting *all* tasks into the
>>>>>>>> PLZA group? That
>>>>>>>> would avoid any additional context switch overhead as the PLZA
>>>>>>>> MSR would never
>>>>>>>> need to change.
>>>>>>>
>>>>>>> Yes. That can be one use case.
>>>>>>>
>>>>>>>>
>>>>>>>> If that is the case, maybe for the PLZA group we should allow
>>>>>>>> user to
>>>>>>>> do:
>>>>>>>>
>>>>>>>> # echo '*' > tasks
>>>>>
>>>>> Dedicating a resource group to "PLZA" seems restrictive while also
>>>>> adding many
>>>>> complications since this designation makes resource group behave
>>>>> differently and
>>>>> thus the files need to get extra "treatments" to handle this "PLZA"
>>>>> designation.
>>>>>
>>>>> I am wondering if it will not be simpler to introduce just one new
>>>>> file, for example
>>>>> "tasks_cpl0" in both CTRL_MON and MON groups. When user space
>>>>> writes a task ID to the
>>>>> file it "enables" PLZA for this task and that group's CLOSID and
>>>>> RMID is the associated
>>>>> task's "PLZA" CLOSID and RMID. This gives user space the
>>>>> flexibility to use the same
>>>>> resource group to manage user space and kernel space allocations
>>>>> while also supporting
>>>>> various monitoring use cases. This still supports the "dedicate a
>>>>> resource group to PLZA"
>>>>> use case where user space can create a new resource group with
>>>>> certain allocations but the
>>>>> "tasks" file will be empty and "tasks_cpl0" contains the tasks
>>>>> needing to run with
>>>>> the resource group's allocations when in CPL0.
>>>>
>>>> Yes. We should be able do that. We need both tasks_cpl0 and cpus_cpl0.
>>>>
>>>> We need make sure only one group can configured in the system and
>>>> not allow in other groups when it is already enabled.
>>>
>>> As I understand this means that only one group can have content in its
>>> tasks_cpl0/tasks_kernel file. There should not be any special
>>> handling for
>>> the remaining files of the resource group since the resource group is
>>> not
>>> dedicated to kernel work and can be used as a user space resource
>>> group also.
>>> If user space wants to create a dedicated kernel resource group there
>>> can be
>>> a new resource group with an empty tasks file.
>>>
>>> hmmm ... but if user space writes a task ID to a tasks_cpl0/
>>> tasks_kernel file then
>>> resctrl would need to create new syntax to remove that task ID.
>>>
>>> Possibly MPAM can build on this by allowing user space to write to
>>> multiple
>>> tasks_cpl0/tasks_kernel files? (and the next version of PLZA may too)
>>>
>>> Reinette
>>>
>>>
>>>>
>>>> Thanks
>>>> Babu
>>>>
>>>>>
>>>>> Reinette
>>>>>
>>>>> [1] https://lore.kernel.org/lkml/aXpgragcLS2L8ROe@agluck-desk3/
>>>>>
>>>>
>>>
>>>
>>
>> Thanks,
>>
>> Ben
>>
>>
>

Thanks,

Ben