Re: [PATCH v2 00/16] fs,x86/resctrl: Add kernel-mode (e.g., PLZA) support to the resctrl subsystem

From: Reinette Chatre

Date: Tue Mar 31 2026 - 18:27:29 EST


Hi Babu,

On 3/30/26 11:46 AM, Babu Moger wrote:
> On 3/27/26 17:11, Reinette Chatre wrote:
>> On 3/26/26 10:12 AM, Babu Moger wrote:
>>> On 3/24/26 17:51, Reinette Chatre wrote:
>>>> On 3/12/26 1:36 PM, Babu Moger wrote:

>>>>>        Tony suggested using global variables to store the kernel mode
>>>>>        CLOSID and RMID. However, the kernel mode CLOSID and RMID are
>>>>>        coming from rdtgroup structure with the new interface. Accessing
>>>>>        them requires holding the associated lock, which would make the
>>>>>        context switch path unnecessarily expensive. So, dropped the idea.
>>>>>        https://lore.kernel.org/lkml/aXuxVSbk1GR2ttzF@agluck-desk3/
>>>>>        Let me know if there are other ways to optimize this.
>>>> I do not see why the context switch path needs to be touched at all with this
>>>> implementation. Since PLZA only supports global assignment does it not mean that resctrl
>>>> only needs to update PQR_PLZA_ASSOC when user writes to info/kernel_mode and
>>>> info/kernel_mode_assignment?
>>> Each thread has an MSR to configure whether to associate privilege level zero execution with a separate COS and/or RMID, and the value of the COS and/or RMID.  PLZA may be enabled or disabled on a per-thread basis. However, the COS and RMID association and configuration must be the same for all threads in the QOS Domain.
>> Based on previous comment in https://lore.kernel.org/lkml/abb049fa-3a3d-4601-9ae3-61eeb7fd8fcf@xxxxxxx/
>> and this implementation all fields of PQR_PLZA_ASSOC except PQR_PLZA_ASSOC.plza_en must be the
>> same for all CPUs on the system, not just per QoS domain. Could you please confirm?
>
> Sorry for the confusion. It is "per QoS domain".
>
> All the fields of PQR_PLZA_ASSOC except PQR_PLZA_ASSOC.plza_enmust be set to the same value for all HW threads in the QOS domain for consistent operation (Per-QosDomain).

Thank you for clarifying. To build on this, what would be best way for resctrl to interpret this?
As I see it all values in PQR_PLZA_ASSOC apply to *all* resources yet (theoretically?) every resource
can have domains that span different CPUs. There thus seem to be a built in assumption of what a "domain"
means for PQR_PLZA_ASSOC so it sounds to me as though, instead of saying that "PQR_PLZA_ASSOC needs
to be the same in QoS domain" it may be more accurate to, for example, say that "PQR_PLZA_ASSOC has L3 scope"?

This seems to be what this implementation does since it hardcodes PQR_PLZA_ASSOC scope to the L3
resource but that creates dependency to the L3 resource that would make PLZA unusable if, for example,
the user boots with "rdt=!l3cat" while wanting to use PLZA to manage MBA allocations when in kernel?

...

> Yes, I agree with your concerns. The goal here is to make the interface less disruptive while still addressing the different use cases.

I consider changing resctrl behavior when values are written to existing resctrl files
to be disruptive. This is something we explicitly discussed during v1 as something to
be avoided so this implementation that overloads the tasks file again is unexpected.

>      Background: Customers have identified an issue with the QoS
>      Bandwidth Control feature: when a CLOS is aggressively throttled
>      and execution transitions into kernel mode, kernel operations are
>      also subject to the same aggressive throttling.
>
> > Privilege-Level Zero Association (PLZA) allows a user to specify a
> COS and/or RMID to be used during execution at Privilege Level Zero.
> When PLZA is enabled on a hardware thread, any execution that enters
> Privilege Level Zero will have its transactions associated with the
> PLZA COS and/or RMID. Otherwise, the thread continues to use the COS
> and RMID specified by |PQR_ASSOC|. In other words, the hardware
> provides a dedicated COS and/or RMID specifically for kernel-mode
> execution.
ack.

>
> There are multiple ways this feature can be applied. For simplicity, the discussion below focuses only on CLOSID.
>
>
>      1. Global PLZA enablement
>
> PLZA can be configured as a global feature by setting |PQR_PLZA_ASSOC.closid = CLOSID| and |PQR_PLZA_ASSOC.plza_en = 1| on all threads in the system. A dedicated CLOSID is reserved for this purpose,

Also discussed during v1 is that there is no need to dedicate a CLOSID for this purpose.
There could be an "unthrottled" CLOSID to which all high priority user space tasks as
well as all kernel work of all tasks are assigned.
If user space chooses to dedicate a CLOSID for kernel work then that should supported and
interface can allow that, but there is no need for resctrl to enforce this.

> and all CPU threads use its allocations whenever they enter Privilege Level Zero. This CLOSID does not need to be associated with any resctrl group.

The CLOSID has to be associated with a resource group to be able to manage its
resource allocations, no?

> The user can explicitly enable or disable this feature.
ack.

> There is no context switch overhead but there is no flexibility with this approach.

Flexibility is subjective. As I understand this supports the only use case we learned about so far:
https://lore.kernel.org/lkml/CABPqkBSq=cgn-am4qorA_VN0vsbpbfDePSi7gubicpROB1=djw@xxxxxxxxxxxxxx/

>      2. Group based PLZA allocation :  PLZA is managed via dedicated
>      restctrl group. A separate resctrl group can be created
>      specifically for PLZA, with a dedicated CLOSID used exclusively
>      for kernel mode execution. This approach can be further divided
>      into two association models:

So far this sounds like global allocation since both need a dedicated resource group.
Whether this group is dedicated to kernel work or shared between kernel and user space work
is up to the user. There is no motivation why CLOSID should ever be enforced to be
exclusive for kernel mode execution.

>
> i) CPU based association
> CPUs are assigned to the PLZA group, and PLZA is enabled only on
> those CPUs. This effectively creates a dedicated PLZA group. MSRs (|
> PQR_PLZA_ASSOC)| are programmed only when the user changes CPU
> assignments. This approach requires no changes to the context switch
> code and introduces no additional context switch overhead.
>
> ii) Task based association
> Tasks are explicitly assigned by the user to the PLZA group. Tasks
> need to be updated when user adds a new task. Also, this requires
> updates during task scheduling so that the MSRs (|PQR_PLZA_ASSOC)|
> are programmed on each context switch, which introduces additional
> context switch overhead.

As discussed during v1 any changes needed to support per task assignment would
need to be done with new files dedicated to this purpose. Do not overload the
existing resctrl tasks/cpus/cpus_list files.

> I tried to fit these requirements into the interface files in /sys/
> fs/resctrl/info/. I may have missed few things while trying to
> achieve it. As usual, I am open for the discussion and
> recommendations.

Many of these items were already discussed as part of v1 so I think we may be
talking past each other here. I tried to highlight the relevant points raised
during v1 discussion that I thought there already was agreement on.

The one new aspect is that I assumed this implementation will only be for
global configuration and assignment. It looks like you want to support both
global configuration and per-task assignment. In the original I did not consider
configuration and assignment to occur at different scope so we may need to come up
with new modes to distinguish. Consider the addition of two modes as below:

# cat info/kernel_mode
[inherit_ctrl_and_mon]
global_assign_ctrl_inherit_mon_set_all
global_assign_ctrl_assign_mon_set_all
global_assign_ctrl_inherit_mon_set_individual
global_assign_ctrl_assign_mon_set_individual

Above introduces a "set_all" and "set_individual" suffix to the original two
modes.

global_assign_ctrl_inherit_mon_set_all
global_assign_ctrl_assign_mon_set_all:

Above are the original two modes but makes it clear that when this mode is
activated _all_ tasks run with the assignment.

global_assign_ctrl_inherit_mon_set_individual
global_assign_ctrl_assign_mon_set_individual:

Above are two new modes. In this mode user space also assigns a resource
group globally but then needs to follow that up by activating every task
separately to run with this assignment.
One way in which this can be accomplished could be to have "kernel_mode_tasks",
"kernel_mode_cpus", and "kernel_mode_cpus_list" files become visible (or be
created) in the resource group found in info/kernel_mode_assignment. User
space interacts with the new files to set which tasks and/or CPUs run with
PLZA enabled.

Even so, as I understand global_assign_ctrl_inherit_mon_set_all and
global_assign_ctrl_assign_mon_set_all addresses the only known use case. Do you know
if there are use cases for global_assign_ctrl_inherit_mon_set_individual and
global_assign_ctrl_assign_mon_set_individual? The latter two adds significant
complexity to resctrl while I have not heard about any use case for it.

Reinette